Hierarchical Organization of Behavior:
Computational, Psychological and Neural Perspectives

Organized by:
Yael Niv, Princeton University
Matthew Botvinick, Princeton University
Andrew Barto, University of Massachusetts, Amherst

This website is the product of a NIPS 2007 workshop whose aim was to discuss current ideas from computer science, psychology and neuroscience regarding learning and control of hierarchically structured behavior. The workshop brought together front-line researchers with the aim of gleaning new insights by integrating knowledge from these somewhat disparate areas of active research into the hierarchical organization of behavior.

The workshop schedule can be found here or in the more official format here.

Workshop resources: (by order of talks)

  1. Yael Niv - Introduction to hierarchical reinforcement learning (slides)
  2. Matt Botvinick - Hierarchical reinforcement learning and the brain: Potential connections (slides)
  3. Rick Cooper (with Nicholas Ruh & Denis Mareschal) - The hierarchies that underlie routine behavior (slides)
  4. David Badre - Cognitive control, hierarchy, and the rostro-caudal organization of the prefrontal cortex (slides)
  5. Etienne Koechlin - Architecture of central executive functions in the human prefrontal cortex (extended abstract)
  6. Bhaskara Marthi (with Leslie Kaelbling & Tomas Lozano-Perez) - Learning hierarchical structure in policies
  7. Wilco Moerman (with Bram Bakker & Marco Wiering) - Hierarchical assignment of behaviours to subpolicies (slides,demo,abstract)
  8. Jordan Frank (with Doina Precup) - Recognizers: A study in learning how to model temporally extended behaviors
  9. Jeremy Reynolds (with Todd Braver & Randall O'Reilly) - Computational, behavioral and neuro-imaging methods investigating the hierarchical organization of prefrontal cortex and goal-oriented behavior
  10. Kai Krueger (with Peter Dayan) - Flexible shaping: How learning in small steps helps (slides)
  11. Joanna Bryson - Hierarchical organization of intelligence: Ethological and AI perspectives (slides)
  12. Frank Krueger (with Jordan Grafman) - Structured event complexes in the human prefrontal cortex
  13. Kalina Christoff - Prefrontal topography of cognitive control according to levels of abstraction
  14. Neville Mehta (with Mike Wynkoop, Soumya Ray, Prasad Tadepalli & Tom Dietterich) - Automatic induction of MAXQ hierarchies (slides,extended abstract)
  15. Jason Wolfe (with Bhaskara Marthi & Stuart Russell) - Hierarchical lookahead agents: A preliminary report (slides)
  16. Rich Sutton - Hierarchy, behavior and off-policy learning (slides)
  17. Zico Kolter (with Peter Abbeel & Andrew Ng) - Hierarchical apprenticeship learning with applications to quadruped locomotion
  18. Harm van Seijen (with Bram Bakker & Leon Kester) - Reinforcement learning with multiple, qualitatively different state representations (slides, movie about switching between state abstractions)
  19. Zachary Stein - Addressing the American problem by modeling cognitive development (slides, abstract, relevant publications can be found at Kurt Fischer's website)
  20. Andy Barto - Intrinsically motivated hierarchical reinforcement learning (slides)

Relevant papers: (to add a paper to this list please email yael at princeton dot edu)

  • MM Botvinick, Y Niv & AC Barto (submitted) - Hierarchically organized behavior and its neural foundations: A reinforcement-learning perspective PDF
    A number of people expressed interest in this not-yet-published paper which reviews the connections between RL and the psychological-neural perspective on hierarchies. This paper is also where the idea of the workshop began.

  • MM Botvinick (2007) - Multilevel structure in behaviour and in the brain: a model of Fuster's hierarchy. Phil Trans R Soc B PDF
    In this paper a hierarchically structured model spontaneously develops an 'anatomical' division of labor akin to what several of the PFC speakers discussed.

  • Various papers from the lab of Jurgen Schmidhuber (can be found on his site on hierarchical learning) -
    There is no teacher providing useful intermediate subgoals for the hierarchical reinforcement learning systems below. Some use gradient-based subgoal generators, some search in discrete subgoal space, some use recurrent networks to deal with partial observability (the latter is an almost automatic consequence of realistic hierarchical reinforcement learning):

    • B Bakker & J Schmidhuber (2004) - Hierarchical reinforcement learning based on subgoal discovery and subpolicy specialization. In F Groen, N Amato, A Bonarini, E Yoshida, and B Kroese (Eds.), Proceedings of the 8-th Conference on Intelligent Autonomous Systems, IAS-8, Amsterdam, The Netherlands, p. 438-445

    • B Bakker & J Schmidhuber (2004) - Hierarchical reinforcement learning with subpolicies specializing for learned subgoals. In MH Hamza (Ed.), Proceedings of the 2nd IASTED International Conference on Neural Networks and Computational Intelligence, NCI 2004, Grindelwald, Switzerland, p. 125-130

    • A program search-based RL method that divides and conquers:
      R Salustowicz & J Schmidhuber (1998) - Learning to predict through PIPE and automatic task decomposition. Technical Report IDSIA-11-98, IDSIA

    • A method that decomposes POMDPs into sequences of MDPs; the memory is in the agent numbers:
      M Wiering & J Schmidhuber (1997) - HQ-Learning. Adaptive Behavior 6(2):219-246

    • Gradient-based learning of subgoal sequences:
      • J Schmidhuber & RĘWahnsiedler (1992) - Planning simple trajectories using neural subgoal generators. In JA Meyer, HL Roitblat, & SW Wilson, eds, Proceedings of the 2nd International Conference on Simulation of Adaptive Behavior, p. 196-202. MIT Press
      • J Schmidhuber (1991) - Learning to generate sub-goals for action sequences. In TĘKohonen, KĘMŠkisara, OĘSimula, & JĘKangas, eds, Artificial Neural Networks, p. 967-972. Elsevier Science Publishers
      • J Schmidhuber (1990) - Towards compositional learning with dynamic neural networks. Technical Report FKI-129-90, Institut fur Informatik, Technische Universitat Munchen

    • A theoretically optimal way of creating and solving subgoals in general reinforcement learning settings is the Goedel Machine. Goedel machine papers:
      • J Schmidhuber (2005) - Completely self-referential optimal reinforcement learners. In W Duch et al., eds, Proc. Intl. Conf. on Artificial Neural Networks ICANN'05, LNCS 3697, p. 223-233, Springer-Verlag Berlin Heidelberg
      • J Schmidhuber (2006) - Goedel machines: self-referential universal problem solvers making provably optimal self-improvements. In B Goertzel & C Pennachin, eds, Artificial General Intelligence, p. 119-226
      • J Schmidhuber (2005) - A technical justification of consciousness. 9th annual meeting of the Association for the Scientific Study of Consciousness ASSC, Caltech, Pasadena, CA
      • J Schmidhuber (2005) - Goedel machines: towards a technical justification of consciousness. In D Kudenko, D Kazakov & E Alonso, eds, Adaptive Agents and Multi-Agent Systems III LNCS 3394, p. 1-23, Springer

    • A bias-optimal way of creating and solving subgoals in the context of ordered problem sequences is the Optimal Ordered Problem Solver. OOPS Publications:
      • J Schmidhuber (2004) - Optimal ordered problem solver. Machine Learning, 54, 211-254
      • J Schmidhuber, V Zhumatiy & M Gagliolo (2004) - Bias-optimal incremental learning of control sequences for virtual robots. In F Groen, N Amato, A Bonarini, E Yoshida & B Kroese, eds, Proceedings of the 8-th conference on Intelligent Autonomous Systems, IAS-8, Amsterdam, The Netherlands, p. 658-665
      • J Schmidhuber (2003) - Bias-optimal incremental problem solving. In S Becker, S Thrun & K Obermayer, eds, Advances in Neural Information Processing Systems 15, NIPS15, MIT Press, Cambridge MA, p. 1571-1578

    • Predictor hierarchies are treated here:
      J Schmidhuber (1992) - Learning complex, extended sequences using the principle of history compression. Neural Computation, 4(2):234-242

    • A dozen or so papers on intrinsically motivated reinforcement learning (or artificial curiosity) can be found here

  • Papers from Andy Barto's lab on intrinsically motivated reinforcement learning and skill formation:

    • S Singh, AG Barto & N Chentanez (2005) - Intrinsically motivated reinforcement learning. 18th Annual Conference on Neural Information Processing Systems (NIPS), Vancouver, B.C., Canada PDF
    • AG Barto, S Singh & N Chentanez (2004) - Intrinsically motivated learning of hierarchical collections of skills. International Conference on Developmental Learning (ICDL), LaJolla, CA, USA PDF
    • O Simsek, AP Wolfe & AG Barto (2005) - Identifying useful subgoals in reinforcement learning by local graph partitioning. Proceedings of the Twenty-Second International Conference on Machine Learning ICML 05, Bonn, Germany PDF
    • A Stout, GD Konidaris & AG Barto (2005) - Intrinsically motivated reinforcement learning: A promising framework for developmental robot learning. Proceedings of the AAAI Spring Symposium on Developmental Robotics, Stanford University, Stanford, CA PDF
    • O Simsek & AG Barto (2006) - An intrinsic reward mechanism for efficient exploration. Proceedings of the Twenty-Third International Conference on Machine Learning (ICML 06), Pittsburgh, PA PDF
    • GD Konidaris & AG Barto (2006) - Building portable options: Skill transfer in reinforcement learning. University of Massachusetts Department of Computer Science Technical Report UM-CS-2006-17 PDF
    • GD Konidaris & AG Barto (2006) - An adaptive robot motivational system. Animals to Animats 9: Proceedings of the 9th International Conference on Simulation of Adaptive Behavior (SAB-06), CNR, Roma, Italy PDF
    • A Jonsson & AG Barto (2006) - Causal graph based decomposition of factored MDPs. Journal of Machine Learning Research, 7:2259--2301 PDF
    • O Simsek & AG Barto (2007) - Betweenness centrality as a basis for forming skills. University of Massachusetts, Department of Computer Science Technical Report TR-2007-26 PDF

  • Papers from Clay Holryod's lab on hierarchical error processing:

    • OE Krigolson & CB Holroyd (2006) - Evidence for hierarchical error processing in the human brain. Neuroscience, 137, 13-17 PDF
    • OE Krigolson & CB Holroyd (2007) - Hierarchical error processing: Different errors, different systems. Brain Research, 1155, 70-80 PDF
    • OE Krigolson & CB Holroyd (2007) - Predictive information and error processing: The role of medial-frontal cortex during motor control. Psychophysiology, 44, 586-595 PDF

  • JJ Bryson (2000) - The study of sequential and hierarchical organisation of behaviour via artificial mechanisms of action selection. MPhil Dissertation, University of Edinburgh, Department of Psychology PDF

  • JJ Bryson (2000) - Hierarchy and sequence vs. full parallelism in action selection. Simulation of Adaptive Behavior 6, p.147-156 PDF

  • Papers from Daniel Polani's lab on self-motivation behaviour generation and related issues, based on information theoretic methods:

    • Two papers that use a sensorimotor loop efficiency measure as a driver for 'self-motivated' state selection:
      • AS Klyubin, D Polani & CL Nehaniv (2005) - All else being equal be empowered. In: Advances in Artficial Life, European Conference on Artificial Life (ECAL 2005), vol. 3630 of LNAI, 393-402, Springer PDF
      • AS Klyubin, D Polani & CL Nehaniv (2005) - Empowerment: A universal agent-centric measure of control. In: Proceedings of CEC 2005, IEEE PDF

    • This one looks at 'relevant information' for an agent in the context of a RL task. It provides an information resource-limited view on RL decision making:
      • D Polani, CL Nehaniv, T Martinetz & JT Kim (2006) - Relevant information in optimized persistence vs. progeny strategies. In: Proceedings of Artificial Life X PDF

    • Two papers on a information-theoretic model for emerging representations in an agent's perception-action loop. The first is a much expanded version of the second, but is not freely accessible
      • AS Klyubin, D Polani & CL Nehaniv (2007) - Representations of space and time in the maximization of information flow in the perception-action loop. Neural Computation, 19(9), 2387-2432 abstract
      • AS Klyubin, D Polani & CL Nehaniv (2004) - Organization of the information flow in the perception-action Loop of evolved agents. In Proceedings of 2004 NASA/DoD Conference on Evolvable Hardware, IEEE Computer Society PDF

Copyright Disclaimer: The documents linked here are included by the contributing authors as a means to ensure timely dissemination of scholarly and technical work on a non-commercial basis. Copyright and all rights therein are maintained by the authors or by other copyright holders, notwithstanding that they have offered their works here electronically. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright.
Last modified: Yael Niv, 13 May 2008