The Neuroscience of Reinforcement Learning
Yael Niv
yael at princeton dot edu

Overview and goals:

One of the most influential contributions of machine learning to understanding the human brain is the (fairly recent) formulation of learning in real world tasks in terms of the computational framework of reinforcement learning. This confluence of ideas is not limited to abstract ideas about how trial and error learning should proceed, but rather, current views regarding the computational roles of extremely important brain substances (such as dopamine) and brain areas (such as the basal ganglia) draw heavily from reinforcement learning. The results of this growing line of research stand to contribute not only to neuroscience and psychology, but also to machine learning: human and animal brains are remarkably adept at learning new tasks in an uncertain, dynamic and extremely complex world. Understanding how the brain implements reinforcement learning efficiently may suggest similar solutions to engineering and artificial intelligent problems. This tutorial will present the current state of the study of neural reinforcement learning, with an emphasis on both what it teaches us about the brain, and what it teaches us about reinforcement learning.

Target Audience:

The target audience are researchers working in the field of reinforcement learning, who are interested in the current stateā€of-the-art of neuroscientific applications of this theoretical framework, as well as researchers working in related fields of machine learning such as engineering and robotics. Familiarity/basic knowledge of reinforcement learning (MDPs, dynamic programming, online temporal difference methods) will be assumed; basic knowledge in neuroscience or psychology will not.

Tutorial outline:

  1. Introduction: A coarse-grain overview of the brain and what we currently know about how it works
  2. Learning and decision making in animals and humans: is this really a reinforcement learning problem?
  3. Dopamine and prediction errors: what we know about dopamine, why we think it computes a temporal difference prediction error, and why should we care? Evidence for the prediction error hypothesis of dopamine
  4. Actor/Critic architectures in the basal ganglia: a distribution of functions in a learning network
  5. SARSA versus Q-learning: can dopamine reveal what algorithm the brain actually uses?
  6. Multiple learning systems in the brain: what is the evidence for both model based and model free reinforcement learning systems in the brain, why have more than one system, and how to arbitrate between them
  7. Beyond phasic dopamine: average reward reinforcement learning, tonic dopamine and the control of response vigor
  8. Risk and reinforcement learning: can the brain tell us something about learning of the variance of rewards?
  9. Open challenges and future directions: what more can reinforcement learning teach us about the brain, and where can we expect the brain to teach us about reinforcement learning?

Slides: (last updated 14/6/2009)

Slides for printing (no background) can be found here
Slides for viewing on screen (with all the graphics) can be found here

The tutorial will be based loosely on:

  1. Reviews
    • Y Niv (in press) - Reinforcement learning in the brain - The Journal of Mathematical Psychology. PDF
    • P Dayan & Y Niv (2008) - Reinforcement learning and the brain: The Good, The Bad and The Ugly - Current Opinion in Neurobiology, 18(2), 185-196. PDF
    • D Joel, Y Niv & E Ruppin (2002) - Actor-critic models of the basal ganglia: New anatomical and computational perspectives - Neural Networks 15, PDF
  2. Research papers
    • Y Niv, ND Daw & P Dayan (2005) - How fast to work: Response vigor, motivation and tonic dopamine - In: Y Weiss, B Scholkopf & J Platt, eds., Neural Information Processing Systems 18, 1019-1026, MIT Press. PDF
    • ND Daw, Y Niv & P Dayan (2005) - Uncertainty based competition between prefrontal and dorsolateral striatal systems for behavioral control - Nature Neuroscience, 8(12), 1704-1711. PDF
    • Y Niv, MO Duff & P Dayan (2005) - Dopamine, Uncertainty and TD Learning - Behavioral and Brain Functions 1:6 (4 May 2005), doi:10.1186/1744-9081-1-6. Open Access Full text

Presenter bio: I am an assistant professor at Princeton University, working on the interaction of reinforcement learning and state representation in early task learning. My formal training is in computational neuroscience (Interdisciplinary undergraduate program, Tel Aviv University; MA in psychobiology, Tel Aviv University; PhD at the Interdisciplinary Center for Neural Computation at The Hebrew University of Jerusalem and The Gatsby Computational Neuroscience Unit at UCL). Both my masters and doctoral theses were theoretical investigations into the implications of reinforcement learning for human and animal decision making, and the implementation of reinforcement learning in the brain.

Last modified: Yael Niv - March 2009