Reinforecment Learning

Papers and Projects

Final Projects

Text Books:
Matrkov Decisoin Processes Martin L. Puterman
Reinforcement Learning           Richard S. Sutton and Andrew G. Barto

First Class:
1.  course description (postscript,html)
2.  template for scribe notes (postscript, latex, html)
     and explanation about latex  (postscript, latex, html)
3.  Slides of first class (power point,postscript,html).

Second Class:
Finished the overview.

Third and Fourth Class:
Model of Markov decision Processes (MDP) and
Finite Horizon Problems.

Lecture 3 (postscript, latex, html)
Lecture 4 (postscript, latex, html)

Homework 1 (postscript, latex, html)

Fifth and Six Class:
Infinite Horizon Discounted Problems.

Lecture 5 (postscript, latex, html)
Lecture 6 (postscript, latex, html)

Homework 2 (postscript, latex, html)

Seven, Eight and Nine Class:
Learning with unknown model.

Lecture 7 (postscript, latex, html)    Monte-carlo Algorithms
Lecture 8 (postscript, latex, html)    Temporal Diffrence (TD) Algorithms
Lecture 9 (postscript, latex, html)    Q-Learning (and SARSA) Algorithms

Homework 3 (postscript, latex, html)

Lecture Ten and Eleven:
Learning with large state space.

Lecture 10 (postscript, latex, html)    TD-Gammon

Lecture 11 (postscript, latex, html)     Large state space

Lecture Twelve:
Partially Observable MDP.

Lecture Thirteen:
Generator model and sparse sampling in Large MDPs.

PROJECT(postscript, latex, html)