-----------
Tel-Aviv University - Computer Science Colloquium

Sunday, February 20, 14:15-15:15
COFFEE at 14:00

Schreiber Building, Room 309
-----------

Planning in Markov Decision Processes using Sparse Sampling Methods

Yishay Mansour

Tel Aviv University

Abstract:

In this work we investigate planing in Markov Decision Processes (MDP)
whose state space is exponential (or infinite). To describe the MDP,
we assume a generative model that receives a state and action, and
outputs the next state (according to the distribution given by the MDP).

We show that, given a generative model, one can compute efficiently
a near-optimal policy (in time which is independent from the size
of the underlying MDP).  Our technique can be viewed as a form of sparse
sampling in an MDP.

Similarly, we address the question of finding a near-optimal
policy from a limited class of policies. We show that for a
finite class of policies, it is sufficient that the number of
queries to the generative model grows only logarithmically in
the number of policies. We also show that for an infinite
class of policies, one can use the VC-dimension to bound the
number of queries to the generative model.

The talk will not assume any prior knowledge about Markov Decision Processes.

This is a joint work with Michael Kearns (AT&T) and Andrew Ng (U. C. Berkeley)
 

-----------

For colloquium schedule, see http://www.math.tau.ac.il/~zwick/colloq.html