Machine Learning and Friends Lunch

Symbolic Generalization for On-line Planning

Abstract

The ability to generalize over experience is crucial to an on-line planner, both when the state space is large and when obtaining experience is relatively expensive compared to the cost of computation. Previous work focuses on generalization using function approximation. However, these algorithms cannot guarantee convergence. In this talk, I will present a new generalization method based on the symbolic model-checking paradigm that guarantees convergence to the optimal solution.

Symbolic representations have been used successfully in off-line planning algorithms for Markov decision processes. I will show that a symbolic representation can also improve the performance of an on-line planner. In addition to reducing computation time, symbolic generalization can reduce the amount of costly real-world interactions required for convergence. I will present an extension of Real-Time Dynamic Programming (RTDP), called Symbolic RTDP (or sRTDP), that uses symbolic model checking techniques to generalize its on-line experience. After each step of on-line interaction with an environment modeled as a Markov decision process, sRTDP generalizes its experience by updating a group of states rather than a single state. Two approaches to dynamic grouping of states will be presented, both of which accelerate the planning process significantly in terms of both CPU time and the number of interactions with the environment.

Back to ML Lunch home