VDCBPI: An Approximate Scalable Algorithm for Large POMDPs

Source: University of Toronto

Favorite

Free registration required

Existing algorithms for discrete partially observable Markov decision processes can at best solve problems of a few thousand states due to two important sources of intractability: the curse of dimensionality and the policy space complexity. This paper describes a new algorithm (VDCBPI) that mitigates both sources of intractability by combining the Value Directed Compression (VDC) technique with Bounded Policy Iteration (BPI). The scalability of VDCBPI is demonstrated on synthetic network management problems with up to 33 million states.
Format:PDF Size:320.30
Date:Jan 2008