Grzes, Marek, Poupart, Pascal (2015) Incremental Policy Iteration with Guaranteed Escape from Local Optima in POMDP Planning. In: Proceedings of the 14th International Conference on Autonomous Agents and Multiagent System. Proceedings of the International Conference on Autonomous Agents and Multiagent Systems (AAMAS). . (Access to this publication is currently restricted. You may be able to access a copy if URLs are provided) (KAR id:48652)
PDF (Restricted due to publisher copyrights)
Publisher pdf
Language: English Restricted to Repository staff only |
|
|
|
Official URL: http://www.aamas2015.com/en/AAMAS_2015_USB/aamas/p... |
Abstract
Partially observable Markov decision processes (POMDPs) provide a natural framework to design applications that continuously make decisions based on noisy sensor measurements. The recent proliferation of smart phones and other wearable devices leads to new applications where, unfortunately, energy efficiency becomes an issue. To circumvent energy requirements, finite-state controllers can be applied because they are computationally inexpensive to execute. Additionally, when multi-agent POMDPs (e.g. Dec-POMDPs or I-POMDPs) are taken into account, finite-state controllers become one of the most important policy representations. Online methods scale the best; however, they are energy demanding. Thus methods to optimize finite-state controllers are necessary. In this paper, we present a new, efficient approach to bounded policy interaction (BPI). BPI keeps the size of the controller small which is a desirable property for applications, especially on small devices. However, finding an optimal or near optimal finite-state controller of a bounded size poses a challenging combinatorial optimization problem. Exhaustive search methods clearly do not scale to larger problems, whereas local search methods are subject to local optima. Our new approach solves all of the common benchmarks on which local search methods fail, yet it scales to large problems.
Item Type: | Conference or workshop item (Paper) |
---|---|
Uncontrolled keywords: | Planning under Uncertainty; POMDP; Policy Iteration; Finite State Controller |
Subjects: | Q Science > Q Science (General) > Q335 Artificial intelligence |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Marek Grzes |
Date Deposited: | 26 May 2015 16:21 UTC |
Last Modified: | 05 Nov 2024 10:32 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/48652 (The current URI for this page, for reference purposes) |
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):