Skip to main content
Kent Academic Repository

Energy Efficient Execution of POMDP Policies

Grześ, Marek, Poupart, Pascal, Yang, Xiao, Hoey, Jesse (2014) Energy Efficient Execution of POMDP Policies. IEEE Transactions on Cybernetics, 45 (11). pp. 2484-2497. ISSN 2168-2267. E-ISSN 2168-2275. (doi:10.1109/TCYB.2014.2375817) (KAR id:48653)

Abstract

Recent advances in planning techniques for partially observable Markov decision processes have focused on online search techniques and offline point-based value iteration. While these techniques allow practitioners to obtain policies for fairly large problems, they assume that a non-negligible amount of computation can be done between each decision point. In contrast, the recent proliferation of mobile and embedded devices has lead to a surge of applications that could benefit from state of the art planning techniques if they can operate under severe constraints on computational resources. To that effect, we describe two techniques to compile policies into controllers that can be executed by a mere table lookup at each decision point. The first approach compiles policies induced by a set of alpha vectors (such as those obtained by point-based techniques) into approximately equivalent controllers, while the second approach performs a simulation to compile arbitrary policies into approximately equivalent controllers. We also describe an approach to compress controllers by removing redundant and dominated nodes, often yielding smaller and yet better controllers. Further compression and higher value can sometimes be obtained by considering stochastic controllers. The compilation and compression techniques are demonstrated on benchmark problems as well as a mobile application to help persons with Alzheimer's to way-find. The battery consumption of several POMDP policies is compared against finite-state controllers learned using methods introduced in this paper. Experiments performed on the Nexus 4 phone show that finite-state controllers are the least battery consuming POMDP policies.

Item Type: Article
DOI/Identification number: 10.1109/TCYB.2014.2375817
Uncontrolled keywords: Energy-efficiency, Finite-state Controllers, Knowledge compilation, Markov decision processes, Mobile Applications, POMDPs
Subjects: Q Science > Q Science (General) > Q335 Artificial intelligence
Divisions: Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing
Depositing User: Marek Grzes
Date Deposited: 26 May 2015 19:38 UTC
Last Modified: 16 Feb 2021 13:25 UTC
Resource URI: https://kar.kent.ac.uk/id/eprint/48653 (The current URI for this page, for reference purposes)

University of Kent Author Information

Grześ, Marek.

Creator's ORCID: https://orcid.org/0000-0003-4901-1539
CReDIT Contributor Roles:

Yang, Xiao.

Creator's ORCID:
CReDIT Contributor Roles:
  • Depositors only (login required):

Total unique views for this document in KAR since July 2020. For more details click on the image.