An Approximately Optimal Relative Value Learning Algorithm for Averaged MDPs with Continuous States and Actions

Hiteshi Sharma, Rahul Jain 0002. An Approximately Optimal Relative Value Learning Algorithm for Averaged MDPs with Continuous States and Actions. In 57th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2019, Monticello, IL, USA, September 24-27, 2019. pages 734-740, IEEE, 2019. [doi]

Abstract

Abstract is missing.