Hiteshi Sharma, Rahul Jain 0002. An Approximately Optimal Relative Value Learning Algorithm for Averaged MDPs with Continuous States and Actions. In 57th Annual Allerton Conference on Communication, Control, and Computing, Allerton 2019, Monticello, IL, USA, September 24-27, 2019. pages 734-740, IEEE, 2019. [doi]
Abstract is missing.