Advantage based value iteration for Markov decision processes with unknown rewards

Pegah Alizadeh, Yann Chevaleyre, François Lévy. Advantage based value iteration for Markov decision processes with unknown rewards. In 2016 International Joint Conference on Neural Networks, IJCNN 2016, Vancouver, BC, Canada, July 24-29, 2016. pages 3837-3844, IEEE, 2016. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.