Policy Learning for Time-Bounded Reachability in Continuous-Time Markov Decision Processes via Doubly-Stochastic Gradient Ascent

Ezio Bartocci, Luca Bortolussi, Tomás Brázdil, Dimitrios Milios, Guido Sanguinetti. Policy Learning for Time-Bounded Reachability in Continuous-Time Markov Decision Processes via Doubly-Stochastic Gradient Ascent. In Gul Agha, Benny Van Houdt, editors, Quantitative Evaluation of Systems - 13th International Conference, QEST 2016, Quebec City, QC, Canada, August 23-25, 2016, Proceedings. Volume 9826 of Lecture Notes in Computer Science, pages 244-259, Springer, 2016. [doi]

Abstract

Abstract is missing.