Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning

Chang Tian, An Liu 0001, Guan Huang, Wu Luo. Successive Convex Approximation Based Off-Policy Optimization for Constrained Reinforcement Learning. IEEE Transactions on Signal Processing, 70:1609-1624, 2022. [doi]

Abstract

Abstract is missing.