Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value At Risk

Dohyeong Kim, Songhwai Oh. Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value At Risk. IEEE Robotics and Automation Letters, 7(3):7644-7651, 2022. [doi]

@article{KimO22a,
  title = {Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value At Risk},
  author = {Dohyeong Kim and Songhwai Oh},
  year = {2022},
  doi = {10.1109/LRA.2022.3184793},
  url = {https://doi.org/10.1109/LRA.2022.3184793},
  researchr = {https://researchr.org/publication/KimO22a},
  cites = {0},
  citedby = {0},
  journal = {IEEE Robotics and Automation Letters},
  volume = {7},
  number = {3},
  pages = {7644-7651},
}