No More Hand-Tuning Rewards: Masked Constrained Policy Optimization for Safe Reinforcement Learning

Stef Van Havermaet, Yara Khaluf, Pieter Simoens. No More Hand-Tuning Rewards: Masked Constrained Policy Optimization for Safe Reinforcement Learning. In Frank Dignum, Alessio Lomuscio, Ulle Endriss, Ann Nowé, editors, AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, Virtual Event, United Kingdom, May 3-7, 2021. pages 1344-1352, ACM, 2021. [doi]

Authors

Stef Van Havermaet

This author has not been identified. Look up 'Stef Van Havermaet' in Google

Yara Khaluf

This author has not been identified. Look up 'Yara Khaluf' in Google

Pieter Simoens

This author has not been identified. Look up 'Pieter Simoens' in Google