No More Hand-Tuning Rewards: Masked Constrained Policy Optimization for Safe Reinforcement Learning - researchr publication authors

researchr

You are not signed in
Sign in
Sign up

Stef Van Havermaet, Yara Khaluf, Pieter Simoens. No More Hand-Tuning Rewards: Masked Constrained Policy Optimization for Safe Reinforcement Learning. In Frank Dignum, Alessio Lomuscio, Ulle Endriss, Ann Nowé, editors, AAMAS '21: 20th International Conference on Autonomous Agents and Multiagent Systems, Virtual Event, United Kingdom, May 3-7, 2021. pages 1344-1352, ACM, 2021. [doi]

This author has not been identified. Look up 'Stef Van Havermaet' in GoogleThis author has not been identified. Look up 'Yara Khaluf' in GoogleThis author has not been identified. Look up 'Pieter Simoens' in Google

runs on WebDSL