Constrained Policy Optimization for Controlled Contextual Bandit Exploration

Mohammad Kachuee, Sungjin Lee. Constrained Policy Optimization for Controlled Contextual Bandit Exploration. In Gabriel Pedroza, Xin Cynthia Chen, José Hernández-Orallo, Xiaowei Huang 0001, Huáscar Espinoza, Richard Mallah, John McDermid, Mauricio Castillo-Effen, editors, Proceedings of the Workshop on Artificial Intelligence Safety 2022 (AISafety 2022) co-located with the Thirty-First International Joint Conference on Artificial Intelligence and the Twenty-Fifth European Conference on Artificial Intelligence (IJCAI-ECAI-2022), Vienna, Austria, July 24-25, 2022. Volume 3215 of CEUR Workshop Proceedings, CEUR-WS.org, 2022. [doi]

Abstract

Abstract is missing.