Constrained Policy Optimization for Controlled Contextual Bandit Exploration - researchr publication

researchr

You are not signed in
Sign in
Sign up

Mohammad Kachuee, Sungjin Lee. Constrained Policy Optimization for Controlled Contextual Bandit Exploration. In Gabriel Pedroza, Xin Cynthia Chen, José Hernández-Orallo, Xiaowei Huang 0001, Huáscar Espinoza, Richard Mallah, John McDermid, Mauricio Castillo-Effen, editors, Proceedings of the Workshop on Artificial Intelligence Safety 2022 (AISafety 2022) co-located with the Thirty-First International Joint Conference on Artificial Intelligence and the Twenty-Fifth European Conference on Artificial Intelligence (IJCAI-ECAI-2022), Vienna, Austria, July 24-25, 2022. Volume 3215 of CEUR Workshop Proceedings, CEUR-WS.org, 2022. [doi]

Abstract is missing.

runs on WebDSL