No Free Lunch: Overcoming Reward Gaming in AI Safety Gridworlds

Mariya Tsvarkaleva, Louise A. Dennis. No Free Lunch: Overcoming Reward Gaming in AI Safety Gridworlds. In Ibrahim Habli, Mark Sujan, Simos Gerasimou, Erwin Schoitsch, Friedemann Bitsch, editors, Computer Safety, Reliability, and Security. SAFECOMP 2021 Workshops - DECSoS, MAPSOD, DepDevOps, USDAI, and WAISE, York, UK, September 7, 2021, Proceedings. Volume 12853 of Lecture Notes in Computer Science, pages 226-238, Springer, 2021. [doi]

@inproceedings{TsvarkalevaD21,
  title = {No Free Lunch: Overcoming Reward Gaming in AI Safety Gridworlds},
  author = {Mariya Tsvarkaleva and Louise A. Dennis},
  year = {2021},
  doi = {10.1007/978-3-030-83906-2_18},
  url = {https://doi.org/10.1007/978-3-030-83906-2_18},
  researchr = {https://researchr.org/publication/TsvarkalevaD21},
  cites = {0},
  citedby = {0},
  pages = {226-238},
  booktitle = {Computer Safety, Reliability, and Security. SAFECOMP 2021 Workshops - DECSoS, MAPSOD, DepDevOps, USDAI, and WAISE, York, UK, September 7, 2021, Proceedings},
  editor = {Ibrahim Habli and Mark Sujan and Simos Gerasimou and Erwin Schoitsch and Friedemann Bitsch},
  volume = {12853},
  series = {Lecture Notes in Computer Science},
  publisher = {Springer},
  isbn = {978-3-030-83906-2},
}