Zishun Yu, Yunzhe Tao, Liyu Chen, Tao Sun, Hongxia Yang. B-Coder: Value-Based Deep Reinforcement Learning for Program Synthesis. In The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024. [doi]
Abstract is missing.