SAC-B: Soft Actor-Critic with Bias for Suppressing Q-valueOverestimation in Off-Policy Reinforcement Learning

Han Wang, Wei Du 0010, Yanyu Xu 0001, Lizhen Cui 0002. SAC-B: Soft Actor-Critic with Bias for Suppressing Q-valueOverestimation in Off-Policy Reinforcement Learning. In Proceedings of the 2025 7th International Conference on Distributed Artificial Intelligence, DAI 2025, London, United Kingdom, November 21-24, 2025. pages 58-66, ACM, 2025. [doi]

Abstract

Abstract is missing.