Pessimistic value iteration for multi-task data sharing in Offline Reinforcement Learning

Chenjia Bai, Lingxiao Wang 0003, Jianye Hao, Zhuoran Yang, Bin Zhao 0001, Zhen Wang, Xuelong Li 0001. Pessimistic value iteration for multi-task data sharing in Offline Reinforcement Learning. Artificial Intelligence, 326:104048, January 2024. [doi]

Authors

Chenjia Bai

This author has not been identified. Look up 'Chenjia Bai' in Google

Lingxiao Wang 0003

This author has not been identified. Look up 'Lingxiao Wang 0003' in Google

Jianye Hao

This author has not been identified. Look up 'Jianye Hao' in Google

Zhuoran Yang

This author has not been identified. Look up 'Zhuoran Yang' in Google

Bin Zhao 0001

This author has not been identified. Look up 'Bin Zhao 0001' in Google

Zhen Wang

This author has not been identified. Look up 'Zhen Wang' in Google

Xuelong Li 0001

This author has not been identified. Look up 'Xuelong Li 0001' in Google