Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management

Shipra Agrawal 0001, Randy Jia. Learning in Structured MDPs with Convex Cost Functions: Improved Regret Bounds for Inventory Management. Operations Research, 70(3):1646-1664, 2022. [doi]

Abstract

Abstract is missing.