Policy Mirror Descent Inherently Explores Action Space

Yan Li 0074, Guanghui Lan. Policy Mirror Descent Inherently Explores Action Space. SIAM Journal on Optimization, 35(1):116-156, 2025. [doi]

References

No references recorded for this publication.

Cited by

No citations of this publication recorded.