Abstract is missing.
- A unified framework for temporal difference methodsDimitri P. Bertsekas. 1-7 [doi]
- Efficient data reuse in value function approximationHirotaka Hachiya, Takayuki Akiyama, Masashi Sugiyama, Jan Peters. 8-15 [doi]
- Constrained optimal control of affine nonlinear discrete-time systems using GHJB methodLili Cui, Huaguang Zhang, Derong Liu, Yongsu Kim. 16-21 [doi]
- ADHDP(λ) strategies based coordinated ramps metering with queuing considerationXuerui Bai, Dongbin Zhao, Jianqiang Yi. 22-27 [doi]
- Algorithm and stability of ATC receding horizon controlHongwei Zhang, Jie Huang, Frank L. Lewis. 28-35 [doi]
- Online policy iteration based algorithms to solve the continuous-time infinite horizon optimal control problemKyriakos G. Vamvoudakis, Draguna Vrabie, Frank L. Lewis. 36-41 [doi]
- Real-time motor control using recurrent neural networksDongsung Huh, Emanuel Todorov. 42-49 [doi]
- Hierarchical optimal control of a 7-DOF arm modelDan Liu, Emanuel Todorov. 50-57 [doi]
- Coupling perception and action using minimax optimal controlTom Erez, William D. Smart. 58-65 [doi]
- A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional Markov decision process with continuous state and action spacesJun Ma, Warren B. Powell. 66-73 [doi]
- Basis function adaptation methods for cost approximation in MDPHuizhen Yu, Dimitri P. Bertsekas. 74-81 [doi]
- Executing concurrent actions with multiple Markov decision processesElva Corona-Xelhuantzi, Eduardo F. Morales, Luis Enrique Sucar. 82-89 [doi]
- Iterative local dynamic programmingEmanuel Todorov, Yuval Tassa. 90-95 [doi]
- Adaptive computation of optimal nonrandomized policies in constrained average-reward MDPsEugene A. Feinberg. 96-100 [doi]
- The QV family compared to other reinforcement learning algorithmsMarco A. Wiering, Hado van Hasselt. 101-108 [doi]
- Feature discovery in approximate dynamic programmingPhilippe Preux, Sertan Girgin, Manuel Loth. 109-116 [doi]
- Inferring bounds on the performance of a control policy from a sample of trajectoriesRaphaël Fonteneau, Susan A. Murphy, Louis Wehenkel, Damien Ernst. 117-123 [doi]
- Neural-network-based reinforcement learning controller for nonlinear systems with non-symmetric dead-zone inputsXin Zhang, Huaguang Zhang, Derong Liu, Yongsu Kim. 124-129 [doi]
- Algorithms for variance reduction in a policy-gradient based actor-critic frameworkYogesh P. Awate. 130-136 [doi]
- The knowledge gradient algorithm for online subset selectionIlya O. Ryzhov, Warren B. Powell. 137-144 [doi]
- Planning under uncertainty, ensembles of disturbance trees and kernelized discrete action spacesBoris Defourny, Damien Ernst, Louis Wehenkel. 145-152 [doi]
- Policy search with cross-entropy optimization of basis functionsLucian Busoniu, Damien Ernst, Bart De Schutter, Robert Babuska. 153-160 [doi]
- Eigenfunction approximation methods for linearly-solvable optimal control problemsEmanuel Todorov. 161-168 [doi]
- Learning continuous-action control policiesJason Pazis, Michail G. Lagoudakis. 169-176 [doi]
- A theoretical and empirical analysis of Expected SarsaHarm van Seijen, Hado van Hasselt, Shimon Whiteson, Marco A. Wiering. 177-184 [doi]
- Kalman Temporal Differences: The deterministic caseMatthieu Geist, Olivier Pietquin, Gabriel Fricout. 185-192 [doi]
- Integrating sporadic imitation in Reinforcement Learning robotsWilli Richert, Ulrich Scheller, Markus Koch, Bernd Kleinjohann, Claudius Stern. 193-198 [doi]
- Bounds of optimal learningRoman V. Belavkin. 199-204 [doi]
- Multiagent reinforcement learning in extensive form games with complete informationAli Akramizadeh, Mohammad B. Menhaj, Ahmad Afshar. 205-211 [doi]
- Practical numerical methods for stochastic optimal control of biological systems in continuous time and spaceC. Alexander Simpkins, Emanuel Todorov. 212-218 [doi]
- Path integral-based stochastic optimal control for rigid body dynamicsEvangelos Theodorou, Jonas Buchli, Stefan Schaal. 219-225 [doi]
- Using reward-weighted imitation for robot Reinforcement LearningJan Peters, Jens Kober. 226-232 [doi]
- Adaptive Critic Designs-based autonomous unmanned vehicles navigation: Application to robotic farm vehiclesH. Daniel Patiño, Santiago Tosetti, Flavio Capraro. 233-237 [doi]
- Neuro-controller of cement rotary kiln temperature with adaptive critic designsXiaofeng Lin, Tangbo Liu, Shaojian Song, Chunning Song. 238-242 [doi]