Abstract is missing.
- Closed-loop control of anesthesia and mean arterial pressure using reinforcement learningRegina Padmanabhan, Nader Meskin, Wassim M. Haddad. 1-8 [doi]
- An adaptive dynamic programming algorithm to solve optimal control of uncertain nonlinear systemsXiaohong Cui, Yanhong Luo, Huaguang Zhang. 1-6 [doi]
- Adaptive dynamic programming for discrete-time LQR optimal tracking control problems with unknown dynamicsYang Liu, Yanhong Luo, Huaguang Zhang. 1-6 [doi]
- Pareto Upper Confidence Bounds algorithms: An empirical studyMadalina M. Drugan, Ann Nowé, Bernard Manderick. 1-8 [doi]
- Policy gradient approaches for multi-objective sequential decision making: A comparisonSimone Parisi, Matteo Pirotta, Nicola Smacchia, Luca Bascetta, Marcello Restelli. 1-8 [doi]
- Approximate real-time optimal control based on sparse Gaussian process modelsJoschka Boedecker, Jost Tobias Springenberg, Jan Wülfing, Martin A. Riedmiller. 1-8 [doi]
- Neural-network-based adaptive dynamic surface control for MIMO systems with unknown hysteresisLei Liu, Zhanshan Wang, Zhengwei Shen. 1-6 [doi]
- Model-based multi-objective reinforcement learningMarco A. Wiering, Maikel Withagen, Madalina M. Drugan. 1-6 [doi]
- Adaptive fault identification for a class of nonlinear dynamic systemsLi-bing Wu, Dan Ye, Xin-Gang Zhao. 1-6 [doi]
- Event-based optimal regulator design for nonlinear networked control systemsAvimanyu Sahoo, Hao Xu, Sarangapani Jagannathan. 1-8 [doi]
- Subspace identification for predictive state representation by nuclear norm minimizationHadrien Glaude, Olivier Pietquin, Cyrille Enderli. 1-8 [doi]
- Using approximate dynamic programming for estimating the revenues of a hydrogen-based high-capacity storage deviceVincent François-Lavet, Raphaël Fonteneau, Damien Ernst. 1-8 [doi]
- Adaptive dynamic programming-based optimal tracking control for nonlinear systems using general value iterationXiaofeng Lin, Qiang Ding, Weikai Kong, Chunning Song, Qingbao Huang. 1-6 [doi]
- A data-based online reinforcement learning algorithm with high-efficient explorationYuanheng Zhu, Dongbin Zhao. 1-6 [doi]
- Pseudo-MDPs and factored linear action modelsHengshuai Yao, Csaba Szepesvári, Bernardo Avila Pires, Xinhua Zhang. 1-9 [doi]
- Tunable and generic problem instance generation for multi-objective reinforcement learningDeon Garrett, Jordi Bieger, Kristinn R. Thórisson. 1-8 [doi]
- Continuous-time differential dynamic programming with terminal constraintsWei Sun, Evangelos A. Theodorou, Panagiotis Tsiotras. 1-6 [doi]
- Neural network-based adaptive optimal consensus control of leaderless networked mobile robotsHaci Mehmet Guzey, Hao Xu, Sarangapani Jagannathan. 1-6 [doi]
- Adaptive aggregated predictions for renewable energy systemsBalázs Csanád Csáji, András Kovács, József Váncza. 1-8 [doi]
- Convergence of value iterations for total-cost MDPs and POMDPs with general state and action setsEugene A. Feinberg, Pavlo O. Kasyanov, Michael Z. Zgurovsky. 1-8 [doi]
- Convergent reinforcement learning control with neural networks and continuous action searchMinwoo Lee, Charles W. Anderson. 1-8 [doi]
- Optimal self-learning battery control in smart residential grids by iterative Q-learning algorithmQinglai Wei, Derong Liu, Guang Shi, Yu Liu, Qiang Guan. 1-7 [doi]
- Data-driven partially observable dynamic processes using adaptive dynamic programmingXiangnan Zhong, Zhen Ni, Yufei Tang, Haibo He. 1-8 [doi]
- Cognitive control in cognitive dynamic systems: A new way of thinking inspired by the brainSimon Haykin, Ashkan Amiri, Mehdi Fatemi. 1-7 [doi]
- Nonparametric infinite horizon Kullback-Leibler stochastic controlYunpeng Pan, Evangelos A. Theodorou. 1-8 [doi]
- Near-optimality bounds for greedy periodic policies with application to grid-level storageYuhai Hu, Boris Defourny. 1-8 [doi]
- Information-theoretic stochastic optimal control via incremental sampling-based algorithmsOktay Arslan, Evangelos A. Theodorou, Panagiotis Tsiotras. 1-8 [doi]
- An analysis of optimistic, best-first search for minimax sequential decision makingLucian Busoniu, Rémi Munos, Elod Pall. 1-8 [doi]
- Multi-objective reinforcement learning for AUV thruster failure recoverySeyed Reza Ahmadzadeh, Petar Kormushev, Darwin G. Caldwell. 1-8 [doi]
- A two stage learning technique for dual learning in the pursuit-evasion differential gameAhmad A. Al-Talabi, Howard M. Schwartz. 1-8 [doi]
- Heuristics for multiagent reinforcement learning in decentralized decision problemsMartin W. Allen, David Hahn, Douglas C. MacFarland. 1-8 [doi]
- Active learning for classification: An optimistic approachTimothe Collet, Olivier Pietquin. 1-8 [doi]
- Reinforcement learning-based optimal control considering L computation time delay of linear discrete-time systemsTaishi Fujita, Toshimitu Ushio. 1-6 [doi]
- A comparison of approximate dynamic programming techniques on benchmark energy storage problems: Does anything work?Daniel R. Jiang, Thuy V. Pham, Warren B. Powell, Daniel F. Salas, Warren R. Scott. 1-8 [doi]
- Beyond exponential utility functions: A variance-adjusted approach for risk-averse reinforcement learningAbhijit Gosavi, Sajal K. Das, Susan L. Murray. 1-8 [doi]
- Theoretical analysis of a reinforcement learning based switching schemeAli Heydari. 1-6 [doi]
- On-policy Q-learning for adaptive optimal controlSumit Kumar Jha, Shubhendu Bhasin. 1-6 [doi]
- Accelerated gradient temporal difference learning algorithmsDominik Meyer, Remy Degenne, Ahmed Omrane, Hao Shen. 1-8 [doi]
- ADP-based optimal control for a class of nonlinear discrete-time systems with inequality constraintsYanhong Luo, Geyang Xiao. 1-5 [doi]
- Annealing-pareto multi-objective multi-armed bandit algorithmSaba Q. Yahyaa, Madalina M. Drugan, Bernard Manderick. 1-8 [doi]
- Model-free Q-learning over finite horizon for uncertain linear continuous-time systemsHao Xu, Sarangapani Jagannathan. 1-6 [doi]
- Using supervised training signals of observable state dynamics to speed-up and improve reinforcement learningDaniel L. Elliott, Charles Anderson. 1-8 [doi]