Abstract is missing.
- Leveraging Hamilton-Jacobi PDEs with time-dependent Hamiltonians for continual scientific machine learningPaula X. Chen, Tingwei Meng, Zongren Zou, Jérôme Darbon, George Em Karniadakis. 1-12 [doi]
- Data-efficient, explainable and safe box manipulation: Illustrating the advantages of physical priors in model-predictive controlAchkan Salehi, Stéphane Doncieux. 13-24 [doi]
- Gradient shaping for multi-constraint safe reinforcement learningYihang Yao, Zuxin Liu, Zhepeng Cen, Peide Huang, Tingnan Zhang, Wenhao Yu 0003, Ding Zhao. 25-39 [doi]
- Continual learning of multi-modal dynamics with external memoryAbdullah Akgül, Gozde Unal, Melih Kandemir. 40-51 [doi]
- Learning to stabilize high-dimensional unknown systems using Lyapunov-guided explorationSongyuan Zhang, Chuchu Fan. 52-67 [doi]
- An investigation of time reversal symmetry in reinforcement learningBrett Barkley, Amy Zhang, David Fridovich-Keil. 68-79 [doi]
- HSVI-based online minimax strategies for partially observable stochastic games with neural perception mechanismsRui Yan 0002, Gabriel Santos, Gethin Norman, David Parker 0001, Marta Kwiatkowska. 80-91 [doi]
- Real-time safe control of neural network dynamic models with sound approximationHanjiang Hu, Jianglin Lan, Changliu Liu. 92-103 [doi]
- Tracking object positions in reinforcement learning: A metric for keypoint detectionEmma Cramer, Jonas Reiher, Sebastian Trimpe. 104-116 [doi]
- Linearised data-driven LSTM-based control of multi-input HVAC systemsAndreas Hinderyckx, Florence Guillaume. 117-129 [doi]
- The behavioral toolboxIvan Markovsky. 130-141 [doi]
- Learning "Look-Ahead" Nonlocal Traffic Dynamics in a Ring RoadChenguang Zhao, Huan Yu. 142-154 [doi]
- Safe dynamic pricing for nonstationary network resource allocationBerkay Turan, Spencer Hutchinson, Mahnoosh Alizadeh. 155-167 [doi]
- Safe online convex optimization with multi-point feedbackSpencer Hutchinson, Mahnoosh Alizadeh. 168-180 [doi]
- Controlgym: Large-scale control environments for benchmarking reinforcement learning algorithmsXiangyuan Zhang, Weichao Mao, Saviz Mowlavi, Mouhacine Benosman, Tamer Basar. 181-196 [doi]
- On the convergence of adaptive first order methods: proximal gradient and alternating minimization algorithmsPuya Latafat, Andreas Themelis, Panagiotis Patrinos. 197-208 [doi]
- Strengthened stability analysis of discrete-time Lurie systems involving ReLU neural networksCarl R. Richardson, Matthew C. Turner, Steve R. Gunn, Ross Drummond. 209-221 [doi]
- Interpretable data-driven model predictive control of building energy systems using SHAPPatrick Henkel, Tobias Kasperski, Phillip Stoffel, Dirk Müller 0005. 222-234 [doi]
- Physics-informed neural networks with unknown measurement noisePhilipp Pilar, Niklas Wahlström. 235-247 [doi]
- Adaptive online non-stochastic controlNaram Mhaisen, George Iosifidis. 248-259 [doi]
- Global rewards in multi-agent deep reinforcement learning for autonomous mobility on demand systemsHeiko Hoppe, Tobias Enders, Quentin Cappart, Maximilian Schiffer. 260-272 [doi]
- Soft convex quantization: revisiting Vector Quantization with convex optimizationTanmay Gautam, Reid Pryzant, Ziyi Yang, Chenguang Zhu, Somayeh Sojoudi. 273-285 [doi]
- Uncertainty quantification of set-membership estimation in control and perception: Revisiting the minimum enclosing ellipsoidYukai Tang, Jean-Bernard Lasserre, Heng Yang. 286-298 [doi]
- Minimax dual control with finite-dimensional information stateOlle Kjellqvist. 299-311 [doi]
- An efficient data-based off-policy Q-learning algorithm for optimal output feedback control of linear systemsMohammad Alsalti, Victor G. Lopez, Matthias Albrecht Müller. 312-323 [doi]
- Adapting Image-based RL Policies via Predicted RewardsWeiyao Wang 0002, Xinyuan Fang, Gregory D. Hager. 324-336 [doi]
- Piecewise regression via mixed-integer programming for MPCDieter Teichrib, Moritz Schulze Darup. 337-348 [doi]
- Parameter-adaptive approximate MPC: Tuning neural-network controllers without retrainingHenrik Hose, Alexander Gräfe, Sebastian Trimpe. 349-360 [doi]
- $\widetilde{O}(T^{-1})$ {C}onvergence to (coarse) correlated equilibria in full-information general-sum markov gamesWeichao Mao, Haoran Qiu, Chen Wang, Hubertus Franke, Zbigniew Kalbarczyk, Tamer Basar. 361-374 [doi]
- Inverse optimal control as an errors-in-variables problemRahel Rickenbach, Anna Scampicchio, Melanie N. Zeilinger. 375-386 [doi]
- Learning soft constrained MPC value functions: Efficient MPC design and implementation providing stability and safety guaranteesNicolas Chatzikiriakos, Kim Peter Wabersich, Felix Berkel, Patricia Pauli, Andrea Iannelli. 387-398 [doi]
- MPC-inspired reinforcement learning for verifiable model-free controlYiwen Lu, Zishuo Li, Yihan Zhou, Na Li, Yilin Mo. 399-413 [doi]
- Real-world fluid directed rigid body control via deep reinforcement learningMohak Bhardwaj, Thomas Lampe, Michael Neunert, Francesco Romano, Abbas Abdolmaleki, Arunkumar Byravan, Markus Wulfmeier, Martin A. Riedmiller, Jonas Buchli. 414-427 [doi]
- On the uniqueness of solution for the Bellman equation of LTL objectivesZetong Xuan, Alper Kamil Bozkurt, Miroslav Pajic, Yu Wang 0044. 428-439 [doi]
- Decision boundary learning for safe vision-based navigation via Hamilton-Jacobi reachability analysis and support vector machineTara Toufighi, Minh Bui, Rakesh Shrestha, Mo Chen. 440-452 [doi]
- Understanding the difficulty of solving Cauchy problems with PINNsTao Wang, Bo Zhao, Sicun Gao, Rose Yu. 453-465 [doi]
- Signatures meet dynamic programming: Generalizing Bellman equations for trajectory followingMotoya Ohnishi, Iretiayo Akinola, Jie Xu, Ajay Mandlekar, Fabio Ramos 0001. 466-479 [doi]
- Online decision making with history-average dependent costsVijeth Hebbar, Cedric Langbort. 480-491 [doi]
- Learning-based rigid tube model predictive controlYulong Gao, Shuhao Yan, Jian Zhou, Mark Cannon, Alessandro Abate, Karl Henrik Johansson. 492-503 [doi]
- A data-driven Riccati equationAnders Rantzer. 504-513 [doi]
- Nonconvex scenario optimization for data-driven reachabilityElizabeth Dietrich, Alex Devonport, Murat Arcak. 514-527 [doi]
- Uncertainty quantification and robustification of model-based controllers using conformal predictionKong Yao Chee, Thales C. Silva, M. Ani Hsieh, George J. Pappas. 528-540 [doi]
- Learning for CasADi: Data-driven models in numerical optimizationTim Salzmann, Jon Arrizabalaga, Joel Andersson 0001, Marco Pavone 0001, Markus Ryll. 541-553 [doi]
- Neural operators for boundary stabilization of stop-and-go trafficYihuai Zhang, Ruiguo Zhong, Huan Yu. 554-565 [doi]
- Submodular information selection for hypothesis testing with misclassification penaltiesJayanth Bhargav, Mahsa Ghasemi, Shreyas Sundaram. 566-577 [doi]
- Learning and deploying robust locomotion policies with minimal dynamics randomizationLuigi Campanaro, Siddhant Gangapurwala, Wolfgang Merkt, Ioannis Havoutis. 578-590 [doi]
- Learning flow functions of spiking systemsMiguel Aguiar, Amritam Das, Karl Henrik Johansson. 591-602 [doi]
- Safe learning in nonlinear model predictive controlJohannes Buerger, Mark Cannon, Martin Doff-Sotta. 603-614 [doi]
- Efficient skill acquisition for insertion tasks in obstructed environmentsJun Yamada, Jack Collins, Ingmar Posner. 615-627 [doi]
- Balanced reward-inspired reinforcement learning for autonomous vehicle racingZhen Tian, Dezong Zhao, Zhihao Lin, David Flynn, Wenjing Zhao, Daxin Tian. 628-640 [doi]
- An invariant information geometric method for high-dimensional online optimizationZhengfei Zhang, Yunyue Wei, Yanan Sui. 641-653 [doi]
- On the nonsmooth geometry and neural approximation of the optimal value function of infinite-horizon pendulum swing-upHaoyu Han, Heng Yang. 654-666 [doi]
- Data-driven robust covariance control for uncertain linear systemsJoshua Pilipovsky, Panagiotis Tsiotras. 667-678 [doi]
- Combining model-based controller and ML advice via convex reparameterizationJunxuan Shen, Adam Wierman, Guannan Qu. 679-693 [doi]
- Pointwise-in-time diagnostics for reinforcement learning during training and runtimeNoel Brindise, Andres Felipe Posada-Moreno, Cedric Langbort, Sebastian Trimpe. 694-706 [doi]
- Expert with Clustering: Hierarchical Online Preference Learning FrameworkTianyue Zhou, Jung-Hoon Cho, Babak Rahimi Ardabili, Hamed Tabkhi, Cathy Wu 0002. 707-718 [doi]
- Verification of neural reachable tubes via scenario optimization and conformal predictionAlbert Lin, Somil Bansal. 719-731 [doi]
- Random features approximation for control-affine systemsKimia Kazemian, Yahya Sattar, Sarah Dean. 732-744 [doi]
- Hacking predictors means hacking cars: Using sensitivity analysis to identify trajectory prediction vulnerabilities for autonomous driving securityMarsalis T. Gibson, David Babazadeh, Claire J. Tomlin, S. Shankar Sastry. 745-757 [doi]
- Rademacher complexity of neural ODEs via Chen-Fliess seriesJoshua Hanson, Maxim Raginsky. 758-769 [doi]
- Robust cooperative multi-agent reinforcement learning: A mean-field type game perspectiveMuhammad Aneeq uz Zaman, Mathieu Laurière, Alec Koppel, Tamer Basar. 770-783 [doi]
- Learning ε-Nash equilibrium stationary policies in stochastic games with unknown independent chains using online mirror descentTiancheng Qin, S. Rasoul Etesami 0001. 784-795 [doi]
- Uncertainty informed optimal resource allocation with Gaussian process based Bayesian inferenceSamarth Gupta, Saurabh Amin. 796-812 [doi]
- Improving sample efficiency of high dimensional Bayesian optimization with MCMCZeji Yi, Yunyue Wei, Chu Xin Cheng, Kaibo He, Yanan Sui. 813-824 [doi]
- SpOiLer: Offline reinforcement learning using scaled penaltiesPadmanaba Srinivasan, William J. Knottenbelt. 825-838 [doi]
- Towards safe multi-task Bayesian optimizationJannis O. Lübsen, Christian Hespe, Annika Eichler. 839-851 [doi]
- Mixing classifiers to alleviate the accuracy-robustness trade-offYatong Bai, Brendon G. Anderson, Somayeh Sojoudi. 852-865 [doi]
- Design of observer-based finite-time control for inductively coupled power transfer system with random gain fluctuationsSatheesh Thangavel, Rathinasamy Sakthivel. 866-875 [doi]
- Learning robust policies for uncertain parametric Markov decision processesLuke Rickard, Alessandro Abate, Kostas Margellos. 876-889 [doi]
- Conditions for parameter unidentifiability of linear ARX systems for enhancing securityXiangyu Mao, Jianping He 0001, Chengpu Yu, Chongrong Fang. 890-901 [doi]
- Meta-learning linear quadratic regulators: A policy gradient MAML approach for model-free LQRLeonardo Felipe Toso, Donglin Zhan, James Anderson 0001, Han Wang 0016. 902-915 [doi]
- A large deviations perspective on policy gradient algorithmsWouter Jongeneel, Daniel Kuhn 0001, Mengmeng Li. 916-928 [doi]
- Deep model-free KKL observer: A switching approachJohan Peralez, Madiha Nadri. 929-940 [doi]
- In vivo learning-based control of microbial populations density in bioreactorsSara Maria Brancato, Davide Salzano, Francesco De Lellis, Davide Fiore, Giovanni Russo 0002, Mario di Bernardo. 941-953 [doi]
- Bounded robustness in reinforcement learning via lexicographic objectivesDaniel Jarne Ornia, Licio Romao, Lewis Hammond, Manuel Mazo Jr., Alessandro Abate. 954-967 [doi]
- System-level safety guard: Safe tracking control through uncertain neural network dynamics modelsXiao Li, Yutong Li, Anouck Girard, Ilya V. Kolmanovsky. 968-979 [doi]
- Nonasymptotic regret analysis of adaptive linear quadratic control with model misspecificationBruce D. Lee, Anders Rantzer, Nikolai Matni. 980-992 [doi]
- Error bounds, PL condition, and quadratic growth for weakly convex functions, and linear convergences of proximal point methodsFeng-Yi Liao, Lijun Ding, Yang Zheng 0001. 993-1005 [doi]
- Parameterized fast and safe tracking (FaSTrack) using DeepReachHyun Joe Jeong, Zheng Gong, Somil Bansal, Sylvia L. Herbert. 1006-1017 [doi]
- Probabilistic ODE solvers for integration error-aware numerical optimal controlAmon Lahr, Filip Tronarp, Nathanael Bosch, Jonathan Schmidt, Philipp Hennig, Melanie N. Zeilinger. 1018-1032 [doi]
- Event-triggered safe Bayesian optimization on quadcoptersAntonia Holzapfel, Paul Brunzema, Sebastian Trimpe. 1033-1045 [doi]
- Finite-time complexity of incremental policy gradient methods for solving multi-task reinforcement learningYitao Bai, Thinh T. Doan 0001. 1046-1057 [doi]
- Convergence guarantees for adaptive model predictive control with kinky inferenceRiccardo Zuliani, Raffaele Soloperto, John Lygeros. 1058-1070 [doi]
- Convex approximations for a bi-level formulation of data-enabled predictive controlXu Shang, Yang Zheng 0001. 1071-1082 [doi]
- PDE control gym: A benchmark for data-driven boundary control of partial differential equationsLuke Bhan, Yuexin Bian, Miroslav Krstic, Yuanyuan Shi. 1083-1095 [doi]
- Towards bio-inspired control of aerial vehicle: Distributed aerodynamic parameters for state predictionYikang Wang, Adolfo Perrusquía, Dmitry I. Ignatyev. 1096-1106 [doi]
- Residual learning and context encoding for adaptive offline-to-online reinforcement learningMohammadreza Nakhaei, Aidan Scannell, Joni Pajarinen. 1107-1121 [doi]
- CoVO-MPC: Theoretical analysis of sampling-based MPC and optimal covariance designZeji Yi, Chaoyi Pan, Guanqi He, Guannan Qu, Guanya Shi. 1122-1135 [doi]
- Stable modular control via contraction theory for reinforcement learningBing Song, Jean-Jacques E. Slotine, Quang-Cuong Pham. 1136-1148 [doi]
- Data-driven bifurcation analysis via learning of homeomorphismWentao Tang 0001. 1149-1160 [doi]
- A learning-based framework to adapt legged robots on-the-fly to unexpected disturbancesNolan Fey, He Li, Nicholas Adrian, Patrick M. Wensing, Michael D. Lemmon. 1161-1173 [doi]
- On task-relevant loss functions in meta-reinforcement learningJaeuk Shin, Giho Kim, Howon Lee, Joonho Han, Insoon Yang. 1174-1186 [doi]
- State-wise safe reinforcement learning with pixel observationsSimon Sinong Zhan, Yixuan Wang 0001, Qingyuan Wu, Ruochen Jiao, Chao Huang 0015, Qi Zhu 0002. 1187-1201 [doi]
- Multi-agent assignment via state augmented reinforcement learningLeopoldo Agorio, Sean Van Alen, Miguel Calvo-Fullana, Santiago Paternain, Juan-Andrés Bazerque. 1202-1213 [doi]
- PlanNetX: Learning an efficient neural network planner from MPC for longitudinal controlJasper Hoffmann, Diego Fernandez Clausen, Julien Brosseit, Julian Bernhard, Klemens Esterle, Moritz Werling, Michael Karg, Joschka Bödecker. 1214-1227 [doi]
- Mapping back and forth between model predictive control and neural networksRoss Drummond, Pablo R. Baldivieso, Giorgio Valmorbida. 1228-1240 [doi]
- A multi-modal distributed learning algorithm in reproducing kernel Hilbert spacesAneesh Raghavan, Karl Henrik Johansson. 1241-1252 [doi]
- Towards model-free LQR control over rate-limited channelsAritra Mitra, Lintao Ye, Vijay Gupta 0001. 1253-1265 [doi]
- Learning true objectives: Linear algebraic characterizations of identifiability in inverse reinforcement learningMohamad Louai Shehab, Antoine Aspeel, Nikos Aréchiga, Andrew Best, Necmiye Ozay. 1266-1277 [doi]
- Safety filters for black-box dynamical systems by learning discriminating hyperplanesWill Lavanakul, Jason J. Choi, Koushil Sreenath, Claire J. Tomlin. 1278-1291 [doi]
- Lagrangian inspired polynomial estimator for black-box learning and control of underactuated systemsGiulio Giacomuzzo, Riccardo Cescon, Diego Romeres, Ruggero Carli, Alberto Dalla Libera. 1292-1304 [doi]
- From raw data to safety: Reducing conservatism by set expansionMohammad Bajelani, Klaske van Heusden. 1305-1317 [doi]
- Dynamics harmonic analysis of robotic systems: Application in data-driven Koopman modellingDaniel Felipe Ordoñez Apraez, Vladimir Kostic, Giulio Turrisi, Pietro Novelli, Carlos Mastalli, Claudio Semini, Massimiliano Pontil. 1318-1329 [doi]
- Recursively feasible shrinking-horizon MPC in dynamic environments with conformal prediction guaranteesCharis J. Stamouli, Lars Lindemann, George J. Pappas. 1330-1342 [doi]
- Multi-modal conformal prediction regions by optimizing convex shape templatesRenukanandan Tumu, Matthew Cleaveland, Rahul Mangharam, George J. Pappas, Lars Lindemann. 1343-1356 [doi]
- Learning locally interacting discrete dynamical systems: Towards data-efficient and scalable predictionBeomseok Kang, Harshit Kumar, Minah Lee, Biswadeep Chakraborty, Saibal Mukhopadhyay. 1357-1369 [doi]
- How safe am I given what I see? Calibrated prediction of safety chances for image-controlled autonomyZhenjiang Mao, Carson Sobolewski, Ivan Ruchkin. 1370-1387 [doi]
- Convex neural network synthesis for robustness in the 1-normRoss Drummond, Chris Guiver, Matthew C. Turner. 1388-1399 [doi]
- Increasing information for model predictive control with semi-Markov decision processesRémy Hosseinkhan Boucher, Stella Douka, Onofrio Semeraro, Lionel Mathelin. 1400-1414 [doi]
- Physically consistent modeling & identification of nonlinear friction with dissipative Gaussian processesRui Dai, Giulio Evangelisti, Sandra Hirche. 1415-1426 [doi]
- STEMFold: Stochastic temporal manifold for multi-agent interactions in the presence of hidden agentsHemant Kumawat, Biswadeep Chakraborty, Saibal Mukhopadhyay. 1427-1439 [doi]
- Distributed on-the-fly control of multi-agent systems with unknown dynamics: Using limited data to obtain near-optimal controlShayan Meshkat Alsadat, Nasim Baharisangari, Zhe Xu 0005. 1440-1451 [doi]
- CACTO-SL: Using Sobolev learning to improve continuous actor-critic with trajectory optimizationElisa Alboni, Gianluigi Grandesso, Gastone Pietro Rosati Papini, Justin Carpentier, Andrea Del Prete. 1452-1463 [doi]
- Multi-agent coverage control with transient behavior considerationRunyu Zhang, Haitong Ma, Na Li 0002. 1464-1476 [doi]
- Data driven verification of positive invariant sets for discrete, nonlinear systemsAmy K. Strong, Leila Jasmine Bridgeman. 1477-1488 [doi]
- Adaptive teaching in heterogeneous agents: Balancing surprise in sparse reward scenariosEmma Clark, Kanghyun Ryu, Negar Mehr. 1489-1501 [doi]
- Can a transformer represent a Kalman filter?Gautam Goel, Peter L. Bartlett. 1502-1512 [doi]
- Data-driven simulator for mechanical circulatory support with domain adversarial neural processSophia Sun, Wenyuan Chen, Zihao Zhou, Sonia Fereidooni, Elise Jortberg, Rose Yu. 1513-1525 [doi]
- DC4L: Distribution shift recovery via data-driven control for deep learning modelsVivian Lin, Kuk Jin Jang, Souradeep Dutta, Michele Caprio, Oleg Sokolsky, Insup Lee 0001. 1526-1538 [doi]
- QCQP-Net: Reliably learning feasible alternating current optimal power flow solutions under constraintsSihan Zeng, Youngdae Kim, Yuxuan Ren, Kibaek Kim. 1539-1551 [doi]
- A deep learning approach for distributed aggregative optimization with users' feedbackRiccardo Brumali, Guido Carnevale, Giuseppe Notarstefano. 1552-1564 [doi]
- A framework for evaluating human driver models using neuroimagingChristopher Strong, Kaylene C. Stocking, Jingqi Li, Tianjiao Zhang, Jack L. Gallant, Claire J. Tomlin. 1565-1578 [doi]
- Deep Hankel matrices with random elementsNathan P. Lawrence, Philip D. Loewen, Shuyuan Wang, Michael G. Forbes, R. Bhushan Gopaluni. 1579-1591 [doi]
- Robust exploration with adversary via Langevin Monte CarloHao-Lun Hsu, Miroslav Pajic. 1592-1605 [doi]
- Generalized constraint for probabilistic safe reinforcement learningWeiqin Chen 0003, Santiago Paternain. 1606-1618 [doi]
- Neural processes with event triggers for fast adaptation to changesPaul Brunzema, Paul Kruse, Sebastian Trimpe. 1619-1632 [doi]
- Data-driven strategy synthesis for stochastic systems with unknown nonlinear disturbancesIbon Gracia, Dimitris Boskos, Luca Laurenti, Morteza Lahijanian. 1633-1645 [doi]
- Growing Q-networks: Solving continuous control tasks with adaptive control resolutionTim Seyde, Peter Werner, Wilko Schwarting, Markus Wulfmeier, Daniela Rus. 1646-1661 [doi]
- Hamiltonian GANChristine Allen-Blanchette. 1662-1674 [doi]
- Do no harm: A counterfactual approach to safe reinforcement learningSean Vaskov, Wilko Schwarting, Chris L. Baker. 1675-1687 [doi]
- Wasserstein distributionally robust regret-optimal control over infinite-horizonTaylan Kargin, Joudi Hajar, Vikrant Malik, Babak Hassibi. 1688-1701 [doi]
- Probably approximately correct stability of allocations in uncertain coalitional games with private samplingGeorge Pantazis, Filiberto Fele, Filippo Fabiani, Sergio Grammatico, Kostas Margellos. 1702-1714 [doi]
- Reinforcement learning-driven parametric curve fitting for snake robot gait designJack Naish, Jacob Rodriguez, Jenny Zhang, Bryson Jones, Guglielmo Daddi, Andrew L. Orekhov, Rob Royce, Michael Paton, Howie Choset, Masahiro Ono, Rohan Thakker. 1715-1727 [doi]
- Pontryagin neural operator for solving general-sum differential games with parametric state constraintsLei Zhang, Mukesh Ghimire, Zhe Xu 0005, Wenlong Zhang, Yi Ren. 1728-1740 [doi]
- Adaptive neural network based control approach for building energy control under changing environmental conditionsLilli Frison, Simon Gölzhäuser. 1741-1752 [doi]
- Physics-constrained learning for PDE systems with uncertainty quantified port-Hamiltonian modelsKaiyuan Tan, Peilun Li, Thomas Beckers 0001. 1753-1764 [doi]
- Proto-MPC: An encoder-prototype-decoder approach for quadrotor control in challenging windsYuliang Gu, Sheng Cheng, Naira Hovakimyan. 1765-1776 [doi]
- Efficient imitation learning with conservative world modelsVictor Kolev, Rafael Rafailov, Kyle Hatch, Jiajun Wu 0001, Chelsea Finn. 1777-1790 [doi]
- Restless bandits with rewards generated by a linear Gaussian dynamical systemJonathan Gornet, Bruno Sinopoli. 1791-1802 [doi]