Combination of learning from non-optimal demonstrations and feedbacks using inverse reinforcement learning and Bayesian policy improvement

Ali Ezzeddine, Nafee Mourad, Babak Nadjar Araabi, Majid Nili Ahmadabadi. Combination of learning from non-optimal demonstrations and feedbacks using inverse reinforcement learning and Bayesian policy improvement. Expert Syst. Appl., 112:331-341, 2018. [doi]

Bibliographies