Prasenjit Karmakar – Kudos: Growing the influence of research

All Stories

Two Time-Scale Stochastic Approximation with Controlled Markov Noise and Off-Policy Temporal-Difference Learning

Article • Mathematics of Operations Research, February 2018, INFORMS

Shalabh Bhatnagar, Prasenjit Karmakar