What is it about?
Building Batch Reinforcement Learning (B2RL) dataset is for domain
Featured Image
Why is it important?
Batch reinforcement learning (BRL) is an emerging field in the reinforcement learning community. It learns exclusively from static datasets (i.e. replay buffers). Model-free BRL models are capable of learning the optimal policy without the need for accurate environment models or simulation environments as oracles. Model-based BRL methods learn the environment, and dynamic models, from the buffers, then use these models to predict environment responses and generate Markov Decision Process (MDP) transitions given states and actions from policies. In the offline settings, existing replay experiences are used as prior knowledge for BRL models to learn from. Thus, generating replay buffers are crucial for the BRL model benchmark. In our B2RL (Building Batch RL) dataset, we collect real-world datasets from our database, as well as buffers generated by several behavioral policies in simulation environments. To the best of our knowledge, we are the first to open-source building datasets for the purpose of batch RL learning.
Perspectives
Read the Original
This page is a summary of: B2RL, November 2022, ACM (Association for Computing Machinery),
DOI: 10.1145/3563357.3566164.
You can read the full text:
Resources
Contributors
The following have contributed to this page