What is it about?

This paper proposes an algorithm combining multiple strategies on the stochastic multi-armed bandit problem. From experiments of Auer et al. (2002), we can know that multi-armed bandit algorithms perform differently on problems with different reward distributions. In the situation where it is not known which strategies are best our algorithm is one solution.

Featured Image

Why is it important?

Theoretically, the proposed algorithm epsilon_t-comb converges to the best strategy asymptotically. The definition of best strategy is found in the paper.

Read the Original

This page is a summary of: Combining Multiple Strategies for Multiarmed Bandit Problems and Asymptotic Optimality, Journal of Control Science and Engineering, January 2015, Hindawi Publishing Corporation,
DOI: 10.1155/2015/264953.
You can read the full text:

Read
Open access logo

Contributors

The following have contributed to this page