2019 TutORial: Recent Advances in Multiarmed Bandits for Sequential Decision Making

Given by Shipra Agrawal at the 2019 INFORMS Annual Meeting in Seattle, WA.

This tutorial discusses some recent advances in sequential decision making models that build upon the basic multi-armed bandit (MAB) setting to greatly expand its purview. Specifically, it discusses progress in algorithm design and analysis techniques for three models: (a) Contextual bandits, (b) Combinatorial bandits, and (c) Bandits with long-term constraints and non-additive rewards, along with applications in several domains such as online advertising, recommendation systems, crowdsourcing, healthcare, network routing, assortment optimization, revenue management, and resource allocation.