The multi-armed bandit problem: a fixed limited set of resources must be allocated between competing choices in a way that maximizes their expected gain, when each choice's properties are only partially known, and may become better understood as time passes or by allocating resources to the choice.
Machine Learning A-Z
Thoughts and flashcards derived from Udemy’s course Machine Learning A-Z.