Exploration-Exploitation Trade-off
The fundamental challenge in search and learning: should you exploit what you've learned (focus on high-reward nodes) or explore new options (try under-explored nodes)?
The fundamental challenge in search and learning: should you exploit what you've learned (focus on high-reward nodes) or explore new options (try under-explored nodes)? UCB balances both.