The University of Arizona
Map Home
Loading...
Adjust height of sidebar
KMap

Grant

CIF: Small: Theory and Algorithms for Efficient and Large-Scale Monte Carlo Tree Search

Sponsored by National Science Foundation

Active
$599.2K Funding
1 People
External

Related Topics

Abstract

Monte Carlo tree search (MCTS) is a versatile online planning methodology for sequential decision-making problems such as reinforcement learning that has recently shown empirical success in real-world problems including games, chemical synthesis, materials/drug discovery, and numerical algorithms. However, there is a huge gap between existing MCTS theory and practice because (i) the de facto standard MCTS algorithm called upper confidence bound for trees (UCT) is known to be provably suboptimal, (ii) existing theories are limited to asymptotic or worst-case analyses, and (iii) the optimal performance rates of MCTS algorithms are not known. This implies that the state-of-the-art MCTS methods might still be far from realizing their full potential, and further developments are required to prepare for the next generations of much larger and more complex decision-making problems. This project focuses on bridging the gap between theory and practice in MCTS methodology by developing novel MCTS algorithms with strong mathematical performance guarantees, establishing the optimal performance rates, and evaluating them in real-world applications. This project integrates education into research by developing a course module and building interdisciplinary teams of undergraduates who will work closely with material scientists to evaluate the developed algorithms on materials discovery tasks. The project consists of three main directions: the foundations of MCTS, large-scale MCTS, and the design of experiments for MCTS. Each direction contains several main objectives: (i) for the foundations of MCTS, the focus is to improve maximum mean estimator and leverage tools from a related problem called pure exploration to develop algorithms with strong guarantees and study information-theoretic limits of MCTS; (ii) for the large-scale MCTS, the focus is to analyze and improve existing heuristics for large-scale MCTS problems such as progressive widening, incremental depth expansion, and function approximations; (iii) for the design of experiments for MCTS, the focus is to develop experimental design methods to efficiently train function approximations for MCTS with a small number of samples. In addition to theoretical and algorithmic developments, the project also aims at implementing all algorithms developed as open-source software, evaluating them using benchmark datasets, and applying them to material science tasks via the interdisciplinary teams of undergraduates as part of the educational aim. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

People