Adaptive search space pruning in complex strategic problems
Fig 7
Aggregated results for AlphaZero with different sizes of additional search using MCTS against pure MCTS with 1000 simulations.
A) Limiting the trained deep learning models with a shutter of size zero (blue bars) significantly improved their performance against an MCTS algorithm on all experimental board configurations (all p < 10−5). B) Adding a shutter heuristic did not improve performance on empty boards (i.e., from the initial state of a game), except when no additional MCTS search was done. All error bars are 95% confidence intervals.