1. P. Auer, N. Cesa-Bianchi, and P. Fischer. Finite time analysis of the multiarmed
bandit problem. Machine Learning, 47(2-3):235–256, 2002.
2. P. Auer, N. Cesa-Bianchi, Y. Freund, and R.E. Schapire. The nonstochastic mul-
tiarmed bandit problem. SIAM Journal on Computing, 32:48–77, 2002.
3. A.G. Barto, S.J. Bradtke, and S.P. Singh. Real-time learning and control using
asynchronous dynamic programming. Technical report 91-57, Computer Science
Department, University of Massachusetts, 1991.
4. D. Billings, A. Davidson, J. Schaeffer, and D. Szafron. The challenge of poker.
Artificial Intelligence, 134:201–240, 2002.
5. B. Bouzy and B. Helmstetter. Monte Carlo Go developments. In H.J. van den
Herik, H. Iida, and E.A. Heinz, editors, Advances in Computer Games 10, pages
6. H.S. Chang, M. Fu, J. Hu, and S.I. Marcus. An adaptive sampling algorithm for
solving Markov decision processes. Operations Research, 53(1):126–139, 2005.
7. M. Chung, M. Buro, and J. Schaeffer. Monte Carlo planning in RTS games. In
CIG 2005, Colchester, UK, 2005.
8. M. Kearns, Y. Mansour, and A.Y. Ng. A sparse sampling algorithm for near-
optimal planning in large Markovian decisi on processes. In Proceedings of IJ-
CAI’99, pages 1324–1331, 1999.
9. T.L. Lai and H. Robbins. Asymptotically efficient adaptive allocation rules. Ad-
vances in Applied Mathematics, 6:4–22, 1985.
10. L. Peret and F. Garcia. On-line search for solving Markov decision processes via
heuristic sampling. In R.L. de Mantaras and L. Saitta, editors, ECAI, pages 530–
11. B. Sheppard. World-championship-caliber Scrabble. Artificial Intelligence, 134(1–
12. S.J.J. Smith and D.S. Nau. An analysis of forward pruning. In AAAI, pages
13. G. Tesauro and G.R. Galperin. On-line policy improvement using Monte-Carlo
search. In M.C. Mozer, M.I. Jordan, and T. Petsche, editors, NIPS 9, pages 1068–
14. R. Vanderbei. Optimal sailing strategies, statistics and operations research pro-
gram. University of Princeton, http://www.sor.princeton.edu/˜rvdb/sail/sail.html.,