A graphical representation of the increasing strength of computer Go programs
Graph, 1989-2017
Graph, 2008-2017
This page uses graphs to show the strength of the leading computer Go programs,
plotted against time.
The source of all the data is the page
Human-Computer Go Challenges,
which lists all the "official" human-computer go games that I am aware of.
These are games and matches that have been well-publicised, or which have been
played at the end of a computer Go tournament between the winner or winners and
a human, as part of the event. Inclusion criteri are listed below.
Criteria for inclusion in the graph
A game is treated as a data point for the graph if all of these are true:
- The game was sufficiently public that I am aware of it and it is
listed here.
- It was played by a leading program.
- The human's rating is known.
- It was played on a 19×19 board.
- The handicap is known, and was less than 18 stones.
The data file used to create the graphs is in this text file.
Weaknesses in the data
There are many weaknesses in the data.
- The programs use a very wide range of hardware. An event in November
2014 used 2-core laptops. AlphaGo's match in October 2015 used 1202
CPUs and 176 GPUs.
- While I have only included humans of known strength, their strengths
are given according to a variety of national and Go-server-based rating
systems. These differ considerably. However I have treated them all in
the same way (except for Korean ratings, where I have treated "1-gup" as
0-dan professional, and otherwise ignored amateur Korean ratings).
- The games involve "leading programs", which may be a grade or so weaker
than the strongest program of their day.
Assumptions
To make all the data comparable so that it can be included in one graph, I
have made some assumptions:
- One stone difference has the same meaning at all playing strengths.
- One amateur (dan or kyu) grade difference corresponds to one handicap stone.
- One professional grade difference corresponds to ⅓ of a handicap stone.
- Professional 1-dan corresponds to amateur 6⅓ dan, and so professional
9-dan corresponds to amateur 9 dan.
- An "n-stone" handicap is effectively worth n-½ stones.
Details of the graphs
Wherever a human of known strength (taking account of the handicap used) lost to a
leading program, there is a black ribbon extending downwards from one stone weaker
than the human's adjusted strength to the bottom of the graph. The ribbon indicates
that it is slightly improbable that the program was weak enough to fall in the
strength range shown by the ribbon.
Likewise, wherever a human beat a leading program, there is a white ribbon extending
upwards from one stone stronger than the human's adjusted strength to the top of the
graph, indicating that it is slightly improbable that the program was strong enough
to fall in the strength range shown by the ribbon.
The ribbons are all partly transparent, allowing the combined effect of several
overlapping ribbons to be seen for games played on the same or close dates.
For example, at the end of 1997, three Taiwanese inseis (whom I have treated as
amateur 6-dan) played against Handtalk, then the world's leading program, all giving
11-stone handicaps. One insei won his game, the other two lost. The adjusted rating
of a 6-dan giving 11 stones is 6-kyu, so the graph shows, at the end of 1997, a black
ribbon extending down from 7-kyu and a white ribbon extending up from 5-kyu. The
black ribbon is actually two overlaid black ribbons for the two games lost by the
inseis, so is somewhat denser than the white ribbon.
Graph, 1989-2017
Graph, 2008-2017
Reasons for the continuing improvement in strength
- Better programs. In the early days of computer Go, the programs were the work
of a few amateurs. But since about 1990, incerasing numbers of prgrammers have been
working full time to make improvements. Increasingly, programs are the work of teams.
- Moore's Law. Processor power continues to double every two years. However, for
a period from about 2000 to 2006, increasing processor power had little effect on the
performance of Go-playing programs. Chess programs could use it to read ever deeper with
alpha-beta search, but Go programs could not make effective use of more power.
- Parallelisation. After the introduction of
UCT in 2006, more
processor power could once more be used effectively: a UCT search can readily be
parallelised.
- DCNN. Late in 2014, programmers started to consider the use of
Deep Convolutional Neural Nets. These have
proved very effective.
- Increased availability of massive processor power. Maybe this should just be
regarded as a consequence of Moore's Law. Graphics cards are now widely used by computer
Go programs, as they can support multiple parallel processes. The "Cloud" also helps:
it can now be possible for an individual to hire a thousand processors for two hours,
or to borrow them from his employer for a weekend.
- Increased sponsorship. From early in 2016, and probably earlier but privately,
large corporations have been devoting significant resorces to improving computer Go
software. Most notably, DeepMind, acquired by Google in 2014, created AlphaGo; also
Facebook created DarkForest, and Dwango supported DeepZen.
Last updated: 2016-02-08