data:image/s3,"s3://crabby-images/fb91e/fb91ea75741b0ec4c87853e8502bb56da56ae85e" alt="Setting up Elo Scores For a New Game"
Setting up Elo Scores For a New Game
The Elo rating system is a method for calculating the relative skill levels of players in zero-sum games, such as chess and Go. It is named after its creator Arpad Elo, a Hungarian-American physics professor. Here I show how to implement it for a new game, and measure the Elo scores for some pure MCTS on different Tangled game graphs.
data:image/s3,"s3://crabby-images/69287/69287c002348cbe99c9e44b639b7c66519d00149" alt="Snarks!"
Snarks!
Today I would like to invite you to come falter with me down black cobwebbed steps to meet a thing only whispered about in the carven mausolea of university math departments: snark graphs. The term "snark graph" was coined by Martin Gardner in 1976, inspired by Lewis Carroll's poem The Hunting of the Snark.
data:image/s3,"s3://crabby-images/3bb3f/3bb3f9263b16626f5d14b4aae27c2f949cdf3b74" alt="Deconstructing AlphaZero Part 5: The Joys and Horrors of Parallelization"
Deconstructing AlphaZero Part 5: The Joys and Horrors of Parallelization
I spent a month trying to parallelize AlphaZero’s self-play and competitive phases. About three weeks in I was about to give up but I gave it one more go, and … success! Once I’d figured it all out, I got about 15-20x wallclock speedup on my 3090 + 10-core home setup.
This gives a blow-by-blow of how I got this over the finish line, with advice on how to structure python multiprocessing setups.
data:image/s3,"s3://crabby-images/86ecf/86ecf998dbf6c63493a279acd15cd880230911d3" alt="Deconstructing AlphaZero Part 4: How An Agent Learns"
Deconstructing AlphaZero Part 4: How An Agent Learns
Here I want to show you how an AlphaZero agent learns about its world. As it learns, its ability to intuit what to do grows, and it gets better at taking actions that help in achieving its intrinsic goal.
data:image/s3,"s3://crabby-images/28941/289419e4be0895df4ee9de412c6d51d1f81156eb" alt="Deconstructing AlphaZero Part 3: How You Gonna Act Like That"
Deconstructing AlphaZero Part 3: How You Gonna Act Like That
The defining feature of an agent is that it makes choices about what actions to take, given its current understanding of the world.
In this post, I describe how an AlphaZero agent makes the choices that it does. I also include some video describing part of what’s going on, which I decided to include as the inaugural episode of a new podcast. Exciting!
data:image/s3,"s3://crabby-images/7a4dc/7a4dc1438c76e5b20552e0d4afc9ebc3e2cff4f5" alt="Deconstructing AlphaZero Part 2: State, Actions, and Reward, and Dreaming of Game Trees"
Deconstructing AlphaZero Part 2: State, Actions, and Reward, and Dreaming of Game Trees
There are three key things you need to define before you build any reinforcement learning agent — state, actions, and reward.
In this post I’ll guide you through how to do this, with a worked example for Tangled!
data:image/s3,"s3://crabby-images/d7ec0/d7ec09feac596981a275ceeece18fc514906ceb2" alt="Deconstructing AlphaZero Part 1: Introduction"
Deconstructing AlphaZero Part 1: Introduction
AlphaZero is a beautiful algorithm developed by the folks at DeepMind around 2016-2017. Its output is a software agent — an AI system that has agency (it makes decisions and takes actions by itself with no outside intervention).
In the next few posts, I’ll explain how AlphaZero works, and show how to use it to build superhuman agents for playing games.
data:image/s3,"s3://crabby-images/df2b9/df2b932c6408eb03c2fea3dabe4dcffe75f44b64" alt="A Fun Math Problem"
A Fun Math Problem
In the set of all possible four vertex complete graphs with one red vertex, one blue vertex, and two black vertices, and each edge colored with one of three colors, how many unique graphs are there?
The image here is what Microsoft’s image generator spat out with a prompt of “Burnside’s lemma”.
data:image/s3,"s3://crabby-images/73917/73917239e86ccf50f4cac4ddacf9de1a0476ce7a" alt="Adjudicating Tangled Terminal States on Tiny Graphs With Three Different Approaches"
Adjudicating Tangled Terminal States on Tiny Graphs With Three Different Approaches
I adjudicated all Tangled terminal states for both three- and four-vertex complete game board graphs using three different approaches — a numerical Schrodinger Equation solver, a specially tuned simulated annealing approach run on the s=1 Hamiltonian, and a D-Wave Advantage2.5 system using multiple embeddings. I found 100% agreement between all three for all terminal states for both graphs.
data:image/s3,"s3://crabby-images/f1eb4/f1eb4c061f767ac783f5c06827362288647bb5a1" alt="How I Adjudicate Tangled Terminal States Using D-Wave Hardware"
How I Adjudicate Tangled Terminal States Using D-Wave Hardware
D-Wave’s hardware is not easy to understand or use properly. I have trouble understanding how to get it to do what I want, and I have a PhD in physics and was the CTO of the company for more than a decade!
In this post, I want to show you what I did to get the system to do what I want.
data:image/s3,"s3://crabby-images/a4058/a4058819fa0a36375175266a166611adaec21bc0" alt="Some Very Useful Tools"
Some Very Useful Tools
Thought I’d do a post with links to some tools I use. Some of this is specific to blogging using Squarespace, but some of it could be more generally useful!
data:image/s3,"s3://crabby-images/e3f81/e3f815934be161fa76637d5a1d9a7294a7e7a109" alt="Is it Possible to Build a One-Person Billion Dollar Company?"
Is it Possible to Build a One-Person Billion Dollar Company?
A few weeks ago I was at an event where one of the speakers was talking about how transformative generative AI is for start-ups. Not only could you use these new tools to build products, but likely you could also use them to reduce the number of people you need in a start-up. He claimed that it may now be possible to build a billion dollar start-up with only one person!
data:image/s3,"s3://crabby-images/ac855/ac855fed82179cf1012ad97fb120ea321990b43d" alt="Parallelizing Computation Using D-Wave Processors Via Multiple Simultaneous Embeddings"
Parallelizing Computation Using D-Wave Processors Via Multiple Simultaneous Embeddings
If we want to use a processor to solve a problem that is much smaller than the size of the hardware graph, one cool technique we can use is to find multiple non-overlapping ways to fit our problem into the hardware graph. These ‘fits’ are called embeddings. If you can find a bunch, then each time you call the processor, you get all of them solved for you at once. Not only does this give you a pretty big speed up (up to about 200-350x faster for the problems I’ll show you here), but it gives a neat way to average over certain types of noise in a processor.
data:image/s3,"s3://crabby-images/603d7/603d75a52fdd61be65a836265142f70abaabdc49" alt="The Computational Cost of Using the Schrodinger Equation to Adjudicate Terminal States in Tangled"
The Computational Cost of Using the Schrodinger Equation to Adjudicate Terminal States in Tangled
Evaluating who won a Tangled game requires solving the Schrodinger equation with N qubits, giving O(2ᴺ) scaling in both required memory and compute time.
data:image/s3,"s3://crabby-images/dc546/dc5468d39eca98ea907ea3e33f867523d35c1624" alt="Building Three Tangled Agents"
Building Three Tangled Agents
A software agent is an artificial intelligence software program designed to take actions (ie have agency) in response to the state of their environment.
Let’s look at how to build game-playing agents for any Tangled game board. We can then play against these agents, and agents can play against each other!
data:image/s3,"s3://crabby-images/84d36/84d36225cdb46f036729e433780958458b031ef3" alt="Tangled on a 4-vertex graph"
Tangled on a 4-vertex graph
The three-vertex Tangled game board I introduced in the previous post is the smallest where one player can win (two vertex games always end in a tie). Let’s take a look at Tangled on a four-vertex six-edge graph. It's still stupidly small, but maybe we can learn something!
data:image/s3,"s3://crabby-images/0e5dd/0e5dd33fc0c4927ee3eb4c92c479584ee0c6c396" alt="Tangled on a 3-vertex graph"
Tangled on a 3-vertex graph
In the previous post, I introduced the Tangled game, and we analyzed a two-player two-vertex variant. It wasn’t very interesting, because there was no way for either player to win — every game had to end in a draw! Here we’ll work through the smallest game board where one player can win. This game has three vertices.
data:image/s3,"s3://crabby-images/56807/568073cfa1a777ec65325b4861f7e13c009f71de" alt="An Introduction to Tangled"
An Introduction to Tangled
In an earlier post, I introduced the idea of constructing games where evaluating terminal states requires solving a problem where a quantum supremacy claim has been made. Here I’ll make this concrete by introducing a game which I call Tangled, which has this property.
data:image/s3,"s3://crabby-images/4e0e2/4e0e24bb3c60dc1c451f646da2a7ecd6b8cffcf0" alt="Quantum Mechanics and Quantum Computers"
Quantum Mechanics and Quantum Computers
I’m going to try to explain what quantum physics really is, why you should care, and how it relates to quantum computers (it’s freaking awesome by the way, even without D&D (Deepak & Drugs), or even AD&D (Advanced Deepak & Drugs)).
data:image/s3,"s3://crabby-images/160c6/160c6629c363e84964b1411d9dd5a41da9089470" alt="Influence"