Sunday, December 21, 2008

Dr. Markov

Last week, in CS106X, I programmed Boggle.

Boggle is played with a set of cubes that have letters on each face. You shake them up, put them in a grid, and try to find words made of adjacent dice faster than the other players can.

Computer Boggle is different. It gives you time, letting you find as many words as you can, and then beats your score by an order of magnitude or so by finding all the remaining words on the board that are in the (huge) Scrabble player's dictionary. You can also play 5x5 "BigBoggle" instead of the standard 4x4, in which case the computer generally beats you even more thoroughly.

One of the cool things about CS106 is that they don't just want solutions, though. It's good style to extend them in some way, and solve some additional problems that weren't in the assignment. For Boggle, I wanted to write code that would find boards with unusually few (or many) words, using only the standard cubes.

So I tried Markov chain Monte Carlo, which I'd Wikied a long, long time ago when I was really into raytracing.

In a nutshell: the algorithm tries to find a global minimum of a function that can be sampled and which is roughly continuous, but which may be in many dimensions and may not have any other nice properties.

You start with a guess.
At each step, you find a distribution centered at the current guess and randomly sample it. If you know what the global minimum should be (e.g. zero) the variance of the distribution will vary with the cost (the value of the function at the current guess), so that if you're far away from the minimum, you're sampling over a wide area, while if you're getting close, you make more incremental guesses. Each 'probe' sampled this way is evaluated, and then you randomly decide whether to accept or reject it, depending on its cost. If the probe cost is significantly lower than the current cost, then you almost certainly accept it, and if the cost is significantly higher, then you almost always reject; in between, it's a toss-up: if you accept, then that probe becomes your current guess, and if you reject, then you keep probing.

Doing all this with Boggle boards took a bit of approximation and rule-of-thumb coding because I don't know a way to actually define a probability distribution over the set of possible Boggle boards. Instead, I defined 'swapping' as taking two cubes at random, switching their places, and randomly rotating each one. I started with a standard, randomly-generated board and, after each evaluation, did a number of swaps proportional to my cost function, with a maximum of about ten. When the guesses started getting so good that I wasn't switching any cubes, I picked single cubes and rotated them randomly to create new probes. The idea is similar: the closer you get to a global minimum, the closer your probes are to your current guess.

I made two cost functions, one preferring boards with lots of words and and the other preferring boards with few words. I already had a function that could quickly search a Boggle board and find all the words on it.

The result: it worked. I was surprised.

BigBoggle boards tend to have around 150 words on them. Here's one with no words at all:

On regular boggle boards, the computer will score ~90. Here, it scored 917:

The code is private because of academic policy, but if you want to play with the program, it's here.

Markov chain Monte Carlo rules.

No comments: