Sunday, June 1, 2008

Sudoku solver in JavaScript

I remember siting once in a bar with my friend. I remember that night very well because it was very interesting otherwise. For whatever reason quite unknown to me, at some time of the evening I decided to take a look at her new phone. Quite cute little phone she had, but I am not a gadget-maniac, no. This wouldn't keep my attention that long - if there wasn't a games section. Well, not even that - but it had Sudoku. OK, it's not a very interesting game compared to, say, Civilization, Counter Strike or such masterpieces.

However, it was interesting. What kept me to it was one very simple thing. Here, take a look at the sample Sudoku board (taken from the above article about Sudoku on Wikipedia):

Nothing special - 81 squares, some of which (30, to be exact) have digits (1-9 only), some of which are empty. Rules? Very simple: no row, column and block (block is a set of 3x3 squares, separated by thick lines on the picture) can have more then one instance of the same digit. To solve it, you need to fill all the digits in. Did I mention it's nothing special? Exactly my point - so simple, but not so simple to solve. If you look at Algorithmics of sudoku, under Exceptionally difficult Sudokus, you will see that there are ones that are even impossible to be solved by applying only logic rules (i.e. without any guessing of the digits). At least, that is how it stands right now.

The above Wikipedia pages explain pretty well the different ways you can solve many Sudoku puzzles. To use it to solve the Sudoku, you generally need two methods - using only them is pretty efficient. Here are the two methods:

  • Neighbors elimination
  • Singles extraction
This is especially true if you are a computer - in that case, even if you don't solve using the above two methods, you can significantly reduce the number of possibilities. This is when so-called guessing comes in - you can call it brute force solving.

Let me explain the above two methods in short.

Neighbors elimination


Eliminating neighbors is basically applying the first rule given above: no row, column and block can have more then one instance of the same digit. This is rather simple - if you have a square which has only one digit as the possibility, no other square in the same row, column and block can have that digit as a possibility. By removing this, we reduce the number of choices, possibly clearing some other squares to only one possibility - and that is our goal at the end.

Here is what can be eliminated

Since there can be only one 3 in the marked row and the 3 in the cell at the bottom is the only available choice, the three in the cell at the top can be freely cleared - it is not a choice if we want to solve the board.

Singles extraction


Applying the second rule gives this method. The second rule says: to solve it, you need to fill all the digits in. While this is not explicitly mentioned in the rule, is it obvious that a filled Sudoku board means that there are no empty squares. This leads us to another significant conclusion when combined with the first rule. Say there is a row, column or block containing only one square with a specific digit as a possibility. In that case, that square must be filled with that digit. Otherwise, we could not put that digit in any of the other squares and some square would, thus, be left empty, which violates the second rule.

If you take a look at the marked block, you can see that the circled 3 is the only 3 in the whole block. This means that 3 must be in the cell. Otherwise, it could not go to any other cell in the block, so the board could not be solved properly.

Solving by the computer


The above solutions are right as applied. However, they cannot lead to solution in many cases, even when applied one after another more then one time. There are other, more complex, techniques. Even the best of them cannot be used to solve some boards - "Qassim Hamza" is one of them. While these may be solved if some new method of solving is invented in the future, the key point is that using the above techniques the number of possible variants is very narrowed. After that, very few number of guesses need to be performed and the solution can be found.

The "very few" can be very much for a human. The problem is that you basically need to keep track of all the possibilities, which becomes tiresome very quickly. Luckily, our PCs are very good at not getting bored quickly. We, thus, can use them and the brute force algorithm to solve the board.

The way brute force algorithm works is rather simple. Take a look at the following board:

Applying any of the two non-guessing methods presented above yields nothing - you cannot eliminate any digit. What brute force solving does is the following:

It takes one of the circled digits in the first cell, starting from left-top corner and going right, then down, which has more then one digit in it. It then takes the first digit - 1. It assumes that that cell has only that digit. After that, you in essence have the new board with virtually fixed digit 1 at that cell. What the program needs to do is solve that board.

Now, as we already mentioned, using the previous two non-guessing methods is rather fast comparing to brute force algorithm and yields boards that are in many cases much easier to solve. Thus, brute force solver actually applies the above two methods on the new board, until they cannot be applied anymore. There are three cases:
  • The board is invalid at some time, meaning that our guess was certainly not right, so we backtrack. That means - return to the starting board and take the next digit - 5 in our case and do the same thing again,
  • The board is not invalid, but still not solved. In that case, we are at the next "milestone". We have a board for which we need to guess another digit. We again take the first cell that has at least two digits and apply the same method,
  • The board is solved.
The given solution is quite fast even in JavaScript. It solved all the boards I tried in less then half a minute on a P4 2.8 GHz. If you would like to play with it or the code (it's GPL 3), take it from here.