Last time I made a solver to test puzzles for me. Once that was done, I could make puzzles like so:

- Fill in the board with tiles.
- Pull tiles off one at a time while the solver looks at what remains. If I pull off a tile and the solver says it’s stuck, then put the tile back and remove a different one.
- Keep doing step 2 until I have a puzzle of the desired difficulty.
- Lock down all of the remaining tiles.
- Done.

This is how I produced all of the puzzles in the builds I’ve released. Once this was done I shelved the project for about a year. But now that I’m back on it, I have to say step 1 is now the major roadblock to creating new puzzles.

It takes *time* to fill in a board. Filling in the first 8 / 9^{ths} of the board is trivial, but getting the last ninth into place can be tricky. Take this example:

This is what it’s like when you get down to the end. I need to add another 1 to the board, but there aren’t any valid spaces for a 1. So I’ll swap a couple of tiles around to make the 1 fit, but now maybe I can’t get the last 2 on the board. So I fix that but then 4 is a problem, and so on. It’s trivial to fix on 4×4 puzzles like the one above, but on a 9×9 this can actually take a while to sort out.

In the past I didn’t mind. Creating a filled-in board is basically a puzzle in itself. It was fun to do. But it’s also time-consuming and often not what I want to be working on at the moment. If you played some of my test builds, you may have spotted a few puzzles like this one:

I don’t know if I had any that were quite this flagrant, but this is a good example of a lazy puzzle layout. Rather than make a properly randomized puzzle, I just added things according to a nice orderly little system. The benefit is that you can fill in a 9×9 puzzle in less than a minute. The downside is that if the player notices the repeating patterns, it trivializes the entire puzzle. Sudoku puzzles are **not** supposed to look like this. I circled a couple of repeating patterns for you, but the whole board is filled with groupings like this. I filled in the top row with the digits 1-9 in numeric order, and then changed as little as possible with each subsequent row.

I did this a year ago to quickly fill in boards for testing. I figured I’d replace them with proper puzzles if the project ever got off the ground. Then I forgot I’d done this, so when I returned to the project this year I ended up releasing these lazy boards to the public.

Whoops. Oh well. Crappy puzzles are a content problem, and generally content problems are easier to solve than technology problems.

For the record, a proper sudoku layout should look like this:

That’s a nice random layout. But it takes time to make those by hand. I need a way to fill in a board automatically.

I assumed there was some mathematical trick for doing this, but no. It turns out that board layouts are made through trial-and-error.

I know the code should be simple to write, but I have no idea how long it will take to run. My hope is that I can get it under five seconds or so. Now that I’ve released my test builds to the public, the #1 requested featureAside from a “next puzzle” button, which I’ve already done. is for a way for the game to generate random puzzles. If I can fill in the board in just a few seconds, then I’ll have everything I need to generate puzzles. But if it takes 30 seconds then I’ll just shelve the idea until later.

The rules are simple:

- Working left-to-right, top-to-bottom, pass over the board looking for empty squares. (Instead of starting with a blank board, it can handle pre-existing tiles. This means I can make a puzzle that spells out “A BADONK BLOCKADE” using the letter tileset if I want to.)
- When it finds an empty square, it randomly places a new tile there that doesn’t violate the rules.
- If there is no valid tile for that space, then the puzzle is stuck. Back up to the previously placed tile, remove it, and try something different. If THAT one gets stuck, then it will back up to the previous, and so on.
- Keep doing this until the board is full.

It takes about four minutes to fill in a blank 16×16 board. That’s way too long. Although to be fair, I’m deliberately throttling this code. Instead of having the program lock up until it finds an answer, it just works on it a little bit each frame while keeping the program running. It’s even animating stuff, so I can see tiles flying all over the screen at hyperspeed. I’m doing this because I want to watch the process work.

Sure, I could make the search run faster by having it run exclusively and halting rendering and animation until it’s over, but I’ve got a better idea…

When the auto-fill code places a tile, it hands the resulting board off to the solver code I wrote last year. The solver will reply with all of the forced moves. As in, “Based on the tiles currently in play, here are moves you’ll HAVE to make.” This doesn’t do much early in the process, but it’s a huge boost to the latter stages of filling the board. It lets the fill code spot doomed layouts far in advance, instead of needing to work through all of the permutations before giving up. It also means there will be less permutations to test later in the process, since more of the board will be filled in.

Now it takes almost exactly 30 seconds to fill in one of these 16×16 monster puzzles. A traditional 9×9 is done in under 5.

I now have the two major pieces needed to automatically generate puzzles. It can fill a board, and then pull as many pieces as possible off of it.

There’s still the matter of balancing the system. Right now it can only gauge puzzle difficulty by how many tiles are on the board at the start. I’ll need to come up with a way to appraise difficulty if I wanted to make a random puzzle generator. But that’s a problem for later in the project when I’ve got a group of playtesters to work with. In the meantime I have a bunch of annoying technology problems to solve.

#### Footnotes:

[1] Aside from a “next puzzle” button, which I’ve already done.

## Games and the Fear of Death

Why killing you might be the least scary thing a game can do.

## Borderlands Series

A look at the main Borderlands games. What works, what doesn't, and where the series can go from here.

## The Middle Ages

Would you have survived in the middle ages?

## Dead Island

A stream-of-gameplay review of Dead Island. This game is a cavalcade of bugs and bad design choices.

## Tenpenny Tower

Bethesda felt the need to jam a morality system into Fallout 3, and they blew it. Good and evil make no sense and the moral compass points sideways.

Did you consider a more algorithmic approach to the problem?

When I had to do something like this for a pet project, I tried using a backtracker like the one you describe here, and the board was so big it just got stuck. Didn’t manage to fill half the board in an hour. So I scrapped the entire idea and implemented it as an Exact Cover problem – that took some time, but it started raining solutions. For you, it would probably fill a board in less than a second.

After I implemented my solution below I took a look at your Wikipedia link. Like most Wikipedia mathematics pages, it is inscrutable to me. Is there an example of your implementation in layman’s terms anywhere that I can look at?

Here’s an Exact Cover solver in Python in only 30 lines, along with an example of what EC actually

is:http://www.cs.mcgill.ca/~aassaf9/python/algorithm_x.html

And here’s a better explanation of Exact Cover, solving it and its relation to Sudoku and puzzles in general:

http://www.ams.org/samplings/feature-column/fcarc-kanoodle

Difficulty: you know how you wrote all your solver algorithms to return ‘true’ if they did anything to contribute to the solution?

Idea 1: rate the algorithms somehow as harder or easier. Difficulty is the highest number you returned true for. Or the sum of all the difficulties used. Whatever.

Idea 2: change the true/false return to “from zero to number of times I had to use the algorithm”. Weight the harder algorithms more heavily if they had to be used more than once.

As one of the people in your last thread who very specifically called out the predictable patterns issue (including providing annotated screenshots), I’m really glad to see you tackling this head-on. It was the biggest concern I had, as it definitely betrayed the claimed difficulty. I’m wondering if you found a way around the other pattern I specifically mentioned at the time, however, as I think that pattern may in part be a property of the board’s design.

Not quite sure if this would be able to give you all possible Sudoku puzzles but it should be possible to derive a large number of valid solved puzzles from one solved puzzle, simply by permutating the contents.

1: switch all numbers 1 with 2

2: extension: symbol x becomes y, y becomes z, z becomes x …

3: some subset of the tiles can trade places. The rules for that would be similar to the ones for placing tiles in the first place … Yeah, that’s a bit foggy since I don’t know those rules myself… I’m tempted to go and work that out now but my lunchbreak is just over :)

Anyways: even if there are just a few very simple switches and swaps, it should be possible to use one “expensive” solution (as you describe in the post) to generate a large-ish number of other solutions which are systematically similar but hard for a human to identify as such, especially if the puzzle derived from them has a completely different initial state.

Not a large number, not by any stretch of the imagination. The number of unique 9×9 grids (i.e. Without relabeled grids) is of the order of 6.67E21.

That’s why there’s no danger of seeing the same grid too often!

FWIW Math with Bad Drawings uses symmetries to get 1.2 trillion different puzzles from a single 9 by 9 puzzle.

oh, that’s a few more than I thought :)

It’s nowhere near the total number of possible ones, meaning that you’d be limited to a rather small subset of theoretically possible ones.

Applying the method from Math with Bad drawings to the full solution and then using Shamus’ method to go from solution to puzzle would of course give us a few orders of magnitude more though the big question (for which I have no answer) is: Can we cover all possible puzzles this way, from just one solution? If I had to guess I’d say no, probably not.

Which means:

An efficient way of generating puzzles would probably be to randomly generate new “seed” solutions every now and then, not every time somebody wants a new random puzzle. Then, when you do have to make a new one, you randomly pick one of the random seed solutions, apply a random set of switches and transformations to it, then randomly remove stuff until you have a new puzzle. As I understand Shamus post, the getting from solution to puzzle should run pretty fast in automated mode. The nice thing about that is also that you could influence difficulty by choosing which rules the number-removal algorithm is allowed to use. So you could make a puzzle which does not require e.g. hidden twins. Or does require them but not hidden triplets etc… or to make it even easier you could leave more numbers on the board than absolutely required.

That’d seem like a decent approach to me.

… I am absolutely with Shamus’ approach, though, in terms of having the algorithm that could run automatically run within the GUI instead, with human interaction and all. That’s probably the best way to teach oneself the rules and not just the maths but to actually get a feeling for it all, and it’s a game in and of itself.

Difficulty for sudoku is usually measured by the number of missing tiles.You can use that as the default if you dont want to fuss around with it.

But as a more advanced method,you could tweak your solver to return stuff like time it took to solve the puzzle,which methods were used and how much,etc.

I didn’t take the time to read this paper, but based on the abstract and leafing through it, it could prove helpful. I like that it is based on actual people solving actual puzzles, rather than purely theoretical considerations.

https://arxiv.org/abs/1403.7373

Difficulty Rating of Sudoku Puzzles: An Overview and Evaluation

Radek Pelà¡nek

(Submitted on 28 Mar 2014)

How can we predict the difficulty of a Sudoku puzzle? We give an overview of difficulty rating metrics and evaluate them on extensive dataset on human problem solving (more then 1700 Sudoku puzzles, hundreds of solvers). The best results are obtained using a computational model of human solving activity. Using the model we show that there are two sources of the problem difficulty: complexity of individual steps (logic operations) and structure of dependency among steps. We also describe metrics based on analysis of solutions under relaxed constraints — a novel approach inspired by phase transition phenomenon in the graph coloring problem. In our discussion we focus not just on the performance of individual metrics on the Sudoku puzzle, but also on their generalizability and applicability to other problems.

It’s good you got it down to 30 seconds, but since you already have it throttling all it needs to do is run in less time than it takes to solve a puzzle. That way when the user is playing one puzzle it can generated the next one in the background, and they can keep playing forever.

That can work as an endless mode,but usually with random stuff like this the user is given the option to change the seed before the random generation begins,thus completely negating any advantage preloading can give you.And,as Shamus pointed out,he wants an option to input a phrase you wish,so he definitely wants a option of manual seeds.

I wanted to comment on the speed thing too. Seems like the only options explored so far are:

1. Halt everything else in the game while this long algorithm runs.

2. Run the algorithm in tiny pieces and do animations after every piece.

There’s another option however:

3. Run the board-generation code async. Then you can show a spinner, or just run it in the background while the player plays another puzzle, or some other appropriate place in the game where the player is doing something else / waiting.

4. Generate a whole bunch of puzzles (maybe a few thousand?) before shipping and include them with the game, so the player’s computer only needs to do the generation if all of these are exhausted.

Idea for a difficulty meter would be looking at the “order” of missing tiles.

Some missing tiles are “first order” – they can be placed simply by looking at the tiles in the starting set, without placing any other tiles. For example, if there’s only one space on the initial board that could possibly contain a “1,” then that’s a first-order tile. First order tiles are the ones you can place “first” with no additional information.

Some tiles are higher “order” – they can’t be uniquely placed until at least one of the other playable tiles has been played first. Second-order tiles can’t be uniquely played with the starting position, but can be played once all the first-order tiles are played. Example – there’s a “2” in the starting set, and there are 2 legit places on the board it can go. However, there’s a first-order “3” in the starting set that occupies once of those 2 locations. Once that’s played, there’s only one place for the 2.

Third-order can only be uniquely placed once the second-order tiles are in, etc.

The way you could in theory use this as a “difficulty” scale is to look at the ratio of low-order tiles to higher-order tiles. Simpler games would have more first-order and second-order tiles – it’s easier to place them just by counting, and you have potentially more places you can start. More difficult games might have a single first-order and a single second-order tile, and more tiles that you can’t place right away.

I would consider a tile that can be placed just by looking at the board (eg, a square that can only have a “1” due to already sharing a group with 23456789, or a square that must have a “2” because all other squares in that group share other groups with a 2) as first order.

And it would be nice to have a button that would auto-play all first order tiles, since they are just a matter of taking the time to review every cell and assess them, with no higher-order thinking required.

You could then assess a rating based on how many tiles require a higher order (hidden pairs, etc.) logic to place.

With sufficiently advanced math, all tiles are first order.

The definition presented by MrGuy is that if the only input required to determine the correct location for a piece is the location of the starting pieces = first order.

Second order tiles, by definition, require additional input information – the location of the first order tiles.

No ‘advanced math’ is going to make all tiles first order, because the definition is strictly talking about input information necessary. The definition of 2nd order is that it takes as input the location of the 1st order tiles. There’s no getting around that.

If there are two legal completed boards such that all the starting tiles appear in each board, the start conditions are illegal.

Therefore, “select FROM legal_boards WHERE tiles match given” will take forever and a big database but give you all the tiles as first order.

With similar thinking in mind it would be “relatively” easy to make the solver give a numerical value to how many tiles can be placed each pass. As well as how many tiles have multiple possibilities to be held in memory for a second, third or nth pass. Then you could use these values to assess how hard the problem becomes to solve. (A dirty version of this would be to use the time for the solver to solve the problem as the number. Using a few set problems of known challenge level to gauge the system so faster ones dont give harder problems with easier than predicted level)

By keeping track of how many tiles have multiple possibilities for how deep in the search you are you are able to create a set of values then you need the fuzzy brush of people to decide about where the break point is between Simple, easy, challenging, hard, “impossible” should be placed.

No “Continue reading Â»” tag = this home page now has more filler.

The two ways I might look at measuring difficulty would be:

1) Based on techniques needed: make a couple different versions of the solver that don’t know all the techniques of the full solver (e.g. no hidden twin) and see which solvers can solve the puzzle.

2) While solving the puzzle, look at the average number of forced moves in any given position: when there’s a lot of forced moves, puzzles are easy: when there’s really only a single number that you can fill in for a given position, that’s hard.

—

Also, if possible, you might look at generating puzzles in the background while the game is running: a 30 second delay for a puzzle to be generated is fine, but if that processing can happen in the downtime while the user is staring at a different puzzle and the CPU isn’t really doing anything, that’s even better.

Back when I was toying with a solver of my own, I ended up at this page:

http://www.sudokuwiki.org/sudoku.htm

And learned about a lot of obscurely-named strategies that people use to solve sudoku. And what interested me about a lot of them is that many of them are just individual consequences of some core rules – the rules you discussed in your last post, about your solver. But here’s the trick. The reason why people break all of these different strategies out and describe them separately, instead of just going back to the core rules, is that each of the strategies is a heuristic for _recognising_ a situation where you can apply the rules to a specific effect. Computers are better at rules; humans are better at pattern-matching. So (most of?) the strategies correspond to how most _humans_ solve a sudoku, not to anything necessary for your computerised solver.

Why should we care? Because this gives you a way to judge difficulty. If you force _a_ version of your solver to follow human strategies – like the one on that page does – then it will probably be slower than the other, strictly rule-based solver, but knowing which and how many human strategies it requires to get an answer will give you some idea how hard it will be for humans to solve.

Shamus, here’s a fill routine in C I whipped up that can fill a 16×16 grid in 3ms. There’s basically no smarts to it. It just uses a pre-computed lookup table to check things along the way. To avoid getting stuck for too long, there’s a bailout threshold. That 3ms is the average

withbailouts, and the early bailout limits the worse case to tens of ms.https://gist.github.com/skeeto/f60e7e5d65639b995f78d706a72546b1

That doesn’t solve the rest of your problem, though I may take a crack at that next.

You should never get stuck when

generatinga puzzle. Maybe whensolvingit, but, I mean, you know what the starting state is, so if it’s impossible, it should be apparent at that point.Took a look at your code, looks like it’s hard-coded to 16×16? Can it accommodate arbitrary given starting states?

Well, looks like I spoke too soon. My “solver” absolutely does produce invalid sets, and the larger the board, the more state violations there are.

Looks like I need a state-reversing system. Hmmm.

By “stuck” I mean it makes an early decision that leaves the remainder impossible to fill. It

willeventually fill the grid once it backs out that decision, but will grind away for minutes, or worse. So instead it bails out if it doesn’t fill the grid in a certain number of steps, simply starting over from scratch.I just pushed an update to make it actually generate a legal puzzle. It only takes about 300ms. However, there’s still no metric to measure the

difficultyof its generated puzzles. Here’s one (I hope it formats correctly):_ _ _ F _ B H _ M _ P K L _ _ _

J _ M K N _ _ G _ _ L _ F _ D B

H _ C _ _ _ L O _ I _ _ N E _ M

D L G _ _ C _ F B _ O E A _ H _

_ _ _ A C O B _ L _ F J _ G M E

_ _ L E _ D I _ _ B N O C F P K

_ _ F C P _ M _ _ _ _ G _ D A O

O K _ G _ _ _ H _ C _ P _ L _ _

F J _ H O _ N I G L _ _ _ _ B D

G _ A M _ _ E _ _ D I B _ P L H

L _ _ _ H J _ _ P F _ N I _ _ _

B I _ _ _ M D _ _ _ _ H _ J _ F

_ G _ N _ _ F _ E P _ A _ B O L

K _ _ _ L E _ A _ G D F _ H _ I

E _ _ _ I _ _ _ K H B L _ _ F _

_ _ H L B _ K _ _ _ _ I _ A _ G

It also does 9×9 (I included tables for both) as a compile-time option.

If this were true, there would be a lot less content on this blog.

I’ll second the idea of selecting difficultly by using crippled solvers to test, or perhaps by waiting until your solver takes between x and y time to prove the puzzle. (Although that makes things strange based on where your solver takes a long time to do something that a human gets quickly).

If you use all the letters in the alphabet minus one (Selected based on input phrase? Shouldn’t be hard), you’d have a 25×25 board with sub-blocks of 5×5.

I also feel like you’d get a performance boost if the puzzle builder cached all the empty squares in one pass. Then your first rule would start “Get the next empty square…” and you won’t spend so much time checking the squares you already filled.

Might also want to tell your solver to just check the row, column, and aligned sub-blocks of the tile you just placed, as none of the rest of the board is going to be directly affected.

Does the solver maintain solution state during puzzle generation? If not, that’s another way to speed up puzzle generation. You’ve already got a list of all the squares, and which numbers they can contain. Instead of looking for just any empty square, sort your list of tiles by the number of possible numbers that can go in them, and choose one of those numbers. When the most options you have anywhere is 1, you’re done!

I just implemented this idea in Python (code through the link), and it generates a 16×16 puzzle in under 50 ms. Just for fun, it can generate a 144×144 pseudoku puzzle in under 30 seconds, but on the down-side, you can’t watch the tiles flying around at the same time. I tried a 256 square puzzle (with 65536 tiles), but it exceeded Python’s maximum recursion depth.

Something interesting I noticed about my solver (not sure if it is true of pseudoku in general) is that the larger the puzzle is, the more tiles I have to choose before the logical necessity runs away with the solution. For a 4×4 it’s 6 or 7 tiles, just under half of the 16 total. For a 9×9, it’s 40 to 47 tiles, or just over half of the 81. By the time I get to 225×225 puzzles, I’m choosing around 46k of the 51k tiles, close to 90%.

But I haven’t experimented with removing unnecessary tiles, to make the actual puzzle with blanks and stuff. I’m curious to know if the larger puzzles require a larger fraction of starting tiles in order to guarantee that there’s only one solution.

Hi Shamus,

check out the links I put in this comment:

http://www.shamusyoung.com/twentysidedtale/?p=36970#comment-1092981

They should give you a different perspective and may help you.

John.

https://www.youtube.com/watch?v=MlyTq-xVkQE

17 clues is the minimum for a 9×9, apparently. Not sure if there’s an easy way to derive the required amounts for other sizes, or if this would even be useful at all in trying to create puzzles, but.. maybe it’s helpful in some way?

To reiterate on my suggestion for a Monte-Carlo puzzle generator in the last thread, and because I am generally a fan of using (abusing?) the power of randomness and statistics to get around things that may otherwise be much more difficult and/or time-consuming to compute:

How feasible would it be in theory to first fill a board entirely randomly, then going through all the invalid constellations (e.g. 2x the number 4 in the same row) and swapping one of the offending tile with a random tile in a different part (row, column, block) of the puzzle? Would it work by just swapping between different invalid constellations?

I suppose this depends greatly on how many such invalid constellations would appear on average in a randomly filled board, but I think there should be a threshold below which this approach could actually work. And maybe if the initial random board too far above that threshold, it is faster to scrap it and start with a fresh random board, than trying to adjust an especially terrible one.

Or:

What are the odds of a completely random board fulfilling the requirements of a valid sudoku, and how long does it take to generate a random board and then check whether it is valid? In other words, could this actually be brute-forced?

Peter Norvig also has a page about programming a sudoku solver and generator. He generates and solves a puzzle in much less than a second (usually, there are edge cases). Using python, which is typically 100x slower than C.