On Tuesday I talked about messing around with the game Threes! and making variants of the established gameplay. If you didn’t read that entry, then the short version is this: I felt the game was too random, and I experimented with ways of making outcomes more related to player skill and less about the benevolence of the random number generator. I ran some simulations but didn’t come to any interesting conclusions. Then yesterday I ran a few more, and now I think I have some useful results. But first, let’s get caught up:
It turns out that I wasn’t the only person to think of moving the game to powers of two. Someone made 2048, which is also built around joining powers of two.
One of the points I made was that there were a lot of variants of the game that you could make. Threes! is a charming little game, but the given mechanics could be altered to make dozens of different games. What if tiles moved as far as possible, instead of just one space? How do you handle multiple combines in a single move? Where are new tiles added? How far in advance should the player see upcoming tiles? What if the goal was reversed, so you wanted to fill the board in as few moves as possible? What if we added powerups or space-clearing combos, like we see in Chime, Bejewled, or Lumines? What happens if we make the board larger? What if the player is told to achieve a win state, rather than delaying an inevitable lose state?Roughly, instead of playing to the highest score possible, your goal is to make a single tile worth x points. What if we give the user an undo button?
And so on.
Basically, we could make a hundred different games here. Some will be okay, some might be great, and many will be terrible, random, or boring. A big part of game design (in any genre) is in being able to figure out which mechanics will lead to stimulating play. So how do we find the right design?
Some people pointed out that I was probably focusing too much on randomness in my last post. We shouldn’t be worried about how much the outcomes diverge from each other, but how much difference we see between high-skill and low-skill outcomes. It’s fine if I score 300 one game and 7,000 another, as long as someone randomly mashing keys doesn’t best my score, and I don’t score better than someone with far more skill.
So for the purposes of this test, I’ve written five AI players:
- Circle: An idiot AI that will move up, right, down, left, over and over, until the game ends.
- Step: Another idiot AI that will move right, up, right, up, right, up, over and over, until the game ends. (To avoid getting stuck, it will move left if right is blocked, and down if up is blocked. But it makes no effort to maximize score.)
- Basic: Our first “real” AI. It looks at the possible outcomes of the next move and chooses the move that clears the most screen space and puts the most combine-able tiles beside each other.
- Improved: Same as above, but the AI looks two moves into the future instead of just one. (So, it will make a less ideal move now if it will lead to a really great outcome the turn after.)
- Advanced: Same as above, but it looks a total of 4 moves into the future.
This AI is pretty rudimentary, but I didn’t want to spend all dang day coding and testing AI. This should be good enough for the purposes of our test.
So now let’s do a run of games using the original Threes! rule set. You need to combine 1 and 2 to make three. Tiles move one space. You can only see 1 upcoming tile. We’ll have each of the AIs play ten games. Note that they will play the exact same ten games, with the random number generator offering the sameSince new tiles are based on what’s already on the board, there will still be differences. If A has a 16 on the board in turn ten, then they might get a 16 tile. If B just has two 8 tiles and hasn’t joined them, then B might get some other tile on turn 10. Other than this, both players should get the same series of tiles. tiles.
The result:
A comparison of the ten games. The Y axis is the number of moves the AI lasted before losing the game. The X axis is the game number in our series of 10. |
As expected, the Circle and Step are at the bottom, and one doesn’t really beat the other in a meaningful way. But as many players have noted, it’s often quite possible for random play to best intelligent play. There’s a lot of overlap between the random AI and the “good” AI. Sure, good AI beats dumb AI on average, but this chart shows the phenomena players have described: Play one game as best you can, then beat that score by mashing the keys randomly.
(I should note that the improved and advanced AI are at a disadvantage here. They’re attempting to look 2 and 4 moves into the future. However, according to the original rules you can’t see beyond the next piece. So they’re… “guessing”. This might result in an AI that makes moves in preparation of pieces that never show up. I’ll leave it to you to decide if their numbers are useful here.)
So now lets switch to this alternate rule set:
- Players can see the next 4 incoming pieces, as well as see where they will appear.
- Instead of needing blue and red tiles to make your first 3, players can simply join like-with-like.
- When sliding tiles, they will slide as far as they are able, instead of moving 1 space.
The result:
I think this paints a picture of a much better game. We can see a nice gap between random play and deliberate play. (Although I’d love to know what happened to the Basic AI in game 3I could add a way to step through the time-lapse games a bit at a time and painstakingly analyze the game turn-by-turn, but… I’m not going to..) Advanced AI always beats the lower AI, and improved AI beat basic AI 8 out of 10 games. The really big games belonged to the Advanced AI, and the Random AIs were shoved down to the bottom of the chart where they belong.
I think this makes for a more rewarding game. This is one of the reasons people love Dark Souls: You can get better at it. On your first play-though you’ll die a ton of times. On subsequent play-throughs you’ll die less and less. Stick with it, and you can even become good enough at the game to trivialize it. There’s a nice, large delta between the performance of a newbie and a veteran, which isn’t usually the case in a AAA game designed to welcome players of all levels. (I don’t play Dark Souls. I can appreciate the skill-based gameplay, but the journey would be too frustrating for me. High-cost death makes me miserableWhile learning. I actually don’t mind adding permadeath once I’ve gotten good at a game, but punishing death while learning completely enrages me. and angry and I’d likely end up smashing stuff while climbing that learning curve.)
The point is, players often enjoy mastering something and improving their performance over time, regardless of whether the skill is based on logic (Threes!) skill (Mario) or knowledge (Legend of Zelda) so that their victory feels “earned”. They just want a game where they can objectively improve. Going by this criteria, 2048 is a better game than Threes!.
Footnotes:
[1] Roughly, instead of playing to the highest score possible, your goal is to make a single tile worth x points.
[2] Since new tiles are based on what’s already on the board, there will still be differences. If A has a 16 on the board in turn ten, then they might get a 16 tile. If B just has two 8 tiles and hasn’t joined them, then B might get some other tile on turn 10. Other than this, both players should get the same series of tiles.
[3] I could add a way to step through the time-lapse games a bit at a time and painstakingly analyze the game turn-by-turn, but… I’m not going to.
[4] While learning. I actually don’t mind adding permadeath once I’ve gotten good at a game, but punishing death while learning completely enrages me.
Self-Balancing Gameplay
There's a wonderful way to balance difficulty in RPGs, and designers try to prevent it. For some reason.
Artless in Alderaan
People were so worried about the boring gameplay of The Old Republic they overlooked just how boring and amateur the art is.
Could Have Been Great
Here are four games that could have been much better with just a little more work.
Grand Theft Railroad
Grand Theft Auto is a lousy, cheating jerk of a game.
PC Gaming Golden Age
It's not a legend. It was real. There was a time before DLC. Before DRM. Before crappy ports. It was glorious.
A certain Shamus may have forgotten to do a “show more” on this post. You should probably have a talk with that guy.
Yeah, you hurt my mouse-wheel finger by 1 second of extra scrolling;
I demand a refund! :P
On Topic:
I’m really surprised at how quickly you’re able to crank out these games.
Maybe it’d be easier if I actually sat down and tried to make something like this, but it seems like programming a game to follow whatever kind of rules, would take a while to do.
I agree. If I could be bothered to put together my own “boilerplate”, maybe I might do more than think and maybe jot down some ideas and notes about potential games.
Based on your comment, I decided to find out how long it would take me to create a trivial version of a game like this. Turns out, it’s about two hours, using what I know (C# and WPF) and I think it would take about the same time using other languages/frameworks. You really don’t need OpenGL for this.
Sure, my version looks bad, doesn’t count score or detect game over, but it is playable.
So, if you want to play with some gameplay variations just for fun, or something like this, you should be able to get there in an afternoon, assuming you already know programming and something you can write the UI in.
If anyone’s interested it’s about 300 lines of code and it’s at https://github.com/svick/2048.
Screenshot: https://raw.github.com/svick/2048/master/screenshot.png
Yup, I saw 2048 the day after I read your post, thought “hey, maybe I should point this out in the comments!”, went to look again, and found this post :-)
Anyway, I’ve only played both of the web-based applications, but after some 10-20 games of each, I have to say I prefer 2048 over Threes!. However, I can imagine people preferring Threes!, it’s a little more seemingly-random, it looks more interesting. Looking for 1 and 2 to join together “feels” better than “just pairing up”.
One thing I like about 2048: there’s not only a win state (get to 2048), also a best possible score (namely, having *only* a 2048 tile on the board, and the new tile from the last move). Any score over 16,500 or so means there’s still room for improvement – though I expect only an AI would be able of getting there
But the score counter (specifically, the “Best” score it remembers) works on the basis that more is always better. Theoretically, it should be possible to arrange things so that your last move creates multiple 2048 tiles, which would give a higher score and be a bigger achievement than the minimalist ending.
I agree with all your reasons for 2048 being better, plus I’ll add one.
Many people pointed out that Threes! has a tipping point when the board gets to about 70% full where it is impossible to come back.
Conversely in 2048 it is entirely possible to have the board completely full and drop it back to half full or less with just a few moves.
Not to say that your conclusion about randomness is wrong, but the randomness in the AI and player scores may correspond more to how people are playing it than to how random it actually is.
As you already said, the rules are different, and the AI that you essentially designed for the alternative ruleset may simply not be as good at the original.
Same with the players, as dealing with a two “useless” pieces (1 and 2) before turning them into normal “useful” pieces may require a different mentality than with the alternative ruleset, where the 2-4-8 progression is much simpler to understand. Having played a couple games of each, I feel it is possible to Threes much more strategically, even with percentages and everything. It’s just that the other one is much easier to understand.
That’s an interesting point. The seeming randomness of Threes! could be a result of the AI not having the right goals as much as it could be from the game being more random.
I mean, “maximize available moves” sounds like a good scoring metric on paper, but I wonder if a more nuanced metric would yield different results?
For the original Threes! I’d like to see an AI that’s focused on creating the highest number of possible combinations while blocking as many rows-columns as it can. Programming the smart AIs to constantly clear the board seems disadvantageous; the clearer the board, the less predictable the next tile’s location becomes.
If you let it see every piece in the game in advance (or a high enough number that it’s effectively all the way to the end), would the AI always play a perfect game?
Assuming Shamus programs it correctly, and you have a ginormous* amount of memory and processing power, yes the computer will always win or tie**, with perfect information.
Which is why Skynet is always a real danger in the back of my mind. ^^;
* These types of problems usually use up memory/CPU either exponentially with regards to the size of the input, or polynomially but with a big exponent on the biggest polynomial.
Been a while since I studied AI, so I can’t remember the specifics, but it’s definitely a many-branches type problem…which makes me think I shouldn’t have hedged my bet, and just said it was exponential. :P
** In a single player game like this, I guess there is no way to “tie”.
Luckily, perfect information is impossible to obtain in the real world.
Perfect, yes, but sensors, cameras, and other information-gathering devices continue to get better and cheaper.
So, “good enough to enslave humanity” is still a possibility.
Fear Guardian!
Fear Colossus!
Not really,since even the best camera today coupled with the best processor,graphics card and software still doesnt come even close to the human brain in terms of efficiency.And thats just the sensory part of the equation,the one we have developed the most.We are centuries away from machines posing a hint of a threat.
Which is a shame,because damn it,I want my robot overlords!
From an intelligence standpoint, that may be true (predictions being hard, especially about the future), but in terms of brute force, it could happen with today’s technology.
Bots that mine minerals, bots that make bots, bots that kill every non-bot thing that moves. It would need a bit of a head start, but then it would be hard to stop.
But that’s not really Skynet-like in any meaningful sense.
Relevant xkcd:
https://xkcd.com/1046/
Depends on what you tell it to do. That “up right down left” AI wouldn’t play any better if it knew what the next thousand tiles were going to be.
OK, yeah, I should have clarified that.
A hamster with the totality of Shakespeare is no better than a hamster with an equal amount of blank paper. ^^;
Either beats the hamster without any paper at all; at least the floor is clean.
…don’t use Shakespeare to beat hamsters.
Yes, follow Either’s example and beat your hamster bare handed. A clean cage is a happy cage, especially when it was cleaned by mopping the floor with its furry inhabitant.
I hear something in the distance… jackboots?… no… dictionaries slamming shut! Run! It’s the grammar nazis!
…that is the worst misspelling of “gramophone noises” I’ve ever seen.
What happened to Basic in game 3? Easy enough: the random number generator making the tiles was unkind. That happens sometimes. It’s not random if it doesn’t. (:
Did you make all three of those changes in one go, or did you implement them one at a time until you got to a level of randomness that you were comfortable with? It would be interesting to know which changes do the most to reduce the randomness.
Very cool results! Glad to see you still doing AI stuff!
Your last statement seems to say that the “alternate rule set” is the same as that used by 2048. Is this the case? If not, how similar is 2048 to your example improved rule-set?
“[…] Dark Souls: You can get better it.”
Assuming you meant ‘better at it’?
And interesting post (though really these programming ones have a tendency to be).
(In fact, I’ve written a program that compares the rating I’d give to each individual post, along with some other factors, and I’ve turned the results into graphs (along with some screenshots from development), that I shall post right here, and…)
Took me over an hour to finish the last paragraph and a half because of that video you linked. Speed runners always make me feel bad.
I think of speed runs as something of a curiosity though not something I would aspire to. It’s an interesting picture of cheesing the game or knowing it inside-out but I don’t find that playstyle particularly fun to perform.
What interested me about 2048 was how much better I got at it after watching the AI play it for a minute or so. It settles into a sort of left-right-left-right-join(up or down) pattern, which I found quite useful. Of course you still have to decide when to join and when to collect a few more with your left-rights, but it definitely improved my scores.
This is a reason why I liked FTL so much over some other rogue-likes. It does have a lot of randomness, but the playthroughs are relatively short (compared to, say, Dwarf Fortress or Unreal World), the graphics are both functional and pretty (instead of just functional) and skill, knowledge AND logic are all factors that improve your chances significantly on subsequent runs.
Then there are the ships as both a gameplay-style choice and a second difficult curve. Some ships are really, really hard to get off, and you need to be very skilled with them and the game to even get past the initial sectors. A newbie will take these ships and will invariably die within a couple of encounters. While a veteran will just know what the ship can and can’t handle and try to minimize damage on that knowledge.
I love the stealth ship thanks to the more deliberate play style you have with it. The ship’s defense is dictated by an active role of the player early in the game, and certain enemy ships WILL wreck you if you don’t damage certain rooms of theirs right away. Even if just to try and escape. I’ve found the initial rock ship to be very challenging as well in the start, since it only has missiles armaments, meaning it’ll necessarily lose resources in order to destroy an enemy ship and you’ll have to be more inclined to accept surrender, especially when they’ll save you a couple of missiles and the enemy is offering missiles (that won’t necessarily be there once you wreck them).
FTL is indeed pretty cool, but I think that a lot of gamers would be put off by the constant “failure” of death;
Modern gaming has pretty much equated death with “bad” for so long, that a game like FTL, where death is more “meh” is hard to get used to.
Shamus, I have to ask. Do the AI’s play exactly the same game? I mean, say they start playing game n. Do they all have the same start configuration of the board and do they get the same new tile on move k? I assume so, since that would be the best way to see their prowess game to game as well as the average behaviour, if the game n is random at the start.
Yes, same initial setup, same sequence of new tiles. EXCEPT:
New tiles become available based on what’s on the board. So if, on turn 10, A joins a couple of pieces and makes a new tile (say 256) then they can start seeing new, higher-numbered tiles appear. If B hasn’t managed to join two 128’s yet, then B will still be getting lower-ranked tiles.
So their sequence of new tiles is *mostly* the same.
These two posts (Experimenting with Threes!) have once again highlighted your, frankly, spectacular brain. Great.
Howdy! I’m one of the developers of Threes. I enjoyed the post here, you raise an interesting question.
Though, from what I’ve seen, everyone who is decent at Threes (can get a 384 every time, at least) usually beats 2048 in one or two tries. That’s what happened to me at least, so essentially 2048 is a solved game. I’m not saying it doesn’t have value or that makes it “better”. That is totally up to you.
Sort of like if you’re more into checkers or chess. Competitive games incorporate a “random” or hidden-information element with the addition of another player, whereas solitary games often have to hide a certain amount of information. Take Solitaire for instance: not every game of solitaire can be won. There are variations on the game that increase that number, but it seems like people still like Solitaire and with a deck of cards and so much hidden info with the facedown cards, you’ve got a lot of ??? in that game.
But during the 14 months of development on Threes, we really chased after making a challenging game that you could play for years over a long period of time. Toward the end, when we were about to submit the game, our friend Adam Saltsman texted me late one night that he’d basically solved Threes. He’d gotten a 768 and felt like he could continue forever with his corner strategy. At the time we allowed movement of a new card onto the board when there weren’t any possible moves as well as showing 3, 6, 12 cards in the next slot. This proved too powerful and we made some adjustments that were later adjusted further with the addition of the “+” on the next card. This makes for a game that you can definitely get better at (when focused, I always at least get a 384) while still feel challenged to push into that next level (a 768, 1536 or whatever card is one past your highest). It takes a surprising amount of attention, patience and strategy to get a 1536 in Threes. I’ve only had one.
Anyhow, thanks for the post. :)
Thanks for the perspective, and for the history. From what you’ve said, it suggests that giving the player more information makes the game less random early on, at the expense of making it a dead-end for advanced players. Very interesting tradeoff.
Sadly, I am not nearly good enough to “solve” 2048. :)
Yea that’s a big part of it. Having an incredibly enthralling game for beginners AND advanced players from start to finish is a bit of a pipe dream, but shooting for that target is where that “flow” comes from. It really is a delicate balance and you’ll never make a “perfect game” but you do your best. Some say Poker is the closest, or maybe Go. Either way, good thing we have so many games to play that can scratch our various and unique itches. :)
I found 2048 quite doable. Focus on pressing down and right, pressing left when stuck. This causes high numbers to accumulate in the bottom right corner. Just keep smaller numbers always to the up and left of larger numbers, and you can consistently combine them downward and rightward for most of the game. You may have to break from this pattern at the very end, but it will get you a long way.
Since there’s a developer here, I feel it’s worth mentioning that the method you’ve used to generate the next tile doesn’t match what Threes! uses (and the developer can confirm, as also shown in their publicized devlog emails).
It seems you are simply randomly generating tiles, whereas Threes! uses a bag system like Tetris. According to experimentation, it’s 4 each of one, two, and three tiles, along with one extra tile not in the above that can be worth up to 1/8th of the highest tile on the board which shows (maybe) every other bag. If 1/8 doesn’t amount to at least 6, though, then the extra tile is not added at all.
So every twelve/thirteen tiles, the next twelve/thirteen tiles are predetermined. This means it’s possible to receive 8 ones before seeing a two (four ones at the end of one bag, and four ones at the start of the next), but highly unlikely; it’s 0.2% to even see four tiles in a row in the first place. Even in such a case, you’d have to have some serious board mismanagement to not have already combined some number of the ones/twos already on the board, and you are guaranteed to receive four twos within the next 8/9 tiles.
Additionally, new tiles will only show a column/row where tiles have moved. If there is only one tile in that row/column and it is already against the wall, moving towards the wall will never spawn a tile in that row/column.
All that being said, these factors will have little bearing on how the AI performs unless it is tweaked to be aware of the rules and can make guesses based on the probabilities of the remaining tiles in the current bag, which WILL eventually show even if they guess wrong. Once it’s done, though, I suspect the smart AIs will actually start out-performing the dumb AIs.
New interesting wrinkle: 3D version playable at http://joppi.github.io/2048-3D/ . Envision the three panes on top of each other…
Ok, if it hadn’t been for these posts on Threes, and now 2048, I would have dismissed the FB post that led me to the WONDERFUL adaptation of 2048. I give you:
The Doctor Who Edition!
Just wanted to leave this here: it’s a site with all kinds of variations of Threes! and 2048. Double boards, tetris-style falling blocks, whatnot. Enjoy :)
http://get2048.com/
I just have to share this lovely version:
http://louhuang.com/2048-numberwang/
If you still haven’t tried it, I’d recommend at least giving Dark Souls a go. I wouldn’t call dying in it a high cost; just non-zero. The only way to lose big is to die twice in a row, and early on you don’t have much to lose even in that worst case. (you only lose unspent souls, and early on you’ll probably want to spend them rather than save them anyway)