It’s been a running joke for a couple of years that half the games coming out have the word “Dead” in the titleAlso, games with ‘half’ in the title are dead.. Dead Space, Dead Island, Deadlight, Left 4 Dead, etc. So it got me thinking: Just how common is the practice, really? Is the word “Dead” really as played out as it seems, or is this a case of confirmation bias run amok? Aside from “dead”, what are the top overused words in game titles? Are there any overused words that we just don’t notice?
So I’m going to find out. Since I don’t want to run through and manually enter the name of every videogame ever made, I need a way to automate this. The path of least resistance seems to be to use Steam’s library. Being a PC platform, Steam is obviously missing a ton of games. But this should be close enough for our purposes. This isn’t science, it’s trivia.
Sadly, I can’t find a clean way to extract a full list of titles from Steam. The closest I can come is this file, which looks kind of promising at first. But there’s no way of knowing how old the list is, or if all games are listed.
Worse, the list includes a lot of non-game stuff like DLC and trailers. Which means that if there was a game called Dead Shooter, then it might appear several times in our list like so:
Dead Shooter Guns Pack 1
Dead Shooter
Dead Shooter Release Trailer
Dead Shooter Launch Trailer
Dead Shooter Beta
Dead Shooter Guns Pack 2
Dead Shooter GOTY Edition
Dead Shooter Brady Guide
Ugh. I feel strongly that we should not count “Dead Shooter” EIGHT TIMES. There’s only one game called Dead Shooter so it should only count once.
In an ideal world I could just query the Steam database and filter out things like trailers, betas, and DLC. But apparently the paranoid people at Valve don’t want to allow open database access to all the random strangers on the internet? (The nerve!) So we have to settle for parsing this text file and trying to untangle it ourselves.
Some DLC has the word DLC in the title. But not all of it. Sometimes it will use the word “Pack”. Sometimes there won’t be any special descriptor at all: “Dead Shooter – More Guns”. There’s no way to know if that’s DLC, a trailer, a sequel, or a special release of the game without going to the Store page and looking at it. And I’m not going to manually inspect all 15 thousand games in this list. Sorry…that you’re demanding something so unreasonable. What’s your problem, anyway? Sheesh!.
Our goal:
- Parse a large text file and strip out all the crap that isn’t the title of something.
- Remove everything than can be identified as DLC, beta, trailers, etc.
- Do a word frequency count on the remaining titles.
- A few words shouldn’t be tracked. Sequel numbers aren’t very interesting for this project. Neither are words like “the” and “of”.
- Find the top N most overused words and list them, along with their games.
We need to find the best most convenient tool for this particular job.
I actually own tools like this, except my set also includes a special screwdriver that is always magically the kind I don’t need at the moment. |
So we need to use a language or a script that’s good at doing complex conditional text parsing. I know some people will say Perl script is a good tool for this job, but I can’t tell the difference between valid Perl and something I typed with my faceIs there a difference?. This could also be a good job for Python, but I don’t know Python at the moment. I always find myself in one of two states:
- This is probably a good job for Python, but I don’t want to stop working on this project to learn a new language. I’ll learn it later when I’m not so busy.
- Man, I really ought to learn Python. But I don’t need it for anything right now. I’ll wait until I have a project I can use it on.
So I guess I’ll use everyone’s favorite language, PHP. And to be fair, I think PHP is a good fit here. When you don’t care about stability, security, or performance, and where code maintenance isn’t a concern, then PHP can often be a decent choice.
So I write some PHP to chew through the text and give us the goods.
The result? A complete mess. As usual, it’s Activision’s fault.
The Call of Duty franchise has a ridiculous amount of DLC and trailers, none of which have “DLC” or “Trailer” in the title. So we get page after page of stuff like, “Call of Duty, Call of Duty Singleplayer, Call of Duty 2, Call of Duty 2 Singleplayer, Call of Duty: United Offensive, Call of Duty: United Offensive Singleplayer.” This shoots both “Call” and “Duty” to the top of the list. There are “only” about 12 CoD titles on the PC, but my list is showing nearly one hundred. And there’s no good way to filter this except to remove everything with “Call of Duty” in the title. And that’s fine. Without this one franchise the words “Call” and “Duty” aren’t common at all and shouldn’t appear on our list.
Other notable offenders are Company of Heroes and Total War, which pollute the list with a ridiculous flood of DLC. The last major offender is the Train Simulator games, which have a million little DLC packs that all have the word “class” in the title.
I filter all that crap out. We’re looking for overused words in game titles, not over-monetized games.
There’s one last round of filtering we need to do. We need to remove a lot of descriptive words: Gold, Steam, Online, Episode, and Game. Also sequel numbers and years. Those words are really common, but while “Dead Shooter” seems like it’s overusing the “dead” word, I don’t think anyone minds when an MMO is called “X Online” or when a re-release is called “Gold Edition”. And “Shoot Guy 2014” is arguably just as good a title as “Shoot Guy VII” and “Shoot Guy 7”. All we care about here is the “Shoot Guy” part, not the sequel identifier. These words are descriptive and helpful to the consumer and I don’t think it’s fair to count them as overused. They make game titles less confusing, while putting the word “Dead” in everything makes them more confusing.
So after filtering out as much noise as I can, here are the to 20 most overused words in game titles:
1. World – 129 titles
2. Dark – 120 titles
3. Star – 107 titles
4. Space – 98 titles
5. Quest – 89 titles
6. Battle – 89 titles
7. Dead – 79 titles
8. Magic – 78 titles
9. Black – 78 titles
10. Ghost – 76 titles
11. Wars – 72 titles
12. Simulator – 66 titles
13. City – 63 titles
14. Kings – 62 titles
15. Dungeon – 61 titles
16. Rise – 61 titles
17. Dragon – 57 titles
18. Deluxe – 56 titles
19. Maker – 54 titles
20. Evil – 54 titles
So that’s something of a surprise to me. I hadn’t ever noticed the overuse of “World” or “Space”. And “dead” – which I expected would be one of the big offenders on this list – is fighting for seventh place.
The list isn’t perfect. I think Crusader Kings DLC is propping up #14, and I noticed #20 counts all games with both “Evil” and “Devil”. “Ghosts” is propped up by Call of Duty: Ghosts and its endless flood of trailers and DLC. I could probably find other flaws in the list if I went digging for them, but this basically satisfied my curiosity. If you want to see it in more detail, here is the top 20 list, including the games.
Footnotes:
[1] Also, games with ‘half’ in the title are dead.
[2] …that you’re demanding something so unreasonable. What’s your problem, anyway? Sheesh!
[3] Is there a difference?
The Brilliance of Mass Effect
What is "Domino Worldbuilding" and how did it help to make Mass Effect one of the most interesting settings in modern RPGs?
What is Vulkan?
There's a new graphics API in town. What does that mean, and why do we need it?
Resident Evil 4
Who is this imbecile and why is he wandering around Europe unsupervised?
Juvenile and Proud
Yes, this game is loud, crude, childish, and stupid. But it it knows what it wants to be and nails it. And that's admirable.
My Music
Do you like electronic music? Do you like free stuff? Are you okay with amateur music from someone who's learning? Yes? Because that's what this is.
Time to make Dark Star the Battle Quest of Space World – Black Magic of the Dead edition
I’d like to also suggest someone make “Dark-Star World: Quest to Battle the Dead Ghost Kings of the Black Magic Space Wars – Evil Deluxe Dragon Edition”.
Obviously I wasn’t thinking big enough with my “World of Dark Star: Space Quest”.
Also it turns out that this (http://en.wikipedia.org/wiki/Darkstar:_The_Interactive_Movie) is a thing.
Dark Star *would* make for a fairly decent setting/inspiration for a Space Quest game.
Well, for certain values of ‘fairly’
Also certain values of ‘decent’.
Also I’m not sure I’d use the word ‘inspiration’ in conjunction with this project.
I am so disappointed that it is not an interactive version of *this* movie.
http://en.wikipedia.org/wiki/Dark_Star_%28film%29
I would play a game that held up to that title. But what is weird is that even though I know it’s just a hodgepodge of popular words, my brain still broke it down into a genre. Though it doesn’t even exist and I don’t even know what it could possibly be like to play.
Sci-fi RPG. For those wondering.
To be honest “Black Magic of the Dead” is a pretty metal subtitle, I would play a game that had that subtitle.
It would be Korriban from KOTOR.
The title is so out there that it definitely sounds like a conscious use and it’s been done a lot in recent years (a few off the top of my head Deep Dungeons of Doom, The Mighty Quest for Epic Loot, Holy Avatar vs the Maidens of the Dead). It’s interesting to see this namig device still works as I do feel it draws my attention.
“World of Dark Star: Space Quest” was what I had in mind too. The funny thing is it actually sounds like a very plausible mobile game.
Star World: Dark Space: Black Magic Battle Quest.
“Dark Space World Star” sounds pretty cool, when de-contextualized from existing titles.
Why didnt you just use wikipedia?It has plenty of lists.
For example,there is a list of games using steam authentication.
And more importantly,a bunch of lists of all the games made since forever.
I can’t help that notice that “Lists of Lists” – of which the lists of video games lists is one – is a disturbingly large category. I can only hope that it doesn’t grow so large as to itself need to be subdivided…
Listception!
But,to be fair,lots of those lists overlap.What you need is just the list of windows games(because pc master race),and then just go through all the alphabetical lists and gather them together.
There’s also the List of lists of lists.
Which, of course, includes itself.
I think they should create a list of lists that don’t include themselves.
That wouldnt be logical.
That might be difficult.
Yes, thanks for explaining the joke.
For more fun, make a list of all the lists that do include themselves.
It’ll start an edit war over whether the list should be included in itself, and whoever edited last will be right every time!
The one thing I can say in defense of the Steam approach is the Wikipedia approach (depending on how the list is constructed) could be biased based on what is notable enough for people to think to include. A Steam list, if its done by database query, lists the games people might not think of. Of course, it has its own biases based on the particulars of the demographic that uses Steam (and the types of developers that would use Steam. Indies and the triple AAAs that can afford multiplatform development).
It would be interesting to see what the list looks like for Google Play or Apple Store. I suspect it would probably be heavily influenced by copycats trying to cash in on the more successful mobile games.
Are you sure? The people at SteamDB suggest that:
They’re using SteamKit2, a .NET library accessing the Steam API.
I happen to have some experience acquiring data from Steam so here’s my crack at the problem.
1. go to Steam search and set up a filter to show only “Games” without DLC and other crap
2. we want to grab data from all pages. Unfortunetely, the search results page handles page flipping through Javascript, so we can’t just grab page sources with curl – instead we need something like this script which returns page source after JS has done its magic
3. run this in a Bash terminal:
for i in $(seq 1 184); do curl-phantom.js “$(echo “http://store.steampowered.com/search/?snr=1_4_4__12&term=#sort_by=Name_ASC&category1=998&page=$i”)” | grep ” | sed -r ‘s|.*>([^<]+)<.*|1|'; done(nevermind, the comment form mangles it badly)4. now we have a full list of game titles
5. some more rounds of sed, sort and uniq to get a list of most frequently used words (less sanitized than Shamus’ by choice):
190 Edition
99 War
64 World
63 Game
55 Gold
47 Wars
47 Simulator
46 Space
46 Heroes
46 Dark
44 Deluxe
37 Star
37 Dead
34 Battle
33 Super
33 Quest
29 Life
28 Lost
28 Adventure
27 Time
So World War Simulator: Gold Edition and Heroes of Dark Space Deluxe are both more played out than Super Dead Battle Quest?
Interestingly, if you do a search within Shamus’ list, there isn’t very much overlap between the most-used titles. There’s no “Dark World” or “Space Magic” or things like that.
Almost all sites that use JavaScript have some form of pure HTML fallback solution when accessed by a browser with JS disabled. I just tested this myself on the steam page you linked to, back/forward and page numbers are just regular hyperlinks.
Right, I missed that. In this case regular curl can do the job after all. However, there are pages on Steam that absolutely do require JS, such as the community market listings page, which is why I made the leap in the first place.
This is where python would be handy. Really, “war” and “wars” should both counted the same, but that means stemming the results before counting. I don’t know if it’s easy with a bash script, but I’ve made a web-spider word-counter for a sociology paper about FOSS web pages before, and stemming was really easy there…
After de-pluralizing, lowercasing (which makes little difference) and removing “edition”, “game”, “gold” and “online” the top looks like this:
148 war
75 world
56 hero
49 dark
48 star
47 simulator
46 space
44 deluxe
38 legend
38 dead
35 dungeon
34 king
34 battle
33 super
33 shadow
33 quest
30 lost
29 life
29 adventure
27 time
“World War Hero: Dark Star Simulator” 10/10 GOTY
And it’s first “Deluxe edition” with DLC: Space Deluxe w/ Legend of the Dead Dungeon DLC included.
More importantly, how did she get so good at writing backwards?
By climbing a trellis in the shape of a klein bottle?
“What is this, a whiteboard for ants!?”
The real whiteboard needs to be at least three times as backwards as this.
It actually isn’t that hard to write backwards if you spend a sufficient amount of time doing it. At the start you do need to check the way you write things a lot, so you can repeat the same things in reverse, but it turns automatic pretty quickly. It also isn’t a skill that fades. You pretty much just need to learn to do it once.
Yes, I do get very bored in class, thanks for asking.
And my handwriting is actually better when I write backwards. I blame it on the fact that the world is filled with right-handed assholes, who miss the obvious benefits of writing from right to left.
Yeah, I learned that as well when I got bored in class. That was a few years ago, but I can still do it.
Strangely, I find longhand to be far easier than normal non-connected writing to write backwards. It just flows together and I don’t have to constantly think about which direction this particular letter goes in.
Also, simply practice writing with your off-hand. Most people will naively write backwards with that hand, so once you get penmanship to an exceptable level you write with either hand.
Bonus: You’ll learn to read backwards as well.
Yeah, Crusader Kings has dozens of $2 cosmetic DLC packs, which are routinely sold as a huge bundle for $10 at Steam sales. They also have a bunch of legitimate expansion packs with features like “Map now 30% larger, includes India”, so it’s difficult to weed out the false positives properly.
You actually get the larger map for free in a patch. The DLC only allows you to play as an Indian ruler.
Looks like Tom Clancy’s Ghost may have a spectral thumb on the scales! :D Some of those titles sound rather intriguing. I wonder what goes on in Happy Wars? They have Fast Food Clerics, apparently! Able to channel the healing power of tacos and such, one assumes.
I wonder if the “AAA” list and “indie” list would be markedly different? (If it were even possible to clearly delineate what counts as which, of course.)
I wonder that too. “Space”, “World”, “Battle”, and “Quest” I associate with older games (mostly pre-2000s, or maybe pre-2006) and a lot of indies seem to be targeting those retro niches underserved by AAA.
As far as perl – it’s hard to tell something you typed with your face from valid perl, but you can write good perl that doesn’t look like something you typed with your face! use strict; and use warnings; help a lot (they tell perl to be more strict about what it considers valid and emit warnings when you input something technically valid but almost certainly not what you want, respectively). Beyond that, it mostly comes down to good code standards (not allowing the crazy facesmash variables except in cases where they have explicit, obvious meanings, like @_ for arguments and $1 in regex matches).
It’s still very easy to end up writing facesmash perl (which is my new favorite name for it, I called it line noise before), which is why I prefer python for most scripting stuff, but when you’re just trying to do something with a giant pile of text, perl’s built-in handling for text is hard to beat.
Indeed. One of the reasons I like Perl so much is because it is such an expressive language that if you really want to facecode, you can (but you’ll hate yourself in the morning). If one is so inclined, you can also code in a more Modern Perl style.
I’m actually not a fan of python; its philosophy is “there is exactly one way to do it” and attempts to get everyone to write clean “pythonic” code in that paradigm, but it is entirely possible to write terrible code in any language regardless of forced whitespace. Also it doesn’t have lexicals! (or, technically I think there was progress made towards that in python 3, with a weird ‘nonlocal’ keyword, but that doesn’t help avoid the sort of bugs I would hit if I just had a ‘my’ or ‘var’ keyword for every declaration…)
Edit: Oh, and in regards to use of $1 for regexp matches, these days I avoid using them for anything but the most trivial of regexps; use the named captures introduced in 5.10, ideally in combination with /x for whitespace-heavy pretty-printed patterns. (?<areacode> ⧵d{2,4} )(?<phone> ⧵d{8} ) for example, captures to $+{areacode} and $+{phone}. Much less room for mistakes.
Edit Edit: Had to use ⧵ (U+29F5) instead of backslash to avoid the obvious in hindsight comment munging designed to prevent nasty hackers from persuading PHP to do stupid stuff.
I find it funny that whatever unicode escape you used is not supported by whatever character set this random laptop I’m viewing it from has installed. I just see a bunch of boxes.
TOR did something similar with Sci-Fi and Fantasy book titles. Theirs was limited to titles from the previous decade. The post was on 2011, so I assume it was 2000-2010
http://www.tor.com/blogs/2011/03/best-of-the-decade-data-common-words-in-titles
Their Top 10 was
Shadow
Dragon
War
Night
Dead
City
Dark
Blood
Magic
World
And John Scalzi went the extra mile and made http://www.tor.com/stories/2011/04/the-shadow-war-of-the-night-dragons-book-one-the-dead-city-excerpt, which got a Hugo nomination. The author’s response was, and I quote, “AH HA HA HA HA HAH HA HA HA HAH HA HA.”
I don’t know; I would’ve given him a Hugo nomination for “Although that's not actually a legend. That's really just more of an ambition.”. That was pretty genius delivery.
+1 to these comments and that story
I saw it coming and it was still beautiful.
Ah Scalzy you magnificent bastard :)
You read his BOOK?
Is that a real book? I remember coming across that post, but I assumed it was just a joke because he posted on April 1.
It’s interesting just large the crossover is between the two lists.
Well, a ton of videogames are fantasy or sci-fi, that probably accounts for a lot of it.
I’d almost be more interested in a bigram analysis. You can use semantic vectors *really* trivially for this job on your current data. One of my students and I then turned that into a quite lovely word cloud (take a look at the appendices in her thesis near the end) using wordcram which allows you to shape the wordcloud to a 2 bit background image. (I.e. you could fit the bigrams on a posterised call of duty cover, for maximum irony.)
Even better, you could use bigrams to create game-naming markov chains and auto-generate some excellently generic video game titles.
I love a good markov babbler as much as the next person, but I don’t think it would work terribly well for videogame titles, mostly because most videogame titles are so short. Four words is a fairly long game title…
Hmm…and to test my own theory, I fed Shamus’ list text (unedited) into a markov babbler I found online. Below are the first ~30 titles it spit out.
I’m surprised that most of these are actually fairly short, though obviously you have leakers like “Endless Space Marine…” (which would IMO have been a great title if it had ended after 3 words…)
There’s some biases towards reality (“Battlefield 2” and “Dead Island” are both here), which you’d expect with a Markov simulation.
Personal favorites:
Elegy for a Digger Simulator
Doorways: The Deadly Intent
Dead Island Thunder
Empires: Dawn of Islam
and, of course, the classic broswimmer Call of Diving
Top 30 of my list:
Ghost Recon Phantoms – Legend Guide
Avadon: The Outcasts, Space Rangers
Space Program Manager
Fractured Space
Endless Space Marine – 62 titles A Dead – Army of the Rain-Slick Precipice of Loath Nolder
Necronomicon: The Ripper
Call of Diving
Elegy For A Digger Simulator
Agricultural Simulator 2
PAM Space QA
Kerbal Space 2
Battlefield 2
Battlefield Bad Company 2 – The Secret of Ashworld
Doorways: The Deadly Intent
Thief: Deadly Intent
Thief: Deadly 30
Dead Island
Dead Island Thunder
Tom Clancy’s Splinter Cell Blacklist Deluxe Scenario
The Dragon Helm
Spiral Knights: Iron King
Castlevania: Lords of Heroes: Going Rogue
City of Duty: Ghosts – Libro Mission
Construction Simulator 2011: Extended Edition
Blackguards Deluxe Edition
Empires: Dawn of Islam
Crusader Kings II: The Mighty Quest for Epic Quest 2
Commander: Conquest of Magic VI
Magicka: Wizard Wars
If you ever decide to take another crack at it… Here, you’re actually measuring the most *used* words in game titles. To get the most overused, grab a word frequency list from somewhere and compare.
Common choices of word frequency lists include things like the complete works of Shakespeare, which may be slightly odd for this use or may be fine.
Then, divide each prevalence in game titles ratio with the prevalence in English. That should make words like “Black” and “City” drop a bit — they’re pretty common in English, too. But it will make words like “Ghost”, “Dragon” and “Quest” pop up toward the top — they’re not as common in English, but crazy-common in game titles.
Though you’d want to filter to words that have reasonable usage levels in both game titles and English. Otherwise you’d get a bunch of stuff where if a word was used only once in all of Shakespeare but three times in game titles it’ll look crazy-overused.
I didnt expect the kind of literally statistician.
NOBODY expects the Literary Statistician!
I didn’t ask for the Literary Statistician!
Yes! This needs linguistic normalization.
Is Shakespear’s work really representative for contemporary english?
I sense this post was written because I was complaining about Darkest Dungeon.
You should complain more then,make Shamoose stop his music thing and come back to the darkest depths of the dungeon of programming.
There are some cool sounding games hidden in Shamus’s list. I myself can’t wait for Dark Star World, or Ghost Wars Simulator.
I want to play Ghost Wars Simulator.
I want to play Ghost Simulator. Haunt a huge old mansion, cause mayhem and scare the visitors to death.
You mean like that?
could you give us a list of actual game that use at least two item of your list? I don’t think it would be too long.
I’m seeing some big false positive spikes on the list. Notably, Darksiders II dlc is holding Dark up there, Space Marine and Space Hulk dlc are inflating Space, and Quest is also counting Conquest.
Is “Darksiders” inflating the “dark” count? I was assuming the script only picks out complete words, i.e. bounded on each side by a non-alphabetic character (space, punctuation, whatever), otherwise “Kings” wouldn’t be in the Top 20 while “King” wasn’t.
The script appears to pick out words that are part of larger words, though I’m guessing Shamus made a specific exception for Kings. Darksiders and Underworld trigger Dark and World respectively on the list.
It’s a tricky problem; I would argue that while plurals should obviously fold together, and clearly “quest” and “conquest” should not, you arguably do want to count word fragments, so “dark” and “darksiders”, or “craft”, “warcraft” and “minecraft” should count together. This is obviously an order of magnitude more complex.
If there’s a way to get sales data per game, you could get a list of the top-selling words to have in a game title, too.
“Craft” is in both WOW and MC. I’d bet it comes out in the top five.
Protip:
A base game on steam will always have an appId that ends in 0 (the inverse is not true, if something has 10 or more DLC packs then some of them may also end in 0).
This means you can discard anything with an appId that doesn’t end in 0.
Of course this is probably useless knowledge now as bucaneer seems to have solved it.
In advance I want to say that I am a bit tired and did not have the had find something the would keep the names of games together to get a nice list for reference. maybe I will think about it tomorrow.
My suggestion for removing some of the false positives would be the following:
1 ToLower everything
2. Separate by common separators (;, :. , -)
3. Put everything in a list/dictionary/array
4. Clean out all the separator entries and some weird numbering
5. Go through the list and count each word in it
6. Wirte that in your output array
I know the O notation for that thing is horrid but it should do the job. Maybe there is room for more startup cleanup like ignoring sequels alltogether.
Sounds like an interesting project. Have to think about that. :)
Good night to everyone!
I didn’t see “World” being on the list at all, but there it is. Out of nowhere, right under our noses.
You did pretty well leveraging PHP’s string functions, but it looks like it messed up a little on the full list page… “Tomb Raider: Underworld” (all mentions) became “Tomb Raider: Undeworld”.
For my own amusement -here’s how I’d have tackled the program just using a basic stats program (STATA if I were at work, SPSS if I were desperate.)
The best solution would be for the person who created the database to give it a meaningful ID code so we could sort based on that, rather than having to futz with text. (first 5 numbers ID publisher, next 5 ID series, next 5 ID entry in series -do games have something like Dewey Decimal IDs?)
But if we must futz with text, I’d think the best solution would be to extract a string from the titles. If there’s a standard format, you could, say, pull everything before the first semi-colon, or the first 10 characters. Then search for duplicates and drop them. that should reduce the number of cases you’d have to manually check.
If I could pull up some even more esoteric software, there are programs that can identify roots from a string of text and extract them -which would fix needing either an arbitrary number of characters or a standard format.
Key IDs should never be meaningful. Bad habit to let happen. Probably won’t matter for a throwaway database, but for anything you expect to ever change…
Given that this person is using Stata (and from my own experience) I’m guessing they work more in data analysis than in data storage or database administration.
Keeping away from meaningful IDs is all well and good when you’re designing your own database for a web page from scratch, but when you’re trying to merge the data from 3 different Census samples for a nested multi-level analysis with hundreds of thousands of samples, you thank the goddamn bloody stars if the data sets used consistent key IDs, or you curse the children of the creators if you have to make them yourself.
While it seems to only appear in four titles on Steam, I’ve always found the word “unleashed” to be over-used. This mostly comes (I suppose) from my years of reading comic books. I think every major comic book character has been “unleashed” on a cover somewhere, with Wolverine experiencing an “unleashed” at least annually.
“When Titans Clash”
And the other favorite: “Everybody DIES!”
Sequel to “Rocks Fall”, as I recall…
Makes you wonder about who’s keeping all those super heroes on a leash the rest of the time.
Stan Lee is a dirty old man, that’s for sure.
:spittake:
I would like to institute a general moratorium against the use of the word ‘of’ in game titles, or at least specifically titles in the form of X of Y.
Same should go for the HURK sign.
While I read that article with a content smile on my face, I don’t have any meaningful feedback, other than:
I feel like the picture is some Left for Dead reference, but I don’t get it? Would anyone please explain it?
It’s the second hit on Google images for “left 4 dead 2”. I presume it’s the header because the first paragraph is about games with the word “dead” in the title.
IAMA electrician. This screwdriver is always the screwdriver I need;
http://www.amazon.com/Klein-32500-Screwdriver-Driver-Cushion/dp/B0015SBILG/ref=sr_1_1?ie=UTF8&qid=1424083404&sr=8-1&keywords=klein+9+in+1
Unless of course I need a swivel screwdriver, a ratchet screwdriver, a precision screwdriver, or some kind of, torx or larger nut-driver bit.
Or if I actually need a screw gun, because you don’t want to be there all day, do you?
But apart from these special cases, and also times where I’m dealing with metric-sized hex-heads, or while I’m working a hot panel and am using my insulated screwdrivers instead, I’m using the Klein 9-in-1.
No idea how complicated it would be, but if you could get a filter to eliminate repeating two-word phrases (so Call of Duty counts for “call” and “duty” but Call of Duty 2 doesn’t count for anything) it would weed out all those DLCs and sequels. Not like any two-word phrases come up in unrelated game titles.
…I’ve got to say I’m surprised Shadow didn’t make the cut.
In all fairness, I believe that there are too many games named “Call of Duty”, so maybe it should count.
I should make a No Man’s Sky style game called Dark Space: Star World Quest – you’d be traveling through the darkness of space to find the mythical Star World, of course. On a quest, you might say.
This reminds of a FUN (in the Dwarf Fortress sense of the word) project I had.
Take 2 excel sheets (because of course we couldn’t use database files) filled with student information–one for the main college applications, and one for need-based funding applications–and merge the two, without any sort of key variable*. Oh and by the way, the same person is quite often in each data set multiple times, because the student, their parents, and their academic advisors all might have made separate applications, none of which were constrained to actually fill in the information in the same manner (there were quite a few “Bob” versus “Robert” differences). This isn’t a small set of data either–on the order of fifty-thousand records, and if you mess one up you just potentially screwed some disadvantaged kid out of their chance to go to college.
What I finally wound up with was a script that looked at name (first and last), address, and application ID**; merged the stuff that matched between the sets; and flagged all the close matches for a human to look through and deal with. Which sounds simple but was a BITCH to iron out.
I have no idea how they do it now, because I know they aren’t using my code. Apparently, the intern before me did all of this by hand. Weekly.
*:In the years before I got this mess, they used SSN as a key variable. Then they stopped collecting them for privacy concerns (reasonable) without establishing any manner of variable that could be used as a key instead (less reasonable).
**:Supposedly the students are assigned a unique application ID when they filled out the main admittance application, which is supposedly required before the student can fill out a funding application. Of course, whoever created the form didn’t deign to validate the admittance application number on the funding application, so it never (less than 10% of the time) had a valid number in it. For some reason, most students thought their email went in there (I’m guessing because that’s their log-in name, but I don’t know that for sure)
I would guess whatever genius designed the form thought “ID” would make a good label.
Shamus if you revisit this later you might want to make sure words are actually words.
Darksiders for example is counted as “Dark” which is wrong, here the word is “Darksiders”.
You should use space (aka whitespace) as a word delimiter there is a pattern switch for that in PHP (sorry can’t recall what it was) it’s listed in the PHP documentation.
By the looks of it Dark and Star get artificially a high count as there are a lot of words with “Dark” or “Star” in them.
Then there is the question of EverQuest vs Ever-Quest vs Ever Quest.
IMO all those three are the same, the capitalization implies a dash thus two words, while if it was Everquest then it’s a single word.
Not sure how to do a pattern/rule to handle that though. You could probably do a [A-Z] rule combined with a starting with rule or sub group using () or something.
This is also the exact same list for heavy metal songs written in the last 40 years.
Regarding database schema lady, password should be the (hash, salt) tuple returned by a decent password hasher like bcrypt.
This would also be the list presented to a marketing team when selecting the next flavor of Mountain Dew.
Hah!
This post is excellently timed after the third or fourth time I wandered into a Dead State thread thinking it was about State of Decay!
I thought it would be fun to mess around with this list, so I took the top 10 words and wrote a small program to spit out pairs of them.
Then I spit out every pair with minor criteria to reverse them in cases where that seemed to make sense.
Here’s a list of 58 hit titles for your next video game.(You’ll note most of these are already taken…)
1: Dark World
2: Star World
3: Space World
4: Quest World
5: Battle World
6: Dead World
7: Magic World
8: Black World
9: Ghost World
10: Star Dark
11: Dark Star
12: Space Dark
13: Dark Space
14: Quest Dark
15: Dark Quest
16: Battle Dark
17: Dead Dark
18: Dark Dead
19: Magic Dark
20: Dark Magic
21: Black Dark
22: Ghost Dark
23: Dark Ghost
24: Space Star
25: Quest Star
26: Battle Star
27: Dead Star
28: Magic Star
29: Black Star
30: Ghost Star
31: Quest Space (Suspiciously similar to a hit series from the 1990s. Also one of the few cases where my “Try swapping the words” algorithm missed.)
32: Battle Space
33: Dead Space
34: Magic Space
35: Black Space
36: Ghost Space
37: Battle Quest
38: Dead Quest
39: Magic Quest
40: Black Quest
41: Ghost Quest
42: Dead Battle
43: Battle Dead
44: Magic Battle
45: Battle Magic
46: Black Battle (I’d skip this one, it sounds racist.)
47: Battle Black
48: Ghost Battle
49: Battle Ghost
50: Magic Dead
51: Dead Magic
52: Black Dead
53: Ghost Dead
54: Dead Ghost
55: Black Magic
56: Ghost Magic
57: Ghost Black
58: Black Ghost
That table design in the picture is a little bit weird. If you want to allow for multiple products in an Order you would need to place another table in-between Orders and Products that links Orders to products in a One to Many relationship. Thus a User can be linked to many Orders and each Order can be linked to many Products. Database design is important! Especially if the front end shoehorned multiple product orders in despite the database not supporting them directly. Imagine having to unravel this thing into a proper schema after it’s been used for 5+ years. Blah.
Oh yea, cool article. Didn’t expect World to be winning, but kinda makes sense.
Shamus, please spend a day learning python. You will be so much happier.
I can tolerate a lot of weird stuff in titles, but I think I’m officially sick of seeing the word “Rise”.
I don’t know if it’s just lazy, or if I’m just having war flashbacks to The Dark Knight Rises.
That woman is awesome! she can write backwards specially technical material!
Hmm, that seems like rather a hacky way to filter out DLC and the like; I’d probably try poking the Steam API a bit more, see if there’s a page you can plug in an appid and get some per-item data and constrain the list to actual releases that way.
Poor little DLC Quest didn’t stand a chance.