15 Sorting Algorithms in 6 Minutes

By Shamus
on Sep 18, 2013
Filed under:

I don’t have a new entry for Good Robot done, so I’m going to resort to the crutch of delinquent bloggers everywhere and post a YouTube video made by someone else and use it as a conversation-starter.

Computers do a lot of sorting. Alphabetizing names, sorting enemies by distance, arranging a list of polygons from far to near, putting a high score table in order, arranging files according to size or date, ordering search results by word frequency, and a million other things. There are a lot of different ways of sorting lists of values.Some clever person wrote a program to show the sorting take place and emit audio based on the current state of the list. I find the result to be kind of hypnotic:

Link (YouTube)

Shamus, why are there so many different sort algorithms?

Good question, me. The reason is that there are a lot of different sort problems. The above sorts look like they’re sorting some really predictable data: A list of all the numbers from 1 to 1,000, or whatever. This is not always the case. Sometimes it is very not the case.

Back in the old days sort was depicted as being a tradeoff between speed and memory usage. If you’ve got lots of memory (or the data is small) then you can create space for a copy of the stuff you’re sorting and fill in the new space as you sort. If you don’t, then you have to move the entries around in-place. It’s like re-arranging a room. It’s easy if you’ve got another room you can shove some items into, and a lot harder if you have to keep all the furniture in the room as you work. A lot of sorts are optimized for different levels of memory use. Maybe one is slow but only requires room for 1 additional item. Another is faster but requires enough memory to contain half of the list. Another is faster still but requires a complete copy of the list, thus doubling memory usage.

Likewise, optimal techniques will vary based on what you’re sorting. If I’m sorting the integers from 1 to 100 and I’ve got a bunch of instances of the number 10 next to each other, I know those values are sorted relative to each other and don’t need to be fussed with. If I find another 10, I can stick it before or after these other tens and it doesn’t matter. But if I’m sorting floating-point numbers like 3.285714285714286 and 12.83333333333333 then I’m probably not going to have large blocks of same-value entries next to each other. Also, I think some sorts make guesses about where items appear in the list based on average or median values. That’s great if you can say, “These values all must fall within the range of 1-100, so I know this value of 80 will be somewhere near the top.” But if your list is (for whatever reason) all the digits from 1 to 1,000 but also there is a value of ten billion stuck in there, then assumptions about ranges or averages could really make a mess of things.

Then there are concerns about how scrambled the list might be. If you begin with the assumption that the list is complete chaos at the outset, then you might use something optimized for really complex sorting. But what if the list is only chaos sometimes, but the majority of the time it’s in perfect order with just one item out of place? Going back to our furniture-moving analogy: It’s overkill to shove all the furniture into the spare room if all you want to do is move the end-table.

Maybe you’ve got a data set that’s too massive to fit in memory, so you need a sort that that’s good at operating on isolated segments. Maybe the entries are massive and moving them takes a lot of time (maybe you need to write to disk every time you move something) so you want a system that might spend more CPU time thinking so it can do the sort in the fewest possible moves.

And let us not forget the all important cost: Programmer time. If you’re never doing more than occasionally sorting a couple hundred small items, then you should not, under any circumstances, worry, argue, or even think about what sort to use. Just use whatever sort is handy, and if you don’t have one handy just use stupid bubble sort. Programmer time is precious and your brainpower is probably needed elsewhere.

I say this knowing nobody will take my advice because getting things just right is part of the fun of this job.

Comments (92)

From the Archives:

  1. Stephen says:

    Bitonic sort looked crazy, like it was sorting smaller piles ascendingly and descendingly at the same time.

  2. Brian says:

    A crucial thing to note about these sort algorithms when watching is the size of the contents being sorted. They’re not all even closely equal. (look at the width of the lines (and the delay) of the selection sort versus the radix sort versus the stable sort.

    Fun fact of the day that I learned from Wizard’s Bane (a lovely tale of programming magic): the shell sort is named for Donald Shell.

  3. RTBones says:

    Interesting video, though if you are wondering what the differences are between types of sort routines, you may be hard pressed from that video to figure it out.

    Not all is lost, however. In a somewhat…Hungarian?…way, here are some sorts sorted for all sorts of audiences:

    Select sort
    Quick sort
    Bubble sort
    Merge sort
    Insert sort

  4. DrMcCoy says:

    Also, sometimes you want a stable sorting algorithm, that keeps elements of the same value in the current order. One example would be sorting a phone book by both given and family name in two separate sorting steps.

  5. Radix sort always makes me do a double take when I watch this. Sorting with no comparisons is obvious once you look at how it works, but I still find it impressive.

  6. zob says:

    In case anyone is looking for such thing, I believe I know the best sort of video to teach sort algorithms.

  7. Shishberg says:

    If we’re all honest, though… the reason there are so many sorting algorithms is that it’s a very convenient problem to use to teach comp sci. It’s incredibly easy to specify the problem, there are some really simple algorithms that anyone can code once they understand loops, you can get decent performance improvements with a modest bit of extra complexity, you can use it to teach speed/memory tradeoffs and using domain knowledge in algorithms and all sorts of programming abstractions… It’s a really useful microcosm of the rest of the field.

    Nobody needs fifteen sorting algorithms. They exist because it’s had more raw hours of study thrown at it than probably all other problems in computing combined.

    One thing… No to bubble sort. Nobody should ever use bubble sort. Insertion sort is the same amount of code, easier to understand, and is linear on sorted input.

    • Orophor says:

      I remember learning all about the trade offs and nuance of different sort algorithms but in my own code these days, I pretty much stick to quicksort: http://en.wikipedia.org/wiki/Quicksort

      • guy says:

        I seem to recall that quicksort has some kind of absurd failure condition where everything goes to pieces, hence why we ever use any other recursive sort. Something about picking a terrible pivot value.

        • Volfram says:

          In the “standard” implementation, your pivot value is the first value in the current sub-array. If your dataset is already sorted, that means that you’ll be making new sub-arrays for EVERY point in your dataset, instead of about half of them.

          This is fairly easy to fix by choosing a random value instead of the first value. As my Algorithms instructor said, you can still end up with a failure state, but the odds of that are “about the same as all the air molecules in the room condensing in one corner.”

          Of course, then you have another problem in that many random number generators aren’t the fastest things alive.

          When I was setting up the collision system for my game engine, I wanted to be able to screen out duplicate collision events, so I implemented a binary tree which throws out any collisions that have the same value as the current leaf. I might revisit it on a future iteration(AFTER the current project is finished) to see if a Heap-style tree would be faster and provide the same benefits over a Binary Search tree.

          • bionicOnion says:

            Amusingly enough, one of the other ways around the problem of quicksorting a pre-sorted input is to shuffle everything around before you sort it. Since shuffling can be done in O(n) time, and quicksort can be done in O(n log n) time, you end up with greater efficiency overall than the worst case O(n^2), and only a slight performance hit for shuffling that’s asymptotically negligible.

            Of course, we know that Bogosort is optimal in all scenarios, so it’s a moot point anyways.

            • Volfram says:

              As a matter of fact, I’m actually doing that in my collision engine, as the binary tree search has a similar weakness if the data isn’t sufficiently “random,” and if I don’t the data will be very nearly in-order.

              Interesting. I’d completely forgotten that I’d done that until you mentioned it.

          • kdansky says:

            One easy way around it is to use the average value of the first and the last entry in the current segment. In a pre-sorted case, that gives you near-optimal runtime as a freebie.

            That said, QS isn’t used much except for teaching. MergeSort (and its cousin External Sort) has largely replaced it, because it scales far better for big data, is just as fast for smaller sets, and it never suffers from bad pivot choice.

            And we’ve not even talked about the combined algorithms, which choose a different one depending on input size, such as TimSort.

            • Lanthanide says:

              That’s the first time I’ve ever heard that mergesort is the preferred sort algorithm, rather than quicksort. Got any literature on that?

            • Ingvar M says:

              Although anything except picking a truly (or sufficiently close to) random pivot opens you up to malicious inputs. However, that may not be one of the things you have to consider.

              And most quicksort implementations fall down to something close to insertion and/or bubble-sotr when they hit around 3-5 elements anyway.

      • James Schend says:

        The correct answer is: “use the sort() function provided by your library and don’t worry about the implementation, because the guys who made the library are undoubtedly smarter than you also then you don’t have to waste neurons thinking about it.”

        • Blake says:

          Unless you’re using some architecture that has very limited executable space (in which case using for std::sort can mean too much templated code, in which case you need to look closer at your problem.

          But in the general case, std::sort works just fine.

    • WillRiker says:

      Yes, this. The vast majority of these algorithms you will *never* encounter outside of a CS classroom. A couple of non-comparison sorts like Radix Sort can be useful in very very specific circumstances, but it is almost never worth the effort of implementing it instead of just using whatever sort is available in your favorite language’s standard library.

    • Lanthanide says:

      “One thing… No to bubble sort. Nobody should ever use bubble sort. Insertion sort is the same amount of code, easier to understand, and is linear on sorted input.”

      Came here to say this. There’s no reason to implement bubble sort when you can do insertion or selection sort.

      • Erunaamo says:

        Agree. Insertion and selection sorts are simple to code, elegant to comprehend, and pretty potent for small jobs.

        Bubble sort has no particular benefits at all.

        It’s a plausible-looking mediocre sort algorithm, so it can be instructive to analyze it or use it as a baseline for comparison. Thus many programmers were taught about it early on, and it feels familiar. Sometimes, as here, we slip from that feeling of familiarity to thinking it’s a good “default” sorting algorithm, but it just ain’t so.

  8. Peter H. Coffin says:

    And every computer program used for Real Work these days comes with enough libraries and built-ins that you’re NEVER going to be called upon to actually sort anything anyway. It’s all

    key_list = [‘last_name’, ‘first_name’, ‘user_key’]
    my_array_handle = array_sort($my_array_handle, $key_list)

    and you’re done.

    Large amounts of data where performance would actually matter are probably in an external database system *anyway* and many highly-skilled professionals spent man-decades making that db engine sort REALLY FAST and REALLY FLEXIBLE, so you’re not going to be playing in that game anyway.

    • Anachronist says:

      Well, the standard library sort is typically implemented as a quicksort, which devolves into a slow O(N^2) sort if your data happens to be roughly in reverse order from what you want. In cases like that, it’s useful to know something about your data and what you’re doing with it. You may want to ignore the library routine and implement your own heapsort, for example (which doesn’t require auxiliary storage like quicksort but executes nearly as fast, regardless of the ordering), or an insertion sort (which is better if you need to sort small arrays repeatedly).

      Not every application runs on an internet-connected PC with access to an SQL server. Your application may be embedded in stand-alone firmware with limited memory, requiring alternatives to the standard library sort. For example, you may have data in a large file (not in a database) that must be sorted but it’s too big for memory, in which case there are algorithms for external sorts, invented back in the days when data was stored on reels of magnetic tape.

      The point is, the library sort is not always the right tool for the job.

      • Brett says:

        Good library quicksorts should randomize their input and pivot selection so they get expected N log N time regardless of what kind of data you feed it, just so that you can avoid that nasty N^2 worst-case scenario.

        • Nathon says:

          Quicksort has that nasty O(N^2) worst-case, but Python uses Timsort, which starts with merge-sort. O(NlogN) sorts for the win.

          • ET says:

            I’m liking this worst-case time of O(n*logn), plus the O(n) space. *
            Huzzah for Python!

            * According to Wikipedia, which will soon become the real-world Hitchiker’s Guide, just as soon as we can jam it all into a smartphone-sized disk… ^^;

      • Peter H. Coffin says:

        “Right tool”, maybe not. “Good enough tool”, almost always. And while not every application is going to be running on a machine that’s connected and has a database available, the ones that are going to be dealing with millions of things that need to be in any order other than “arrival” generally will be. Even processing data in the Antarctic is going to happen at base camp on someone’s laptop, and that’s certainly enough horsepower to run a database instance.

        • guy says:

          That is the lovely thing about modern programming; you can always pick a badly sub-optimal solution and then throw enough power at it that no one cares.

        • zob says:

          This me nitpicking.

          He said:
          “The point is, the library sort is not always the right tool for the job”

          You said:
          “‘Right tool’, maybe not. ‘Good enough tool’, almost always”

          “not always” = “almost always”
          Your rebuttal point seems to be about “right tool” vs “good enough tool”

          • burningdragoon says:

            Excuse me while I nitpick your nitpick >.>

            “not always” and “almost always” can certainly imply the same thing (and probably did these cases), “almost always” is still a more definitive phrase. Technically “never” can fall inside of “not always” (under a certain point of view mayhaps).

        • Blake says:

          Sorting collisions in a game?

  9. Zak McKracken says:

    I just love how Bubblesort actually sounds like bubbles. Makes the name that much more obvious.

  10. Stuart Hacking says:

    Bogosort sounds *exactly* as it should sound.

  11. HiEv says:

    Huh… Whadda ya’ know. The sort algorithm I “invented” in college has a name: cocktail shaker sort. I did it slightly differently than the visualization, sorting both directions simultaneously instead of back and forth. Since it’s just two bubble sorts running on the same data at the same time, but from opposite directions, I called it the “double-bubble sort”.

  12. Nathon says:

    I’m curious, Shamus, if you’ve been exposed to big-O notation. I know you weren’t formally educated in the computer sciences, but you do know what you’re doing.

    I have to admit that I have implemented bubble sort for production code (lists of length < 16 needed sorting and I couldn't use most libraries).

    • SAJ14SAJ says:

      Yes, there *are* actually situations when bubble sort is optimal, and that would be one. The other, extremely common one, is when a quicksort partition goes under a critical size, it is actually faster to finish it with bubblesort.

  13. Daemian Lucifer says:

    Well this video proves it:Quick sort is the best sounding algorithm of them all.

  14. James Schend says:

    That was one of the demo apps on my Commodore-64. Lower resolution of course. ;)

  15. Adam says:

    From Youtube comments:

    “Damn Bogo sort, get it together.”

    • DIN aDN says:

      I feel strangely empathetic towards poor old bogo sort. I mean, it looks like it’s trying so hard, and all it’s realistically accomplishing is wearing out the data storage medium it’s working on.

  16. Jeff R. says:

    I’ve not kept up with such things, but how parallelism impact sort? (Some of those divide-and-conquer type sorts look like they could benefit from multiple threads quite a bit…)

    • 4th Dimension says:

      Yeahm now that you mention it. I guess all divide and conquer sorts taht are based on recursion, probably could be paralleled, by going step or two (ro more into the recursion and splitting the array that way in however many arrays you need (less than core count).
      Although you’ll still need to merge the individually sorted sub arrays in one thread.

      • Jeff R. says:

        Possibly two threads? I think a merge working from both ends towards the middle is do-able. Wouldn’t guarantee perfectly 2x speed, but better than a single thread.

        • 4th Dimension says:

          Now I might be wrong, but you can not do it in two threads. At least you can not merge the result using two threads because simply you can not know excatly where in the array will your subarry fit. So you will still need to lock up the array while you are inserting your value. And during the insert other thread can not try to find where to put the next value since the array is undergoing non atomic changes at the time.

          What you can do say you have 4 cores. You can first split the array into 4 parts for 4 threads one for each core. When they complete, you can merge two of the parts each with 2 threads. And the final merge of two subarrys does one thread. Sort of threaded recursion.

          • Jeff R. says:

            I think that as long as you know the sizes of your two subarrays (which you should), you can do it without collision, having thread one build the lower half of the new array from the lower halves of the subarrays from the bottom while thread two builds the upper half of the new array from the top. Neither thread should ever be trying to write or delete anything the other one will ever want to have.[1] There’s no logical reason why you can’t do the locking semaphores on array elements rather than the entire array.

            [1] towards the end of the process [or earlier on perverse data] the lower thread may claim the top element of one subarray or the upper thread the bottom of one subarray, but even then, the other thread will never actually want to claim and delete that item, just look at in once a cycle as long as it exists. Which can easily be managed in a semaphore.

            • 4th Dimension says:

              The problem with that way is that you are making an asumption that the ranges of two arrays are similar, so that bottom halfs of the array will comprise the bottom of the master array while the top parts will make the top of the master array. While in reality things are much more random. You can have one array comprised of mostly lower values of master array, and than as a MAX it has MAX of the entire master array.

              Simply put before you sort the array.

              Oh and neither thread will ever delete anything. The problem stems from the fact that they will be inserting (not add, since that would require them to communicate their current values and for one thread to wait while the other fills in all the lower values (in case of sorting smaller to larger)) values. And that means they will be shifting already inserted ones. So the other thread must not read during the insert. It can not use that time to figure out whee they’ll place their value.

              On the other hand, I would love to see this implemented and be proven wrong ;)

              • guy says:

                If they get desynchronized, I can see one thread reading in new values when the other thread has picked two to swap and is halfway through swapping, so a value is temporarily not present in the array and another is doubled.

                Any routine that will land numbers in the right half of the array and then sort each half could be split into multiple threads at each pass, assuming that you don’t have a problem accessing the same array at different locations with multiple threads. Should be safe, but an overzealous compiler might not let you. I guess you could have an actual problem in certain architectures if the array might get components relocated, but usually that won’t happen.

                If you do have to instantiate new arrays, I don’t think you can reinsert in multiple threads, but copying between arrays is O(N) anyhow.

                • Jeff R. says:

                  I was forgetting that we were talking about in-place, but I still think it can work that way. I wasn’t planning on doing insertions, only swaps. The low thread looks at the bottoms of both arrays and either marks the absolute lower one’s as good and increments that pointer or else swaps the bottom elements. The high thread does the same with the tops. There is no possible case in which the two threads want to do a swap involving the same element (other than the debugging case where the two arrays being merged aren’t actually properly sorted.)

                  The case you’re mentioning is an avoidable edge case near the end; we can do special behavior when either array has less than 3 unmerged elements in it and keep that safe.

                  • Jeff R. says:

                    Wait, nevermind; the swap-based merge I was thinking about doesn’t work at all, parallel or no. So I’ll fall back to “merge to a new array” being doable in two threads safely. Copying the result back is only O(n) and is itself the most parallelizable task imaginable, so that’s still good.

    • guy says:

      It would be tricky. An in-place sort cannot be done in multiple threads on the first pass, because there is a non-zero chance they’ll access the same memory location simultaneously or in the wrong order. Once it gets to the point where you can break the array into two parts and are sure every number is in the right half, then you can split it (assuming it’s possible to access different locations in the same array from different threads simultaneously without causing errors, which I’m honestly not sure about) and work on each half.

      If you actually create a new array at each recursion, it would be much easier until you run out of memory. You could combine the two, creating new arrays on the first pass and then sorting them in place in separate threads, but then you have to write more code.

    • Nentuaby says:

      “Bitonic Sort” from that video is an algorithm particularly designed for massively-parallel implementation. The reason it sorts things in such a crazy fashion is that a bunch of independant comparators are doing their thing at once, not talking to each other.

  17. Jeff says:

    Shamus, I think you have too many sorts there:

    “Shamus, why are there sort many different sort algorithms?”

    That should probably be a “so”.

  18. Knut says:

    Heap sort sounds like a descent into madness

  19. Ravens Cry says:

    It was like listening to an early arcade, all the boobs and bips.

  20. Brandon says:

    “Programmer time is precious […]”

    Not according to managers! They are more than happy wasting programmer time in meetings and other trivial BS.

    Or they are happy keeping us working on band-aid fixes for crappy old software instead of building its replacement.

    … sigh.

    • Richard says:

      Actually, to some extent “band-aid fixes” are pretty much the right thing to do when you have mostly-working software.

      The reason is twofold and quite simple:
      1) Your mostly-working software has the bugfixes for weird and rare cases. Those bugs took a long time to find and some time to squash.
      If you redo-from-scratch you’ve lost all that work, and you’ll make at least some of the same original errors when rebuilding.

      2) You need to sell product.
      You can continue to sell the mostly-working software and make money while improving it, and your existing customers get the improvements as you do them.
      If you redo-from-scratch then there will be a long period where you’re not improving the old software and not selling the new product either.
      Your existing customers will complain about buggy software that’s not getting fixed,and your potential new customers will think “That software is buggy and they’re not fixing it” and buy something else.

      If it’s a spaghetti mess, the right thing to do is to slowly re-architect the software while keeping it working.

  21. Jock says:

    Could you do a follow-up post that goes over what each of those sorts does, and why each might be picked over other sorts? I’m familiar with some of them, but others are new to me, and you have a way of putting this stuff in an easy to digest format.

  22. Diego says:

    I clicked on that video. An hour and a half later I was watching cellular automata videos, then some robots, neural networks… when I started closing everything I encountered this article again! “ah! that’s what I was doing”

  23. Piflik says:

    Reminds me of this XD

    I like PanicSort…

  24. Retsam says:

    I was sort of hoping to see my favorite not-so-serious sorting algorithm come up, the Stooge Sort, which is as follows:

    1) Recursively Stooge Sort greatest 2/3rds of array.
    2) Recursively Stooge Sort least 2/3rds of array.
    3) Recursively Stooge Sort greatest 2/3rds of array.

    Voila, sorted array. Wikipedia informs me that the complexity of this sort is: O(nlog 3 / log 1.5 ) = O(n^2.7095…).

  25. Peter says:

    A great demonstration of sorting agorithms can be seen here:


    There the number of items are similar, and the initial set has different starting cases to see how different sorts behave depeneding on the input.

  26. luagha says:

    Ah, but a distributed bubblesort is one of the fastest.

    I was reading an article where someone was organizing participants at a dog show. He had dogs with owners on leash, each having a number, all in a line, in no particular order.

    He said, “Everyone look to the person on your left. If your number is higher than theirs, swap places.”

    In less than five minutes, the line was sorted.

  27. Neil Harding says:

    Radix sorting was good on the C64 on my Sprite Multiplexer (you have to sort by the Y position, so that you can make the 8 sprites act as 24, by reusing them further down the screen). On a 1Mhz processor, every cycle counts :)

Leave a Reply

Comments are moderated and may not be posted immediately. Required fields are marked *


Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun.

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>