15 Sorting Algorithms in 6 Minutes

By Shamus Posted Wednesday Sep 18, 2013

 Quakecon Keynote 2013 Annotated: Part 4Previous Post

I don’t have a new entry for Good Robot done, so I’m going to resort to the crutch of delinquent bloggers everywhere and post a YouTube video made by someone else and use it as a conversation-starter.

Computers do a lot of sorting. Alphabetizing names, sorting enemies by distance, arranging a list of polygons from far to near, putting a high score table in order, arranging files according to size or date, ordering search results by word frequency, and a million other things. There are a lot of different ways of sorting lists of values.Some clever person wrote a program to show the sorting take place and emit audio based on the current state of the list. I find the result to be kind of hypnotic:

Link (YouTube)

Shamus, why are there so many different sort algorithms?

Good question, me. The reason is that there are a lot of different sort problems. The above sorts look like they’re sorting some really predictable data: A list of all the numbers from 1 to 1,000, or whatever. This is not always the case. Sometimes it is very not the case.

Back in the old days sort was depicted as being a tradeoff between speed and memory usage. If you’ve got lots of memory (or the data is small) then you can create space for a copy of the stuff you’re sorting and fill in the new space as you sort. If you don’t, then you have to move the entries around in-place. It’s like re-arranging a room. It’s easy if you’ve got another room you can shove some items into, and a lot harder if you have to keep all the furniture in the room as you work. A lot of sorts are optimized for different levels of memory use. Maybe one is slow but only requires room for 1 additional item. Another is faster but requires enough memory to contain half of the list. Another is faster still but requires a complete copy of the list, thus doubling memory usage.

Likewise, optimal techniques will vary based on what you’re sorting. If I’m sorting the integers from 1 to 100 and I’ve got a bunch of instances of the number 10 next to each other, I know those values are sorted relative to each other and don’t need to be fussed with. If I find another 10, I can stick it before or after these other tens and it doesn’t matter. But if I’m sorting floating-point numbers like 3.285714285714286 and 12.83333333333333 then I’m probably not going to have large blocks of same-value entries next to each other. Also, I think some sorts make guesses about where items appear in the list based on average or median values. That’s great if you can say, “These values all must fall within the range of 1-100, so I know this value of 80 will be somewhere near the top.” But if your list is (for whatever reason) all the digits from 1 to 1,000 but also there is a value of ten billion stuck in there, then assumptions about ranges or averages could really make a mess of things.

Then there are concerns about how scrambled the list might be. If you begin with the assumption that the list is complete chaos at the outset, then you might use something optimized for really complex sorting. But what if the list is only chaos sometimes, but the majority of the time it’s in perfect order with just one item out of place? Going back to our furniture-moving analogy: It’s overkill to shove all the furniture into the spare room if all you want to do is move the end-table.

Maybe you’ve got a data set that’s too massive to fit in memory, so you need a sort that that’s good at operating on isolated segments. Maybe the entries are massive and moving them takes a lot of time (maybe you need to write to disk every time you move something) so you want a system that might spend more CPU time thinking so it can do the sort in the fewest possible moves.

And let us not forget the all important cost: Programmer time. If you’re never doing more than occasionally sorting a couple hundred small items, then you should not, under any circumstances, worry, argue, or even think about what sort to use. Just use whatever sort is handy, and if you don’t have one handy just use stupid bubble sort. Programmer time is precious and your brainpower is probably needed elsewhere.

I say this knowing nobody will take my advice because getting things just right is part of the fun of this job.

Shamus Young is a programmer, an author, and nearly a composer. He works on this site full time. If you'd like to support him, you can do so via Patreon or PayPal.

 Quakecon Keynote 2013 Annotated: Part 4Previous Post

Next PostPassage Screensaver 

From The Archives:

92 comments

92 thoughts on “15 Sorting Algorithms in 6 Minutes”

Stephen says:

Wednesday Sep 18, 2013 at 6:14 am

Bitonic sort looked crazy, like it was sorting smaller piles ascendingly and descendingly at the same time.

Reply
Brian says:

Wednesday Sep 18, 2013 at 6:24 am

A crucial thing to note about these sort algorithms when watching is the size of the contents being sorted. They’re not all even closely equal. (look at the width of the lines (and the delay) of the selection sort versus the radix sort versus the stable sort.

Fun fact of the day that I learned from Wizard’s Bane (a lovely tale of programming magic): the shell sort is named for Donald Shell.

Reply
1. Abnaxis says:
  
  Wednesday Sep 18, 2013 at 7:39 am
  
  I always wondered about that. I tried to come up with some way it makes sense to call it shell sorting, like somehow it was related to integrating by shells or something…
  
  Reply
  1. Blue Painted says:
    
    Wednesday Sep 18, 2013 at 7:49 am
    
    I knew that! Yay :-)
    
    Reply
2. ClearWater says:
  
  Friday Sep 20, 2013 at 8:46 am
  
  That’s because the less efficient algorithms, such as bubble sort, would take ages when sorting the same number of items as the faster ones. The video would be numerous minutes longer.
  
  Reply
RTBones says:

Wednesday Sep 18, 2013 at 6:43 am

Interesting video, though if you are wondering what the differences are between types of sort routines, you may be hard pressed from that video to figure it out.

Not all is lost, however. In a somewhat…Hungarian?…way, here are some sorts sorted for all sorts of audiences:

Select sort
Quick sort
Bubble sort
Merge sort
Insert sort

Reply
1. The Mich says:
  
  Thursday Sep 19, 2013 at 7:37 am
  
  That’s much more hypnotic than the video above.
  
  Reply
2. postinternetsyndrome says:
  
  Friday Sep 20, 2013 at 3:25 pm
  
  Excellent.
  
  Reply
DrMcCoy says:

Wednesday Sep 18, 2013 at 6:45 am

Also, sometimes you want a stable sorting algorithm, that keeps elements of the same value in the current order. One example would be sorting a phone book by both given and family name in two separate sorting steps.

Reply
Charnel Mouse says:

Wednesday Sep 18, 2013 at 6:50 am

Radix sort always makes me do a double take when I watch this. Sorting with no comparisons is obvious once you look at how it works, but I still find it impressive.

Reply
1. Ian says:
  
  Friday Sep 20, 2013 at 2:57 am
  
  Sadly, hidden under the linear running time is a very heavy constant. Making radixsort useless for most practical uses.
  
  Unless you can assert that you are sorting a very large list of very small numbers.
  
  Reply
  1. Charnel Mouse says:
    
    Thursday Oct 3, 2013 at 9:55 am
    
    Oh, I don’t doubt it’s useless practically. I mostly just admire it for aesthetical reasons, like several of the other algorithms in the video.
    
    Reply
zob says:

Wednesday Sep 18, 2013 at 6:51 am

In case anyone is looking for such thing, I believe I know the best sort of video to teach sort algorithms.
http://www.youtube.com/watch?v=ywWBy6J5gz8

Reply
Shishberg says:

Wednesday Sep 18, 2013 at 7:21 am

If we’re all honest, though… the reason there are so many sorting algorithms is that it’s a very convenient problem to use to teach comp sci. It’s incredibly easy to specify the problem, there are some really simple algorithms that anyone can code once they understand loops, you can get decent performance improvements with a modest bit of extra complexity, you can use it to teach speed/memory tradeoffs and using domain knowledge in algorithms and all sorts of programming abstractions… It’s a really useful microcosm of the rest of the field.

Nobody needs fifteen sorting algorithms. They exist because it’s had more raw hours of study thrown at it than probably all other problems in computing combined.

One thing… No to bubble sort. Nobody should ever use bubble sort. Insertion sort is the same amount of code, easier to understand, and is linear on sorted input.

Reply
1. Orophor says:
  
  Wednesday Sep 18, 2013 at 7:37 am
  
  I remember learning all about the trade offs and nuance of different sort algorithms but in my own code these days, I pretty much stick to quicksort: http://en.wikipedia.org/wiki/Quicksort
  
  Reply
  1. guy says:
    
    Wednesday Sep 18, 2013 at 8:54 am
    
    I seem to recall that quicksort has some kind of absurd failure condition where everything goes to pieces, hence why we ever use any other recursive sort. Something about picking a terrible pivot value.
    
    Reply
    1. Volfram says:
      
      Wednesday Sep 18, 2013 at 9:33 am
      
      In the “standard” implementation, your pivot value is the first value in the current sub-array. If your dataset is already sorted, that means that you’ll be making new sub-arrays for EVERY point in your dataset, instead of about half of them.
      
      This is fairly easy to fix by choosing a random value instead of the first value. As my Algorithms instructor said, you can still end up with a failure state, but the odds of that are “about the same as all the air molecules in the room condensing in one corner.”
      
      Of course, then you have another problem in that many random number generators aren’t the fastest things alive.
      
      When I was setting up the collision system for my game engine, I wanted to be able to screen out duplicate collision events, so I implemented a binary tree which throws out any collisions that have the same value as the current leaf. I might revisit it on a future iteration(AFTER the current project is finished) to see if a Heap-style tree would be faster and provide the same benefits over a Binary Search tree.
      
      Reply
      1. bionicOnion says:
        
        Wednesday Sep 18, 2013 at 5:31 pm
        
        Amusingly enough, one of the other ways around the problem of quicksorting a pre-sorted input is to shuffle everything around before you sort it. Since shuffling can be done in O(n) time, and quicksort can be done in O(n log n) time, you end up with greater efficiency overall than the worst case O(n^2), and only a slight performance hit for shuffling that’s asymptotically negligible.
        
        Of course, we know that Bogosort is optimal in all scenarios, so it’s a moot point anyways.
        
        Reply
        
        Volfram says:
        
        Thursday Sep 19, 2013 at 8:16 am
        
        As a matter of fact, I’m actually doing that in my collision engine, as the binary tree search has a similar weakness if the data isn’t sufficiently “random,” and if I don’t the data will be very nearly in-order.
        
        Interesting. I’d completely forgotten that I’d done that until you mentioned it.
        
        Reply
      2. kdansky says:
        
        Thursday Sep 19, 2013 at 3:38 am
        
        One easy way around it is to use the average value of the first and the last entry in the current segment. In a pre-sorted case, that gives you near-optimal runtime as a freebie.
        
        That said, QS isn’t used much except for teaching. MergeSort (and its cousin External Sort) has largely replaced it, because it scales far better for big data, is just as fast for smaller sets, and it never suffers from bad pivot choice.
        
        And we’ve not even talked about the combined algorithms, which choose a different one depending on input size, such as TimSort.
        
        Reply
        
        Lanthanide says:
        
        Thursday Sep 19, 2013 at 4:25 am
        
        That’s the first time I’ve ever heard that mergesort is the preferred sort algorithm, rather than quicksort. Got any literature on that?
        
        Reply
        
        Phantom Hoover says:
        
        Saturday Sep 21, 2013 at 1:48 pm
        
        The Proceedings of the Editors of Wikipedia mentions that mergesort can be more effectively parallelised, so presumably that’s why.
        
        Reply
        
        Ingvar M says:
        
        Thursday Sep 19, 2013 at 5:03 am
        
        Although anything except picking a truly (or sufficiently close to) random pivot opens you up to malicious inputs. However, that may not be one of the things you have to consider.
        
        And most quicksort implementations fall down to something close to insertion and/or bubble-sotr when they hit around 3-5 elements anyway.
        
        Reply
  2. James Schend says:
    
    Wednesday Sep 18, 2013 at 9:47 am
    
    The correct answer is: “use the sort() function provided by your library and don’t worry about the implementation, because the guys who made the library are undoubtedly smarter than you also then you don’t have to waste neurons thinking about it.”
    
    Reply
    1. Blake says:
      
      Wednesday Sep 18, 2013 at 6:33 pm
      
      Unless you’re using some architecture that has very limited executable space (in which case using for std::sort can mean too much templated code, in which case you need to look closer at your problem.
      
      But in the general case, std::sort works just fine.
      
      Reply
2. WillRiker says:
  
  Wednesday Sep 18, 2013 at 10:39 am
  
  Yes, this. The vast majority of these algorithms you will *never* encounter outside of a CS classroom. A couple of non-comparison sorts like Radix Sort can be useful in very very specific circumstances, but it is almost never worth the effort of implementing it instead of just using whatever sort is available in your favorite language’s standard library.
  
  Reply
3. Lanthanide says:
  
  Thursday Sep 19, 2013 at 4:21 am
  
  “One thing… No to bubble sort. Nobody should ever use bubble sort. Insertion sort is the same amount of code, easier to understand, and is linear on sorted input.”
  
  Came here to say this. There’s no reason to implement bubble sort when you can do insertion or selection sort.
  
  Reply
  1. Erunaamo says:
    
    Friday Sep 27, 2013 at 8:33 am
    
    Agree. Insertion and selection sorts are simple to code, elegant to comprehend, and pretty potent for small jobs.
    
    Bubble sort has no particular benefits at all.
    
    It’s a plausible-looking mediocre sort algorithm, so it can be instructive to analyze it or use it as a baseline for comparison. Thus many programmers were taught about it early on, and it feels familiar. Sometimes, as here, we slip from that feeling of familiarity to thinking it’s a good “default” sorting algorithm, but it just ain’t so.
    
    Reply
Peter H. Coffin says:

Wednesday Sep 18, 2013 at 7:30 am

And every computer program used for Real Work these days comes with enough libraries and built-ins that you’re NEVER going to be called upon to actually sort anything anyway. It’s all

key_list = [‘last_name’, ‘first_name’, ‘user_key’]
my_array_handle = array_sort($my_array_handle, $key_list)

and you’re done.

Large amounts of data where performance would actually matter are probably in an external database system *anyway* and many highly-skilled professionals spent man-decades making that db engine sort REALLY FAST and REALLY FLEXIBLE, so you’re not going to be playing in that game anyway.

Reply
1. Anachronist says:
  
  Wednesday Sep 18, 2013 at 7:50 am
  
  Well, the standard library sort is typically implemented as a quicksort, which devolves into a slow O(N^2) sort if your data happens to be roughly in reverse order from what you want. In cases like that, it’s useful to know something about your data and what you’re doing with it. You may want to ignore the library routine and implement your own heapsort, for example (which doesn’t require auxiliary storage like quicksort but executes nearly as fast, regardless of the ordering), or an insertion sort (which is better if you need to sort small arrays repeatedly).
  
  Not every application runs on an internet-connected PC with access to an SQL server. Your application may be embedded in stand-alone firmware with limited memory, requiring alternatives to the standard library sort. For example, you may have data in a large file (not in a database) that must be sorted but it’s too big for memory, in which case there are algorithms for external sorts, invented back in the days when data was stored on reels of magnetic tape.
  
  The point is, the library sort is not always the right tool for the job.
  
  Reply
  1. Brett says:
    
    Wednesday Sep 18, 2013 at 8:19 am
    
    Good library quicksorts should randomize their input and pivot selection so they get expected N log N time regardless of what kind of data you feed it, just so that you can avoid that nasty N^2 worst-case scenario.
    
    Reply
    1. Nathon says:
      
      Wednesday Sep 18, 2013 at 8:31 am
      
      Quicksort has that nasty O(N^2) worst-case, but Python uses Timsort, which starts with merge-sort. O(NlogN) sorts for the win.
      
      Reply
      1. ET says:
        
        Thursday Sep 19, 2013 at 9:55 pm
        
        I’m liking this worst-case time of O(n*logn), plus the O(n) space. *
        Huzzah for Python!
        
        * According to Wikipedia, which will soon become the real-world Hitchiker’s Guide, just as soon as we can jam it all into a smartphone-sized disk… ^^;
        
        Reply
  2. Peter H. Coffin says:
    
    Wednesday Sep 18, 2013 at 8:35 am
    
    “Right tool”, maybe not. “Good enough tool”, almost always. And while not every application is going to be running on a machine that’s connected and has a database available, the ones that are going to be dealing with millions of things that need to be in any order other than “arrival” generally will be. Even processing data in the Antarctic is going to happen at base camp on someone’s laptop, and that’s certainly enough horsepower to run a database instance.
    
    Reply
    1. guy says:
      
      Wednesday Sep 18, 2013 at 8:59 am
      
      That is the lovely thing about modern programming; you can always pick a badly sub-optimal solution and then throw enough power at it that no one cares.
      
      Reply
      1. Humanoid says:
        
        Wednesday Sep 18, 2013 at 8:14 pm
        
        (EDIT: I shouldn’t post just after getting out of bed and before reading)
        
        Reply
    2. zob says:
      
      Wednesday Sep 18, 2013 at 10:06 am
      
      This me nitpicking.
      
      He said:
      “The point is, the library sort is not always the right tool for the job”
      
      You said:
      “‘Right tool’, maybe not. ‘Good enough tool’, almost always”
      
      Since,
      “not always” = “almost always”
      Your rebuttal point seems to be about “right tool” vs “good enough tool”
      
      Reply
      1. burningdragoon says:
        
        Wednesday Sep 18, 2013 at 10:31 am
        
        Excuse me while I nitpick your nitpick >.>
        
        “not always” and “almost always” can certainly imply the same thing (and probably did these cases), “almost always” is still a more definitive phrase. Technically “never” can fall inside of “not always” (under a certain point of view mayhaps).
        
        Reply
    3. Blake says:
      
      Wednesday Sep 18, 2013 at 6:36 pm
      
      Sorting collisions in a game?
      
      Reply
Zak McKracken says:

Wednesday Sep 18, 2013 at 7:46 am

I just love how Bubblesort actually sounds like bubbles. Makes the name that much more obvious.

Reply
Stuart Hacking says:

Wednesday Sep 18, 2013 at 7:52 am

Bogosort sounds *exactly* as it should sound.

Reply
1. lethal_guitar says:
  
  Wednesday Sep 18, 2013 at 8:36 am
  
  I was wondering the whole time how this would sound with Bogosort while watching. And then.. :D
  
  Reply
2. krellen says:
  
  Wednesday Sep 18, 2013 at 9:04 am
  
  Is it wrong that I want to listen to more Bogosort?
  
  Reply
  1. Spammy says:
    
    Wednesday Sep 18, 2013 at 1:28 pm
    
    You know those videos where someone loops a song for an hour, or ten hours? I think I could listen to an hour of Bogosort. Maybe it would finally win if I did!
    
    Reply
3. swenson says:
  
  Wednesday Sep 18, 2013 at 10:44 am
  
  Yep. Bogosort is bestsort.
  
  Reply
4. Klay F. says:
  
  Wednesday Sep 18, 2013 at 3:17 pm
  
  What I love is that Bogo is Josh’s nickname now, and Bogosort is basically Reginald Cuftburt in music form. Its almost poetic.
  
  Reply
  1. Humanoid says:
    
    Wednesday Sep 18, 2013 at 8:28 pm
    
    I read the comments before watching the video, and I just assumed that Cuftbert was the originator rather than it being an actual thing.
    
    But nonetheless it’d accurately describe the method in which Josh searches for Dynamite in his inventory.
    
    Reply
  2. Humanoid says:
    
    Wednesday Sep 18, 2013 at 9:05 pm
    
    Also:
    
    while list does not look sorted:
    consume drugs and alcohol
    
    Eventually a sorted list will look indistinguishable from an unsorted list.
    
    Reply
    1. MrGuy says:
      
      Thursday Sep 19, 2013 at 6:24 pm
      
      If all you want is sorted output…
      
      for (i=1,i<=array.length,i++) {
      array[i] = array[i-1] + 1;
      }
      
      Reply
5. Piflik says:
  
  Thursday Sep 19, 2013 at 5:55 am
  
  I just read the Wikipedia page on BogoSort…I really like the Quantum BogoSort…’if the list is not in order, the universe is destroyed’ XD
  
  Reply
  1. Stuart Hacking says:
    
    Thursday Sep 19, 2013 at 7:52 am
    
    It’s entirely possible that this is already happening, and we just happen to inhabit the very lucky universe! ;-)
    
    Reply
    1. MrGuy says:
      
      Thursday Sep 19, 2013 at 6:44 pm
      
      Yeah, man. I’m Zaphod Beeblebrox!
      
      Reply
6. WillRiker says:
  
  Thursday Sep 19, 2013 at 10:03 am
  
  Go home Bogosort, you are drunk.
  
  Reply
HiEv says:

Wednesday Sep 18, 2013 at 8:07 am

Huh… Whadda ya’ know. The sort algorithm I “invented” in college has a name: cocktail shaker sort. I did it slightly differently than the visualization, sorting both directions simultaneously instead of back and forth. Since it’s just two bubble sorts running on the same data at the same time, but from opposite directions, I called it the “double-bubble sort”.

Reply
1. Noumenon says:
  
  Wednesday Sep 18, 2013 at 9:12 am
  
  You’ve definitely got a knack for nomenclature.
  
  Reply
2. Volfram says:
  
  Wednesday Sep 18, 2013 at 9:37 am
  
  I’m always conflicted when that happens to me. On the one hand, “Yay, I arrived at the same conclusion as a known very-smart-person.” On the other hand, “Boo, I didn’t create anything new, and probably could have saved some time by stealing their work instead.”
  
  Reply
3. newplan says:
  
  Wednesday Sep 18, 2013 at 5:05 pm
  
  Could be named better.
  
  It’s a bubble sort but it goes from both ends, right?
  
  You should call it the bubble butt sort.
  
  Reply
Nathon says:

Wednesday Sep 18, 2013 at 8:25 am

I’m curious, Shamus, if you’ve been exposed to big-O notation. I know you weren’t formally educated in the computer sciences, but you do know what you’re doing.

I have to admit that I have implemented bubble sort for production code (lists of length < 16 needed sorting and I couldn't use most libraries).

Reply
1. SAJ14SAJ says:
  
  Wednesday Sep 18, 2013 at 12:31 pm
  
  Yes, there *are* actually situations when bubble sort is optimal, and that would be one. The other, extremely common one, is when a quicksort partition goes under a critical size, it is actually faster to finish it with bubblesort.
  
  Reply
Daemian Lucifer says:

Wednesday Sep 18, 2013 at 9:08 am

Well this video proves it:Quick sort is the best ~~sounding~~ algorithm of them all.

Reply
1. krellen says:
  
  Wednesday Sep 18, 2013 at 12:59 pm
  
  I dunno, I’m pretty partial to the way least-significant digit radix sounds myself.
  
  Reply
  1. Volfram says:
    
    Thursday Sep 19, 2013 at 8:22 am
    
    During the last couple of cycles before the final sort anyway, yeah. I love the ascending chords.
    
    Reply
James Schend says:

Wednesday Sep 18, 2013 at 9:39 am

That was one of the demo apps on my Commodore-64. Lower resolution of course. ;)

Reply
Adam says:

Wednesday Sep 18, 2013 at 11:57 am

From Youtube comments:

“Damn Bogo sort, get it together.”

Reply
1. DIN aDN says:
  
  Wednesday Sep 18, 2013 at 12:58 pm
  
  I feel strangely empathetic towards poor old bogo sort. I mean, it looks like it’s trying so hard, and all it’s realistically accomplishing is wearing out the data storage medium it’s working on.
  
  Reply
Jeff R. says:

Wednesday Sep 18, 2013 at 12:04 pm

I’ve not kept up with such things, but how parallelism impact sort? (Some of those divide-and-conquer type sorts look like they could benefit from multiple threads quite a bit…)

Reply
1. 4th Dimension says:
  
  Wednesday Sep 18, 2013 at 12:52 pm
  
  Yeahm now that you mention it. I guess all divide and conquer sorts taht are based on recursion, probably could be paralleled, by going step or two (ro more into the recursion and splitting the array that way in however many arrays you need (less than core count).
  Although you’ll still need to merge the individually sorted sub arrays in one thread.
  
  Reply
  1. Jeff R. says:
    
    Wednesday Sep 18, 2013 at 1:24 pm
    
    Possibly two threads? I think a merge working from both ends towards the middle is do-able. Wouldn’t guarantee perfectly 2x speed, but better than a single thread.
    
    Reply
    1. 4th Dimension says:
      
      Wednesday Sep 18, 2013 at 4:39 pm
      
      Now I might be wrong, but you can not do it in two threads. At least you can not merge the result using two threads because simply you can not know excatly where in the array will your subarry fit. So you will still need to lock up the array while you are inserting your value. And during the insert other thread can not try to find where to put the next value since the array is undergoing non atomic changes at the time.
      
      What you can do say you have 4 cores. You can first split the array into 4 parts for 4 threads one for each core. When they complete, you can merge two of the parts each with 2 threads. And the final merge of two subarrys does one thread. Sort of threaded recursion.
      
      Reply
      1. Jeff R. says:
        
        Wednesday Sep 18, 2013 at 7:12 pm
        
        I think that as long as you know the sizes of your two subarrays (which you should), you can do it without collision, having thread one build the lower half of the new array from the lower halves of the subarrays from the bottom while thread two builds the upper half of the new array from the top. Neither thread should ever be trying to write or delete anything the other one will ever want to have.[1] There’s no logical reason why you can’t do the locking semaphores on array elements rather than the entire array.
        
        [1] towards the end of the process [or earlier on perverse data] the lower thread may claim the top element of one subarray or the upper thread the bottom of one subarray, but even then, the other thread will never actually want to claim and delete that item, just look at in once a cycle as long as it exists. Which can easily be managed in a semaphore.
        
        Reply
        
        4th Dimension says:
        
        Thursday Sep 19, 2013 at 12:57 am
        
        The problem with that way is that you are making an asumption that the ranges of two arrays are similar, so that bottom halfs of the array will comprise the bottom of the master array while the top parts will make the top of the master array. While in reality things are much more random. You can have one array comprised of mostly lower values of master array, and than as a MAX it has MAX of the entire master array.
        
        Simply put before you sort the array.
        
        Oh and neither thread will ever delete anything. The problem stems from the fact that they will be inserting (not add, since that would require them to communicate their current values and for one thread to wait while the other fills in all the lower values (in case of sorting smaller to larger)) values. And that means they will be shifting already inserted ones. So the other thread must not read during the insert. It can not use that time to figure out whee they’ll place their value.
        
        On the other hand, I would love to see this implemented and be proven wrong ;)
        
        Reply
        
        guy says:
        
        Thursday Sep 19, 2013 at 8:09 am
        
        If they get desynchronized, I can see one thread reading in new values when the other thread has picked two to swap and is halfway through swapping, so a value is temporarily not present in the array and another is doubled.
        
        Any routine that will land numbers in the right half of the array and then sort each half could be split into multiple threads at each pass, assuming that you don’t have a problem accessing the same array at different locations with multiple threads. Should be safe, but an overzealous compiler might not let you. I guess you could have an actual problem in certain architectures if the array might get components relocated, but usually that won’t happen.
        
        If you do have to instantiate new arrays, I don’t think you can reinsert in multiple threads, but copying between arrays is O(N) anyhow.
        
        Reply
        
        Jeff R. says:
        
        Thursday Sep 19, 2013 at 10:39 am
        
        I was forgetting that we were talking about in-place, but I still think it can work that way. I wasn’t planning on doing insertions, only swaps. The low thread looks at the bottoms of both arrays and either marks the absolute lower one’s as good and increments that pointer or else swaps the bottom elements. The high thread does the same with the tops. There is no possible case in which the two threads want to do a swap involving the same element (other than the debugging case where the two arrays being merged aren’t actually properly sorted.)
        
        The case you’re mentioning is an avoidable edge case near the end; we can do special behavior when either array has less than 3 unmerged elements in it and keep that safe.
        
        Reply
        
        Jeff R. says:
        
        Thursday Sep 19, 2013 at 1:00 pm
        
        Wait, nevermind; the swap-based merge I was thinking about doesn’t work at all, parallel or no. So I’ll fall back to “merge to a new array” being doable in two threads safely. Copying the result back is only O(n) and is itself the most parallelizable task imaginable, so that’s still good.
        
        Reply
2. guy says:
  
  Wednesday Sep 18, 2013 at 6:25 pm
  
  It would be tricky. An in-place sort cannot be done in multiple threads on the first pass, because there is a non-zero chance they’ll access the same memory location simultaneously or in the wrong order. Once it gets to the point where you can break the array into two parts and are sure every number is in the right half, then you can split it (assuming it’s possible to access different locations in the same array from different threads simultaneously without causing errors, which I’m honestly not sure about) and work on each half.
  
  If you actually create a new array at each recursion, it would be much easier until you run out of memory. You could combine the two, creating new arrays on the first pass and then sorting them in place in separate threads, but then you have to write more code.
  
  Reply
3. Nentuaby says:
  
  Wednesday Sep 18, 2013 at 8:19 pm
  
  “Bitonic Sort” from that video is an algorithm particularly designed for massively-parallel implementation. The reason it sorts things in such a crazy fashion is that a bunch of independant comparators are doing their thing at once, not talking to each other.
  
  Reply
Jeff says:

Wednesday Sep 18, 2013 at 1:50 pm

Shamus, I think you have too many sorts there:

“Shamus, why are there sort many different sort algorithms?””

That should probably be a “so”.

Reply
Knut says:

Wednesday Sep 18, 2013 at 1:58 pm

Heap sort sounds like a descent into madness

Reply
Ravens Cry says:

Wednesday Sep 18, 2013 at 3:32 pm

It was like listening to an early arcade, all the boobs and bips.

Reply
Brandon says:

Wednesday Sep 18, 2013 at 8:22 pm

“Programmer time is precious […]”

Not according to managers! They are more than happy wasting programmer time in meetings and other trivial BS.

Or they are happy keeping us working on band-aid fixes for crappy old software instead of building its replacement.

… sigh.

Reply
1. Richard says:
  
  Thursday Sep 19, 2013 at 3:20 am
  
  Actually, to some extent “band-aid fixes” are pretty much the right thing to do when you have mostly-working software.
  
  The reason is twofold and quite simple:
  1) Your mostly-working software has the bugfixes for weird and rare cases. Those bugs took a long time to find and some time to squash.
  If you redo-from-scratch you’ve lost all that work, and you’ll make at least some of the same original errors when rebuilding.
  
  2) You need to sell product.
  You can continue to sell the mostly-working software and make money while improving it, and your existing customers get the improvements as you do them.
  If you redo-from-scratch then there will be a long period where you’re not improving the old software and not selling the new product either.
  Your existing customers will complain about buggy software that’s not getting fixed,and your potential new customers will think “That software is buggy and they’re not fixing it” and buy something else.
  
  If it’s a spaghetti mess, the right thing to do is to slowly re-architect the software while keeping it working.
  
  Reply
Jock says:

Wednesday Sep 18, 2013 at 10:30 pm

Could you do a follow-up post that goes over what each of those sorts does, and why each might be picked over other sorts? I’m familiar with some of them, but others are new to me, and you have a way of putting this stuff in an easy to digest format.

Reply
Diego says:

Thursday Sep 19, 2013 at 1:30 am

I clicked on that video. An hour and a half later I was watching cellular automata videos, then some robots, neural networks… when I started closing everything I encountered this article again! “ah! that’s what I was doing”

Reply
1. Trix2000 says:
  
  Thursday Sep 19, 2013 at 11:14 pm
  
  You’re not alone, though I managed to cut my time to ~45 minutes.
  
  Reply
Piflik says:

Thursday Sep 19, 2013 at 5:48 am

Reminds me of this XD

I like PanicSort…

Reply
Retsam says:

Thursday Sep 19, 2013 at 11:39 am

I was sort of hoping to see my favorite not-so-serious sorting algorithm come up, the Stooge Sort, which is as follows:

1) Recursively Stooge Sort greatest 2/3rds of array.
2) Recursively Stooge Sort least 2/3rds of array.
3) Recursively Stooge Sort greatest 2/3rds of array.

Voila, sorted array. Wikipedia informs me that the complexity of this sort is: O(nlog 3 / log 1.5 ) = O(n^2.7095…).

Reply
1. Ilseroth says:
  
  Thursday Sep 19, 2013 at 7:28 pm
  
  Well then, wish granted: http://www.youtube.com/watch?v=vIDkfrSdID8
  
  Stooge sort visualization.
  
  Been a long time since programming class, all I remember is the professor going over a list of sorting algorithms and when she got to Bubble, she said “don’t use this, and if you think you need to use this, find another way to do it.”
  
  Reply
  1. Volfram says:
    
    Thursday Sep 19, 2013 at 9:24 pm
    
    That is incredibly dramatic-sounding.
    
    Reply
Peter says:

Thursday Sep 19, 2013 at 7:58 pm

A great demonstration of sorting agorithms can be seen here:

http://www.sorting-algorithms.com/

There the number of items are similar, and the initial set has different starting cases to see how different sorts behave depeneding on the input.

Reply
1. Shamus says:
  
  Thursday Sep 19, 2013 at 8:29 pm
  
  That’s a VERY cool demonstration. Thanks.
  
  Reply
luagha says:

Friday Sep 20, 2013 at 11:37 pm

Ah, but a distributed bubblesort is one of the fastest.

I was reading an article where someone was organizing participants at a dog show. He had dogs with owners on leash, each having a number, all in a line, in no particular order.

He said, “Everyone look to the person on your left. If your number is higher than theirs, swap places.”

In less than five minutes, the line was sorted.

Reply
Neil Harding says:

Saturday Dec 28, 2013 at 2:31 pm

Radix sorting was good on the C64 on my Sprite Multiplexer (you have to sort by the Y position, so that you can make the 8 sprites act as 24, by reusing them further down the screen). On a 1Mhz processor, every cycle counts :)

Reply

Thanks for joining the discussion. Be nice, don't post angry, and enjoy yourself. This is supposed to be fun. Your email address will not be published. Required fields are marked*

You can enclose spoilers in <strike> tags like so:
<strike>Darth Vader is Luke's father!</strike>

You can make things italics like this:
Can you imagine having Darth Vader as your <i>father</i>?

You can make things bold like this:
I'm <b>very</b> glad Darth Vader isn't my father.

You can make links like this:
I'm reading about <a href="http://en.wikipedia.org/wiki/Darth_Vader">Darth Vader</a> on Wikipedia!

You can quote someone like this:
Darth Vader said <blockquote>Luke, I am your father.</blockquote>

T w e n t y S i d e d