Marat BN — Marat Borisovich Nepomnyashy's home page

Why Sort Algorithms Are Named the Way They’re Named

Created on 2016-05-01

I’m writing this blog in the course of reviewing the major sorting algorithms, to better remember how they work by understanding why they are named the way they are, and to also help me memorize their names, by forming associations with related concepts.

Besides the Wikipedia entries, I wrote this blog based on the information at these URLs:

In this blog I’m going to discuss the 7 major sorting algorithms discussed at these URLs that all the other sorting algorithms are derived from. When these 7 are memorized, then the others are a lot easier to remember by associating them with the one they are derived from.

The videos below are not from me, but from the YouTube user Timo Bingmann.

Bubble Sort

Likely the simplest, and also most tedious and mundane way to sort, at least when sorting objects. The objects are swapped until they are in correct order.

People tend to use it in the physical world to manually sort very small quantities of physical objects, and to also sort nearly identical items kept on racks, such as sets of hand tools of gradual sizes. It is used by people is to arrange themselves in a line ordered by height (or by letters of their names), in which case it is the most efficient method when used even by a large number of people without any other organization; however in that case there is no one-by-one iteration as in a computer algorithm, since each person moves themselves independently and simultaneously with others.

In a software algorithm, as the list is iterated, every time the next value is less than the previous value, the order of these values is swapped. After each iteration, the list is re-iterated again and again, until all the values are in the correct order.

Average performance: O(n^2)

Called bubble sort because in a histogram representation, or when used to sort physical objects of varying height, the gaps created by the small values nested between larger values will have the appearance of bubbling to the front of the list by shifting once with each iteration.

(I would call it the “rearrange-by-height” sort, or “short people bubbling through queue” sort.)

https://www.youtube.com/watch?v=Cq7SMsQBEUw
Insertion Sort

The way most people manually sort moderate amounts of physical objects, when they are faced with that task. Items are picked at random out of the shrinking unsorted pile, and carefully arranged in a growing thus-sorted list.

In a software algorithm, the main iteration starts at the second position, as the first position is considered to be already sorted. As the main iteration advances, each new value is inserted into its correct sorted position among the previously sorted values. The correct position for each new value is determined by a nested iteration that goes through the previously sorted values each time a new value needs to be inserted.

Approximate performance: O(n^2)

Called insertion sort because each newly sorted value is inserted between the previously sorted values.

(I would call it the “physical-object-insert” or “insert card between its neighbors” sort.)

https://www.youtube.com/watch?v=8oJS1BMKE64
Selection Sort

Involves finding / selecting either the largest (or the smallest) value out of a collection of values, and moving it to append (or prepend) to a thus-sorted growing list of previously selected values. The value is not picked at random as in the insertion sort.

In a software algorithm, the list is iterated to find either the smallest or the largest value, which is then placed either to the front (for the smallest) or to the back (for the largest). To not iterate through the placed value again, the next iteration will begin at the next position (from the smallest value) or end at the previous position (from the largest value).

Software implementations of this algorithm might be considered simpler than for the insertion sort, because the previously sorted values do not need to be shifted, and while there are multiple iterations, there is no nested iteration.

Approximate performance: O(n^2)

Perhaps the closest analogy to this algorithm in the physical world would be the way a mechanic might oil bearings, applying the lubricant to the squeakiest first, and then on to the next most squeakiest, or a typical order in which organization members get promoted or demoted based on their rank, even though not all bearings always get oiled, and not all members get promoted or demoted. Another analogy would be the way people may eat certain collections of food items, eating the lesser-tastiest before the more-tastiest, to get the best after-taste at the conclusion of their meal, but in that case the tastiest rank is subjective, and the items get destroyed. So there is no clear analogy, and as finding one item among many is tedious and time-intensive, in most circumstances people prefer to sort physical objects by the insertion sort.

As this sort is inefficient for both people and computers, the only way I can envision it ever being used other than for a demonstration is by somebody condemned to program specifically in line-numbered BASIC on an Apple ][e or an 8088 IBM PC, because it is simpler to implement than the insertion sort, and also slightly more efficient than the bubble sort (because the quantity of unsorted values that must be re-iterated decreases by one with each iteration).

The selection sort is important to know because it is the foundation for the powerful heap sort that will be discussed below.

Called selection sort because the smallest or the largest value is selected from among all the other values.

(I would call it the “squeakiest bearing first” sort, “promotion/demotion by rank” sort, “berry-eating” sort, “BASIC on an Apple ][e / 8088 IBM PC condemnee” sort, or “actually not something you’d want to select” sort, and “do NOT select” sort.)

https://www.youtube.com/watch?v=92BfuxHn2XE
Quicksort / Partititon-Exchange Sort

The first advanced sorting algorithm discussed in this blog. There are no analogs to it in the physical world.

A special position called the pivot is chosen in the list, usually in the middle. This partitions the list into the left side and the right. The left side is then iterated once to find any values greater than the pivot, which are then moved to the right side. Likewise, the right side is iterated once to find any values less than the pivot, which are then moved to the left side. This way no value on the left side is greater than the pivot, and no value on the right side is less than the pivot. The algorithm is then applied again recursively to the right and left sides to sort both of them. After the recursion, and all its child, grand-child, grand-grand-child, etc. sub-recursions return, the list will be entirely sorted.

Approximate performance: O(n * log(n))

To the best of my knowledge, called quicksort because it is quicker than all the simple sorting algorithms (bubble sort, selection sort, insertion sort) described above, and also quicker than some advanced sorting algorithms depending on the algorithm and the data sorted.

Called partition-exchange sort because the list is partitioned and the values in the partitions are exchanged so that in one partition they are less, and in the other partition they are greater, than the pivot value.

https://www.youtube.com/watch?v=8hEyhs3OV1w
Merge Sort

Another recursive sort algorithm similar to quicksort. However, this time there is no pivot position, the list is just split into 2 halves, each of which is then recursively sorted. The 2 now sorted sub-lists are merged together into the newly fully-sorted full list.

The reason why it is possible to ultimately produce sorted lists just by continuously splitting them, is because ultimately numerous splits will create single-element “lists”, and such “lists” are inherently sorted. When a single-element “list” is “merged” with another single-element “list”, the resultant 2-element list will become sorted. The actual comparison of values takes place only during the merging to determine the order in which they are added onto the output list.

Approximate performance: O(n * log(n))

Called merge sort due to the merging of the recursively-sorted sub-lists.

The approximate performance is only a rough indicator, and this algorithm is generally known to be not quite as quick as the quicksort. (Whereas in quicksort the recursively-sorted sub-lists are concatenated on either side of the pivot rather than merged.)

https://www.youtube.com/watch?v=ZRPoEKHXTJg
Heapsort

Similar to the selection sort in the sense that the final sorted list is strung together by either increasingly larger values or smaller values; however, the values are not selected from the list, but extracted from a sorted binary tree structure (also known as a heap) that is initially assembled with the values from the list. The actual sorting is done in the course of assembling the heap structure.

Approximate performance: O(n * log(n))

The performance is better than for selection sort due to the efficiency inherent in building the heap data structure rather than linearly searching for the value to select. The worst-case performance is also O(n * log(n)), while the worst-case performance for quicksort is O(n^2). But requires more memory than quicksort (to build the heap).

Called heapsort because the values are sorted by assembling a heap with them, which is a sorted binary tree.

https://www.youtube.com/watch?v=_bkow6IykGM

The video of the heap sort is longer than the video of the selection sort because it plays tones corresponding to the process of building the heap, and in actuality that process takes much less time than portrayed in the video. In the videos, the number of comparisons for the heap sort is listed as 1048, while for the selection sort it is 4950, and the number of array accesses is 3279 vs 9771.
Radix sort

Unlike all the previously discussed algorithms, this one does not do any actual comparisons between the values to be sorted. The values are sorted by arranging them into “buckets” based on their digits or letters.

The number of “buckets” depends on the base of the values being sorted, which in mathematical terms is also known as their radix, hense radix sort. If the values sorted are decimal digits, then the number of “buckets” will be 10, because that’s the base / radix of decimal numbers. The buckets will be labeled 0 through 9. Conversely, if the values are all only same-case letters of the English alphabet, then the number of “buckets” will be 26, for the number of letters in the English alphabet, and the buckets will be labeled A-Z. If the values are raw 8-bit bytes, then the number of “buckets” will be 256, which is the number of distinct values that can be represented by a byte, and the labels will be 0-255.

The process is comprised of a number of iterations, with each single iteration arranging the values based only by the digit / letter at a particular location within the values, starting either from the least significant digit or the most significant letter, depending on if the values sorted are analyzed right to left such as numbers, or left to right such as words. After each such iteration, the values are collected from the buckets starting from the lowest-labeled bucket and sweeping to the highest-labeled bucket, but emptying each bucket from the bottom in FIFO order. In that sense these logical “buckets” are not like physical buckets which are emptied in LIFO order when filled with layered material. The removed values are appended onto an intermediate list, which is then fed as the input for the next iteration. The total number of iterations will be the number of digits / letters in the largest number / string.

Here’s an example: Given the following random numbers:

52, 77, 29, 48, 1, 80, 94, 9, 49, 4, 34, 17, 44, 26

First iteration in which the least-significant digit is considered yields the following 10 buckets:
```
        0: 80
        1: 1
        2: 52,
        3: empty,
        4: 94, 4, 34, 44
        5: empty
        6: 26
        7: 77, 17
        8: 48
        9: 29, 9, 49
      
```
These buckets are then emptied starting from bucket 0 and progressing to bucket 9, and that yields this new intermediate list:

80, 1, 52, 94, 4, 34, 44, 26, 77, 17, 48, 29, 9, 49

The second (and final iteration in this case because the largest value is 2 digits) yields the following 10 buckets based on the next-significant digit:
```
        0: 1, 4, 9
        1: 17
        2: 26, 29
        3: 34
        4: 44, 48, 49
        5: 52
        6: empty
        7: 77
        8: 80
        9: 94
      
```
That yields this final sorted list:

1, 4, 9, 17, 26, 29, 34, 44, 48, 49, 52, 77, 80, 94

This demonstrates that the values did not have to be compared by less / greater than.

Average performance: O(n * b) (Where b is number of digits / letters in the largest number / string.)

This algorithm is more efficient than all of the comparison-based algorithms when: O(n * b) < O(n * log(n))

Called radix sort because the number of buckets is always equal to the radix of the values being sorted.

https://www.youtube.com/watch?v=LyRWppObda4