Skip to content

The heapsort algorithm

6.4-1

Using Figure 6.4 as a model, illustrate the operation of HEAPSORT on the array $A = < 5, 13, 2, 25, 7, 17, 20, 8, 4 >$.

6.4-1

6.4-2

Argue the correctness of HEAPSORT using the following loop invariant:

At the start of each iteration of the for loop of lines 2-5, the subarray $A [ 1 : i ]$ is a max-heap containing the $i$ smallest elements of $A [ 1 : n ]$, and the subarray $A [ i + 1 : n ]$ contains the $n - i$ largest elements of $A [ 1 : n ]$, sorted.

  • Loop invariant: At the start of each iteration of the for loop of lines 2-5, the subarray $A [ 1 : i ]$ is a max-heap containing the $i$ smallest elements of $A [ 1 : n ]$, and the subarray $A [ i + 1 : n ]$ contains the $n - i$ largest elements of $A [ 1 : n ]$, sorted.
  • Initialization: $i=n$, the subarray $A [ 1 : i ]$ is a max-heap containing the $i$ smallest elements of $A [ 1 : n ]$, since BUILD-MAX-HEAP(A,n) is call prior to the for loop. The subarray $A [ n + 1 : n ]$ contains the $0$ largest elements of $A [ 1 : n ]$, sorted.
  • Maintenance: During the $i$th iteration, $A[ 1 ]$ ,which is the largest element of $A [ 1, i ]$, is exchanged with $A[ i ]$. Since $A [ i + 1 : n ]$ contains the $n-i$ largest elements of $A [ 1 : n ]$, sorted. $A [ i : n ]$ contains the $n-i+1$ largest elements of $A [ 1 : n ]$, sorted.After that, MAX-HEAPIFY(A,1) reheapify $A[ 1 : i-1 ]$, since the childs of $A [ 1 ]$ is the root of a max-heap. Decreasement of $i$ holds the Invariant.
  • Termination: The procedure terminates when $i = 1$. At that time, $A[ 1 : 1 ]$ is a max-heap containing the $1$ smallest elements of $A [ 1 : n ]$, and the subarray $A [ 2 : n ]$ contains the $n - 1$ largest elements of $A [ 1 : n ]$, sorted.

6.4-3

What is the running time of HEAPSORT on an array $A$ of length $n$ that is already sorted in increasing order? How about if the array is already sorted in decreasing order?

Whether it is in increasing order or in decreasing order, BUILD-MAX-HEAP takes $O(n)$.

Let $T(n)$ be the running time of HEAPSORT on an array $A$ of length $n$.

$$ \begin{aligned} T(n) & = O(n) + \sum_{k=1}^{n-1}\lg k\cr & = O(n) + \Theta(n\lg n)\cr & = \Theta(n\lg n)\cr \end{aligned} $$

6.4-4

Show that the worst-case running time of HEAPSORT is $\Omega (n \lg n)$.

See 6.4-3

$\star$ 6.4-5

Show that when all the elements of A are distinct, the best-case running time of HEAPSORT is $\Omega (n\lg n)$.

Heapsort appeared in 1964, but the lower bound was proved by Schaffer and Sedgewick in 1992.

Let's assume that the heap is a full binary tree with $n = 2^k - 1$. There are $2^{k - 1}$ leaves and $2^{k - 1} - 1$ inner nodes.

Let's look at sorting the first $2^{k - 1}$ elements of the heap. Let's consider their arrangement in the heap and color the leaves to be red and the inner nodes to be blue. The colored nodes are a subtree of the heap (otherwise there would be a contradiction). Since there are $2^{k - 1}$ colored nodes, at most $2^{k - 2}$ are red, which means that at least $2^{k - 2} - 1$ are blue.

While the red nodes can jump directly to the root, the blue nodes need to travel up before they get removed. Let's count the number of swaps to move the blue nodes to the root. The minimal case of swaps is when

  1. there are $2^{k - 2} - 1$ blue nodes and
  2. they are arranged in a binary tree.

If there are $d$ such blue nodes, then there would be $i = \lg d$ levels, each containing $2^i$ nodes with length $i$. Thus the number of swaps is,

$$ \sum_{i = 0}^{\lg d}i2^i = 2 + (\lg d - 2)2^{\lg d} = \Omega(d\lg d) $$

And now for a lazy (but cute) trick. We've figured out a tight bound on sorting half of the heap. We have the following recurrence:

$$ T(n) = T(n / 2) + \Omega(n\lg n) $$

Applying the master method, we get that $T(n) = \Omega(n\lg n)$.