An Introduction To Data Structures in Python: Heaps

Heaps are tree-like data structures used in many programming languages. Here we will be focusing on their implementation in Python. Heaps have many uses, including in Dijkstra’s algorithm for finding the shortest path and priority queues. An important characteristic of heaps is how fast they are. The time to view the max item in the heap is O(1) time, which is almost instant. To insert or remove the max item in the heap, it takes O(log n) time, which is also extremely fast.

Min and MaxHeaps are nearly identical, differing only in the order of increasing magnitude of their elements. In MaxHeaps, the highest value is at the top (the max), and each node is greater than its two child nodes. In MinHeaps, the order is reversed, where each parent node is smaller than either of its two child nodes. For the sake of simplicity, this blog will only cover MaxHeaps, but all concepts can be applied to MinHeaps without much variation.

Similarly to lists and queues, heaps can be implemented in Python using lists. When creating a list from a heap, we read and assign indices from left to right, top to bottom. So for the above heap, 8 would be index 1, 7 would be 2, 6 would be 3, 5 would be 4, and so on.

In Python, list indices start at zero, so to ensure our max is at index 1, we set the first value of our list to 0 and ignore it. The reason heap indices start at 1 instead of zero is so we can more easily access parent and child nodes using simple arithmetic. To find the index of a node’s parent node, we simply divide its index by 2. The index of the parent node of 5, whose index is 4, is 4/2. 7 is at index 2. Similarly, to find a node’s left child node, we multiply its index by 2. To find the right child, we multiply the index by 2, then add 1.

Heaps use the same methods as stacks and queues — push, pop, and peek — however, the implementation of these methods is more complicated. Peek is easy. We simply observe the item at list index 1.

Pushing an item to the heap is the second easiest. When we want to add an item to a heap, we first add it to the very last position in the heap. We then ‘float up’ the item, by comparing it to the nodes above it. If it’s larger than its initial parent node, we swap the positions of the two nodes. We keep doing this until its parent node is larger.

The most difficult operation is popping, in which we remove the max item. The first step in this process is to swap the max with the last item in the heap, then remove the new last item. Next, we have to “bubble down” the new max to its proper position in the heap.

The complete wrapper class for a MaxHeap is as follows:

Incoming Freshman @ NYU Stern