A heap is the data structure you reach for whenever you repeatedly need the smallest or largest element of a collection that keeps changing. A priority queue is the abstract idea — "give me the highest-priority item next" — and a binary heap is how it's almost always implemented. Recognizing the heap pattern on sight is one of the highest-leverage skills for a coding interview, because the same shape covers a whole family of "top K" and "merge" problems.
What a heap actually gives you
A binary heap keeps its minimum (min-heap) or maximum (max-heap) at the root. The trade you're buying is this: peeking the extreme element is O(1), while inserting or removing an element is O(log n). Compare that to a sorted array, where peeking is cheap but insertion is O(n), or an unsorted array, where insertion is cheap but finding the extreme is O(n). A heap is the sweet spot when both operations happen many times.
| Operation | Heap | Sorted array | Unsorted array |
|---|---|---|---|
| Peek min/max | O(1) | O(1) | O(n) |
| Insert | O(log n) | O(n) | O(1) |
| Remove extreme | O(log n) | O(1) / O(n) | O(n) |
| Build from n items | O(n) | O(n log n) | O(n) |
When to recognize this pattern
Learn these tell-tale signals and you'll know when to use a heap before you've finished reading the prompt:
- "Top K" / "K largest" / "K smallest" — the canonical top K elements pattern.
- "K closest" points, numbers, or words to a target — closeness is just a priority key.
- "Merge K sorted" lists or arrays — the heap always holds the next candidate from each list.
- "Median of a data stream" or any running order statistic — two heaps balance around the middle.
- Scheduling / "next event" — task schedulers, meeting rooms, and Dijkstra all pull the minimum next.
Min-heap vs max-heap
The single decision that trips people up in a priority queue interview is heap polarity. The rule is counter-intuitive: to keep the K largest elements, use a min-heap of size K. The smallest of your current top K sits at the root, so when a bigger element arrives you pop that root in O(log K) and push the newcomer. Symmetrically, to keep the K smallest, use a max-heap of size K.
Python's heapq is a min-heap only. To simulate a max-heap, push negated values (or tuples whose key is negated) and negate again when you read them out.
import heapq
# min-heap (default)
h = []
heapq.heappush(h, 5)
heapq.heappush(h, 1)
heapq.heappush(h, 3)
heapq.heappop(h) # -> 1 (smallest)
# max-heap: push negated values
mh = []
for x in [5, 1, 3]:
heapq.heappush(mh, -x)
-heapq.heappop(mh) # -> 5 (largest)
# build a heap from a list in O(n)
nums = [9, 4, 7, 1, 2]
heapq.heapify(nums) # nums is now a valid min-heap
Worked examples
1. Kth largest element (the top K core)
Problem: Return the K-th largest element in an array.
Keep a min-heap bounded to size K. After scanning every element, the root is the K-th largest because exactly K-1 larger elements sit above it.
import heapq
def find_kth_largest(nums, k):
heap = []
for x in nums:
heapq.heappush(heap, x)
if len(heap) > k:
heapq.heappop(heap) # drop the smallest so far
return heap[0] # K-th largest
O(n log K) time, O(K) space — better than sorting's O(n log n) whenever K is much smaller than n. Python also offers heapq.nlargest(k, nums), which uses exactly this technique under the hood.
2. K closest points to the origin
Problem: Given a list of points, return the K closest to (0, 0).
"Closest" is a priority. We want the K smallest distances, so keep a max-heap of size K (negate the distance in Python) and evict the farthest whenever the heap overflows.
import heapq
def k_closest(points, k):
heap = []
for x, y in points:
dist = x * x + y * y # no sqrt needed for ordering
heapq.heappush(heap, (-dist, x, y))
if len(heap) > k:
heapq.heappop(heap) # remove the farthest
return [[x, y] for _, x, y in heap]
O(n log K) time, O(K) space. Squaring instead of taking the square root keeps the comparison exact and cheap — a small detail interviewers notice.
3. Merge K sorted lists
Problem: Merge K sorted linked lists into one sorted list.
Seed the heap with the head of each list. Repeatedly pop the global minimum and push that node's successor. The heap never holds more than K nodes, so each pop is O(log K).
import heapq
def merge_k_lists(lists):
heap = []
for i, node in enumerate(lists):
if node:
# i is a tie-breaker so heapq never compares nodes
heapq.heappush(heap, (node.val, i, node))
dummy = tail = ListNode(0)
while heap:
val, i, node = heapq.heappop(heap)
tail.next = node
tail = node
if node.next:
heapq.heappush(heap, (node.next.val, i, node.next))
return dummy.next
O(N log K) time where N is the total number of nodes, O(K) space. The index i is essential: without a unique tie-breaker, heapq would try to compare ListNode objects when two values are equal and raise a TypeError.
4. Find median from a data stream
Problem: Support add_num(x) and find_median() over a growing stream.
Balance two heaps: a max-heap (low) for the smaller half and a min-heap (high) for the larger half. Keep their sizes within one of each other, and the median is at the root(s).
import heapq
class MedianFinder:
def __init__(self):
self.low = [] # max-heap (store negatives)
self.high = [] # min-heap
def add_num(self, x):
heapq.heappush(self.low, -x)
# move the largest of low into high
heapq.heappush(self.high, -heapq.heappop(self.low))
# rebalance so low is never smaller than high
if len(self.high) > len(self.low):
heapq.heappush(self.low, -heapq.heappop(self.high))
def find_median(self):
if len(self.low) > len(self.high):
return -self.low[0]
return (-self.low[0] + self.high[0]) / 2
Each add_num is O(log n); find_median is O(1). Two balanced heaps are the standard answer to any "running median" or "order statistic over a stream" question.
Common mistakes
- Wrong polarity. Using a max-heap for "top K largest" evicts the largest instead of the smallest. Top K largest needs a min-heap; top K smallest needs a max-heap.
- Forgetting to bound the heap. If you never pop when the heap exceeds K, you silently downgrade O(n log K) to O(n log n) and O(K) space to O(n).
- Comparing payloads. When two priority keys tie,
heapqfalls through to comparing the next tuple element. Always include a unique counter or index so it never compares un-orderable objects. - Reaching for a heap when sorting is simpler. If you need all elements in order, just sort — a heap only wins when K is small or the data arrives as a stream.
Spot the pattern instantly with live AI support
CoPilot Interview surfaces the optimal pattern and a working solution with Big-O in about 4 seconds during real Zoom and Teams calls. Free for Windows and macOS.
Download freeFAQ
When should I use a heap in a coding interview?
Whenever you repeatedly need the smallest or largest element of a changing collection. The tell-tale phrases are 'top K,' 'K closest,' 'K largest/smallest,' 'merge K sorted lists,' 'median of a stream,' and many scheduling or 'next event' problems. A heap gives you the extreme element in O(1) and insert/remove in O(log n).
Should I use a min-heap or a max-heap for top K?
For the K largest elements, keep a min-heap of size K: the smallest of your current top K sits at the root, so you can cheaply evict it when a bigger element arrives. For the K smallest, use a max-heap of size K. Python's heapq is a min-heap, so simulate a max-heap by pushing negated values.
What is the time complexity of the top K elements pattern?
Maintaining a heap of size K over n elements is O(n log K) time and O(K) space. That beats sorting the whole array, which is O(n log n), whenever K is much smaller than n.
How does Python's heapq work?
heapq turns a list into a binary min-heap in place. heappush and heappop are O(log n), heap[0] peeks the minimum in O(1), and heapify builds a heap from a list in O(n). For a max-heap, push negative values or tuples with a negated key.
What is the most common heap interview mistake?
Choosing the wrong heap polarity, so you evict the wrong end, and forgetting to bound the heap to size K, which silently turns an O(n log K) solution into O(n log n). Tie-breaking is another trap: when keys collide, push a tuple with a unique counter so heapq never compares the payload objects.