Merge K Sorted Lists
in the wild External merge sort, log-file consolidation, distributed query results — anywhere k sorted streams must merge into one sorted output.
Problem
You're given k linked lists, each already sorted in ascending order. Merge them all into one sorted linked list and return its head.
Examples
Input: [1 → 4 → 5,
1 → 3 → 4,
2 → 6]
Output: 1 → 1 → 2 → 3 → 4 → 4 → 5 → 6
Input: [] Output: (empty)
Input: [[]] Output: (empty)
Input: [1, 2, 3] (k = 1) Output: 1 → 2 → 3
Why this is the canonical "k-way merge with a heap" problem
A two-list merge is O(n). Naïvely repeating it k − 1 times runs total O(n·k) — every pass re-scans the growing merged list.
The smarter approach: at every step, the next element of the output is the smallest head across all k lists. That smallest-of-k query is exactly what a min-heap does in O(log k). So:
- Push the head of each non-empty list into a min-heap, keyed by node value.
O(k log k). - Pop the smallest head → append to the output.
O(log k). - If the popped node has a successor, push it.
O(log k). - Repeat until the heap is empty.
Total: each of the n total nodes contributes one pop and at most one push → O(n log k). The heap never holds more than k items.
This is k-way merge in its purest form. The same algorithm powers external merge sort (k = number of sorted runs that fit in memory), distributed query merging (k = number of shards), and log-merging tools (k = number of input log files).
Hints
Hint 1 — what's the next output element?
At any step, the next element of the merged output is the smallest value among the current heads of the k lists. How do you find the min of k values in less than O(k)?
Hint 2 — heap of pointers, not of values
You don't push node values into the heap — you push the nodes themselves, comparing them by value. That way, after popping a node, you can advance to node.next and push it back in.
Hint 3 — alternative: divide-and-conquer merge
You can also merge pairwise — merge list 0 with list 1, list 2 with list 3, …, then merge the resulting k/2 lists pairwise, and so on. log k rounds, each O(n) → O(n log k). Same big-O as the heap, often slightly faster in practice (better cache behavior, no priority-queue overhead).
Solution
from __future__ import annotations
from dataclasses import dataclass
from typing import Optional
import heapq
@dataclass
class Node:
val: int
next: Optional["Node"] = None
def merge_k(lists: list[Optional[Node]]) -> Optional[Node]:
# heapq compares tuples lexicographically; second slot is a tiebreaker
# so we never compare two Node objects directly (they have no __lt__).
heap: list[tuple[int, int, Node]] = []
for i, head in enumerate(lists):
if head:
heapq.heappush(heap, (head.val, i, head))
dummy = Node(0)
tail = dummy
seq = len(lists) # monotonically increasing tiebreaker
while heap:
_, _, node = heapq.heappop(heap)
tail.next = node
tail = node
if node.next:
heapq.heappush(heap, (node.next.val, seq, node.next))
seq += 1
return dummy.next
# Helpers
def list_of(values: list[int]) -> Optional[Node]:
dummy = Node(0); tail = dummy
for v in values: tail.next = Node(v); tail = tail.next
return dummy.next
def values(head: Optional[Node]) -> list[int]:
out = []
while head: out.append(head.val); head = head.next
return out
cases = [
([[1, 4, 5], [1, 3, 4], [2, 6]], [1, 1, 2, 3, 4, 4, 5, 6]),
([], []),
([[]], []),
([[1, 2, 3]], [1, 2, 3]),
]
for lists, expected in cases:
heads = [list_of(xs) for xs in lists]
got = values(merge_k(heads))
mark = "✓" if got == expected else "✗"
print(f"{mark} merge_k({lists}) = {got}")class Node {
constructor(val, next = null) { this.val = val; this.next = next; }
}
// Min-heap of Node pointers, compared by node.val.
class MinHeap {
constructor() { this.h = []; }
size() { return this.h.length; }
push(n) {
this.h.push(n);
let i = this.h.length - 1;
while (i > 0) {
const p = (i - 1) >> 1;
if (this.h[p].val <= this.h[i].val) break;
[this.h[p], this.h[i]] = [this.h[i], this.h[p]]; i = p;
}
}
pop() {
const top = this.h[0], last = this.h.pop();
if (this.h.length === 0) return top;
this.h[0] = last;
const n = this.h.length;
for (let i = 0;;) {
let l = 2 * i + 1, r = 2 * i + 2, m = i;
if (l < n && this.h[l].val < this.h[m].val) m = l;
if (r < n && this.h[r].val < this.h[m].val) m = r;
if (m === i) break;
[this.h[m], this.h[i]] = [this.h[i], this.h[m]]; i = m;
}
return top;
}
}
function mergeK(lists) {
const h = new MinHeap();
for (const head of lists) if (head) h.push(head);
const dummy = new Node(0); let tail = dummy;
while (h.size()) {
const n = h.pop();
tail.next = n; tail = n;
if (n.next) h.push(n.next);
}
return dummy.next;
}
const listOf = vs => { const d = new Node(0); let t = d; for (const v of vs) { t.next = new Node(v); t = t.next; } return d.next; };
const values = h => { const r = []; while (h) { r.push(h.val); h = h.next; } return r; };
const cases = [
[[[1, 4, 5], [1, 3, 4], [2, 6]], [1, 1, 2, 3, 4, 4, 5, 6]],
[[], []],
[[[]], []],
[[[1, 2, 3]], [1, 2, 3]],
];
for (const [lists, expected] of cases) {
const heads = lists.map(listOf);
const got = values(mergeK(heads));
const ok = JSON.stringify(got) === JSON.stringify(expected);
console.log(`${ok ? "✓" : "✗"} mergeK(${JSON.stringify(lists)}) = ${JSON.stringify(got)}`);
}// Rust's owned linked-list comes with the usual Box<Node> dance. The merge
// algorithm is identical to other languages; the borrow checker just makes
// ownership transfer explicit.
use std::cmp::Reverse;
use std::collections::BinaryHeap;
pub struct Node {
pub val: i32,
pub next: Option<Box<Node>>,
}
pub fn merge_k(mut lists: Vec<Option<Box<Node>>>) -> Option<Box<Node>> {
// Heap of (value, seq) → list-index. We pop the smallest value, take the
// head off that list, and re-push its successor.
let mut heap: BinaryHeap<Reverse<(i32, usize, usize)>> = BinaryHeap::new();
let mut seq = 0usize;
for (i, head) in lists.iter().enumerate() {
if let Some(h) = head { heap.push(Reverse((h.val, seq, i))); seq += 1; }
}
let mut dummy = Box::new(Node { val: 0, next: None });
let mut tail: &mut Box<Node> = &mut dummy;
while let Some(Reverse((_, _, i))) = heap.pop() {
let mut node = lists[i].take().unwrap();
lists[i] = node.next.take();
if let Some(ref nxt) = lists[i] {
heap.push(Reverse((nxt.val, seq, i))); seq += 1;
}
tail.next = Some(node);
tail = tail.next.as_mut().unwrap();
}
dummy.next
}
fn list_of(vs: &[i32]) -> Option<Box<Node>> {
let mut head: Option<Box<Node>> = None;
for &v in vs.iter().rev() {
head = Some(Box::new(Node { val: v, next: head }));
}
head
}
fn values(mut h: Option<Box<Node>>) -> Vec<i32> {
let mut out = Vec::new();
while let Some(n) = h { out.push(n.val); h = n.next; }
out
}
fn main() {
let cases: Vec<(Vec<Vec<i32>>, Vec<i32>)> = vec![
(vec![vec![1, 4, 5], vec![1, 3, 4], vec![2, 6]], vec![1, 1, 2, 3, 4, 4, 5, 6]),
(vec![], vec![]),
(vec![vec![]], vec![]),
(vec![vec![1, 2, 3]], vec![1, 2, 3]),
];
for (lists, expected) in cases {
let heads: Vec<_> = lists.iter().map(|xs| list_of(xs)).collect();
let got = values(merge_k(heads));
let mark = if got == expected { "✓" } else { "✗" };
println!("{mark} merge_k(...) = {:?}", got);
}
}#include <iostream>
#include <queue>
#include <vector>
struct Node {
int val;
Node* next;
Node(int v, Node* n = nullptr) : val(v), next(n) {}
};
Node* merge_k(std::vector<Node*>& lists) {
auto cmp = [](Node* a, Node* b) { return a->val > b->val; }; // min-heap
std::priority_queue<Node*, std::vector<Node*>, decltype(cmp)> pq(cmp);
for (Node* h : lists) if (h) pq.push(h);
Node dummy(0);
Node* tail = &dummy;
while (!pq.empty()) {
Node* n = pq.top(); pq.pop();
tail->next = n;
tail = n;
if (n->next) pq.push(n->next);
}
tail->next = nullptr;
return dummy.next;
}
static Node* list_of(const std::vector<int>& xs) {
Node dummy(0); Node* t = &dummy;
for (int v : xs) { t->next = new Node(v); t = t->next; }
return dummy.next;
}
static void show(Node* h) {
std::cout << "[";
while (h) { std::cout << h->val << (h->next ? "," : ""); h = h->next; }
std::cout << "]\n";
}
int main() {
std::vector<std::vector<int>> in_a = {{1, 4, 5}, {1, 3, 4}, {2, 6}};
std::vector<Node*> a; for (auto& xs : in_a) a.push_back(list_of(xs));
std::cout << "✓ merge_k({{1,4,5},{1,3,4},{2,6}}) = "; show(merge_k(a));
std::vector<Node*> b;
std::cout << "✓ merge_k({}) = "; show(merge_k(b));
std::vector<Node*> c = { list_of({1, 2, 3}) };
std::cout << "✓ merge_k({{1,2,3}}) = "; show(merge_k(c));
}// Hand-rolled min-heap of Node* pointers, compared by ->val.
#include <stdio.h>
#include <stdlib.h>
typedef struct Node {
int val;
struct Node* next;
} Node;
// Min-heap as a flat array.
static Node* heap[1024];
static int hn = 0;
static void sift_up(int i) {
while (i > 0) {
int p = (i - 1) / 2;
if (heap[p]->val <= heap[i]->val) break;
Node* t = heap[p]; heap[p] = heap[i]; heap[i] = t;
i = p;
}
}
static void sift_down(int i) {
while (1) {
int l = 2*i+1, r = 2*i+2, m = i;
if (l < hn && heap[l]->val < heap[m]->val) m = l;
if (r < hn && heap[r]->val < heap[m]->val) m = r;
if (m == i) break;
Node* t = heap[m]; heap[m] = heap[i]; heap[i] = t;
i = m;
}
}
static void hpush(Node* n) { heap[hn++] = n; sift_up(hn - 1); }
static Node* hpop(void) {
Node* top = heap[0];
heap[0] = heap[--hn];
if (hn > 0) sift_down(0);
return top;
}
Node* merge_k(Node** lists, int k) {
hn = 0;
for (int i = 0; i < k; i++) if (lists[i]) hpush(lists[i]);
Node dummy = {0, NULL};
Node* tail = &dummy;
while (hn > 0) {
Node* n = hpop();
tail->next = n; tail = n;
if (n->next) hpush(n->next);
}
tail->next = NULL;
return dummy.next;
}
static Node* mk(int v, Node* nxt) { Node* n = malloc(sizeof(Node)); n->val = v; n->next = nxt; return n; }
static Node* list_of(const int* xs, int n) {
Node* head = NULL;
for (int i = n - 1; i >= 0; i--) head = mk(xs[i], head);
return head;
}
static void show(Node* h) {
printf("[");
while (h) { printf("%d%s", h->val, h->next ? "," : ""); h = h->next; }
printf("]\n");
}
int main(void) {
int a1[] = {1, 4, 5}, a2[] = {1, 3, 4}, a3[] = {2, 6};
Node* a[] = {list_of(a1, 3), list_of(a2, 3), list_of(a3, 2)};
printf("✓ merge_k({{1,4,5},{1,3,4},{2,6}}) = "); show(merge_k(a, 3));
Node* b[] = {NULL}; // k = 0 equivalent
printf("✓ merge_k({}) = "); show(merge_k(b, 0));
int c1[] = {1, 2, 3};
Node* c[] = {list_of(c1, 3)};
printf("✓ merge_k({{1,2,3}}) = "); show(merge_k(c, 1));
return 0;
}The trick that makes Rust hard here is that you can't borrow two list heads at once for comparison. The standard workaround: drive everything through indices into a Vec, so the heap holds (value, list_index) and the actual mutation happens via the index.
Complexity
- Time
- O(n log k) — total
nnodes; each contributes one heap pop + at most one heap push, bothO(log k). - Space
- O(k) — heap holds at most one node per list. The output list itself is allocated separately and isn't counted in "auxiliary space".
- Divide-and-conquer alternative
- Same
O(n log k), often faster in practice (no priority-queue overhead, better cache behavior).
In the wild
- External merge sort. Sort 100 GB on a 16 GB machine: produce many sorted runs that fit in memory, then k-way merge them. This algorithm is the merge step.
- Distributed sorted queries. Each shard returns its own sorted result; the coordinator merges them via k-way merge before returning to the client.
- Log file aggregation.
sort -m file1 file2 ...and tools likelogrotate+tail -Fuse this. - The
mergestage of LSM-tree compaction (Cassandra, RocksDB, LevelDB). Each level's sorted files are merged into the next level.
Variations
- Merge K sorted arrays (not linked lists). Same algorithm; heap holds (value, array index, position-in-array) triples.
- Merge K sorted iterators (the streaming / lazy version). Identical pattern; never materialize input or output in full.
- K closest elements. Push the input around a target, but emit only the closest K — combine with a max-heap cap.
- Kth smallest in K sorted lists. Variant of kth-smallest-bst: start the k-way merge, count to k, stop.