ds · reference

Pick the right structure

Walk the questions top-down. The first answer that matches your workload picks the structure. Real-world workloads usually combine structures — note the hybrids at the bottom.

Workload-first flowchart

Do you need to look up values by an arbitrary key, with no order requirement?

→ Hash table. O(1) average lookup. Mind the load factor and the DoS-resistance question (use SipHash-keyed implementations on untrusted input).
Need ordered iteration, range queries (keys between A and B), or "min greater than X"?

→ Balanced tree (AVL / Red-Black) for in-memory, or a B+-tree for on-disk. O(log n) guaranteed.
Only ever need the minimum (or maximum) and want repeated O(log n) extracts?

→ Heap / priority queue. Smaller constant factors than a balanced tree, no in-order traversal.
Order is "newest in, newest out" (last-in-first-out)?

→ Stack. Backed by an array. Used by every recursive algorithm, expression parser, and undo history.
Order is "first in, first out" (oldest waits longest)?

→ Queue (ring buffer for bounded; linked or VecDeque for unbounded). The backbone of BFS, message brokers, ROS topics.
Data is fixed in size, dense, accessed randomly by index?

→ Array. The fastest thing on a modern CPU. Sequential access is cache-prefetched and ~50–100× faster than equivalent linked structures.
Need O(1) insert/delete in the middle and you'll already have a live reference to the affected node? (LRU cache, OS process scheduler.)

→ Doubly linked list. Often paired with a hash table for O(1) lookup-by-key + O(1) splice.
Modeling relationships, paths, dependencies, or anything network-shaped?

→ Graph. Choose BFS for shortest path on unweighted, Dijkstra for non-negative weights, A* if you have a heuristic, Bellman-Ford if edges can be negative.

Cost reference — at a glance

Structure	Access	Search	Insert	Delete	Ordered?
Array	`O(1)`	`O(n)`	`O(n)` mid · `O(1)*` end	`O(n)` mid	by index
Linked list	`O(n)`	`O(n)`	`O(1)`† at known node	`O(1)`† at known node	by insertion
Stack	`O(1)` top	—	`O(1)` push	`O(1)` pop	LIFO
Queue	`O(1)` head	—	`O(1)` enqueue	`O(1)` dequeue	FIFO
Heap	`O(1)` peek	—	`O(log n)`	`O(log n)` root	by priority
BST (avg)	`O(log n)`	`O(log n)`	`O(log n)`	`O(log n)`	in-order
AVL / Red-Black	`O(log n)`	`O(log n)`	`O(log n)`	`O(log n)`	in-order
Hash table	—	`O(1)*`	`O(1)*`	`O(1)*`	no
Graph (adj-list)	—	BFS/DFS `O(V+E)`	edge `O(1)`	edge `O(deg)`	—

* amortized · † requires a live reference to the affected node (the famous LRU-cache pattern combines a hash table + doubly-linked list to get this).

The hybrids you'll actually ship

LRU cache: Hash table → doubly-linked list. O(1) lookup + O(1) move-to-front on every hit. Every CPU cache, every database query plan cache, and every web browser tab uses this shape.
Priority queue with decrease-key: Heap + hash table (id → heap index). The hash table lets you find a node in the heap by external id in O(1), then percolate it up after lowering its priority. Dijkstra and A* assume this.
Union-find / disjoint set: Array-backed forest. Used by Kruskal's MST, image segmentation, network connectivity. O(α(n)) per operation — effectively constant.
Trie (prefix tree): Tree where each edge is labeled with a character. Hash-table-of-children at each node. Powers autocomplete, IP routing tables, DNA pattern matching. Not in the core 10 lessons but built from BSTs and hash tables.
Skip list: Linked list with extra "express lanes" at each level. Probabilistic O(log n) lookup. Used by Redis sorted sets and LevelDB.
Bloom filter: Bit array + multiple hash functions. O(1) "is this key possibly in the set?" with a tunable false-positive rate. Used by CDN cache layers and biology pipelines.

Common picks people get wrong

Linked list when an array would be 50× faster. If you're iterating sequentially or doing index access, the cache locality of an array dominates. Reach for a linked list only when you genuinely need O(1) splice with a live reference.
BST when you really wanted a heap. If you only ever ask "what's the smallest?", a heap is smaller, faster, and simpler.
Hash table when you need range queries. Hash tables have no order. "Show me keys between 100 and 200" is O(n) in a hash table, O(log n + k) in a balanced tree.
Naive recursion when you should iterate with an explicit stack. Deep recursion overflows the call stack. Tree/graph traversals that might go more than a few thousand levels deep should keep their own stack array.
Adjacency matrix for a sparse graph. An n=10,000 matrix is 100M entries. Use an adjacency list unless the graph is genuinely dense.
Dijkstra on negative weights. Quietly wrong, doesn't crash. Use Bellman-Ford.