Dynamic Programming
Trade memory for time — solve once, remember forever
Dynamic programming · 1D table fill
…
Try it: Watch the 1-D table fill left to right — each cell is just the sum of the two before it. A value the naive recursion would recompute billions of times is read once from the cell next door. That swap — recompute → look up — is the whole idea.
Why it matters
A surprising number of "exponential" problems are secretly polynomial — they only look exponential because the obvious recursion solves the same subproblem millions of times. Dynamic programming is the technique of noticing the repetition and caching it. Once you see the pattern, problems that would take centuries collapse to milliseconds.
In robotics it's the foundation of value iteration, policy iteration, Bellman recursions, and the shortest-path algorithms you've already met. In strings it's the diff algorithm git uses. In bioinformatics it's how every sequence-alignment tool works.
The idea
A DP-shaped problem has two properties:
- Optimal substructure — the optimal answer to the whole problem can be built from optimal answers to smaller versions of the same problem.
- Overlapping subproblems — those smaller versions repeat. The naïve recursion would solve the same one many times.
When both hold, the trick is to solve each subproblem exactly once and store the result. Two flavors:
- Memoization (top-down): keep the recursion; cache each call's return value. The first time you hit
f(7)you compute it; thereafter you read the cache. - Tabulation (bottom-up): drop the recursion entirely. Fill an array (or 2-D table) from the smallest subproblems up to the full one. Each cell's value is a function of cells you've already filled.
Both are O(states × per-state-work). Memoization is closer to the natural recurrence; tabulation gives you finer control over space and order.
Key takeaways
- DP = recursion + a cache that ensures each subproblem is solved once.
- The hardest part is finding the state — what does
f(i, j, k…)need to know to produce its answer? - Once the recurrence is right, the time complexity is
(# of states) × (work per state). - 1D problems often collapse from
O(n)space toO(1)by keeping only the last few cells (rolling). - 2D problems often collapse from
O(m·n)toO(min(m, n))by keeping only one row at a time.
Going deeper
Recognizing a DP shape
Five questions to ask of any problem you suspect is DP:
- Can I describe the answer recursively? "The min cost to reach state X equals the min over choices Y of …"
- What is the state? A single integer? A pair
(i, j)? An index plus a flag? - What is the recurrence? How does
f(state)decompose intof(smaller-state)? - What are the base cases? The smallest states whose answer is known without recursion.
- What order should I fill the table in? Any order where every cell's dependencies are filled before it.
If you can answer all five, you can write the DP. If you can't, you don't have a DP problem yet — go re-state.
Memoization vs tabulation — when to pick which
| | Memoization | Tabulation | |---|---|---| | Mental model | Top-down: trust the recursion | Bottom-up: build piece by piece | | Implementation | Recursive function + cache | Loop filling an array | | Subproblems solved | Only those reachable | All of them | | Stack risk | Deep recursion → stack overflow | None | | Space-optimization potential | Harder | Easy (rolling rows / "last two values") |
Choose memoization when the state space is sparse — many states exist but the answer only depends on a small reachable set. Choose tabulation when you'll touch most of the state space anyway, or when you need to space-optimize.
Space optimization — the "rolling" trick
Most DP recurrences only depend on the immediately previous row or constant-many previous cells. So you don't need the full table.
- Fibonacci needs only
f(n-1)andf(n-2)→O(1)space. - LCS over
m × nonly depends on the previous row →O(min(m, n))space. - Knapsack needs only the previous capacity column →
O(W)space.
Watch the recurrence: if f(i) only uses f(i-1), f(i-2), …, f(i-k) for small k, you can keep just those k values and slide.
Robotics anchor — DP is Bellman is value iteration
The Bellman equation in reinforcement learning is literally a DP recurrence:
V*(s) = max over actions a of [ R(s, a) + γ · Σ P(s' | s, a) · V*(s') ]
The optimal value of state s equals the immediate reward plus the discounted future value of the best next state. Value iteration is just tabulation of the Bellman recurrence over the state space, iterated until convergence. Every gridworld solver, every path-planner that handles costs, every MPC controller in the textbook — they're DP.
When DP fails
DP needs optimal substructure. Some problems violate it:
- Longest simple path in a general graph (no DP — NP-hard).
- Anything where future choices depend on the full history, not just a summarizable state.
If you can't find a finite state that captures everything future steps need to know, DP doesn't apply.
In the wild
- Diff algorithms (
git diff,diff -u) compute longest common subsequence in O(m·n) — exactly the third problem below. - Robot planning under uncertainty uses value iteration / policy iteration — both pure DP on the state space.
- Sequence alignment in computational biology (Needleman-Wunsch, Smith-Waterman) is DP over a 2-D edit table.
- Speech recognition's Viterbi algorithm finds the most likely sequence of hidden states — DP on a trellis.
- Compiler register allocation uses DP on expression trees.
- Routing protocols (OSPF, BGP path selection) — Bellman-Ford is DP on graph distances.
Math details
- Time
- O(states × work-per-state) — once you've chosen a state representation, this is exact.
- Space (full table)
- O(states)
- Space (rolling)
- Often O(d) where
dis the recurrence's "lookback depth" — usually a small constant or one row. - Memoization overhead
- HashMap lookup + recursion stack — constant factor 2-3× over a tight tabulation loop. Worth it for sparse state spaces.
Implementation
The skeleton — every DP fits this template
def solve():
memo = {}
def f(state):
if state in BASE_CASES: return base_answer
if state in memo: return memo[state]
result = combine(f(smaller_state_1), f(smaller_state_2), ...)
memo[state] = result
return result
return f(initial_state)
Or, tabulated:
def solve():
dp = [base_answer for _ in range(N + 1)]
for state in range(SMALLEST + 1, LARGEST + 1):
dp[state] = combine(dp[smaller_1], dp[smaller_2], ...)
return dp[LARGEST]
Classic problem shapes
- 1D linear:
f(n) = combine(f(n-1), f(n-2))— Fibonacci, climb stairs, house robber - 1D unbounded:
f(amount) = min over coins of f(amount - coin) + 1— coin change, rod cutting - 2D:
f(i, j) = combine(f(i-1, j), f(i, j-1), f(i-1, j-1))— LCS, edit distance, grid paths - Intervals:
f(i, j) = min over splits k of f(i, k) + f(k, j) + cost(i, j, k)— matrix-chain, optimal BST - Tree DP: post-order traversal, each node's answer composed from its children's — diameter, max-path-sum
Practice
Three problems that lock in the three foundational DP shapes:
- Climb Stairs — easy · 1D Fibonacci-shape; the memoization → tabulation → O(1)-space progression
- Coin Change — medium · 1D unbounded knapsack; "min over choices" recurrence
- Longest Common Subsequence — medium · 2D table; foundation of every diff tool