Lessons · #11 of 11

Dynamic Programming

Trade memory for time — solve once, remember forever

Dynamic programming · 1D table fill

…

speed step 0 / 0

Try it: Watch the 1-D table fill left to right — each cell is just the sum of the two before it. A value the naive recursion would recompute billions of times is read once from the cell next door. That swap — recompute → look up — is the whole idea.

Why it matters

A surprising number of "exponential" problems are secretly polynomial — they only look exponential because the obvious recursion solves the same subproblem millions of times. Dynamic programming is the technique of noticing the repetition and caching it. Once you see the pattern, problems that would take centuries collapse to milliseconds.

In robotics it's the foundation of value iteration, policy iteration, Bellman recursions, and the shortest-path algorithms you've already met. In strings it's the diff algorithm git uses. In bioinformatics it's how every sequence-alignment tool works.

The idea

A DP-shaped problem has two properties:

Optimal substructure — the optimal answer to the whole problem can be built from optimal answers to smaller versions of the same problem.
Overlapping subproblems — those smaller versions repeat. The naïve recursion would solve the same one many times.

When both hold, the trick is to solve each subproblem exactly once and store the result. Two flavors:

Memoization (top-down): keep the recursion; cache each call's return value. The first time you hit f(7) you compute it; thereafter you read the cache.
Tabulation (bottom-up): drop the recursion entirely. Fill an array (or 2-D table) from the smallest subproblems up to the full one. Each cell's value is a function of cells you've already filled.

Both are O(states × per-state-work). Memoization is closer to the natural recurrence; tabulation gives you finer control over space and order.

Key takeaways

DP = recursion + a cache that ensures each subproblem is solved once.
The hardest part is finding the state — what does f(i, j, k…) need to know to produce its answer?
Once the recurrence is right, the time complexity is (# of states) × (work per state).
1D problems often collapse from O(n) space to O(1) by keeping only the last few cells (rolling).
2D problems often collapse from O(m·n) to O(min(m, n)) by keeping only one row at a time.

Going deeper

Recognizing a DP shape

Five questions to ask of any problem you suspect is DP:

Can I describe the answer recursively? "The min cost to reach state X equals the min over choices Y of …"
What is the state? A single integer? A pair (i, j)? An index plus a flag?
What is the recurrence? How does f(state) decompose into f(smaller-state)?
What are the base cases? The smallest states whose answer is known without recursion.
What order should I fill the table in? Any order where every cell's dependencies are filled before it.

If you can answer all five, you can write the DP. If you can't, you don't have a DP problem yet — go re-state.

Memoization vs tabulation — when to pick which

| | Memoization | Tabulation | |---|---|---| | Mental model | Top-down: trust the recursion | Bottom-up: build piece by piece | | Implementation | Recursive function + cache | Loop filling an array | | Subproblems solved | Only those reachable | All of them | | Stack risk | Deep recursion → stack overflow | None | | Space-optimization potential | Harder | Easy (rolling rows / "last two values") |

Choose memoization when the state space is sparse — many states exist but the answer only depends on a small reachable set. Choose tabulation when you'll touch most of the state space anyway, or when you need to space-optimize.

Space optimization — the "rolling" trick

Most DP recurrences only depend on the immediately previous row or constant-many previous cells. So you don't need the full table.

Fibonacci needs only f(n-1) and f(n-2) → O(1) space.
LCS over m × n only depends on the previous row → O(min(m, n)) space.
Knapsack needs only the previous capacity column → O(W) space.

Watch the recurrence: if f(i) only uses f(i-1), f(i-2), …, f(i-k) for small k, you can keep just those k values and slide.

Robotics anchor — DP is Bellman is value iteration

The Bellman equation in reinforcement learning is literally a DP recurrence:

V*(s) = max over actions a of [ R(s, a) + γ · Σ P(s' | s, a) · V*(s') ]

The optimal value of state s equals the immediate reward plus the discounted future value of the best next state. Value iteration is just tabulation of the Bellman recurrence over the state space, iterated until convergence. Every gridworld solver, every path-planner that handles costs, every MPC controller in the textbook — they're DP.

When DP fails

DP needs optimal substructure. Some problems violate it:

Longest simple path in a general graph (no DP — NP-hard).
Anything where future choices depend on the full history, not just a summarizable state.

If you can't find a finite state that captures everything future steps need to know, DP doesn't apply.

In the wild

Diff algorithms (git diff, diff -u) compute longest common subsequence in O(m·n) — exactly the third problem below.
Robot planning under uncertainty uses value iteration / policy iteration — both pure DP on the state space.
Sequence alignment in computational biology (Needleman-Wunsch, Smith-Waterman) is DP over a 2-D edit table.
Speech recognition's Viterbi algorithm finds the most likely sequence of hidden states — DP on a trellis.
Compiler register allocation uses DP on expression trees.
Routing protocols (OSPF, BGP path selection) — Bellman-Ford is DP on graph distances.

Math details

Time: O(states × work-per-state) — once you've chosen a state representation, this is exact.
Space (full table): O(states)
Space (rolling): Often O(d) where d is the recurrence's "lookback depth" — usually a small constant or one row.
Memoization overhead: HashMap lookup + recursion stack — constant factor 2-3× over a tight tabulation loop. Worth it for sparse state spaces.

Implementation

The skeleton — every DP fits this template

def solve():
    memo = {}
    def f(state):
        if state in BASE_CASES: return base_answer
        if state in memo: return memo[state]
        result = combine(f(smaller_state_1), f(smaller_state_2), ...)
        memo[state] = result
        return result
    return f(initial_state)

Or, tabulated:

def solve():
    dp = [base_answer for _ in range(N + 1)]
    for state in range(SMALLEST + 1, LARGEST + 1):
        dp[state] = combine(dp[smaller_1], dp[smaller_2], ...)
    return dp[LARGEST]

Classic problem shapes

1D linear: f(n) = combine(f(n-1), f(n-2)) — Fibonacci, climb stairs, house robber
1D unbounded: f(amount) = min over coins of f(amount - coin) + 1 — coin change, rod cutting
2D: f(i, j) = combine(f(i-1, j), f(i, j-1), f(i-1, j-1)) — LCS, edit distance, grid paths
Intervals: f(i, j) = min over splits k of f(i, k) + f(k, j) + cost(i, j, k) — matrix-chain, optimal BST
Tree DP: post-order traversal, each node's answer composed from its children's — diameter, max-path-sum

Practice

Three problems that lock in the three foundational DP shapes:

Climb Stairs — easy · 1D Fibonacci-shape; the memoization → tabulation → O(1)-space progression
Coin Change — medium · 1D unbounded knapsack; "min over choices" recurrence
Longest Common Subsequence — medium · 2D table; foundation of every diff tool

→ Browse all practice problems

🧮 Dynamic Programming