Lessons · #11 of 11

Dynamic Programming

Trade memory for time — solve once, remember forever

Dynamic programming · 1D table fill

step 0 / 0

Try it: Watch the 1-D table fill left to right — each cell is just the sum of the two before it. A value the naive recursion would recompute billions of times is read once from the cell next door. That swap — recompute → look up — is the whole idea.

Why it matters

A surprising number of "exponential" problems are secretly polynomial — they only look exponential because the obvious recursion solves the same subproblem millions of times. Dynamic programming is the technique of noticing the repetition and caching it. Once you see the pattern, problems that would take centuries collapse to milliseconds.

In robotics it's the foundation of value iteration, policy iteration, Bellman recursions, and the shortest-path algorithms you've already met. In strings it's the diff algorithm git uses. In bioinformatics it's how every sequence-alignment tool works.

The idea

A DP-shaped problem has two properties:

  1. Optimal substructure — the optimal answer to the whole problem can be built from optimal answers to smaller versions of the same problem.
  2. Overlapping subproblems — those smaller versions repeat. The naïve recursion would solve the same one many times.

When both hold, the trick is to solve each subproblem exactly once and store the result. Two flavors:

Both are O(states × per-state-work). Memoization is closer to the natural recurrence; tabulation gives you finer control over space and order.

Key takeaways

Going deeper

Recognizing a DP shape

Five questions to ask of any problem you suspect is DP:

  1. Can I describe the answer recursively? "The min cost to reach state X equals the min over choices Y of …"
  2. What is the state? A single integer? A pair (i, j)? An index plus a flag?
  3. What is the recurrence? How does f(state) decompose into f(smaller-state)?
  4. What are the base cases? The smallest states whose answer is known without recursion.
  5. What order should I fill the table in? Any order where every cell's dependencies are filled before it.

If you can answer all five, you can write the DP. If you can't, you don't have a DP problem yet — go re-state.

Memoization vs tabulation — when to pick which

| | Memoization | Tabulation | |---|---|---| | Mental model | Top-down: trust the recursion | Bottom-up: build piece by piece | | Implementation | Recursive function + cache | Loop filling an array | | Subproblems solved | Only those reachable | All of them | | Stack risk | Deep recursion → stack overflow | None | | Space-optimization potential | Harder | Easy (rolling rows / "last two values") |

Choose memoization when the state space is sparse — many states exist but the answer only depends on a small reachable set. Choose tabulation when you'll touch most of the state space anyway, or when you need to space-optimize.

Space optimization — the "rolling" trick

Most DP recurrences only depend on the immediately previous row or constant-many previous cells. So you don't need the full table.

Watch the recurrence: if f(i) only uses f(i-1), f(i-2), …, f(i-k) for small k, you can keep just those k values and slide.

Robotics anchor — DP is Bellman is value iteration

The Bellman equation in reinforcement learning is literally a DP recurrence:

V*(s) = max over actions a of [ R(s, a) + γ · Σ P(s' | s, a) · V*(s') ]

The optimal value of state s equals the immediate reward plus the discounted future value of the best next state. Value iteration is just tabulation of the Bellman recurrence over the state space, iterated until convergence. Every gridworld solver, every path-planner that handles costs, every MPC controller in the textbook — they're DP.

When DP fails

DP needs optimal substructure. Some problems violate it:

If you can't find a finite state that captures everything future steps need to know, DP doesn't apply.

In the wild

Math details

Time
O(states × work-per-state) — once you've chosen a state representation, this is exact.
Space (full table)
O(states)
Space (rolling)
Often O(d) where d is the recurrence's "lookback depth" — usually a small constant or one row.
Memoization overhead
HashMap lookup + recursion stack — constant factor 2-3× over a tight tabulation loop. Worth it for sparse state spaces.

Implementation

The skeleton — every DP fits this template

def solve():
    memo = {}
    def f(state):
        if state in BASE_CASES: return base_answer
        if state in memo: return memo[state]
        result = combine(f(smaller_state_1), f(smaller_state_2), ...)
        memo[state] = result
        return result
    return f(initial_state)

Or, tabulated:

def solve():
    dp = [base_answer for _ in range(N + 1)]
    for state in range(SMALLEST + 1, LARGEST + 1):
        dp[state] = combine(dp[smaller_1], dp[smaller_2], ...)
    return dp[LARGEST]

Classic problem shapes

Practice

Three problems that lock in the three foundational DP shapes:

Browse all practice problems