Big-O complexity
How fast does the work grow when the input grows?
complexity · growth race
Semi-log plot. X is linear (0 → current N); Y is log₁₀, fixed from 1 op to 10¹⁸. The Y gridlines (1k, 1M, 1B, 1T, 1P, 1E) stay put as you drag — watch each curve climb past them. The red dashed line marks the 1-second budget at 1 ns/op (≈ 1 billion ops). Anything above it doesn't finish in under a second.
Each tile is the shape of work each algorithm performs on an n×n problem. The grid is schematic — drag the slider to see how visit counts change in the other modes.
Read carefully — fill is on a log10 scale.
Each tile spans 1 op to O(n²) at N=1B in
log space. "59% vs 100%" doesn't mean 1.7× — at N=1B,
O(n²) does 16.7 million times more work than
O(n log n). The multiplier on each row reads the
real ratio.
Wall-clock at 1 ns per operation, plotted against human/cosmic time anchors. The line marks where each algorithm finishes.
drag the slider — every mode updates live
Try it: Pull the N slider from 4 up to 4k, hitting run experiment at each stop. At N=4 every bar looks the same; at N=4k the O(n²) bar dwarfs the rest. That gap is what Big-O is trying to warn you about.
Why it matters
Every data structure you'll meet in the next 10 lessons trades one cost for another. Arrays read in O(1) but insert in O(n). Hash tables look up in O(1) but have no order. Balanced trees give up some constant-factor speed to guarantee O(log n) worst case. None of those words mean anything until you've felt what the difference looks like as N grows.
That's what this lesson is for. No code yet — just the curve.
What Big-O is measuring
The number of basic operations an algorithm performs, as a function of the input size, in the worst case, ignoring constants and lower-order terms.
Three load-bearing pieces:
- As a function of the input size. It's a curve, not a number. "How many comparisons did this take?" has no answer; "How many comparisons did this take for N items?" does.
- Worst case. A linear search through 1,000 items might find the target on the first try. But if you're shipping that code to production, you have to plan for the worst — checking all 1,000.
- Ignoring constants and lower-order terms.
3n + 5andnare bothO(n). The constants get drowned out as N grows. They matter for benchmarks; they don't matter for asymptotic shape.
O, Θ, Ω — the three bounds
Engineers say "Big-O" casually but textbooks distinguish three notations. They describe different bounds on a function's growth.
- Asymptotic notations
- O(f(n)) — upper bound.
T(n) = O(f(n))meansTgrows at most as fast asf. "Costs no more than." - Ω(f(n)) — lower bound.
T(n) = Ω(f(n))meansTgrows at least as fast asf. "Costs no less than." - Θ(f(n)) — tight bound. Both
O(f(n))andΩ(f(n)). "Costs exactly proportional to." - One concrete example — merge sort
- O(n log n) — its worst case is no worse than
n log n. True but loose: it's alsoO(n²),O(n³), … any upper bound holds. - Ω(n log n) — its best case is at least
n log n(any comparison sort has this lower bound by an information-theoretic argument). - Θ(n log n) — the tight statement: merge sort is exactly on the order of
n log n, neither faster nor slower.
Why this matters when revising: when a textbook writes Θ(n) it's claiming the algorithm is both upper- and lower-bounded by n — that's a stronger statement than O(n). A binary search is O(log n) (worst case), Ω(1) (best case — match on the first probe), and not Θ(log n) overall — but Θ(log n) in the worst case. Pin down which case you're talking about before you pick a notation.
Best, worst, average
Hand-in-hand with the bounds above is which input you measure:
- Worst case — the input that maximizes work. Default when you say "Big-O" without qualification.
- Best case — the input that minimizes work. Quicksort's best case is
Θ(n log n); its worst isΘ(n²). - Average case — expected work over an input distribution. Hash table lookup is
O(1)average butO(n)worst. - Amortized — average over a sequence of operations on the same data structure. Dynamic array append is
O(1)amortized because the rareO(n)resize is paid off bynprevious cheap appends. See the arrays and heaps lessons for the proofs.
Reading the chart
Each bar is one classic operation running on the same input array of length N:
- Classes
- O(1)
lookup arr[42]— one CPU operation, regardless of N - O(log n)
binary search— halve the search space each step - O(n)
find max— touch every element once - O(n log n)
merge sort— n elements, each shuffled through log n merge levels - O(n²)
find all pairs— every (i,j) combination - What the two bars mean
- predicted the formula's value at this N (the asymptote)
- actual the real op-count from running the algorithm on a real array
The bars use a log scale on the x-axis. That's the only way O(1) = 1 op and O(n²) = 8 million ops fit on the same chart. Eyeballing distances is misleading on a log axis — read the numbers.
The small-N trap
At N = 4 the bars are visually indistinguishable. O(1) = 1, O(n²) = 6. Six is "basically nothing". This is why early-career engineers write O(n²) code that works in testing and melts in production — the test fixture had 10 rows, the prod table has 10 million.
By N = 1024:
- What 1024 elements cost
- O(1)
1op - O(log n)
~10ops - O(n)
1,024ops - O(n log n)
~10,240ops - O(n²)
523,776ops
Same algorithm, same machine — five orders of magnitude separate them.
Going deeper
A few rules of thumb you'll lean on in the rest of the curriculum:
- If you see a single loop over the input, you've got at least
O(n). - If you see nested loops both touching the input, you're at
O(n²). - If the work halves each step (recursion, binary search, balanced-tree descent), it's
O(log n). - Sorting comparison-based data is
O(n log n)— you cannot beat that in the general case (information-theoretic lower bound). - Hashing collapses the loop into
O(1)expected — at the cost of giving up order and worst-case guarantees.
Now — onto the actual data structures.