Refresher · #0 of 11

Big-O complexity

How fast does the work grow when the input grows?

complexity · growth race

input size 1,000,000 elements

drag the slider — every mode updates live

Try it: Pull the N slider from 4 up to 4k, hitting run experiment at each stop. At N=4 every bar looks the same; at N=4k the O(n²) bar dwarfs the rest. That gap is what Big-O is trying to warn you about.

Why it matters

Every data structure you'll meet in the next 10 lessons trades one cost for another. Arrays read in O(1) but insert in O(n). Hash tables look up in O(1) but have no order. Balanced trees give up some constant-factor speed to guarantee O(log n) worst case. None of those words mean anything until you've felt what the difference looks like as N grows.

That's what this lesson is for. No code yet — just the curve.

What Big-O is measuring

The number of basic operations an algorithm performs, as a function of the input size, in the worst case, ignoring constants and lower-order terms.

Three load-bearing pieces:

As a function of the input size. It's a curve, not a number. "How many comparisons did this take?" has no answer; "How many comparisons did this take for N items?" does.
Worst case. A linear search through 1,000 items might find the target on the first try. But if you're shipping that code to production, you have to plan for the worst — checking all 1,000.
Ignoring constants and lower-order terms. 3n + 5 and n are both O(n). The constants get drowned out as N grows. They matter for benchmarks; they don't matter for asymptotic shape.

O, Θ, Ω — the three bounds

Engineers say "Big-O" casually but textbooks distinguish three notations. They describe different bounds on a function's growth.

Asymptotic notations: O(f(n)) — upper bound. T(n) = O(f(n)) means T grows at most as fast as f. "Costs no more than."; Ω(f(n)) — lower bound. T(n) = Ω(f(n)) means T grows at least as fast as f. "Costs no less than."; Θ(f(n)) — tight bound. Both O(f(n)) and Ω(f(n)). "Costs exactly proportional to."
One concrete example — merge sort: O(n log n) — its worst case is no worse than n log n. True but loose: it's also O(n²), O(n³), … any upper bound holds.; Ω(n log n) — its best case is at least n log n (any comparison sort has this lower bound by an information-theoretic argument).; Θ(n log n) — the tight statement: merge sort is exactly on the order of n log n, neither faster nor slower.

Why this matters when revising: when a textbook writes Θ(n) it's claiming the algorithm is both upper- and lower-bounded by n — that's a stronger statement than O(n). A binary search is O(log n) (worst case), Ω(1) (best case — match on the first probe), and not Θ(log n) overall — but Θ(log n) in the worst case. Pin down which case you're talking about before you pick a notation.

Best, worst, average

Hand-in-hand with the bounds above is which input you measure:

Worst case — the input that maximizes work. Default when you say "Big-O" without qualification.
Best case — the input that minimizes work. Quicksort's best case is Θ(n log n); its worst is Θ(n²).
Average case — expected work over an input distribution. Hash table lookup is O(1) average but O(n) worst.
Amortized — average over a sequence of operations on the same data structure. Dynamic array append is O(1) amortized because the rare O(n) resize is paid off by n previous cheap appends. See the arrays and heaps lessons for the proofs.

Reading the chart

Each bar is one classic operation running on the same input array of length N:

Classes: O(1) lookup arr[42] — one CPU operation, regardless of N; O(log n) binary search — halve the search space each step; O(n) find max — touch every element once; O(n log n) merge sort — n elements, each shuffled through log n merge levels; O(n²) find all pairs — every (i,j) combination
What the two bars mean: predicted the formula's value at this N (the asymptote); actual the real op-count from running the algorithm on a real array

The bars use a log scale on the x-axis. That's the only way O(1) = 1 op and O(n²) = 8 million ops fit on the same chart. Eyeballing distances is misleading on a log axis — read the numbers.

The small-N trap

At N = 4 the bars are visually indistinguishable. O(1) = 1, O(n²) = 6. Six is "basically nothing". This is why early-career engineers write O(n²) code that works in testing and melts in production — the test fixture had 10 rows, the prod table has 10 million.

By N = 1024:

What 1024 elements cost: O(1) 1 op; O(log n) ~10 ops; O(n) 1,024 ops; O(n log n) ~10,240 ops; O(n²) 523,776 ops

Same algorithm, same machine — five orders of magnitude separate them.

Going deeper

A few rules of thumb you'll lean on in the rest of the curriculum:

If you see a single loop over the input, you've got at least O(n).
If you see nested loops both touching the input, you're at O(n²).
If the work halves each step (recursion, binary search, balanced-tree descent), it's O(log n).
Sorting comparison-based data is O(n log n) — you cannot beat that in the general case (information-theoretic lower bound).
Hashing collapses the loop into O(1) expected — at the cost of giving up order and worst-case guarantees.

Now — onto the actual data structures.

📈 Big-O complexity