Longest Substring Without Repeating Characters

time: O(n)
space: O(min(n, alphabet))
tags: hash-tables · sliding-window · strings

in the wild Detect the longest distinct-byte run in a packet stream; longest run of non-repeating tokens in an LLM prompt cache.

Problem

Given a string s, return the length of the longest substring (contiguous) that contains no repeated characters.

Examples

s = "abcabcbb" → 3 (the substring "abc")
s = "bbbbb" → 1 ("b")
s = "pwwkew" → 3 ("wke" — note: "pwke" is a subsequence, not a substring)
s = "" → 0

Why this is the canonical "sliding window + hash" problem

Naïve: for every starting index i, scan forward checking for the first repeat. O(n²) worst case.

The smart approach is a sliding window [l, r] plus a hash map of "most recent index of each character":

Walk r from 0 to n − 1. For each character s[r]:
If we've seen this character at index ≥ l, jump l to prev_index + 1 — we know the window can't include that character anymore.
Update the character's "most recent index" to r.
Track max(r − l + 1) over the run.

Each character is seen by r exactly once; l only moves forward; the hash op is O(1). Total: O(n).

This expand-r, contract-l-as-needed, hash-the-window-contents template is the sliding-window pattern for hash-augmented problems. Variants: "longest substring with K distinct chars", "shortest substring containing all letters of T", "all anagrams of T in S". Once you internalize the window contract, you write any of these in five minutes.

Hints

Hint 1 — why does a hash help?

Brute force is "for each starting position, scan rightward until you hit a repeat". The inner scan re-walks characters you've already seen. What if you remembered where you last saw each character?

Hint 2 — the window won't shrink unnecessarily

If you do hit a repeat, you don't have to slide l one step at a time. You can jump straight past the previous occurrence: l = max(l, prev_index + 1). The max matters — if the previous occurrence is already behind l, ignore it.

Hint 3 — alphabet-size optimization

If the alphabet is fixed (e.g. 128 ASCII chars), the hash map can be a direct-indexed array of size 128 — same big-O, much smaller constant.

Solution

def length_of_longest_substring(s: str) -> int:
    last: dict[str, int] = {}                  # char → last-seen index
    best = 0
    l    = 0
    for r, ch in enumerate(s):
        if ch in last and last[ch] >= l:
            l = last[ch] + 1                   # jump past prior occurrence
        last[ch] = r
        best = max(best, r - l + 1)
    return best

cases = [
    ("abcabcbb", 3),
    ("bbbbb",    1),
    ("pwwkew",   3),
    ("",         0),
    ("dvdf",     3),                            # tricky: 'd' first at 0, 'd' again at 2 → "vdf"
    ("abba",     2),                            # ensures the >= l guard works
]
for s, expected in cases:
    got  = length_of_longest_substring(s)
    mark = "✓" if got == expected else "✗"
    print(f"{mark} length_of_longest_substring({s!r}) = {got}  (expected {expected})")

function lengthOfLongestSubstring(s) {
  const last = new Map();                       // char → last-seen index
  let best   = 0;
  let l      = 0;
  for (let r = 0; r < s.length; r++) {
    const ch = s[r];
    if (last.has(ch) && last.get(ch) >= l) l = last.get(ch) + 1;
    last.set(ch, r);
    if (r - l + 1 > best) best = r - l + 1;
  }
  return best;
}

const cases = [
  ["abcabcbb", 3],
  ["bbbbb",    1],
  ["pwwkew",   3],
  ["",         0],
  ["dvdf",     3],
  ["abba",     2],
];
for (const [s, expected] of cases) {
  const got = lengthOfLongestSubstring(s);
  console.log(`${got === expected ? "✓" : "✗"} lengthOfLongestSubstring(${JSON.stringify(s)}) = ${got}  (expected ${expected})`);
}

use std::collections::HashMap;

pub fn length_of_longest_substring(s: &str) -> usize {
    let mut last: HashMap<char, usize> = HashMap::new();
    let mut best = 0usize;
    let mut l    = 0usize;
    for (r, ch) in s.chars().enumerate() {
        if let Some(&prev) = last.get(&ch) {
            if prev >= l { l = prev + 1; }
        }
        last.insert(ch, r);
        best = best.max(r - l + 1);
    }
    best
}

fn main() {
    let cases = [
        ("abcabcbb", 3usize),
        ("bbbbb",    1),
        ("pwwkew",   3),
        ("",         0),
        ("dvdf",     3),
        ("abba",     2),
    ];
    for (s, expected) in cases {
        let got  = length_of_longest_substring(s);
        let mark = if got == expected { "✓" } else { "✗" };
        println!("{mark} length_of_longest_substring({:?}) = {}  (expected {})", s, got, expected);
    }
}

#include <algorithm>
#include <iostream>
#include <string>
#include <unordered_map>

int length_of_longest_substring(const std::string& s) {
    std::unordered_map<char, int> last;
    int best = 0, l = 0;
    for (int r = 0; r < (int)s.size(); ++r) {
        auto it = last.find(s[r]);
        if (it != last.end() && it->second >= l) l = it->second + 1;
        last[s[r]] = r;
        best = std::max(best, r - l + 1);
    }
    return best;
}

int main() {
    std::pair<std::string, int> cases[] = {
        {"abcabcbb", 3},
        {"bbbbb",    1},
        {"pwwkew",   3},
        {"",         0},
        {"dvdf",     3},
        {"abba",     2},
    };
    for (auto& [s, expected] : cases) {
        int got = length_of_longest_substring(s);
        std::cout << (got == expected ? "✓" : "✗")
                  << " length_of_longest_substring(\"" << s << "\") = " << got
                  << "  (expected " << expected << ")\n";
    }
}

// ASCII (8-bit) alphabet → direct-indexed array, no hash table needed.
#include <stdio.h>
#include <string.h>

static int imax(int a, int b) { return a > b ? a : b; }

int length_of_longest_substring(const char* s) {
    int last[256];
    for (int i = 0; i < 256; i++) last[i] = -1;
    int best = 0, l = 0, n = (int)strlen(s);
    for (int r = 0; r < n; r++) {
        unsigned c = (unsigned char)s[r];
        if (last[c] >= l) l = last[c] + 1;
        last[c] = r;
        best = imax(best, r - l + 1);
    }
    return best;
}

static void show(const char* s, int expected) {
    int got = length_of_longest_substring(s);
    printf("%s length_of_longest_substring(\"%s\") = %d  (expected %d)\n",
           got == expected ? "✓" : "✗", s, got, expected);
}

int main(void) {
    show("abcabcbb", 3);
    show("bbbbb",    1);
    show("pwwkew",   3);
    show("",         0);
    show("dvdf",     3);
    show("abba",     2);
    return 0;
}

idle

The abba case is the one that catches sloppy implementations: when r = 3 ('a'), the previous 'a' was at index 0, but l has already moved to 2 (past the duplicate 'b'). Without the prev >= l guard, you'd reset l to 1 — moving it backwards, an invariant violation. Always: l = max(l, prev + 1).

Complexity

Time: O(n) — r walks once; l only moves forward; hash ops are O(1).
Space: O(min(n, |Σ|)) where |Σ| is the alphabet size. For ASCII strings, that's effectively O(1).
vs brute force: The naive O(n²) approach re-scans the window from every starting position. The sliding-window-with-hash approach sees each character exactly twice (once by r, at most once by l).

In the wild

Packet stream analysis. Find the longest run of distinct byte values — used as a heuristic in some compression and anomaly-detection systems.
Token / prompt analysis in LLM serving: longest distinct-token run for cache hashing & deduplication.
DNA / sequence biology. Longest substring with no repeated nucleotide motif.
The general sliding-window template powers every other "longest window where X holds" / "shortest window where Y holds" interview question.

Variations

Longest substring with at most K distinct characters. Track count of each char in the window; shrink l while distinct-count > K.
Longest substring with exactly K distinct characters. at_most(K) - at_most(K - 1) trick.
Minimum window substring of T in S. Window grows on the right until valid, then shrinks on the left as far as possible.
Permutation in a string. Sliding window of fixed length = |T| with a character-frequency vector.