Longest Substring Without Repeating Characters
in the wild Detect the longest distinct-byte run in a packet stream; longest run of non-repeating tokens in an LLM prompt cache.
Problem
Given a string s, return the length of the longest substring (contiguous) that contains no repeated characters.
Examples
s = "abcabcbb"→3(the substring"abc")s = "bbbbb"→1("b")s = "pwwkew"→3("wke"— note:"pwke"is a subsequence, not a substring)s = ""→0
Why this is the canonical "sliding window + hash" problem
Naïve: for every starting index i, scan forward checking for the first repeat. O(n²) worst case.
The smart approach is a sliding window [l, r] plus a hash map of "most recent index of each character":
- Walk
rfrom0ton − 1. For each characters[r]: - If we've seen this character at index
≥ l, jumpltoprev_index + 1— we know the window can't include that character anymore. - Update the character's "most recent index" to
r. - Track
max(r − l + 1)over the run.
Each character is seen by r exactly once; l only moves forward; the hash op is O(1). Total: O(n).
This expand-r, contract-l-as-needed, hash-the-window-contents template is the sliding-window pattern for hash-augmented problems. Variants: "longest substring with K distinct chars", "shortest substring containing all letters of T", "all anagrams of T in S". Once you internalize the window contract, you write any of these in five minutes.
Hints
Hint 1 — why does a hash help?
Brute force is "for each starting position, scan rightward until you hit a repeat". The inner scan re-walks characters you've already seen. What if you remembered where you last saw each character?
Hint 2 — the window won't shrink unnecessarily
If you do hit a repeat, you don't have to slide l one step at a time. You can jump straight past the previous occurrence: l = max(l, prev_index + 1). The max matters — if the previous occurrence is already behind l, ignore it.
Hint 3 — alphabet-size optimization
If the alphabet is fixed (e.g. 128 ASCII chars), the hash map can be a direct-indexed array of size 128 — same big-O, much smaller constant.
Solution
def length_of_longest_substring(s: str) -> int:
last: dict[str, int] = {} # char → last-seen index
best = 0
l = 0
for r, ch in enumerate(s):
if ch in last and last[ch] >= l:
l = last[ch] + 1 # jump past prior occurrence
last[ch] = r
best = max(best, r - l + 1)
return best
cases = [
("abcabcbb", 3),
("bbbbb", 1),
("pwwkew", 3),
("", 0),
("dvdf", 3), # tricky: 'd' first at 0, 'd' again at 2 → "vdf"
("abba", 2), # ensures the >= l guard works
]
for s, expected in cases:
got = length_of_longest_substring(s)
mark = "✓" if got == expected else "✗"
print(f"{mark} length_of_longest_substring({s!r}) = {got} (expected {expected})")function lengthOfLongestSubstring(s) {
const last = new Map(); // char → last-seen index
let best = 0;
let l = 0;
for (let r = 0; r < s.length; r++) {
const ch = s[r];
if (last.has(ch) && last.get(ch) >= l) l = last.get(ch) + 1;
last.set(ch, r);
if (r - l + 1 > best) best = r - l + 1;
}
return best;
}
const cases = [
["abcabcbb", 3],
["bbbbb", 1],
["pwwkew", 3],
["", 0],
["dvdf", 3],
["abba", 2],
];
for (const [s, expected] of cases) {
const got = lengthOfLongestSubstring(s);
console.log(`${got === expected ? "✓" : "✗"} lengthOfLongestSubstring(${JSON.stringify(s)}) = ${got} (expected ${expected})`);
}use std::collections::HashMap;
pub fn length_of_longest_substring(s: &str) -> usize {
let mut last: HashMap<char, usize> = HashMap::new();
let mut best = 0usize;
let mut l = 0usize;
for (r, ch) in s.chars().enumerate() {
if let Some(&prev) = last.get(&ch) {
if prev >= l { l = prev + 1; }
}
last.insert(ch, r);
best = best.max(r - l + 1);
}
best
}
fn main() {
let cases = [
("abcabcbb", 3usize),
("bbbbb", 1),
("pwwkew", 3),
("", 0),
("dvdf", 3),
("abba", 2),
];
for (s, expected) in cases {
let got = length_of_longest_substring(s);
let mark = if got == expected { "✓" } else { "✗" };
println!("{mark} length_of_longest_substring({:?}) = {} (expected {})", s, got, expected);
}
}#include <algorithm>
#include <iostream>
#include <string>
#include <unordered_map>
int length_of_longest_substring(const std::string& s) {
std::unordered_map<char, int> last;
int best = 0, l = 0;
for (int r = 0; r < (int)s.size(); ++r) {
auto it = last.find(s[r]);
if (it != last.end() && it->second >= l) l = it->second + 1;
last[s[r]] = r;
best = std::max(best, r - l + 1);
}
return best;
}
int main() {
std::pair<std::string, int> cases[] = {
{"abcabcbb", 3},
{"bbbbb", 1},
{"pwwkew", 3},
{"", 0},
{"dvdf", 3},
{"abba", 2},
};
for (auto& [s, expected] : cases) {
int got = length_of_longest_substring(s);
std::cout << (got == expected ? "✓" : "✗")
<< " length_of_longest_substring(\"" << s << "\") = " << got
<< " (expected " << expected << ")\n";
}
}// ASCII (8-bit) alphabet → direct-indexed array, no hash table needed.
#include <stdio.h>
#include <string.h>
static int imax(int a, int b) { return a > b ? a : b; }
int length_of_longest_substring(const char* s) {
int last[256];
for (int i = 0; i < 256; i++) last[i] = -1;
int best = 0, l = 0, n = (int)strlen(s);
for (int r = 0; r < n; r++) {
unsigned c = (unsigned char)s[r];
if (last[c] >= l) l = last[c] + 1;
last[c] = r;
best = imax(best, r - l + 1);
}
return best;
}
static void show(const char* s, int expected) {
int got = length_of_longest_substring(s);
printf("%s length_of_longest_substring(\"%s\") = %d (expected %d)\n",
got == expected ? "✓" : "✗", s, got, expected);
}
int main(void) {
show("abcabcbb", 3);
show("bbbbb", 1);
show("pwwkew", 3);
show("", 0);
show("dvdf", 3);
show("abba", 2);
return 0;
}The abba case is the one that catches sloppy implementations: when r = 3 ('a'), the previous 'a' was at index 0, but l has already moved to 2 (past the duplicate 'b'). Without the prev >= l guard, you'd reset l to 1 — moving it backwards, an invariant violation. Always: l = max(l, prev + 1).
Complexity
- Time
- O(n) —
rwalks once;lonly moves forward; hash ops are O(1). - Space
- O(min(n, |Σ|)) where
|Σ|is the alphabet size. For ASCII strings, that's effectively O(1). - vs brute force
- The naive
O(n²)approach re-scans the window from every starting position. The sliding-window-with-hash approach sees each character exactly twice (once byr, at most once byl).
In the wild
- Packet stream analysis. Find the longest run of distinct byte values — used as a heuristic in some compression and anomaly-detection systems.
- Token / prompt analysis in LLM serving: longest distinct-token run for cache hashing & deduplication.
- DNA / sequence biology. Longest substring with no repeated nucleotide motif.
- The general sliding-window template powers every other "longest window where X holds" / "shortest window where Y holds" interview question.
Variations
- Longest substring with at most K distinct characters. Track count of each char in the window; shrink
lwhile distinct-count > K. - Longest substring with exactly K distinct characters.
at_most(K) - at_most(K - 1)trick. - Minimum window substring of T in S. Window grows on the right until valid, then shrinks on the left as far as possible.
- Permutation in a string. Sliding window of fixed length =
|T|with a character-frequency vector.