Skip to content

127. Word Ladder (Hard)

Problem

Given two words beginWord and endWord, and a word list wordList, return the length of the shortest transformation sequence from beginWord to endWord, where:

  • Each step changes exactly one letter.
  • Every intermediate word must be in wordList.
  • Return 0 if there’s no valid transformation.

beginWord is not required to be in wordList; endWord must be.

Example

  • beginWord = "hit", endWord = "cog", wordList = ["hot","dot","dog","lot","log","cog"]5 (hit → hot → dot → dog → cog)
  • Same setup, endWord = "cot"0

LeetCode 127 · Link · Hard

Approach 1: Brute force, BFS comparing every pair

BFS from beginWord. At each frontier word, scan the whole word list for “1 letter off” candidates.

from collections import deque
def ladder_length(beginWord, endWord, wordList):
if endWord not in wordList: # L1: O(N) set lookup
return 0
words = set(wordList) # L2: O(N) build set
q = deque([(beginWord, 1)]) # L3: O(1) init queue
visited = {beginWord} # L4: O(1) init visited
def one_away(a, b): # L5: O(L) per call
return sum(1 for x, y in zip(a, b) if x != y) == 1
while q: # L6: outer BFS loop, up to N levels
word, dist = q.popleft() # L7: O(1)
if word == endWord:
return dist
for cand in list(words): # L8: O(N) scan per word
if cand not in visited and one_away(word, cand): # L9: O(L) per candidate
visited.add(cand)
q.append((cand, dist + 1))
return 0

Where the time goes, line by line

Variables: N = len(wordList), L = length of each word (all words same length).

LinePer-call costTimes executedContribution
L1 (endWord check)O(N)1O(N)
L2 (set build)O(N)1O(N)
L6 (BFS loop)O(1)up to N itersO(N)
L8 (scan all words)O(N)once per dequeued wordO(N) per dequeue
L9 (one_away check)O(L)N times per dequeued wordO(N² · L) ← dominates

Each dequeued word scans the entire word list (L8, O(N)) and calls one_away for each candidate (L9, O(L)). In the worst case every word is dequeued, giving O(N) dequeues × O(N · L) work = O(N² · L). This is why the pattern-bucket approach replaces L8+L9 with an O(L)-per-word neighbor lookup.

Complexity

  • Time: O(N² · L), driven by L8/L9 (scanning all words and doing O(L) comparison per candidate per dequeued word).
  • Space: O(N · L).

Approach 2: Pattern buckets + BFS (optimal single-direction)

Preprocess: for each word, generate patterns like "h*t", "*ot", "ho*" (one wildcard). Two words are one-away iff they share a pattern. A dict pattern → [words] gives O(L) neighbor lookup.

from collections import defaultdict, deque
def ladder_length(beginWord, endWord, wordList):
if endWord not in wordList: # L1: O(N)
return 0
L = len(beginWord) # L2: O(1)
patterns = defaultdict(list) # L3: O(1) init
for w in wordList: # L4: O(N) outer
for i in range(L): # L5: O(L) inner
patterns[w[:i] + "*" + w[i+1:]].append(w) # L6: O(L) slice + store
q = deque([(beginWord, 1)]) # L7: O(1)
visited = {beginWord} # L8: O(1)
while q: # L9: BFS loop
word, dist = q.popleft() # L10: O(1)
if word == endWord:
return dist
for i in range(L): # L11: O(L) per word
key = word[:i] + "*" + word[i+1:] # L12: O(L) slice
for nb in patterns[key]: # L13: O(neighbors)
if nb not in visited:
visited.add(nb)
q.append((nb, dist + 1))
patterns[key] = [] # L14: prune bucket
return 0

Where the time goes, line by line

Variables: N = len(wordList), L = length of each word.

LinePer-call costTimes executedContribution
L4-L6 (build pattern dict)O(L) per word per positionN × LO(N · L²)
L9 (BFS loop)O(1)up to N itersO(N)
L11-L12 (generate key)O(L)L times per dequeued wordO(L) per word
L13 (neighbor lookup)O(neighbors per pattern)L per word, amortized O(N) totalO(N · L²) ← dominates
L14 (prune bucket)O(1)L per dequeued wordO(N · L)

The bottleneck is the preprocessing (L4-L6) plus the BFS neighbor lookups (L13). Each word generates L pattern keys, each of length L, so building the pattern dict costs O(N · L²). BFS visits each edge at most once (via L14 pruning), and there are at most O(N · L) edges (each word shares O(L) patterns with each neighbor), giving total BFS work O(N · L²).

Complexity

  • Time: O(N · L²), driven by L4-L6/L13 (pattern construction and BFS neighbor traversal, each O(N · L²) total).
  • Space: O(N · L²).

Approach 3: Bidirectional BFS

Start BFS simultaneously from both beginWord and endWord; alternately expand the smaller frontier. When they meet, sum the two depths. Much faster on deep graphs because the branching factor is squared without bidirectional search.

from collections import defaultdict
def ladder_length(beginWord, endWord, wordList):
words = set(wordList) # L1: O(N) build set
if endWord not in words: # L2: O(1)
return 0
L = len(beginWord) # L3: O(1)
front, back = {beginWord}, {endWord} # L4: O(1) init frontiers
visited = {beginWord, endWord} # L5: O(1)
dist = 1 # L6: O(1)
while front and back: # L7: outer loop, O(diameter) levels
if len(front) > len(back): # L8: swap to keep front smaller
front, back = back, front
next_front = set()
for w in front: # L9: iterate smaller frontier
for i in range(L): # L10: O(L) positions
for ch in "abcdefghijklmnopqrstuvwxyz": # L11: O(26) chars
cand = w[:i] + ch + w[i+1:] # L12: O(L) slice
if cand in back: # L13: O(1) set lookup
return dist + 1
if cand in words and cand not in visited: # L14: O(1)
visited.add(cand)
next_front.add(cand)
front = next_front
dist += 1
return 0

Where the time goes, line by line

Variables: N = len(wordList), L = length of each word.

LinePer-call costTimes executedContribution
L1 (build set)O(N)1O(N)
L7 (BFS levels)O(1)O(diameter) levelsO(diameter)
L9-L12 (frontier expansion)O(L²) per word in frontierfrontier size per levelO(frontier · L²)
L13-L14 (candidate test)O(1)26 · L per frontier wordO(N · L²) ← dominates worst case

In the worst case both frontiers grow to O(sqrt(N)) (bidirectional BFS squares the branching rate), giving O(N · L²) total. In practice, the smaller frontier stays tiny and the search terminates far earlier than unidirectional BFS.

Complexity

  • Time: O(N · L²) worst case, driven by L13/L14; typically much faster than unidirectional BFS.
  • Space: O(N · L).

Summary

ApproachTimeSpaceNotes
BFS with pairwise checkO(N² · L)O(N · L)Too slow for large inputs
BFS with pattern bucketsO(N · L²)O(N · L²)Canonical
Bidirectional BFSO(N · L²) worst; much faster practicallyO(N · L)Best in practice

Bidirectional BFS is a general technique for “minimum number of edits” problems where both source and target are known.

Test cases

# Quick smoke tests, paste into a REPL or save as test_127.py and run.
# Uses Approach 2 (pattern buckets + BFS) as the canonical implementation.
from collections import defaultdict, deque
def ladder_length(beginWord, endWord, wordList):
if endWord not in wordList:
return 0
L = len(beginWord)
patterns = defaultdict(list)
for w in wordList:
for i in range(L):
patterns[w[:i] + "*" + w[i+1:]].append(w)
q = deque([(beginWord, 1)])
visited = {beginWord}
while q:
word, dist = q.popleft()
if word == endWord:
return dist
for i in range(L):
key = word[:i] + "*" + word[i+1:]
for nb in patterns[key]:
if nb not in visited:
visited.add(nb)
q.append((nb, dist + 1))
patterns[key] = []
return 0
def _run_tests():
# Example from problem statement
assert ladder_length("hit", "cog", ["hot","dot","dog","lot","log","cog"]) == 5
# No path (endWord not in list)
assert ladder_length("hit", "cot", ["hot","dot","dog","lot","log","cog"]) == 0
# Single step
assert ladder_length("hot", "dot", ["dot","lot"]) == 2
# endWord not reachable
assert ladder_length("hit", "cog", ["hot","dot","dog","lot","log"]) == 0
# Two words, direct transform
assert ladder_length("a", "c", ["a","b","c"]) == 2
print("all tests pass")
if __name__ == "__main__":
_run_tests()