You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
API Kernel: UnionFindConnectivityCore Mechanism: Maintain disjoint sets with efficient union and find operations using path compression and union by rank.
This document presents the canonical Union-Find templates covering connectivity queries, cycle detection, component counting, and equivalence grouping. Each implementation includes path compression and union by rank for near-constant time operations.
classUnionFind:
def__init__(self, n: int):
self.parent=list(range(n)) # Initially, each node is its own parentself.rank= [0] *n# Rank (tree depth upper bound)deffind(self, x: int) ->int:
"""Find with path compression."""ifself.parent[x] !=x:
self.parent[x] =self.find(self.parent[x])
returnself.parent[x]
defunion(self, x: int, y: int) ->bool:
"""Union by rank. Returns True if union performed (different sets)."""px, py=self.find(x), self.find(y)
ifpx==py:
returnFalse# Already in same set# Union by rankifself.rank[px] <self.rank[py]:
px, py=py, pxself.parent[py] =pxifself.rank[px] ==self.rank[py]:
self.rank[px] +=1returnTrue
1.2 Why Path Compression?
# Without path compression:# 1 After find(5): 1→2→3→4→5# / Every find traverses full chain: O(n)# 2# /# 3# \# 4# \# 5# With path compression:# 1 After find(5): All nodes point to root# / | \ \ Subsequent finds: O(1)# 2 3 4 5
1.3 Why Union by Rank?
# Without rank: Tree can become a linked list (O(n) per operation)# With rank: Tree height stays O(log n)# Union by rank: Always attach shorter tree under taller tree# This keeps tree balanced
1.4 Time Complexity Analysis
Operation
Amortized Time
Find
O(α(n)) ≈ O(1)
Union
O(α(n)) ≈ O(1)
Connected
O(α(n)) ≈ O(1)
Where α(n) is the inverse Ackermann function, which grows extremely slowly (α(n) ≤ 4 for any practical n).
Problem: Count connected components in an adjacency matrix graph.
Invariant: Each union reduces component count by 1.
Role: BASE TEMPLATE for Union-Find connectivity.
2.1 Pattern Recognition
Signal
Indicator
"connected components"
→ Union-Find or DFS
"adjacency matrix"
→ Symmetric edges, union both (i,j)
"count groups"
→ Track components during unions
2.2 Implementation
# Pattern: union_find_connected_components# See: docs/patterns/union_find/templates.md Section 1 (Base Template)classSolution:
deffindCircleNum(self, isConnected: List[List[int]]) ->int:
""" Count number of provinces (connected components). Key Insight: - Start with n components (each city is its own province) - Each successful union reduces count by 1 - Final count = number of provinces Why Union-Find over DFS? - Same complexity O(n²) for this problem - Union-Find is more natural for connectivity queries - Easier to extend for dynamic edge additions """n=len(isConnected)
parent=list(range(n))
rank= [0] *ndeffind(x: int) ->int:
ifparent[x] !=x:
parent[x] =find(parent[x])
returnparent[x]
defunion(x: int, y: int) ->bool:
px, py=find(x), find(y)
ifpx==py:
returnFalseifrank[px] <rank[py]:
px, py=py, pxparent[py] =pxifrank[px] ==rank[py]:
rank[px] +=1returnTruecomponents=nforiinrange(n):
forjinrange(i+1, n): # Only upper triangle (symmetric)ifisConnected[i][j] ==1:
ifunion(i, j):
components-=1returncomponents
Problem: Find the edge that creates a cycle in an undirected graph.
Invariant: Union returns False when connecting already-connected nodes.
Role: BASE TEMPLATE for cycle detection.
3.1 Pattern Recognition
Signal
Indicator
"creates cycle"
→ Union-Find: union returns False
"redundant edge"
→ Edge connecting same component
"remove one edge"
→ First (or last) edge creating cycle
3.2 Implementation
# Pattern: union_find_cycle_detection# See: docs/patterns/union_find/templates.md Section 2classSolution:
deffindRedundantConnection(self, edges: List[List[int]]) ->List[int]:
""" Find the edge that creates a cycle. Key Insight: - Process edges in order - Union returns False if nodes already connected - That edge is redundant (creates a cycle) Why this works: - In a tree with n nodes, there are n-1 edges - Adding one more edge creates exactly one cycle - The edge connecting already-connected nodes is redundant """n=len(edges)
parent=list(range(n+1)) # 1-indexedrank= [0] * (n+1)
deffind(x: int) ->int:
ifparent[x] !=x:
parent[x] =find(parent[x])
returnparent[x]
defunion(x: int, y: int) ->bool:
px, py=find(x), find(y)
ifpx==py:
returnFalse# Cycle detected!ifrank[px] <rank[py]:
px, py=py, pxparent[py] =pxifrank[px] ==rank[py]:
rank[px] +=1returnTrueforu, vinedges:
ifnotunion(u, v):
return [u, v]
return [] # Should never reach here
Problem says: Return the LAST edge that creates a cycle
(if multiple edges could be removed)
Input: [[1,2],[2,3],[3,4],[1,4],[1,5]]
Graph: 1-2-3-4-1 (cycle) + 1-5
Edge [1,4] creates the cycle.
If we check [3,4] first: no cycle yet
If we check [1,4] second: cycle detected!
3.6 Complexity Analysis
Aspect
Complexity
Time
O(n × α(n)) ≈ O(n)
Space
O(n) for parent/rank arrays
3.7 Related Problems
Problem
Variation
LC 685: Redundant Connection II
Directed graph (harder)
LC 261: Graph Valid Tree
Check if exactly n-1 edges
LC 1319: Network Operations
Count extra edges
4. Accounts Merge (LeetCode 721)
Problem: Merge accounts that share common emails.
Invariant: Same email in different accounts = same person.
Role: VARIANT applying Union-Find to string grouping.
4.1 Pattern Recognition
Signal
Indicator
"merge by common element"
→ Union-Find with element mapping
"group equivalent items"
→ Map items to indices
"transitive equivalence"
→ If AB and BC, then A~C
4.2 Implementation
# Pattern: union_find_equivalence_grouping# See: docs/patterns/union_find/templates.md Section 3classSolution:
defaccountsMerge(self, accounts: List[List[str]]) ->List[List[str]]:
""" Merge accounts sharing common emails. Key Insight: - Map each email to first account index that has it - If email seen before, union current account with previous account - Collect emails by component root Why Union-Find? - Handles transitive relationships naturally - A shares email with B, B shares with C → A, B, C all merge """fromcollectionsimportdefaultdictn=len(accounts)
parent=list(range(n))
rank= [0] *ndeffind(x: int) ->int:
ifparent[x] !=x:
parent[x] =find(parent[x])
returnparent[x]
defunion(x: int, y: int) ->None:
px, py=find(x), find(y)
ifpx==py:
returnifrank[px] <rank[py]:
px, py=py, pxparent[py] =pxifrank[px] ==rank[py]:
rank[px] +=1# Map email to first account indexemail_to_account: dict[str, int] = {}
fori, accountinenumerate(accounts):
foremailinaccount[1:]: # Skip nameifemailinemail_to_account:
# Union this account with account that has same emailunion(i, email_to_account[email])
else:
email_to_account[email] =i# Collect emails by rootroot_to_emails: dict[int, set[str]] =defaultdict(set)
foremail, account_idxinemail_to_account.items():
root=find(account_idx)
root_to_emails[root].add(email)
# Build resultresult: list[list[str]] = []
forroot, emailsinroot_to_emails.items():
name=accounts[root][0]
result.append([name] +sorted(emails))
returnresult
5. Number of Operations to Make Network Connected (LeetCode 1319)
Problem: Find minimum operations to connect all computers.
Invariant: Need (components - 1) edges to connect all components.
Role: VARIANT combining component counting with edge availability.
5.1 Pattern Recognition
Signal
Indicator
"make all connected"
→ Count components, need c-1 edges
"rearrange edges"
→ Count redundant edges (extras)
"minimum operations"
→ Move redundant edges to connect
5.2 Implementation
# Pattern: union_find_network_connectivity# See: docs/patterns/union_find/templates.md Section 4classSolution:
defmakeConnected(self, n: int, connections: List[List[int]]) ->int:
""" Minimum cables to move to connect all computers. Key Insight: - Need at least n-1 edges to connect n nodes - Redundant edges (forming cycles) can be moved - Count components and check if enough redundant edges Algorithm: 1. If edges < n-1: impossible (-1) 2. Count components using Union-Find 3. Need (components - 1) moves to connect all """iflen(connections) <n-1:
return-1# Not enough cablesparent=list(range(n))
rank= [0] *ndeffind(x: int) ->int:
ifparent[x] !=x:
parent[x] =find(parent[x])
returnparent[x]
defunion(x: int, y: int) ->bool:
px, py=find(x), find(y)
ifpx==py:
returnFalse# Redundant edgeifrank[px] <rank[py]:
px, py=py, pxparent[py] =pxifrank[px] ==rank[py]:
rank[px] +=1returnTruecomponents=nfora, binconnections:
ifunion(a, b):
components-=1returncomponents-1
6. Satisfiability of Equality Equations (LeetCode 990)
Problem: Check if equality/inequality equations are satisfiable.
Invariant: Equal variables must be in same component; unequal must be in different.
Role: VARIANT applying Union-Find to constraint satisfaction.
6.1 Pattern Recognition
Signal
Indicator
"equality constraints"
→ Union-Find: union equal variables
"inequality constraints"
→ Check: must be in different components
"satisfiable"
→ Process equalities first, then check inequalities
6.2 Implementation
# Pattern: union_find_constraint_satisfaction# See: docs/patterns/union_find/templates.md Section 5classSolution:
defequationsPossible(self, equations: List[str]) ->bool:
""" Check if all equations can be satisfied. Key Insight: - Process '==' first: union all equal variables - Then check '!=': must be in different components - If any '!=' has variables in same component → unsatisfiable Why two passes? - Equality is transitive (a==b, b==c → a==c) - Must build all equality relationships before checking inequality """parent=list(range(26)) # 26 lowercase lettersrank= [0] *26deffind(x: int) ->int:
ifparent[x] !=x:
parent[x] =find(parent[x])
returnparent[x]
defunion(x: int, y: int) ->None:
px, py=find(x), find(y)
ifpx==py:
returnifrank[px] <rank[py]:
px, py=py, pxparent[py] =pxifrank[px] ==rank[py]:
rank[px] +=1# Pass 1: Process all equalitiesforeqinequations:
ifeq[1] =='=': # "a==b"x=ord(eq[0]) -ord('a')
y=ord(eq[3]) -ord('a')
union(x, y)
# Pass 2: Check all inequalitiesforeqinequations:
ifeq[1] =='!': # "a!=b"x=ord(eq[0]) -ord('a')
y=ord(eq[3]) -ord('a')
iffind(x) ==find(y):
returnFalse# Contradiction!returnTrue
# ✓ USE Union-Find when:# - Multiple connectivity queries after building# - Dynamic edge additions (but not deletions!)# - Only need "are X and Y connected?" not "what's the path?"# - Detecting cycles during graph construction# ✗ DON'T USE Union-Find when:# - Need to find actual path between nodes# - Need to handle edge deletions# - Single connectivity query (DFS is simpler)# - Need shortest path (BFS is better)
START: Given connectivity/grouping problem
│
├─ "Count connected components"?
│ └─ YES → Union-Find with component counter
│ (LC 547, 1319 pattern)
│
├─ "Find cycle" or "redundant edge"?
│ └─ YES → Union-Find: union returns False = cycle
│ (LC 684 pattern)
│
├─ "Merge/group by common element"?
│ └─ YES → Map elements to indices, union on match
│ (LC 721 pattern)
│
├─ "Equality + inequality constraints"?
│ └─ YES → Two-pass: equalities first, then check inequalities
│ (LC 990 pattern)
│
├─ "Connect all with minimum operations"?
│ └─ YES → Count components, need c-1 moves
│ Check if enough extra edges
│ (LC 1319 pattern)
│
└─ "Path between nodes" or "shortest path"?
└─ NO → Union-Find can help
└─ YES → Use DFS/BFS instead
8.2 Feature Selection Guide
Need to track component sizes?
→ Add size[] array, update in union()
Need weighted relationships (like a/b = 2)?
→ Use weighted Union-Find with ratio tracking
Need to handle string keys (not just indices)?
→ Use dict for parent instead of list
→ Or map strings to indices first
Need to count redundant edges?
→ Count when union() returns False
8.3 Common Mistakes to Avoid
Mistake
Why Wrong
Correct Approach
Not using path compression
O(n) per find
Always compress path
Forgetting to update rank/size
Unbalanced trees
Update after linking
Processing order wrong
Wrong answer
Equalities before inequalities
Using 0-indexed for 1-indexed input
Off-by-one
Match problem indexing
Not checking if enough edges
Wrong answer for impossible
Check edges >= n-1
8.4 Index Mapping Patterns
# 1-indexed nodes (LC 684)parent=list(range(n+1)) # Extra slot for 1-indexing# Character indices (LC 990)x=ord(char) -ord('a') # 'a'->0, 'b'->1, etc.# String to index mapping (LC 721)string_to_idx= {}
fori, sinenumerate(strings):
ifsnotinstring_to_idx:
string_to_idx[s] =len(string_to_idx)
# Coordinate to index (grid problems)idx=row*cols+col