Tarjan SCC Visualization for Coding Interviews
DFS + low-link; each SCC popped off the stack once.
Concept
Tarjan's algorithm finds all strongly connected components (SCCs) of a directed graph in a single depth-first search pass, running in O(V+E) time. An SCC is a maximal set of vertices where every vertex can reach every other vertex via directed edges. SCCs are the condensation blocks of a directed graph: contracting each one to a single node produces a DAG, which is exactly what enables a huge class of downstream analyses such as 2-SAT, dataflow analysis, deadlock detection, web link community discovery, and dependency cycle reporting in build systems. Tarjan's approach is elegant because it emits SCCs in reverse topological order of the condensation DAG without any second pass (unlike Kosaraju's two-DFS method). The core insight is to track, for every vertex, the smallest discovery time reachable from its DFS subtree, then recognize a component's root by comparing that low-link value to the vertex's own discovery time.
How it works
We run a DFS from every unvisited vertex, assigning each vertex a discovery time disc[u] (a counter incremented on first visit) and a low-link value low[u] initialized to disc[u]. We maintain an explicit stack of vertices currently being explored along with an onStack[] boolean array, and we push each vertex onto this stack when we first visit it. When DFS relaxes an edge u to v there are three cases. If v is unvisited, we recurse, then update low[u] = min(low[u], low[v]). If v is on the stack (a back or cross edge inside the current SCC), we update low[u] = min(low[u], disc[v]). Otherwise v belongs to an already emitted component and we ignore it. After processing all of u's neighbors, if low[u] equals disc[u] then u is the root of an SCC. We pop vertices off the stack until we pop u itself; every popped vertex is part of this component and is marked off the stack. The algorithm visits every edge and vertex at most once, yielding linear time with O(V) extra space for the stack, disc, low, and onStack arrays.
Implementation
import java.util.*;
public class Tarjan {
private int timer;
private int[] disc, low;
private boolean[] onStack;
private Deque<Integer> stack;
private List<List<Integer>> adj;
private List<List<Integer>> sccs;
public List<List<Integer>> tarjanSCC(List<List<Integer>> adj, int n) {
this.adj = adj;
this.disc = new int[n];
this.low = new int[n];
this.onStack = new boolean[n];
this.stack = new ArrayDeque<>();
this.sccs = new ArrayList<>();
Arrays.fill(disc, -1);
timer = 0;
for (int v = 0; v < n; v++) {
if (disc[v] == -1) dfs(v);
}
return sccs;
}
private void dfs(int u) {
disc[u] = low[u] = timer++;
stack.push(u);
onStack[u] = true;
for (int v : adj.get(u)) {
if (disc[v] == -1) {
dfs(v);
low[u] = Math.min(low[u], low[v]);
} else if (onStack[v]) {
low[u] = Math.min(low[u], disc[v]);
}
}
if (low[u] == disc[u]) {
List<Integer> component = new ArrayList<>();
while (true) {
int w = stack.pop();
onStack[w] = false;
component.add(w);
if (w == u) break;
}
sccs.add(component);
}
}
public static void main(String[] args) {
int n = 5;
List<List<Integer>> adj = new ArrayList<>();
for (int i = 0; i < n; i++) adj.add(new ArrayList<>());
adj.get(1).add(0);
adj.get(0).add(2);
adj.get(2).add(1);
adj.get(0).add(3);
adj.get(3).add(4);
System.out.println(new Tarjan().tarjanSCC(adj, n));
}
}
Complexity
- Time:
O(V+E) - Space:
O(V)
Common pitfalls
- Confusing disc[v] and low[v] when updating from a back edge: you must use disc[v], not low[v], for cross/back edges.
- Recursive DFS will overflow the JVM stack on deep graphs (millions of vertices); convert to an iterative explicit-frame stack for production.
- Forgetting to clear onStack[] when popping, which would cause later vertices to falsely appear in the current component.
- Mixing up the explicit SCC stack with the DFS recursion stack; they serve different purposes and cannot be merged.
Key design decisions & trade-offs
- Tarjan vs Kosaraju — Chosen: Tarjan's single-pass DFS. Avoids building the transpose graph and running DFS twice; components come out in reverse topological order for free.
- Recursive vs iterative DFS — Chosen: Recursive for clarity here. Recursion reads closer to the textbook; in production switch to an explicit frame stack to survive deep graphs.
Practice problems
Interview follow-ups
- Build the condensation DAG by mapping every vertex to its SCC id, then run a topological sort on the contracted graph.
- Solve 2-SAT by running Tarjan on the implication graph and checking that no variable and its negation share an SCC.
- Convert the recursion to an iterative frame-based DFS to handle graphs with millions of vertices safely.
Why it matters
Tarjan SCC is a core part of the Graphs toolkit. If you are preparing for a technical interview at a top-tier company or teaching a course, a working visualization is the fastest path from "I read about it" to "I understand it".