Topological Sort Visualization for Coding Interviews
Kahn's in-degree algorithm; cycle detection included.
Concept
Topological sort produces a linear ordering of vertices in a directed acyclic graph (DAG) such that for every directed edge u to v, u appears before v in the ordering. This is one of the foundational graph algorithms and shows up everywhere: build systems resolving dependency order, course prerequisite planning, task schedulers, spreadsheet cell recomputation, and package managers like npm, pip, or apt. Because the ordering only exists when the graph is acyclic, a correct topological sort routine must also act as a cycle detector. Two classic implementations exist: Kahn's algorithm, a BFS-style approach driven by in-degree counters, and a DFS post-order approach that reverses finishing times. Both run in O(V+E) time, but Kahn's variant is often preferred in production because it produces the order iteratively and signals cycles naturally when the output length falls short of the vertex count.
How it works
Kahn's algorithm maintains an array of in-degrees (number of incoming edges) for every vertex. We first scan all edges to populate this array, then seed a queue with every vertex whose in-degree is zero. These are the vertices with no prerequisites; they can appear first. We repeatedly pop a vertex from the queue, append it to the output order, and decrement the in-degree of each of its neighbors. Whenever a neighbor's in-degree drops to zero, that neighbor has had all of its prerequisites emitted, so we enqueue it. The loop terminates when the queue empties. If the resulting order contains all V vertices, it is a valid topological sort; if it contains fewer, the graph has a cycle and at least one vertex was never reachable with a zero in-degree. The DFS variant instead does a post-order traversal with three coloring states (unvisited, in-stack, done) and pushes each vertex onto a stack when its recursion finishes; reversing that stack yields the ordering. Both variants are linear in V+E because every edge is relaxed exactly once and every vertex is enqueued or visited exactly once.
Implementation
import java.util.*;
public class TopoSort {
/**
* Returns a topological ordering of n vertices [0..n-1].
* Throws IllegalStateException if the graph contains a cycle.
*/
public static int[] kahnTopoSort(int[][] edges, int n) {
List<List<Integer>> adj = new ArrayList<>();
for (int i = 0; i < n; i++) adj.add(new ArrayList<>());
int[] indeg = new int[n];
for (int[] e : edges) {
adj.get(e[0]).add(e[1]);
indeg[e[1]]++;
}
Deque<Integer> queue = new ArrayDeque<>();
for (int v = 0; v < n; v++) {
if (indeg[v] == 0) queue.offer(v);
}
int[] order = new int[n];
int idx = 0;
while (!queue.isEmpty()) {
int u = queue.poll();
order[idx++] = u;
for (int v : adj.get(u)) {
if (--indeg[v] == 0) queue.offer(v);
}
}
if (idx != n) {
throw new IllegalStateException("Cycle detected: emitted "
+ idx + " of " + n + " vertices");
}
return order;
}
public static void main(String[] args) {
int[][] edges = {{5,2},{5,0},{4,0},{4,1},{2,3},{3,1}};
int[] order = kahnTopoSort(edges, 6);
System.out.println(Arrays.toString(order));
}
}
Complexity
- Time:
O(V+E) - Space:
O(V+E)
Common pitfalls
- Forgetting to validate that the output length equals V, which silently hides cycles in the input graph.
- Reusing a single in-degree array across multiple runs without resetting it, leading to phantom dependencies.
- Building the adjacency list with int[] rather than List<Integer>, then over-allocating or running into fixed capacity issues with dense nodes.
- Assuming the topological order is unique; when multiple zero in-degree vertices exist, the result depends on queue ordering.
Key design decisions & trade-offs
- Kahn's BFS vs DFS post-order — Chosen: Kahn's BFS with in-degree counting. Iterative, avoids deep recursion stack overflow on long chains, and cycle detection falls out naturally from the final output length.
- Return order vs stream emissions — Chosen: Materialize the full int[] order. Most callers need random access to the order (e.g., indexing into level arrays); streaming saves memory only when V is huge, which is rare for DAGs built from source code or builds.
Practice problems
Interview follow-ups
- Extend to a lexicographically smallest topological order by swapping the ArrayDeque for a PriorityQueue.
- Report the actual cycle when detection fails by running a DFS from any remaining vertex with non-zero in-degree.
- Parallelize layers: vertices enqueued in the same round have no dependencies and can execute concurrently.
Why it matters
Topological Sort is a core part of the Graphs toolkit. If you are preparing for a technical interview at a top-tier company or teaching a course, a working visualization is the fastest path from "I read about it" to "I understand it".