Prefix Sum Visualization for Coding Interviews
Range-sum queries in O(1) after O(n) preprocessing.
Concept
Prefix Sum is the cheapest precomputation in competitive programming: spend O(n) up front building P[i] = a[0] + a[1] + ... + a[i-1], and every range-sum query a[l..r] collapses to P[r+1] - P[l] in O(1). It is the foundation for subarray-sum-equals-k (prefix + hashmap), count-range-sum, equilibrium index, range-average queries, 2D image-integral queries, and offline difference-array updates. Once you internalize that 'subarray sum = difference of two prefix sums', a whole class of O(n^2) brute-force problems becomes linear — you iterate once, maintain a hashmap of prefix-sum-to-frequency, and look up what complementary prefix would have produced the target. It generalizes to prefix-XOR for subarray-XOR problems and to prefix-products when divisions are allowed.
How it works
The canonical construction uses an array of length n+1 with P[0] = 0 and P[i] = P[i-1] + a[i-1], so rangeSum(l, r) = P[r+1] - P[l] regardless of whether l == 0. That sentinel avoids a branch on l == 0 and makes the invariant 'P[i] is the sum of the first i elements' uniform. For the subarray-sum-equals-k problem, you stream through the array maintaining a running prefix sum and a hashmap from prefix-sum-value to count-of-times-seen. At each index, if (running - k) has been seen c times before, then c subarrays ending at this index sum to exactly k — because each such earlier prefix, subtracted from the current one, leaves a range whose sum is k. You add 0 -> 1 to the map before starting so that a prefix equal to k itself is counted. The same trick extends to prefix-XOR (lookup running ^ k) and to 'divisible by k' queries (lookup running mod k). The difference array is the dual: to add v to every element in [l, r] offline, do diff[l] += v and diff[r+1] -= v; a single prefix-sum pass materializes the final array in O(n). The choice between inclusive and exclusive prefix conventions is a coin flip, but mixing them is the number-one source of off-by-one bugs — pick one and write it on the whiteboard.
Implementation
package dsa.prefixsum;
public final class PrefixSum {
private final long[] p; // p[i] = sum of first i elements, p[0] = 0
public PrefixSum(int[] a) {
this.p = new long[a.length + 1];
for (int i = 0; i < a.length; i++) {
p[i + 1] = p[i] + a[i];
}
}
// Sum of a[l..r] inclusive, 0-indexed.
public long rangeSum(int l, int r) {
if (l < 0 || r >= p.length - 1 || l > r) {
throw new IllegalArgumentException("bad range [" + l + "," + r + "]");
}
return p[r + 1] - p[l];
}
public long totalSum() { return p[p.length - 1]; }
}
package dsa.prefixsum;
import java.util.HashMap;
import java.util.Map;
public final class SubarraySum {
// Number of contiguous subarrays whose sum equals k. Works with negatives.
public static int subarraySumEqualsK(int[] nums, int k) {
Map<Long, Integer> freq = new HashMap<>();
freq.put(0L, 1); // empty prefix
long running = 0;
int count = 0;
for (int v : nums) {
running += v;
Integer hits = freq.get(running - k);
if (hits != null) count += hits;
freq.merge(running, 1, Integer::sum);
}
return count;
}
// Offline range-add: adds val to every index in [l, r] inclusive.
public static int[] applyRangeUpdates(int n, int[][] updates) {
long[] diff = new long[n + 1];
for (int[] u : updates) {
int l = u[0], r = u[1], val = u[2];
diff[l] += val;
diff[r + 1] -= val;
}
int[] out = new int[n];
long run = 0;
for (int i = 0; i < n; i++) {
run += diff[i];
out[i] = (int) run;
}
return out;
}
}
Complexity
- Time:
O(n) build, O(1) per range query, O(n + q) for q offline updates - Space:
O(n) for the prefix array
Common pitfalls
- Mixing inclusive and exclusive conventions across helpers — stick to one and document it
- Forgetting to seed freq.put(0, 1) in subarraySumEqualsK, which drops subarrays that start at index 0
- Using int for the running sum when values can push past Integer.MAX_VALUE — promote to long
- Off-by-one in rangeSum: P[r+1] - P[l] is inclusive of both ends when built with the n+1 convention
- Applying a difference-array update with r+1 out of bounds — size the diff array to n+1
Key design decisions & trade-offs
- Prefix convention — Chosen: Exclusive P[0]=0, length n+1 vs inclusive P[0]=a[0]. Exclusive removes the branch on l == 0 and is easier to compose; inclusive saves one slot but forces a conditional in every query.
- Subarray-sum lookup structure — Chosen: HashMap<Long,Integer> vs sorted TreeMap. HashMap is O(1) average and enough for equality queries. TreeMap is only needed when the question is 'count prefixes in a range' (count-range-sum).
- Build once vs on-the-fly — Chosen: Precompute full P array for many queries. If you only have one query, the build itself is O(n) — just sum directly. Prefix sums pay off at >= 2 queries.
Practice problems
Interview follow-ups
- Extend to 2D prefix sums for rectangle-sum queries — what's the inclusion-exclusion formula?
- Solve count-range-sum (count prefix-sum differences in [lower, upper]) in O(n log n)
- Adapt the technique for subarray-XOR-equals-k and explain why the hashmap key changes
- Combine prefix sum with a Fenwick tree to support point updates + range queries — when is it worth the jump?
- How would you make the offline difference-array technique work with online queries interleaved with updates?
Why it matters
Prefix Sum is a core part of the Arrays & Strings toolkit. If you are preparing for a technical interview at a top-tier company or teaching a course, a working visualization is the fastest path from "I read about it" to "I understand it".