← System Design Simulator

Rabin-Karp Visualization for Coding Interviews

By Rahul Kumar · Senior Software Engineer · Updated · Category: Arrays & Strings

Rolling hash pattern search; hash collisions trigger char-by-char verify.

Concept

Rabin-Karp is the pattern-matching algorithm that trades determinism for elegance: it hashes the pattern once, then slides a rolling hash across the text so each length-m window's hash is computed in O(1) from the previous window. Windows whose hash differs cannot match, so you skip them outright; windows whose hash matches are verified character-by-character to guard against collisions. Expected time is O(n + m), worst case is O(nm) when adversarial input forces every window to collide. The real superpower is multi-pattern search: once you have a rolling hash, testing a text window against k patterns is a hashset lookup, giving O(n + total_m) expected, which is why plagiarism detectors and near-duplicate-document systems lean on it instead of KMP.

Rabin-Karp — Interactive Simulator

Runs fully client-side in your browser; no sign-up. Or open full screen →

Launch the interactive Rabin-Karp visualization — animated algorithm, step-through, and live data-structure updates.

How it works

Pick a base B (typically 31 or 131 for ASCII; a prime slightly above the alphabet size) and a modulus M (a large prime, e.g. 1_000_000_007) to keep the hash inside a 64-bit long. The pattern hash is h_p = (p[0] * B^(m-1) + p[1] * B^(m-2) + ... + p[m-1]) mod M, computed via Horner's rule. The first window's hash h_0 is computed the same way. To roll from window i to window i+1 you subtract the leading character's contribution (text[i] * B^(m-1)), multiply what remains by B, and add the new trailing character — all mod M. Because modular subtraction can go negative in Java, you add M before taking the final mod. When h_window == h_p you run a character-by-character equality check to confirm; on mismatch you keep rolling. Collision probability with one good prime is roughly m/M — small enough for most uses but not zero. For adversarial inputs (hash-flood DoS) use double hashing with two independent (B, M) pairs; the combined collision rate drops to m/(M1 * M2), which is negligible. The rolling step is the whole value proposition: without it you pay O(m) per window and you might as well run naive matching.

Implementation

RabinKarp.java
package dsa.rabinkarp;

import java.util.ArrayList;
import java.util.List;

public final class RabinKarp {
    private static final long BASE = 31L;
    private static final long MOD = 1_000_000_007L;

    // Returns all starting indices where pattern occurs in text.
    public static List<Integer> rabinKarpSearch(String text, String pattern) {
        List<Integer> hits = new ArrayList<>();
        int n = text.length(), m = pattern.length();
        if (m == 0 || m > n) return hits;

        long patternHash = 0L;
        long windowHash = 0L;
        long highestPow = 1L; // BASE^(m-1) mod MOD

        for (int i = 0; i < m - 1; i++) {
            highestPow = (highestPow * BASE) % MOD;
        }

        for (int i = 0; i < m; i++) {
            patternHash = (patternHash * BASE + pattern.charAt(i)) % MOD;
            windowHash = (windowHash * BASE + text.charAt(i)) % MOD;
        }

        for (int i = 0; i <= n - m; i++) {
            if (windowHash == patternHash && equalsAt(text, pattern, i)) {
                hits.add(i);
            }
            if (i < n - m) {
                long leading = (text.charAt(i) * highestPow) % MOD;
                windowHash = (windowHash - leading + MOD) % MOD;
                windowHash = (windowHash * BASE + text.charAt(i + m)) % MOD;
            }
        }
        return hits;
    }

    private static boolean equalsAt(String text, String pattern, int start) {
        for (int j = 0; j < pattern.length(); j++) {
            if (text.charAt(start + j) != pattern.charAt(j)) return false;
        }
        return true;
    }
}

Complexity

Common pitfalls

Key design decisions & trade-offs

Practice problems

Interview follow-ups

Why it matters

Rabin-Karp is a core part of the Arrays & Strings toolkit. If you are preparing for a technical interview at a top-tier company or teaching a course, a working visualization is the fastest path from "I read about it" to "I understand it".

Related