In a word, KMP records the suffix substring which is a prefix of the string with max length at position i.
Preprocessing
KMP algorithm does preproceses pattern[] and constructs an auxiliary lps[] of size m (same as size of pattern) which is used to skip characters while matching.
name lps indicates longest proper prefix which is also suffix.
A proper prefix is prefix with whole string not allowed.
For example, prefixes of “ABC” are “”, “A”, “AB” and “ABC”. Proper prefixes are “”, “A” and “AB”. Suffixes of the string are “”, “C”, “BC” and “ABC”.
lps[i] = the longest proper prefix of pat[0..i] which is also a suffix of pat[0..i].
For the pattern “ababaca”, lps=[0, 0, 1, 2, 3, 0, 1].
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
publicint[] computeLps(String pattern) { int n = pattern.length(); int[] lps = newint[n]; for (int i = 1; i < n; i++) { // Start by assuming we're extending the previous LPS int j = lps[i - 1]; while (j > 0 && pattern.charAt(j) != pattern.charAt(i)) { j = lps[j - 1]; } if (pattern.charAt(j) == pattern.charAt(i)) { lps[i] = j + 1; } else { lps[i] = 0; } } return lps; }
Search
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18
publicintsearch(String pattern, String text){ int[] lps = computeLps(pattern); int j = 0; // Number of chars matched in pattern. for (int i = 0; i < text.length(); i++) { while (j > 0 && text.charAt(i) != pattern.charAt(j)) { // Fall back in the pattern. j = lps[j - 1]; // Strickly decreasing. } if (text.charAt(i) == pattern.charAt(j)) { // Next char matched, increment position. j++; if (j == pattern.length()) { return i - (j - 1); } } } return -1; // Not found. }