Input: βtextβ = βcxyzghxyzvjkxyzβ and βpatternβ = βxyzβ
Output: 2 7 13
Explanation: The pattern βpatternβ = βxyzβ appears at 3 positions in βtextβ.
The first line contains the string βtextβ.
The second line contains the string βpatternβ.
The first line contains an integer βcountβ, the number of occurrences.
The first line contains βcountβ integers, the positions of occurrence of βpatternβ in βtextβ in sorted order.
You need not print anything. Return the list of positions in any order. They will be sorted and printed.
Check each substring in βtextβ of length βmβ and see if they are equal to βpatternβ.
If we find a match we add the current index (1 - based) in our list βansβ.
The steps are as follows:
List stringMatch(string βtextβ, string βpatternβ)
Rabin-Karp algorithm is a pattern-matching algorithm that uses rolling hash. In the naive approach, we compare each substring of length equal to that of the pattern string. But we can improve it by comparing the hash value of the substring and then comparing the substring only if the hash value matches.
The hash function we are using calculates the hash value in the following way:
Since we are dealing only with lowercase English letters, we have at most 26 different characters. So we assign a value from 0 to 25 to each character. Let the values assigned be:
βaβ = 0
βbβ = 1
βcβ = 2
β¦
βzβ = 25
As we can see, the hash code for each character = Their ASCII code - 97 (ASCII code of βaβ)
If the string is βcodeβ, its hash value = 2 * 26^3 + 14 * 26^2 + 3 * 26^1 + 4 * 26^0 = 44698.
In the case of longer strings, hash values can be bigger, so we will use modular arithmetic.
For rolling hash function:
If our string is βcodesβ and we are considering substrings of length 4:
The hash value of:
βcodeβ = 2 * 26^3 + 14 * 26^2 + 3 * 26^1 + 4 * 26^0 = 44698
βodesβ = 14 * 26^3 + 3 * 26^2 + 4 * 26^1 + 18 * 26^0 = 248214
As we can see, the difference between them is:
So the hash value of βodesβ = (Hash value of βcodeβ - 2 * 26^3) * 26 + 18
Generalizing this idea, the hash value of the current substring = (Hash value of previous substring - Hash code of the first character in previous substring * 26^(length of substring - 1)) * 26 + Hash code of last character in the current substring
Our algorithm computes the hash value of each substring of βtextβ of length equal to that of βpatternβ and compares it to the hash value of βpatternβ. If both the hash values are equal, then it does a full match at that particular index.
The steps are as follows (please note that all arithmetic operations are modulo 10^9 + 7):
Int searchPatternRabinKarp(string βtextβ, string βpatternβ)