'Coding has over 700 languages', '67% of programming jobs aren’t in the technology industry', 'Coding is behind almost everything that is powered by electricity'
Last Updated: 3 Feb, 2021

Boyer Moore Algorithm For Pattern Searching

Moderate
Asked in companies
Goldman SachsGoogleRxLogix

Problem statement

You are given a string ‘text’ and a string ‘pattern’, your task is to find all occurrences of pattern in the string ‘text’ and return an array of indexes of all those occurrences. You may assume that the length of ‘text’ is always greater than the length of ‘pattern’.

For example :
Let text = “this is a good place to have a good start”, pattern = “good” so you have to return {10, 31} because at 10 and 31 index pattern is present in the text.
Note :
If there is no such index in the text then just return an array containing -1.
Input format :
The first line of input contains an integer ‘T’ denoting the number of test cases.

The first line of each test case contains a string ‘text’ in which the pattern will be searched for.

The second line of each test case contains a string ‘pattern’ of which you have to find all occurrences in ’text’.
Output format :
For each test case, return an array representing the indices where the pattern is found in the text. The output of each test case is printed in a separate line. 
Note :
You don’t have to print anything; it has already been taken care of. Just implement the given function. 
Constraints :
1 <= T <= 5
1 <= M < 100
1 <= N < M

Time Limit: 1 sec

Approaches

01 Approach

The idea is to find out which character mismatches in the text from the pattern that is the character of the text that doesn't match with the current character of the pattern and we call that character a bad character.

 

  • One thing here needed to be observed is that unlike other algorithms for pattern searching boyer-Moore algorithm starts matching from the last character of the pattern.
  • Therefore on finding a mismatch seen from the last, we will shift the pattern until the mismatch becomes a match and the pattern moves past the mismatched character i.e. pattern crosses the bad character. We will have two cases here as discussed-
  • While implementing this we preprocess the pattern and store the last occurrence of every possible character in an array of size equal to alphabet size.
  • In the first case when the mismatch becomes a match through shifting, We will find the position of the last occurrence of the mismatching character in the pattern and if the mismatching character exists in the pattern then we’ll shift the pattern such that it gets aligned to the mismatching character in the given text. For example ‘text’ = “ABCAFDBACA” and ‘pattern’ = “BACA”.
    • We can clearly see that we have our first mismatch at position 1 and the mismatched character is “A” so we will find the last occurrence of “A” in a pattern that is present at position 1 as we check from last.
    • Now we will shift the pattern two times so that “A” in the pattern gets aligned with the “A” in text.
  • Now for the second case where we have to move the pattern past the mismatched character because if the bad character does not exist in the pattern there is no point in searching again so we move the pattern past it like in the above example after moving the pattern two times ahead we get a mismatch at position 4 and the mismatching character “F” does not exist in the pattern before position 4 so we’ll shift pattern past to the position 4 and eventually in above example we have got a perfect match of pattern.