Boyer Moore Algorithm for Pattern Searching
Published by
sanya sanya
This algorithm efficiently finds all occurrences of a pattern within a text. It uses two heuristic rules to skip unnecessary comparisons, making it one of the fastest string matching algorithms.
Algorithm:
1. Preprocess the pattern:
- Construct two tables, "bad" and "good".
- The bad table stores the rightmost occurrence position of each character in the pattern.
- The good table helps determine the amount to shift the pattern if a mismatch occurs.
2. Search for pattern matches in the text:
- Initialize a pointer i=0.
- Iterate from 0 to n - m in the text, where n is the length of the text and m is the length of the pattern:
- Compare the pattern from right to left, starting from the last character.
- If a mismatch occurs:
- Calculate the maximum shift distance based on the pattern's rightmost instance of the mismatched character, using the bad character rule.
- Use the good suffix rule to determine an additional shift distance based on the length of the matching suffix within the pattern.
- Shift the pattern to right by the maximum of the two shift distances.
- If a match is found, shift the pattern to the right by the length of the pattern to search for further occurrences.
The preprocessing step constructs the lookup tables, which can be done in O(m) time, where m is the length of the pattern. The pattern searching step has a time complexity of O(n/m), where n is the length of the text and m is the length of the pattern.
Library
WEB DEVELOPMENT
FAANG QUESTIONS