Hello Members,
I have a question regarding the worst case running time for Knuth Morris Pratt Algorithm for String Matching:
Let's say the T = AAAAB and P = AB.
In this case, the prefix table is:
Cell# P Skip Distance
0 A 0
1 B 0
As I scan through T and compare values of P with it:
AAAAB
AB
There is a partial match at Cell#0. Since the skip distance for Cell#0 for P is 0, we move P to the next cell:
AAAAB
AB
Hence, the same scenario will repeat three times (6 comparisons) until T matches the substring , "AB" in P (2 more comparisons). In all there were 8 comparisons = Length of P * Length of T - Length of T = 5 * 2 - 2 = 8 = O(mn) - O(m) = O(mn). Obviously, this is not O(m+n), the worst case bound for the KMP algorithm.
What am I doing wrong? I feel my prefix table is incorrect, but I cannot see any other values in the prefix table if I follow the definiton of "longest suffix in P that is also a prefix in P".
I would be grateful for any help.
Thank you!