
Marshall Clow wrote:
Unless I'm sadly mistaken (which has been known to happen), we are shifting both in the pattern and the corpus. In that code, "match_start" is the place where the match might be, and "idx" is where we want to start comparing. If idx == 0, then we start comparing from the beginning of the pattern, but otherwise, from somewhere in the middle.
I recommend you to study carefully this info: http://www-igm.univ-mlv.fr/~lecroq/string/node8.html#SECTION0080 And it will also help to press the "visualization" button and experiment with the applet. Furthermore I have included bellow a typical C/C++ KMP implementation, which is a slightly modified version of the example presented in the above site. You will clearly see in this sample that the corpus is never shifted. Best regards Jim Xochellis Homepage: https://sites.google.com/site/xochellis/ //************************************* #include <cstdio> #include <cstring> #define XSIZE 256 void KMP(const char *pPattern, int patternSize , const char *pCorpus, int corpusSize); //------------------------------------------ int main(int argc, char* argv[]) { const char *pCorpus= "Knu of Knuth!"; const char *pPattern= "Knuth"; KMP(pPattern, strlen(pPattern), pCorpus, strlen(pCorpus)); getchar(); return 0; } //------------------------------------------ void preKmp(const char *pPattern, int patternSize , int kmpNext[]) { int i, j; i = 0; j = kmpNext[0] = -1; while (i < patternSize ) { while (j > -1 && pPattern[i] != pPattern[j]) j = kmpNext[j]; i++; j++; if (pPattern[i] == pPattern[j]) kmpNext[i] = kmpNext[j]; else kmpNext[i] = j; } } void KMP(const char *pPattern, int patternSize , const char *pCorpus, int corpusSize) { int patternIdx, corpusIdx, kmpNext[XSIZE]; /* Preprocessing */ preKmp(pPattern, patternSize , kmpNext); /* Searching */ patternIdx = corpusIdx = 0; while (corpusIdx < corpusSize) { while (patternIdx > -1 && pPattern[patternIdx] != pCorpus[corpusIdx]) patternIdx = kmpNext[patternIdx]; //<--- Shifting the pattern on mismatch printf("patternIdx: %d -> corpusIdx: %d\n", patternIdx, corpusIdx); patternIdx++; corpusIdx++; //<--- corpus is always increased by 1 if (patternIdx >= patternSize ) { printf("*** Success at: %u\n", corpusIdx - patternIdx); return; //sucess } } puts("*** failure"); }