lundi 29 octobre 2018

String matched for some cases and not for other using rabin karp algorithm?

// n -> length of the text
// m -> length of the pattern    
void rabin_karp_check(int n, int m){
        int h_p = hash_value(pattern, m, 0);       
        int h_t = hash_value(text, m, 0); 
        int x = 0;
        int i = 0,k;
        while(i < n - m + 1){
            if(x > 0){
                h_t = rolling_hash(h_t, m, i);  
                }
            x++;
            int j = i;
            if(h_t == h_p){   
                int match_count = 0;
                k = 0;
                while( match_count < m){        
                    if(pattern[k] == text[j]){
                        match_count++;  k++; j++;
                    }
                    else
                        break;
                }
                if(match_count == m)
                    cout<<"found at "<<i<<"\n";
            }
            i++;
        }
    }

func to calculate hash value of pattern and initial slide of size m of the text using hash_formula

//(256^m-1 * p[0] + 256^m-2 * p[1] + 256^m-3 * p[2] + .... + 256^0 * p[m-1]) % q

int hash_value(string &p, int &m, int i){

        int q = 101,k = 0;    
        long long int l = 0;
        for(int j = i;j < i + m ; j++){
            l = l + pow(256, m - 1 - k) * p[j];
            k++;
        }
        return l % q;               
    }

Function to calculate next hash value using the previous calculated

int rolling_hash(int h_t, int m, int i){   
        int x = (   ( (h_t - ((long int)pow(256, m - 1) * text[i-1])) * 256)  +  text[i + m - 1] ) % 101;
            return (x < 0) ? x + 101 : x;
    }

Output #1:

enter the text: platform for showcasing
enter the pattern: for
found at 4
found at 9

Output #2: Not detected

enter the text: karp rabin
enter the pattern: rabin

Output #3 : detected

enter the text: best practices in
enter the pattern: ic
found at 10

Output # 4: Not detected

enter the text: would you like to hear me tell a joke
enter the pattern: hear

I am unable to figure out why is this happening. I think for some cases the hash value calculated by rolling hash function from the previously stored hash value is not equal to the actual hash value for that size(m) of text. where am I doing wrong?

Thanks in advance

Aucun commentaire:

Enregistrer un commentaire