I need a little help with a particular part of my Karp-Rabin algo.
What I am trying to do is to implement both the version with fixed sliding window
and with separate append
and skip
parts. Sliding window
works perfectly fine. The problem occurs when I try to split the monolith sliding window
into append
and skip
parts. Append
seems to work fine, but skip
is the thing that have been causing me a massive headache last couple of days.
PROBLEM - I am sliding through the string that holds a few instances of a pattern subscribing in it. Sliding window
detects it, but not the other two.
The idea is that the RH
struct holds the pre-computed value of (base ^ window size) mod prime number (b2wmod
) so I can delete the leading character of a string. This value changes after all append
and skip
as window size changes. To decrease the value of b2wmod
, the multiplicative inverse is used to not be in a situation of mod deletion (inverse of base mod modulus value) . It is also pre-computed.
Below are the parts of the code I am interested in. I don't post the whole code to not make you read everything, but can upload it if necessary. Multiplicative inverse seems to be computed correctly, but I can upload the code also.
Would appreciate any help! Thank you in advance!
void
append_to_rh(RH rh)
{
uint64_t hash = rh->hash;
uint64_t base = rh->base;
uint64_t mod = rh->mod;
uint64_t b2wmodm = rh->b2wmodm;
char new = rh->new;
hash = ( hash * base + new ) % mod;
b2wmodm = ( b2wmodm * base ) % mod;
rh->hash = hash;
rh->b2wmodm = b2wmodm;
}
void
skip(RH rh)
{
uint64_t hash = rh->hash;
uint64_t base = rh->base;
uint64_t mod = rh->mod;
uint64_t b2wmodm = rh->b2wmodm;
uint64_t m_inv = rh->m_inv;
char old = rh->old;
uint64_t correction = old * mod;
b2wmodm = ( b2wmodm * (m_inv % mod) ) % mod;
hash = ( hash - old * b2wmodm + correction ) % mod;
rh->hash = hash;
rh->b2wmodm = b2wmodm;
}
void
slide_window(RH rh)
{
uint64_t base = rh->base;
uint64_t mod = rh->mod;
uint64_t hash = rh->hash;
uint64_t b2wmodm = rh->b2wmodm;
char old = rh->old;
char new = rh->new;
hash = ( hash * base - old * b2wmodm + new ) % mod;
rh->hash = hash;
}
Your append
and skip
functions work fine. Here is a sample code I used for testing
#include <string>
#include <cassert>
typedef long long ll;
ll hash, b2wmodm, base, inv_base, mod;
// fast exponentiation, for calculating inv_base
ll exp(ll a, ll b){
ll ans = 1;
while (b){
if (b&1){
ans *= a;
ans %= mod;
}
a *= a;
a %= mod;
b >>= 1;
}
return ans;
}
// calculates expected hash of the string
ll expected_hash(std::string s) {
ll result = 0;
ll multiplier = 1;
for (int i = s.length()-1; i >= 0; i--) {
result += s[i] * multiplier % mod;
result %= mod;
multiplier = multiplier * base % mod;
}
return result;
}
// same as your append and skip functions
void append_to_rh(ll newc) {
hash = (hash * base + newc) % mod;
b2wmodm = (b2wmodm * base) % mod;
}
void skip(ll old) {
ll correction = old * mod;
b2wmodm = (b2wmodm * (inv_base % mod)) % mod;
hash = (hash - old * b2wmodm + correction) % mod;
}
int main() {
base = 29;
mod = 1000000007;
hash = 0;
b2wmodm = 1;
inv_base = exp(base, mod-2);
srand(time(nullptr));
std::string s;
for (int i = 0; i < 2000; i++) {
if (i < 1000 || rand()%2) {
char newchar = rand()%26 + 'a';
s += newchar;
append_to_rh(newchar);
assert(expected_hash(s) == hash);
} else {
char oldchar = s[0];
s = s.substr(1, s.length());
skip(oldchar);
assert(expected_hash(s) == hash);
}
}
}
I guess other parts of your code cause problems. Maybe you are trying to skip an empty window, or maybe you use a non-prime mod.