c++string-matchingsse4

Using SSE4.2 instruction PCMPESTRM with small patterns


I am trying to use some SSE4.2 intructions in string matching algorithms, coded in c++.

I do not understand how to use these instructions to match smaller patterns, and was hoping somebody could help me out with that.

In the code example, I am trying to find the pattern "ant" within the packed string "i am an antelope". I would hope for the result to be a mask set to all zeros except for a 1 at the index 8.

This is my code now, which has #include for nmmintrin.h to include sse4.2 instructions:

void print128_num(__m128i var)
{
    uint8_t *val = (uint8_t*) &var;
    printf("Text: %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i %i \n", 
           val[0], val[1], val[2], val[3], val[4], val[5], 
           val[6], val[7], val[8], val[9], val[10], val[11],
           val[12], val[13], val[14], val[15]);
}

int main(){

    __m128i s = _mm_set_epi8('e','p','o','l','e','t','n','a',' ','n','a',' ','m','a',' ','i');
    __m128i p = _mm_set_epi8(0,0,0,0,0,0,0,0,0,0,0,0,0,'t','n','a');

    print128_num(s);
    print128_num(p);

    __m128i res =  _mm_cmpestrm(s, 16, p, 3, 0);
    print128_num(res);

    return 0;
}

I added all the zeros because the initializing function won't allow less arguments. I realize this is wrong but didn't know how to do it and made several quite desperate attempts.

Anyway this is how I compiled: g++ -g sse4test.cpp -o sse4test -std=c++11 -msse4.2

and this is my output:

Text: 105 32 97 109 32 97 110 32 97 110 116 101 108 111 112 101 
Text: 97 110 116 0 0 0 0 0 0 0 0 0 0 0 0 0 
Text: 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 

which I do not understand, really. (the last line).

Any help would be very much appreciated.


Solution

  • There are two problems with your code. First off, you have the source and the pattern reversed in the call to _mm_cmpestrm. Secondly, you are specifying 0 for the last argument, which is a set of flags specifying the operating mode.

    A mode of zero comes out as _SIDD_CMP_EQUAL_ANY, described as For each character c in A, determine whether any character in B is equal to c.

    For a substring search the mode should be specified as _SIDD_UBYTE_OPS | _SIDD_CMP_EQUAL_ORDERED | _SIDD_BIT_MASK.

    If you do these changes the output is "0 1", or in other words, a match at the 9:th character.

    BTW: You can load from strings by using _mm_loadu_si128((__m128i*)(str)); instead of using _mm_set_epi8.