c++stringc++11scgi

Cannot parse string with null character


I'm trying to write a parser for SCGI requests. I'm trying to parse the string described in the example but for some reason I cannot find the position of the second null character, the one that separates the content length value and the next property name.

This is my test string:

string scgi_request(
    "70:CONTENT_LENGTH\027\0SCGI\01\0REQUEST_METHOD\0POST\0REQUEST_URI\0" \
    "/deepthought\0,What is the answer to life?"
   , 91);

I can find the position of the first null character, position 18. But once I try to find the one after that, the position returned is invalid, off by a few characters, all the way up to position 24.

This is my algorithm:

size_t contentLengthEnd = scgi_request.find('\0');
size_t contentLengthValueEnd = scgi_request.find('\0', ++contentLengthEnd);
std::cerr << contentLengthEnd << std::endl; // 19, because I shifted this one forward 
                                            // otherwise I'd always get the same 
                                            // character
std::cerr << contentLengthValueEnd << std::endl; // 24, no clu why.

Solution

  • Your string starts:

    "70:CONTENT_LENGTH\027\0SCGI\01\0REQUEST_METHOD\0POST\0REQUEST_URI\0" 
    

    These outputs are actually correct for the string you gave. I'm guessing you may be overlooking that \027 is an octal character constant, and so on. The characters and their indices are:

    16: 'H'
    17: '\027'
    18: '\0'
    19: 'S'
    20: 'C'
    21: 'G'
    22: 'I'
    23: '\01'
    24: '\0'
    25: 'R'
    

    Your program finds the first two '\0' which are 18 and 24, but you do ++ on the first one before outputting it, hence the output of 19 and 24.

    If you meant '\0' then '2' then '7' then you'll need to not juxtapose those things, e.g. taking advantage of string literal concatenation:

    "70:CONTENT_LENGTH\0"
    "27\0" 
    "SCGI\0"
    "1\0"