phpgzipinflatechunked

How to decode/inflate a chunked gzip string?


After making a gzip deflate request in PHP, I receive the deflated string in offset chunks, which looks like the following

Example shortened greatly to show format:

00001B4E
¾”kŒj…Øæ’ìÑ«F1ìÊ`+ƒQì¹UÜjùJƒZ\µy¡ÓUžGr‡J&=KLËÙÍ~=ÍkR
0000102F
ñÞœÞôΑüo[¾”+’Ñ8#à»0±R-4VÕ’n›êˆÍ.MCŽ…ÏÖr¿3M—èßñ°r¡\+
00000000

I'm unable to inflate that presumably because of the chunked format. I can confirm the data is not corrupt after manually removing the offsets with a Hex editor and reading the gzip archive. I'm wondering if there's a proper method to parse this chunked gzip deflated response into a readable string?

I might be able to split these offsets and join the data together in one string to call gzinflate, but it seems there must be an easier way.


Solution

  • The proper method to deflate a chunked response is roughly as follows:

    initialise string to hold result
    for each chunk {
      check that the stated chunk length equals the string length of the chunk
      append the chunk data to the result variable
    }
    

    Here's a handy PHP function to do that for you (FIXED):

    function unchunk_string ($str) {
    
      // A string to hold the result
      $result = '';
    
      // Split input by CRLF
      $parts = explode("\r\n", $str);
    
      // These vars track the current chunk
      $chunkLen = 0;
      $thisChunk = '';
    
      // Loop the data
      while (($part = array_shift($parts)) !== NULL) {
        if ($chunkLen) {
          // Add the data to the string
          // Don't forget, the data might contain a literal CRLF
          $thisChunk .= $part."\r\n";
          if (strlen($thisChunk) == $chunkLen) {
            // Chunk is complete
            $result .= $thisChunk;
            $chunkLen = 0;
            $thisChunk = '';
          } else if (strlen($thisChunk) == $chunkLen + 2) {
            // Chunk is complete, remove trailing CRLF
            $result .= substr($thisChunk, 0, -2);
            $chunkLen = 0;
            $thisChunk = '';
          } else if (strlen($thisChunk) > $chunkLen) {
            // Data is malformed
            return FALSE;
          }
        } else {
          // If we are not in a chunk, get length of the new one
          if ($part === '') continue;
          if (!$chunkLen = hexdec($part)) break;
        }
      }
    
      // Return the decoded data of FALSE if it is incomplete
      return ($chunkLen) ? FALSE : $result;
    
    }