pythonmockingyaml

How can I extract the uncompressed text from vcrpy binary response strings?


I'm using vcrpy for automated http mock testing of my application. This is working great. However my mocking includes encoding 'gzip', and 'deflate', which means the vcrpy recording responses are in binary format. Here's an example:

interactions:
- request:
    body: null
    headers:
      Accept: ['*/*']
      Accept-Encoding: ['gzip, deflate']
      Authorization: [Basic Y2hlc3RlcjpiYWRnZXI=]
      Connection: [keep-alive]
      User-Agent: [python-requests/2.9.1]
    method: GET
    uri: http://localhost:8153/go/compare/DeployProduction/14/with/15
  response:
    body:
      string: !!binary |
        H4sIAAAAAAAAAO08bXPbNpPf8ytQ9q6yp+Y7KZGKrU7quG1mEscXO7276XQ0IAlKjCmCQ0KW9TxP
        /vstAJIiJUpxkkvS3tmTiSRisVjsG3YXIE6/e/76/Oa/ry7QbzevXk6enM7ZIkX3izQrz5Q5Y/lY
        11erlbayNVrMdNP3ff2ewygcluBo8gTB3ylLWErk93O6yHFB0FWSkzTJSImu8IwgFf1Kn5zqFeCT
        U2i6RREJlrMzJcZpSRQ0L0h8pugzquOyJKzUcZ6nSYhZQjPVCG0PE9saObHthJbjE2L79tCPhhY2
        ImxZlhEH2Hbt0PB8NyCBZ8RO6NsksoZDi1haWJYKWpAowWcKTlMFFSQ9U0q2Tkk5J4QpSJ88jKwc
        M0aKrEtf6EeBEYXG0ByFERmR2A9sO8CmPzR8LyAYEyOwwpHj48COseOaLgmsUehiHAxjOzYfRt+T
        0+9UFV3fPHtzM0Z5QRll65ygOaW3JYppgQALSrJ3JOREIVWtu1xcPn94hzIskpxtsaEswr3CCYwA
        pjUMh8Qf+iZIY2hGpu063iiyTOyDUPxRbFrRyPJHxHRBIiChKIij2LfMUWxo70plcqrLcUEK3/2R

(additional output omitted)

I have read about decompressing zlib, but it does not address decoding the binary yaml.

So that I can view the original text and verify the test results, how can I convert these binary strings into their original format?


Solution

  • The body string in the yaml file is base64 encoded, but when you load it, you'll get the raw bytestream. If you're curious, read how to decode a base64 string, though it shouldn't be necessary.

    You know you have a gzip bytestream if the first three bytes are \x1f\x8b\x08. Read here if you want to decode this manually, but vcr has a wrapper method for you, which will also update some of the header information after decoding the body. This function is vcr.filters.decode_response().

    To demonstrate this with an example, I'll use the file \tests\fixtures\wild\domain_redirect.yaml. There are two interactions, of which the second contains encoded data.

    import yaml
    from vcr.filters import decode_response 
    
    with open('domain_redirect.yaml', 'r') as f:
        doc = yaml.load(f, Loader=yaml.SafeLoader)
    
    response = doc['interactions'][1]['response']
    decoded = decode_response(response)
    print(decoded['body']['string'])