pythonhttppython-requestsbytegtfs

How to read application/octet-stream in Python


Building off of this question, I'm using a Python script to call the API detailed in the link below:

https://developer.wmata.com/docs/services/gtfs/operations/5cdc51ea7a6be320cab064fe?

I use the code below to call the api:

import requests

# define functions
def _prepare_url(path):
    return f'{API_URL}/{path.lstrip("/")}'


def pull_data(path, params=None, headers=None):
    url =_prepare_url(path)
    return requests.get(url, params=params, headers=headers)


# print results in cleaner format
def jprint(obj):
    # create a formatted string of the Python JSON object
    text = json.dumps(obj, sort_keys=True, indent=4)
    print(text)


API_URL = 'https://api.wmata.com'

# authenticate with your api key
headers = {
    "api_key": "myKey",
}

response = pull_data('/gtfs/bus-gtfsrt-tripupdates.pb', headers=headers)
print(response.content)
print(response.headers)
print(response.url)

But it returns a meaningless stream of data along with the following headers:

Request-Context: appId=cid-v1:2833aead-1a1f-4ffd-874e-ef3a5ceb1de8
Cache-Control: public, must-revalidate, max-age=5
Date: Thu, 11 Feb 2021 22:05:31 GMT
ETag: 0x8D8CED90CC8419C
Content-Length: 625753
Content-MD5: fspEFl7LJ8QbZPgf677WqQ==
Content-Type: application/octet-stream
Expires: Thu, 11 Feb 2021 22:05:37 GMT
Last-Modified: Thu, 11 Feb 2021 22:04:49 GMT

'''b'\n\r\n\x031.0\x10\x00\x18\xd9\xef\xa5\x81\x06\x12\xee\x02\n\n1932817010\x1a\xdf\x02\n\x1a\n\n1932817010\x1a\x0820210214*\x0233\x12\x13\x08\x02\x1a\x06\x10\x9c\xff\xa5\x81\x06"\x0513752...'''

Any guidance on how to go about reading this kind of response?


Solution

  • GTFS-rt is transported in a compressed encoded representation called a "protobuf." Your Python script will need to use the gtfs-realtime.proto file (which contains a definition of the expected contents of the GTFS-rt feed) along with the Google Protobuf Python package in order to decode the response.

    Here is an example of how to read a GTFS-rt API in Python from the documentation: https://developers.google.com/transit/gtfs-realtime/examples/python-sample.