pythondataframegetprotocol-buffersgtfs

Errors with reading GTFS tripupdates.pb real time data using get() function


We want to extract the stop arrival time and departure time from the list within entity using the following code. However, I am getting repeated errors.

dict_obj['entity'][0]

Gives the following output


{'id': '8800314',
 'tripUpdate': {'trip': {'tripId': '8800314',
   'startTime': '11:30:00',
   'startDate': '20240313',
   'routeId': '20832'},
  'stopTimeUpdate': [{'stopSequence': 1,
    'arrival': {'time': '1710344086'},
    'departure': {'time': '1710344086'},
    'stopId': '86900',
    'stopTimeProperties': {}},
   {'stopSequence': 2,
    'arrival': {'time': '1710343956'},
    'departure': {'time': '1710343956'},
    'stopId': '86024',
    'stopTimeProperties': {}},
   {'stopSequence': 3,
    'arrival': {'time': '1710343995'},
    'departure': {'time': '1710343995'},
    'stopId': '86560',
    'stopTimeProperties': {}},
   {'stopSequence': 4,

We want to extract arrival time:

#for trip updates

collector1 = []
counter1=0
for block1 in dict_obj1['entity']:
    counter1 += 1
    row = OrderedDict()
    row['stop_AT'] = block1.get('tripUpdate').get('stopTimeUpdate')[0].get('arrival').get('time')
    row['stop_DT'] = block1.get('tripUpdate').get('stopTimeUpdate')[0].get('departure').get('time')
    collector1.append(row)
df1 = pd.DataFrame(collector1)        

Error:

AttributeError: 'list' object has no attribute 'get'

Code source:

https://nbviewer.org/url/nikhilvj.co.in/files/gtfsrt/locations.ipynb#


Solution

  • Here is an example how you can get arrival and departure times from this dictionary:

    dict_obj1 = {
        "entity": [
            {
                "id": "8800314",
                "tripUpdate": {
                    "trip": {
                        "tripId": "8800314",
                        "startTime": "11:30:00",
                        "startDate": "20240313",
                        "routeId": "20832",
                    },
                    "stopTimeUpdate": [
                        {
                            "stopSequence": 1,
                            "arrival": {"time": "1710344086"},
                            "departure": {"time": "1710344086"},
                            "stopId": "86900",
                            "stopTimeProperties": {},
                        },
                        {
                            "stopSequence": 2,
                            "arrival": {"time": "1710343956"},
                            "departure": {"time": "1710343956"},
                            "stopId": "86024",
                            "stopTimeProperties": {},
                        },
                        {
                            "stopSequence": 3,
                            "arrival": {"time": "1710343995"},
                            "departure": {"time": "1710343995"},
                            "stopId": "86560",
                            "stopTimeProperties": {},
                        },
                    ],
                },
            }
        ]
    }
    
    out = []
    for block in dict_obj1["entity"]:
        for d in block["tripUpdate"]["stopTimeUpdate"]:
            arrival = d.get("arrival", {}).get("time")
            departure = d.get("departure", {}).get("time")
            out.append({"arrival": arrival, "departure": departure})
    
    df = pd.DataFrame(out)
    print(df)
    

    Prints:

          arrival   departure
    0  1710344086  1710344086
    1  1710343956  1710343956
    2  1710343995  1710343995