https://github.com/ajayvd/DOUBT_SO/blob/main/doubt.csv
The above link contains the data frame and it contains 2 columns response_value & response_errors & The response_errors contains the value in list of dictionaries .
problem statement
so here i want to iterate only the provider key and store them next to it in a column in form of list
for example :
for row 1 it will be [ shipt ]
for row 2 [ shipt ]
for row 3 [ doordash , pickupnow , roadie ]
you would be required to do preprocessing , so i am attaching the snippet below .
def parse_json(webhook_data):
try:
data = json.loads(webhook_data)
except ValueError as ex:
data={'estimate':'error'}
return data
data_metering['RESPONSE_ERRORS']=data_metering['RESPONSE_ERRORS'].apply(parse_json)
custom function which i code , please modify the code and tell me the optimised way to do it.
def traverse_dsp(data_frame,column):
provider = []
#Iterate over each row in Dataframe
for index, row in data_frame.iterrows():
# Iterate over each json object in each row in DataFrame
for i in range(0,len(row[column])):
for k,v in row[column][i].items():
if k=="provider":
provider_val=v
x.loc[index,"provider_list"]=provider
x=data_metering
traverse_dsp(x,"RESPONSE_ERRORS")
IIUC, you can try this :
from ast import literal_eval
df_ = pd.read_csv("https://raw.githubusercontent.com/ajayvd/DOUBT_SO/main/doubt.csv")
ser = df_["RESPONSE_ERRORS"].apply(literal_eval).explode()
df = (df_.join(pd.DataFrame(ser.tolist(), index=ser.index)
.groupby(level=0)["provider"].agg(list)))
Output :
print(df["provider"])
0 [Shipt]
1 [Shipt]
2 [DoorDash, PickupNow, Roadie]
3 [DoorDash]
4 [DoorDash]
Name: provider, dtype: object
print(df)
RESPONSE_VALUE RESPONSE_ERRORS provider_list provider
0 ['Shipt'] [{'code': 0, 'message': "Shipt doesn't provide estimates", 'provider': 'Shipt', 'type': 'DSP_DOES_NOT_OFFER_ESTIMATES'}] shipt [Shipt]
1 ['Shipt'] [{'code': 0, 'message': "Shipt doesn't provide estimates", 'provider': 'Shipt', 'type': 'DSP_DOES_NOT_OFFER_ESTIMATES'}] NaN [Shipt]
2 ['FedEx'] [{'code': 400, 'message': 'Distance from pickup address to dropoff address exceeds 15 miles (24 kilometers): dropoff_address', 'provider': 'DoorDash', 'type': 'MAX_DISTANCE_EXCEEDED'}, {'code': 0, 'message': 'Not serviceable, Reason: Mileage', 'provider': 'PickupNow', 'type': 'UNAVAILABLE'}, {'code': 400, 'message': 'Route Sorry, distance in passenger vehicles is limited to 30 miles.', 'provider': 'Roadie', 'type': 'UNMAPPED_ERROR'}] NaN [DoorDash, PickupNow, Roadie]
3 ['PickupNow', 'FedEx', 'Roadie'] [{'code': 400, 'message': 'Distance from pickup address to dropoff address exceeds 15 miles (24 kilometers): dropoff_address', 'provider': 'DoorDash', 'type': 'MAX_DISTANCE_EXCEEDED'}] NaN [DoorDash]
4 ['Roadie'] [{'code': 400, 'message': 'Distance from pickup address to dropoff address exceeds 15 miles (24 kilometers): dropoff_address', 'provider': 'DoorDash', 'type': 'MAX_DISTANCE_EXCEEDED'}] NaN [DoorDash]