pythonjsonapidataframeincapsula

Python - Extract information from dataframe (JSON)


I'm a begginer and it's been a long time I didn't code anything :-) I'm using requests library to retrieve JSON data from the Incapsula(Cloud web security service) API to get some stats about a website. What I want in the end is to write the "type of trafic, timestamp, and number" to a file to create reports. API Response is something like this :

{
    "res": 0,
    "res_message": "OK",
    "visits_timeseries" : [
        {
            "id":"api.stats.visits_timeseries.human",
            "name":"Human visits",
            "data":[
                [1344247200000,50],
                [1344247500000,40],
                ...
            ]
        },
        {
            "id":"api.stats.visits_timeseries.bot",
            "name":"Bot visits",
            "data":[
                [1344247200000,10],
                [1344247500000,20],
                ...
            ]
        }

I'm recovering the Visit_timeseries data like this:

r = requests.post('https://my.incapsula.com/api/stats/v1', params=payload)
reply=r.json()
reply = reply['visits_timeseries']
reply = pandas.DataFrame(reply)

I recover data in that form (date in unix time, number of visit) :

print(reply[['name', 'data']].head())

name                                               data
0  Human visits  [[1500163200000, 39], [1499904000000, 73], [14...
1    Bot visits  [[1500163200000, 1891], [1499904000000, 1926],...

I don't undestand how to extract the fields I want from the dataframe to write only them into the excel. I would need modify the data field into two rows (date, value). And only the name as the top rows.

What would be great is :

        Human Visit      Bot Visit
Date       Value           Value
Date       Value           Value
Date       Value           Value

Thanks for your help!


Solution

  • Well, if it is any help, this is a hardcoded version:

    import pandas as pd
    
    reply =  {
        "res": 0,
        "res_message": "OK",
        "visits_timeseries" : [
            {
                "id":"api.stats.visits_timeseries.human",
                "name":"Human visits",
                "data":[
                    [1344247200000,50],
                    [1344247500000,40]
                ]
            },
            {
                "id":"api.stats.visits_timeseries.bot",
                "name":"Bot visits",
                "data":[
                    [1344247200000,10],
                    [1344247500000,20]
                ]
            }
            ]
            }
    
    human_data = reply['visits_timeseries'][0]['data']
    bot_data = reply['visits_timeseries'][1]['data']
    
    df_h = pd.DataFrame(human_data, columns=['Date', 'Human Visit'])
    df_b = pd.DataFrame(bot_data, columns=['Date', 'Bot Visit'])
    df = df_h.append(df_b, ignore_index=True).fillna(0)
    df = df.groupby('Date').sum()