pythonpandasjython-2.7

Python CSV Grouped Columns to list of dict of mixed elements


I am working on one jython script for websphere which would accept sys.argv as list of dict to perform further process-

I need help in converting CSV data to list of dictionary mixed of list of tuples as below-

Input CSV-

cluster_name,pool_name,min,max,inactive_time,description,action
Clst1,WebContainer,25,25,60000,Revisit,modify
Clst3,WebContainer,50,50,60000,revisit,modify
Clst6,WebContainer,50,50,60000,revisit,modify
Clst1,ORB.thread.pool,,,,,delete
Clst3,ORB.thread.pool,,,,,delete`

I am trying to achieve using pandas to group columns but unable to create mixed element dict

Need below object (list of dict of mixed elements)

[
 {cluster_name:'Clst1',
  pool_name:[
         (WebContainer,25,25,60000,Revisit,modify),
         (ORB.thread.pool,,,,,delete)]},
 {cluster_name:'Clst3',
  pool_name:[
         (WebContainer,50,50,60000,revisit,modify), 
         (ORB.thread.pool,,,,,delete)]},
 {cluster_name:'Clst6',
  pool_name:[
         (WebContainer,50,50,60000,revisit,modify)
        ]}
]

So that I can use this object as sys.argv to jython script.


Solution

  • Try:

    from io import StringIO
    import pandas as pd
    
    csvfile = StringIO("""cluster_name,pool_name,min,max,inactive_time,description,action
    Clst1,WebContainer,25,25,60000,Revisit,modify
    Clst3,WebContainer,50,50,60000,revisit,modify
    Clst6,WebContainer,50,50,60000,revisit,modify
    Clst1,ORB.thread.pool,,,,,delete
    Clst3,ORB.thread.pool,,,,,delete""")
    
    df = pd.read_csv(csvfile)
    
    s = df.set_index(['cluster_name']).apply(tuple, axis=1).rename('pool_name').groupby(level=0).agg(list).reset_index()
    
    s.to_json(orient='records')
    

    Output:

    [{"cluster_name":"Clst1","pool_name":[["WebContainer",25.0,25.0,60000.0,"Revisit","modify"],["ORB.thread.pool",null,null,null,null,"delete"]]},{"cluster_name":"Clst3","pool_name":[["WebContainer",50.0,50.0,60000.0,"revisit","modify"],["ORB.thread.pool",null,null,null,null,"delete"]]},{"cluster_name":"Clst6","pool_name":[["WebContainer",50.0,50.0,60000.0,"revisit","modify"]]}]