I have a nested dictionary stored in the variable nested_dict_variable
.
The dictionary is retrieved by using SPSS valueLabels Property (Python)
type(nested_dict_variable)
results in dict
.
print(nested_dict_variable)
results in {0: {1.0: '1 - low', 2.0: '2', 3.0: '3', 4.0: '4', 5.0: '5 - high', 99.0: "99 - don't know"}, 1: {0.0: '0 - no', 1.0: '1 - yes'}, 2: {1.0: '1 - A', 2.0: '2 - B', 3.0: '3 - C'}}
I am trying to convert this nested dictionary into a pandas DataFrame, but receive the following error. I don't understand why this attribute error is raised given that nested_dict_variable
is (or seems to be) a dictionary!?
AttributeError Traceback (most recent call last)
File c:\mypythonfile.py:38
36 data_list = []
37 for outer_key, inner_dict in nested_dict_variable.items():
---> 38 for inner_key, value in inner_dict.items():
39 data_list.append({'Outer Key': outer_key, 'Inner Key': inner_key, 'Value': value})
41 df = pd.DataFrame(data_list)
AttributeError: 'ValueLabel' object has no attribute 'items'
Here is my code:
# see: https://www.ibm.com/docs/en/spss-statistics/28.0.0?topic=programs-running-spss-statistics-from-external-python-process#d10392e74
import spss
# import pandas
import pandas as pd
# read spss-data
file = r"C:\SPSS-SampleData1.sav"
spss.Submit(
f"""
GET FILE='{file}'.
"""
)
var_index = []
nested_dict_variable= {}
# initialise the handling of spss commands
spss.StartDataStep()
# access active dataset (the one that was read above)
datasetObj = spss.Dataset()
# get a list of variable objects
varListObj = datasetObj.varlist
for var in varListObj:
var_index.append(var.index)
nested_dict_variable[var.index] = var.valueLabels
spss.EndDataStep()
##### CREATE DATAFRAMES #####
# convert nested dictionary to Pandas DataFrame
data_list = []
for outer_key, inner_dict in nested_dict_variable.items():
for inner_key, value in inner_dict.items():
data_list.append({'Outer Key': outer_key, 'Inner Key': inner_key, 'Value': value})
df = pd.DataFrame(data_list)
# end spss process
spss.StopSPSS()
Like user2357112 has pointed out, the var.valueLabels looks like a dict, but it isn't one.
I had a quick look in the documentation of this python spss package and it says:
You can iterate through the set of value labels for a variable using the data property, as in:
varObj = datasetObj.varlist['origin'] for val, valLab in varObj.valueLabels.data.iteritems(): print val, valLab
So you could try rewriting your code:
data_list = []
for outer_key, inner_dict in nested_dict_variable.items():
for inner_key, value in inner_dict.data.iteritems():
data_list.append({'Outer Key': outer_key, 'Inner Key': inner_key, 'Value': value})
I haven't tried it though. Good luck! ;)