I've a Database.csv file with one column and 3 rows, those are data exported from salesforce with simple-salesforce, I try to get the 'Name' Value from the OrderedDict cell data('Name', 'Demand').
Dataframe
Type__c
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234400000001ddAAA')])), ('Name', 'Demand')])
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234400000001ddAAA')])), ('Name', 'Demand')])
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/123430000004C93AAE')])), ('Name', 'Stand')])
with below code i read the csv file and for each row i am assigning the value that contains "OrderedDict([....) to name, and i want to print the "Name" field value,
import pandas as pd
from collections import OrderedDict
df = pd.read_csv('Database.csv')
for index, row in df.iterrows():
name = df._get_value(index, 'Type__c')
base=OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234500000001erAAA')])), ('Name', 'Private')])
print("\n",name['Name'])
print(base['Name'])
Results:
.
print("\n",name['Name'])
TypeError: string indices must be integers
when i change print("\n",name['Name']) to print("\n",name) to test the manual added OrderedDict, i can see it works
import pandas as pd
from collections import OrderedDict
df = pd.read_csv('Database.csv')
for index, row in df.iterrows():
name = df._get_value(index, 'Type__c')
test=OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234500000001erAAA')])), ('Name', 'Private')])
***print("\n",name)***
print(test['Name'])
Result:
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234400000001ddAAA')])), ('Name', 'Demand')])
Private
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234400000001ddAAA')])), ('Name', 'Demand')])
Private
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/123430000004C93AAE')])), ('Name', 'Stand')])
Private
I've a file with thousands of rows and there must be way to create a new column named "Name" and add the text data
at the End i want to achieve FROM
Type__c
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234400000001ddAAA')])), ('Name', 'Demand')])
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234400000001ddAAA')])), ('Name', 'Demand')])
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/123430000004C93AAE')])), ('Name', 'Stand')])
TO
Name
Demand
Demand
Stand
Welcome to the SO-community Pamuk!
Rather than iterating through the rows of a dataframe, it is much more efficient to "apply" a particular function to an entire column (or even subset of a dataframe). That way, pandas will handle performance for you, and it is usually more readable (since you don't have to add any iterating logic).
Here is how you can obtain that "Name" column with apply and a quick lambda function (only the last line is relevant for you, the rest is for replicating your sample dataframe):
from collections import OrderedDict
import pandas as pd
data = [
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234400000001ddAAA')])), ('Name', 'Demand')]),
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/1234400000001ddAAA')])), ('Name', 'Demand')]),
OrderedDict([('attributes', OrderedDict([('type', 'Type__c'), ('url', '/services/data/v42.0/sobjects/Type__c/123430000004C93AAE')])), ('Name', 'Stand')]),
]
df = pd.DataFrame({"Type__c": data})
df["Name"] = df["Type__c"].apply(lambda x: x["Name"])
See Pandas Series.apply
since df["Type__c"]
docs for more examples and details.
For more complicated row-wise operations that involve using more columns at once, you can check the docs for DataFrame.apply
and apply a function to the dataframe itself (or a subset of it).
For a more readable part of the docs that is very relevant to what you are doing, check the Pandas User Guide section on Function Application.