pythonpandassalesforcesimple-salesforce

How to format a table extracted from salesforce into python?


I was able to extract some fields from salesforce using python.

I used the following code block :

!pip install simple_salesforce 

from simple_salesforce import Salesforce
import pandas as pd

sf = Salesforce(
username='', 
password='', 
security_token='')

sf_data = sf.query_all("SELECT Brand_Name__c,Name FROM AuthorisedProduct__c")

sf_df = pd.DataFrame(sf_data)

sf_df.head()

This process puts all items in one 'record' field.

records total size
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fmAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-1.2-TL-PLUS-B')]) 14000
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1fnAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-1.2-TL-PLUS-SB')]) 14000
OrderedDict([('attributes', OrderedDict([('type', 'AuthorisedProduct__c'), ('url', '/services/data/v42.0/sobjects/AuthorisedProduct__c/a020o00000xC1foAAC')])), ('Brand_Name__c', 'ABB'), ('Name', 'UNO-DM-2.0-TL-PLUS-B')]) 14000

Please note there are 14000 values under records. I wanted to have only two fields in a simple dataframe. A Table with 'Brand_Name__c' and 'Name' fields.

Brand_Name__C Name
ABB UNO-DM-2.0-TL-PLUS-B
ABB UNO-DM-1.2-TL-PLUS-SB

and we will have a matrix of 14000 by 2.

Please advise how to achieve that?

And also, How to reverse that process?

Thank you all so much.


Solution

  • You have to be aware of the actual shape of the JSON response sent by Salesforce, which includes a top-level "records" key under which all of your data is contained. Additionally, each record entry contains an "attributes" key, besides the data for the fields you actually requested. You cannot change the shape of the JSON response.

    There is an example provided in the simple_salesforce documentation showing how to digest this API response for Pandas:

    Generate Pandas Dataframe from SFDC API Query (ex.query,query_all)

    import pandas as pd
    
    sf.query("SELECT Id, Email FROM Contact")
    
    df = pd.DataFrame(data['records']).drop(['attributes'],axis=1)