pythonpython-requestssimple-salesforce

Get text body from SalesForce report


I am trying to get SalesForce report via python using requests library. I am able to login successfully. The output I get with python does not contain any text body. How can I extract the text from python output.


from simple_salesforce import Salesforce
import requests
import pandas as pd

sf = Salesforce(username='', 
                password='',
                security_token='')



export_url = 'https://gkg-mfsa.lightning.force.com/lightning/r/Report/00O9N000000JwK2UAK/?export=1&enc=UTF-8&cf=csv'

session = requests.Session()
response = session.get(export_url, 
                       headers=sf.headers, 
                       cookies={'sid': sf.session_id})
download_report = response.content.decode('utf-8')
print(download_report)

output

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
    <meta HTTP-EQUIV="PRAGMA" CONTENT="NO-CACHE">





<script>
function redirectOnLoad() {
if (this.SfdcApp && this.SfdcApp.projectOneNavigator) { SfdcApp.projectOneNavigator.handleRedirect('https://gkg-mfsa.my.salesforce.com?ec=302&startURL=%2Fvisualforce%2Fsession%3Furl%3Dhttps%253A%252F%252Fgkg-mfsa.lightning.force.com%252Flightning%252Fr%252FReport%252F00O9N000000JwK2UAK%252F%253Fexport%253D1%2526enc%253DUTF-8%2526cf%253Dcsv'); }  else 
if (window.location.replace){ 
window.location.replace('https://gkg-mfsa.my.salesforce.com?ec=302&startURL=%2Fvisualforce%2Fsession%3Furl%3Dhttps%253A%252F%252Fgkg-mfsa.lightning.force.com%252Flightning%252Fr%252FReport%252F00O9N000000JwK2UAK%252F%253Fexport%253D1%2526enc%253DUTF-8%2526cf%253Dcsv');
} else {
window.location.href ='https://gkg-mfsa.my.salesforce.com?ec=302&startURL=%2Fvisualforce%2Fsession%3Furl%3Dhttps%253A%252F%252Fgkg-mfsa.lightning.force.com%252Flightning%252Fr%252FReport%252F00O9N000000JwK2UAK%252F%253Fexport%253D1%2526enc%253DUTF-8%2526cf%253Dcsv';
} 
} 
redirectOnLoad();
</script>

</head>


</html>





<!-- Body events -->
<script type="text/javascript">function bodyOnLoad(){if(window.PreferenceBits){window.PreferenceBits.prototype.csrfToken="null";};}function bodyOnBeforeUnload(){}function bodyOnFocus(){}function bodyOnUnload(){}</script>
            
</body>
</html>


<!--
...................................................................................................
...................................................................................................
...................................................................................................
...................................................................................................
-->



How can I get the text part from the output?

Edit:

I have also tried simple_salesforce library approach, specifically using sf.restful to get the report. That gives me a invalid session id error, I have also posted a question about that, you can find it here


Solution

  • This "works for me" (org doesn't have "enhanced domains" enabled yet), put your report id

    from simple_salesforce import Salesforce
    import requests
    from io import StringIO
    
    sf = Salesforce(username='secret@example.com', 
                    password='hunter2',
                    security_token='')
    
    print(sf.sf_instance)
    print(sf.session_id)
    
    export_url = export_url = 'https://' + sf.sf_instance + '/' + '00O5J000000y5LVUAY?isdtp=p1&export=1&enc=UTF-8&xf=csv'
    
    session = requests.Session()
    response = session.get(export_url, 
                           headers=sf.headers, 
                           cookies={'sid': sf.session_id})
    download_report = response.content.decode('utf-8')
    # print(download_report)
    
    data = StringIO(download_report)
    df = pd.read_csv(data)
    df