I am trying to load JSON file data into a dataframe, filter a few records, and write it back to file again. My file contains one JSON record per line and each one has a URL in it. This is the sample data in the input file.
{"site_code":"111","site_url":"https://www.site111.com"}
{"site_code":"222","site_url":"https://www.site333.com"}
{"site_code":"333","site_url":"https://www.site333.com"}
Sample code I used
import pandas as pd
sites = pd.read_json('sites.json', lines=True)
modified_sites = sites[sites['site_code']!=222]
modified_sites.to_json('modified_sites.json',orient='records',lines=True)
But the generated file contains escaped forward slashes
{"site_code":111,"site_url":"https:\/\/www.site111.com"}
{"site_code":333,"site_url":"https:\/\/www.site333.com"}
How can I avoid it and get the following data in the generated file?
{"site_code":111,"site_url":"https://www.site111.com"}
{"site_code":333,"site_url":"https://www.site333.com"}
Note: I referred to these but not helpful for my case
You can try to format escaped slashes directly and save result to file:
import pandas as pd
import numpy as np
sites = pd.read_json('sites.json', lines=True)
modified_sites = sites[sites['site_code']!=222]
modified_sites.to_json('modified_sites.json',orient='records',lines=True)
formatted_json = modified_sites.to_json(orient='records',lines=True).replace('\\/', '/')
print(formatted_json, file=open('modified_sites.json', 'w'))