pythoncsvtojson

converting csv to json with specifc conditions


I want to dump the values of csv to json with specific conditions, here is code I have written:

import json
import csv
csv_file = open("student_data.csv",'r')
csv_reader = csv.DictReader(csv_file, None)
#creating loop
for row in csv_reader:
    row['roll_number']= int (row['roll_number'])
     out=json.dumps(row,indent=2)
         jsonoutput = open(row['name']+"_"+str(row['no'])+'.json','w')
     jsonoutput.write(out)
jsonoutput.close()
csv_file.close()

This creates a json file for each row like from a csv like:

no,name,link        
1,pal,image.png 
2,nina,page.html    
3,ashi,image.jpg    

to this expected output:

pal_1.json

{
    "no": "1",
    "name": "pal",
    "link": {"photo": [{"abc": "image.png"}]}
}

nina_2.json

{
    "no": "2",
    "name": "nina",
    "link": {"webpage": [{"abc": "page.html"}]}
}

ashi_3.json

{
    "no": "3",
    "name": "ashi",
    "link": {"photo": [{"abc": "image.jpg"}]}
}

I tried with creating a dict like this but it didn't work because images can have different extension like jpg, png, jpeg.

extension = {'.png': 'image'}

If my link is an image then it should be written as "link": {"photo": [{"abc": "image.png"}]} in this format, and if it's html , as "link": {"webpage": [{"abc": "page.html"}]}.

For html it's easy but for photo I'm bit confused because I can't directly specify every possible extension.


Solution

  • This will help you to generate the output, you want from csv file.

    I have imported mimetypes module. I am using Python 3.11 (latest release version). You can get more details here.

    import mimetypes

    The CSV format and sample data used.

    no,name,link
    1 ,pal ,image.png
    2 ,nina ,page.html
    3 ,ashi ,image.jpg
    3 ,test ,webpage.html
    

    Python script:

    import json
    import csv
    import mimetypes
    
    csv_file =  open("student_data.csv", 'r')
    csv_reader = csv.DictReader(csv_file, None)
    #print(csv_reader.fieldnames[2])
    #creating loop
    for row in csv_reader:
        row['no']= int(row['no'])
        #print(row)
        out=json.dumps(row,indent=2)
        print(mimetypes.guess_type(row['link']))
        linkFileType = mimetypes.guess_type(row['link'])[0].split('/')
        if('image' in linkFileType):        
            #Code with Link "link":{"photo":[{"abc":"image.png"}]} 
            print(linkFileType[0] +' is an Image type')                    
        else:
            #Code with link "link":{"webpage":[{"abc":"page.html"}]} 
            print(linkFileType[0] + ' is not an Image type')
        
        jsonoutput = open(row['name']+"_"+str(row['no'])+'.json','w')
        jsonoutput.write(out)
        print(out)
    jsonoutput.close()
    csv_file.close()
    

    The output from the above script:

    ('image/png', None)
    image is an Image type
    {
      "no": 1,
      "name": "pal ",
      "link": "image.png"
    }
    ('text/html', None)
    text is not an Image type
    {
      "no": 2,
      "name": "nina ",
      "link": "page.html"
    }
    ('image/jpeg', None)
    image is an Image type
    {
      "no": 3,
      "name": "ashi ",
      "link": "image.jpg"
    }
    ('text/html', None)
    text is not an Image type
    {
      "no": 4,
      "name": "test ",
      "link": "webpage.html"
    }