pythonjsonpython-3.xlistmediainfo

Process JSON data from mediainfo output


I am running "MediaInfo" command from subprocess as :

cmd = ["mediainfo", "--output=JSON", file_loc]

The output i get is :

{
"media": {
"@ref": "/home/mediaworker/divergent.jpg",
"track": [
{
"@type": "General",
"ImageCount": "1",
"FileExtension": "jpg",
"Format": "JPEG",
"FileSize": "84227",
"StreamSize": "0",
"File_Modified_Date": "UTC 2019-07-16 05:36:32",
"File_Modified_Date_Local": "2019-07-16 11:06:32"
},
{
"@type": "Image",
"Format": "JPEG",
"Width": "612",
"Height": "612",
"ColorSpace": "YUV",
"ChromaSubsampling": "4:4:4",
"BitDepth": "8",
"Compression_Mode": "Lossy",
"StreamSize": "84227"
}
]
}
}

Iam trying to reformat this data in a different way as shown below: Basically maintain a list of dictionaries.

[
{
"desc":"ImageCount",
"val" : "1"
},
{
"desc":"FileExtension",
"val" : "jpg"
},
{
"desc":"Format",
"val" : "JPEG"
}{
"desc":"FileSize",
"val" : "84227"
},
{
"desc":"StreamSize",
"val" : "0"
},
{
"desc":"File_Modified_Date",
"val" : "UTC 2019-07-16 05:36:32"
},
{
"desc":"File_Modified_Date_Local",
"val" : "2019-07-16 11:06:32"
},
{
"desc":"Width",
"val" : "612"
},
{
"desc":"Height",
"val" : "612"
},
{
"desc":"ColorSpace",
"val" : "YUV"
},
{
"desc":"ChromaSubsampling",
"val" : "4:4:4"
},
{
"desc":"BitDepth",
"val" : "8"
},
{
"desc":"Compression_Mode",
"val" : "Lossy"
},
{
"desc":"StreamSize",
"val" : "1"
}
]

Without looping multiple times using for loop, is there any way i can achieve it ? I need to remove those unwanted and duplicate keys as well.


Solution

  • Try:

    dct = {
        "media": {
            "@ref": "/home/mediaworker/divergent.jpg",
            "track": [
                {
                    "@type": "General",
                    "ImageCount": "1",
                    "FileExtension": "jpg",
                    "Format": "JPEG",
                    "FileSize": "84227",
                    "StreamSize": "0",
                    "File_Modified_Date": "UTC 2019-07-16 05:36:32",
                    "File_Modified_Date_Local": "2019-07-16 11:06:32",
                },
                {
                    "@type": "Image",
                    "Format": "JPEG",
                    "Width": "612",
                    "Height": "612",
                    "ColorSpace": "YUV",
                    "ChromaSubsampling": "4:4:4",
                    "BitDepth": "8",
                    "Compression_Mode": "Lossy",
                    "StreamSize": "84227",
                },
            ],
        }
    }
    
    out = []
    for d in dct["media"]["track"]:
        for k, v in d.items():
            if not k.startswith("@"):
                out.append({"desc": k, "val": v})
    
    print(out)
    

    Prints:

    [
        {"desc": "ImageCount", "val": "1"},
        {"desc": "FileExtension", "val": "jpg"},
        {"desc": "Format", "val": "JPEG"},
        {"desc": "FileSize", "val": "84227"},
        {"desc": "StreamSize", "val": "0"},
        {"desc": "File_Modified_Date", "val": "UTC 2019-07-16 05:36:32"},
        {"desc": "File_Modified_Date_Local", "val": "2019-07-16 11:06:32"},
        {"desc": "Format", "val": "JPEG"},
        {"desc": "Width", "val": "612"},
        {"desc": "Height", "val": "612"},
        {"desc": "ColorSpace", "val": "YUV"},
        {"desc": "ChromaSubsampling", "val": "4:4:4"},
        {"desc": "BitDepth", "val": "8"},
        {"desc": "Compression_Mode", "val": "Lossy"},
        {"desc": "StreamSize", "val": "84227"},
    ]