pythonjson

How can I separate a .json file by keys stored in a different .json file?


I have two files, one containing this type of data:

{"v_uqiMw7tQ1Cc": {"duration": 55.15, "timestamps": [[0.28, 55.15], [13.79, 54.32]]}, "v_bXdq2zI1Ms0": {"duration": 73.1, "timestamps": [[0, 10.23], [10.6, 39.84], [38.01, 73.1]]}
    

and the other file contains the splits ids:

{train_set:{"v_uqiMw7tQ1Cc","v_uqiMfergQ1Cc"}, test_set:{"v_bXdq2zI1Ms0", "v_bXdfreht2Ms0"}}

How do I change the first file for it to have the subset it belongs to?:

{"v_uqiMw7tQ1Cc": {"duration": 55.15, subset: "train", "timestamps": [[0.28, 55.15], [13.79, 54.32]]}, "v_bXdq2zI1Ms0": {"duration": 73.1, subset: "test", "timestamps": [[0, 10.23], [10.6, 39.84], [38.01, 73.1]]}

Solution

  • Try below code, I guess this is what you are looking for

    import json
    
    with open('json_file/json_set1.json') as f1:
        file1 = json.load(f1)
    with open('json_file/json_set2.json') as f2:
        file2 = json.load(f2)
    
    print(file1) 
    

    {'v_uqiMw7tQ1Cc': {'duration': 55.15, 'timestamps': [[0.28, 55.15], [13.79, 54.32]]}, 'v_bXdq2zI1Ms0': {'duration': 73.1, 'timestamps': [[0, 10.23], [10.6, 39.84], [38.01, 73.1]]}}

    print(file2)
    

    {'train_set': ['v_uqiMw7tQ1Cc', 'v_uqiMfergQ1Cc'], 'test_set': ['v_bXdq2zI1Ms0', 'v_bXdfreht2Ms0']}

    for key1, val2 in file1.items():
        for key2, val2 in file2.items():
            if key1 in file2[key2]:
                file1[key1].update({'subset': key2})
    
    print(file1)
    

    {'v_uqiMw7tQ1Cc': {'duration': 55.15, 'timestamps': [[0.28, 55.15], [13.79, 54.32]], 'subset': 'train_set'},

    'v_bXdq2zI1Ms0': {'duration': 73.1, 'timestamps': [[0, 10.23], [10.6, 39.84], [38.01, 73.1]], 'subset': 'test_set'}}