pythonjsonpathjsonpath-ng

JSONPath issues with Python and jsonpath_ng (Parse error near token ?)


I'm trying to work with jsonpath_ng Python library. For most of the JSONPath filters I usually use it works. However, I'm struggling with a simple filter clause. It can be summarized in 2 lines.

from jsonpath_ng.ext import parse
jsonpath_expression = parse(f"$.jobs.*.jobSummary.[?(@.storagePolicy.storagePolicyName=='{SPname}')].sizeOfApplication")

My JSON payload is this one:

{
"processinginstructioninfo": {
    "attributes": [
        {
            "name": "WebServer",
            "value": "IDM-COMMSERVE"
        }
    ]
},
"totalRecordsWithoutPaging": 161,
"jobs": [
    {
        "jobSummary": {
            "sizeOfApplication": 65552265428,
            "vsaParentJobID": 28329591,
            "commcellId": 2,
            "backupSetName": "defaultBackupSet",
            "opType": 59,
            "totalFailedFolders": 0,
            "totalFailedFiles": 0,
            "alertColorLevel": 0,
            "jobAttributes": 288232025419153408,
            "jobAttributesEx": 67108864,
            "isVisible": true,
            "localizedStatus": "Completed",
            "isAged": false,
            "totalNumOfFiles": 0,
            "jobId": 28329592,
            "jobSubmitErrorCode": 0,
            "sizeOfMediaOnDisk": 34199,
            "currentPhase": 0,
            "status": "Completed",
            "lastUpdateTime": 1661877467,
            "percentSavings": 99.99995,
            "localizedOperationName": "Snap Backup",
            "statusColor": "black",
            "pendingReason": "",
            "errorType": 0,
            "backupLevel": 2,
            "jobElapsedTime": 59,
            "jobStartTime": 1661877408,
            "currentPhaseName": "",
            "jobType": "Snap Backup",
            "isPreemptable": 0,
            "backupLevelName": "Incremental",
            "attemptStartTime": 0,
            "pendingReasonErrorCode": "",
            "appTypeName": "Virtual Server",
            "percentComplete": 100,
            "averageThroughput": 27472.637,
            "localizedBackupLevelName": "Incremental",
            "currentThroughput": 0,
            "subclientName": "default",
            "destClientName": "desktop-1058kvf",
            "jobEndTime": 1661877467,
            "dataSource": {
                "dataSourceId": 0
            },
            "subclient": {
                "clientName": "desktop-1058kvf",
                "instanceName": "VMInstance",
                "backupsetId": 161,
                "commCellName": "idm-commserve",
                "instanceId": 2,
                "subclientId": 235,
                "clientId": 71,
                "appName": "Virtual Server",
                "backupsetName": "defaultBackupSet",
                "applicationId": 106,
                "subclientName": "default"
            },
            "storagePolicy": {
                "storagePolicyName": "IDM-Metallic-Replica_ReplicationPlan",
                "storagePolicyId": 78
            },
            "destinationClient": {
                "clientId": 71,
                "clientName": "desktop-1058kvf",
                "displayName": "idm-laptop1"
            },
            "userName": {
                "userName": "admin",
                "userId": 1
            },
            "clientGroups": [
                {
                    "clientGroupId": 4,
                    "clientGroupName": "Laptop Clients"
                },
                {
                    "clientGroupId": 46,
                    "clientGroupName": "Clients For Commserv LiveSync"
                },
                {
                    "clientGroupId": 47,
                    "clientGroupName": "idm-vcsa"
                },
                {
                    "clientGroupId": 55,
                    "clientGroupName": "Laptop plan test clients"
                }
            ]
        }
    }
]

}

I need to get just the "sizeOfApplication" parameter for every object with a particular "storagePolicyName". That's it. Say, in this case, that the "storagePolicyName" I'm looking values for is "IDM-Metallic-Replica_ReplicationPlan" as an example. I usually go to My favourite JSONPath site to test the JSONpath I use, and this one "$.jobs.*.jobSummary.[?(@.storagePolicy.storagePolicyName=='IDM-Metallic-Replica_ReplicationPlan')].sizeOfApplication" works. But, on Python side, I keep getting "jsonpath_ng.exceptions.JsonPathParserError: Parse error at 1:21 near token ? (?)" errors.

What am I doing wrong? Thank you!

Mattia


Solution

  • I think the problem here is that jsonpath_ng is being stricter to the JSONPath proposal than the other parsers you have tried.

    The first problem is that there shouldn't be a . immediately before a filter condition [?(...)]. So the first step is to remove the . after jobSummary in jobSummary.[?(@storagePolicy....

    I made that change to your JSONPath expression, and used jsonpath_ng to run it on your sample data. The parser error had gone, but it returned no matches. So it's still not right.

    From reading the JSONPath proposal, it's not clear if you can use a filter operator such as [?(...)] on an object, or only on an array. When used on an array it would return all elements of the array that match the filter. If a JSONPath parser does support a filter on an object, then it seems it returns the object if the filter matches and an empty list of matches otherwise.

    I would guess that jsonpath_ng only permits filters on arrays. So let's modify your JSONPath expression to only use filters on arrays. Your JSON has an array in $.jobs, and within each element of this array you want to look within the jobSummary object for a storagePolicy with storagePolicyName=={SPname}. So the following JSONPath expression should return the matching job:

    $.jobs[?(@.jobSummary.storagePolicy.storagePolicyName=='{SPname}')]
    

    What we then want to do is to get the value of the sizeOfApplication property within the jobSummary object within each matching job. Note that the matches returned by the above JSONPath expression are elements of the jobs array, not jobSummary objects. We can't just select sizeOfApplication because we're one level further up than we were before. We need to go back into the jobSummary object to get the sizeOfApplication:

    $.jobs[?(@.jobSummary.storagePolicy.storagePolicyName=='{SPname}')].jobSummary.sizeOfApplication
    

    I used jsonpath_ng to run this JSONPath expression on your sample data and it gave me the output [65552265428], which seems to be the expected output.