amazon-web-servicesamazon-redshift

How to Resolve 'Invalid JSONPath Format' Error When Importing Nested JSON into Amazon Redshift?


I'm trying to insert a nested JSON list into Amazon Redshift but I'm encountering an error: Invalid JSONPath format: Member is not an object. This may be due to incorrect usage of JSONPaths for accessing nested objects.

Here is the JSON I am using:

{
    "UserId": 78910,
    "UserName": "johndoe123",
    "AccountDetails": {
        "AccountType": "Premium",
        "CreatedDate": "2023-01-15T08:30:00"
    },
    "RecentActivities": [
        {
            "Activity": "Logged In",
            "Timestamp": "2023-10-10T14:20:00+00:00"
        },
        {
            "Activity": "Updated Profile",
            "Timestamp": "2023-10-11T11:45:00+00:00"
        },
        {
            "Activity": "Purchased Item",
            "ItemId": 45678,
            "Timestamp": "2023-10-12T09:00:00+00:00"
        }
    ],
    "Preferences": {
        "Language": "en-US",
        "NotificationsEnabled": true
    }
}

The JSONPaths I used are:

{
    "jsonpaths": [
        "$.UserId",
        "$.UserName",
        "$.AccountDetails.AccountType",
        "$.AccountDetails.CreatedDate",
        "$.RecentActivities[*].Activity",
        "$.RecentActivities[*].Timestamp"
    ]
}

I aim to achieve the following table output:

| UserId | UserName   | AccountType | CreatedDate         | Activity         | Timestamp               |
|--------|------------|-------------|---------------------|------------------|-------------------------|
| 78910  | johndoe123 | Premium     | 2023-01-15T08:30:00 | Logged In        | 2023-10-10T14:20:00+00:00 |
| 78910  | johndoe123 | Premium     | 2023-01-15T08:30:00 | Updated Profile  | 2023-10-11T11:45:00+00:00 |
| 78910  | johndoe123 | Premium     | 2023-01-15T08:30:00 | Purchased Item   | 2023-10-12T09:00:00+00:00 |

Solution

  • You cannot wildcard the array index. Copy is a row for row mapping so you could put the 0th entry in your table I expect you want to unroll the array into many rows. To do this you will need to import the array and then unnest it utilizing a staging table.

    There are a number of ways to do this but I expect the easiest is to import the array as a SUPER and then unnest the array. See: https://docs.aws.amazon.com/redshift/latest/dg/query-super.html