jsonazureazure-data-lakeu-sql

U-SQL - Extract data from complex json object


So I have a lot of json files structured like this:

{
    "Id": "2551faee-20e5-41e4-a7e6-57bd20b02a22",
    "Timestamp": "2016-12-06T08:09:57.5541438+01:00",
    "EventEntry": {
        "EventId": 1,
        "Payload": [
            "1a3e0c9e-ef69-4c6a-ac8c-9b2de2fbc701",
            "DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork.FetchVisionModelsForClientOnReferenceDateAsync(System.Int64 clientId, System.DateTime referenceDate, System.Threading.CancellationToken cancellationToken)",
            25,
            "DHS.PlanCare.Business.BusinessLogic.VisionModels.VisionModelServiceWithoutUnitOfWork+<FetchVisionModelsForClientOnReferenceDateAsync>d__11.MoveNext\r\nDHS.PlanCare.Core.Extensions.IQueryableExtensions+<ExecuteAndThrowTaskCancelledWhenRequestedAsync>d__16`1.MoveNext\r\n",
            false,
            "2197, 6-12-2016 0:00:00, System.Threading.CancellationToken"
        ],
        "EventName": "Duration",
        "KeyWordsDescription": "Duration",
        "PayloadSchema": [
            "instanceSessionId",
            "member",
            "durationInMilliseconds",
            "minimalStacktrace",
            "hasFailed",
            "parameters"
        ]
    },
    "Session": {
        "SessionId": "0016e54b-6c4a-48bd-9813-39bb040f7736",
        "EnvironmentId": "C15E535B8D0BD9EF63E39045F1859C98FEDD47F2",
        "OrganisationId": "AC6752D4-883D-42EE-9FEA-F9AE26978E54"
    }
}

How can I create an u-sql query that outputs the

Id, 
Timestamp, 
EventEntry.EventId and 
EventEntry.Payload[2] (value 25 in the example below)

I can't figure out how to extend my query

@extract =
     EXTRACT 
         Timestamp DateTime
     FROM @"wasb://xxx/2016/12/06/0016e54b-6c4a-48bd-9813-39bb040f7736/yyy/{*}/{*}.json"
     USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();

@res =
    SELECT Timestamp
    FROM @extract;

OUTPUT @res TO "/output/result.csv" USING Outputters.Csv(); 

I have seen some examples like:

U- SQL Unable to extract data from JSON file => this only queries one level of the document, I need data from multiple levels.

U-SQL - Extract data from json-array => this only queries one level of the document, I need data from multiple levels.


Solution

  • JSONTuple supports multiple JSONPaths in one go.

    @extract =
         EXTRACT
             Id String,
             Timestamp DateTime,
             EventEntry String
         FROM @"..."
         USING new Microsoft.Analytics.Samples.Formats.Json.JsonExtractor();
    
    @res =
        SELECT Id, Timestamp, EventEntry,
        Microsoft.Analytics.Samples.Formats.Json.JsonFunctions.JsonTuple(EventEntry,
            "EventId", "Payload[2]") AS Event
        FROM @extract;
    
    @res =
        SELECT Id,
        Timestamp,
        Event["EventId"] AS EventId,
        Event["Payload[2]"] AS Something
        FROM @res;