My data is saved on a daily basis in the following path: "/Data/{year}/{month}/{day}/mydata.json" So, e.g. "/Data/2018/10/1/mydata.json" , "/Data/2018/10/2/mydata.json", "/Data/2018/11/1/mydata.json", "/Data/2018/12/5/mydata.json", etc.
I would like to combine all the months and days in one file using USQL. Is it possible to do it in an easy way without mentioning each path separately (otherwise it's crazy to do it for all the days of the year)?
At the moment I use this:
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
USING Microsoft.Analytics.Samples.Formats.Json;
@a =
EXTRACT EventCategory string
, EventAction string
, EventLabel string
FROM "/Data/2018/10/2/mydata.json"
USING new JsonExtractor()
UNION ALL
EXTRACT EventCategory string
, EventAction string
, EventLabel string
FROM "/Data/2018/11/2/mydata.json"
USING new JsonExtractor();
OUTPUT @a
TO "/Output/mydata.Csv"
USING Outputters.Csv(outputHeader:true);
I would like to combine all the months and days in one file using USQL. Is it possible to do it in an easy way without mentioning each path separately (otherwise it's crazy to do it for all the days of the year)?
Yes! You can do this using patterns, a basic example:
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
DECLARE @input string = "/Data/2018/{*}/2/mydata.json";
USING Microsoft.Analytics.Samples.Formats.Json;
@a =
EXTRACT EventCategory string
, EventAction string
, EventLabel string
FROM @input
USING new JsonExtractor()
OUTPUT @a
TO "/Output/mydata.Csv"
USING Outputters.Csv(outputHeader:true);
this will load all data of the second day of the month.
Other variations:
DECLARE @input string = "/Data/2018/{*}/{*}/mydata.json";
will process all files of 2018
DECLARE @input string = "/Data/{*}/12/{*}/mydata.json";
will process all files generated in the 12th month of all years
If you want to retrieve the file parts to get the actual date parts you can do:
REFERENCE ASSEMBLY [Newtonsoft.Json];
REFERENCE ASSEMBLY [Microsoft.Analytics.Samples.Formats];
USING Microsoft.Analytics.Samples.Formats.Json;
@a =
EXTRACT EventCategory string
, EventAction string
, EventLabel string
, date DateTime
FROM "/Data/{date:yyyy}/{date:MM}/{date:dd}/mydata.json"
USING new JsonExtractor()
OUTPUT @a
TO "/Output/mydata.Csv"
USING Outputters.Csv(outputHeader:true);
As you can see there is now an additional column date
of type DateTime
that can be used in the query and/or included in the output.