I have a set of telemetry data stored in a table in the below format. Lets call this table as RawTelemetryData
device_id | TIME | ABC | DEF | GHI | LMN |
---|---|---|---|---|---|
123 | 2021-04-20 00:00:00.0000000 | 1 | 2 | 3 | 4 |
121 | 2021-04-20 00:00:00.0000000 | 1 | 2 | 3 | 4 |
The above table consist of telemetry data which is coming from several IoT device every 10 seconds. I have a client program which needs different columns to be selected based on device_id.
For example. For device_id 123, client program needs ABC,GHI column. Where as for device_id 121 client program expects to have only ABC,DEF.
Initially I was thinking of writing a function which accepts device_id and the columns to be selected. like the below. Where the first argumet is the device id and second argument is the columns to be selected. Say I need to select ABC,GHI of device id 123 ill call the function like this.
Approach 1
getDataByDeviceId(123,"ABC,GHI") This function should only project only ABC and GHI
.create-or-alter function with (folder = "getData", skipvalidation = "true") getDataByDeviceId(device_id:int,columns:string) {
}
But I couldn't get this working.
Approach 2
In this approach I preconfigured columns for each device_id in a separate table. In this case I tried to write a function which accepts only device id, and project columns which are configured in the below table.
DEVICE_COLUMN_MAPPING_TABLE
device_id | columns |
---|---|
123 | ABC |
123 | DEF |
123 | GHI |
123 | LMN |
121 | ABC |
121 | DEF |
121 | GHI |
.create-or-alter function with (folder = "getData", skipvalidation = "true") getDataByDeviceId(device_id:int) {
//program should look at DEVICE_COLUMN_MAPPING_TABLE for colums for this particular device id and project only that.
}
But unfortunately I couldn't get this also working. :(
I want to understand is it possible to get any of the above approach working. If not i have only 1 solution which I can think of , that is construct the query with proper project statement from the source program(A java program, which calls kusto function) based on device_id and invoke to get the results. I was trying to avoid doing this and make all my logic on ADX itself. Any other approach to solve the problem also welcomed :)
Here is a solution based on your second approach:
let device_mappings = datatable(device_id:long, columns_output:string)[
123, "ABC",
123, "GHI",
121, "ABC",
121, "DEF",
122, "LMN"];
let dt = datatable(device_id:long,TIME:datetime,ABC:long,DEF:long,GHI:long,LMN:long)
[123,datetime(2021-04-20 00:00:00.0000000),1,2,3,4,
123,datetime(2021-04-21 00:00:00.0000000),10,20,30,40,
121,datetime(2021-04-20 00:00:00.0000000),1,2,3,4,
122,datetime(2021-04-20 00:00:00.0000000),1,2,3,4];
let func = view(id:long) {
let idKeys = toscalar(device_mappings | where device_id == id | summarize make_set(columns_output));
let otherKeys = toscalar(device_mappings | where device_id != id | summarize make_set(columns_output));
let keysToRemove = set_difference(otherKeys,idKeys);
dt
| where device_id == id
| extend rec = pack_all()
| project filteredRec = bag_remove_keys(rec, keysToRemove)
| evaluate bag_unpack(filteredRec)
};
func(123)