I have the following stream analytics input:
{ "ID":"DEV-001-Test",
"TMSMUTC":"2021-10-14T14:00:00.000",
"MSGTYP":"TELEMETRY",
"THING":[
{
"TMSDUTC":"2021-10-14T13:00:00.000",
"DATA":[
{
"TAGID":"TAGB",
"VALUE":30
},
{
"TAGID":"TAGX",
"VALUE":[30.34,245.65,30.34,245.65,245.65,30.34]
}
]
}
]
}
in which the array of values for the "TAGX" is representing a value recorded from a sensor every 10 mins for one hour from the timestamp "TMSDUTC":"2021-10-14T13:00:00.000". I was wondering how could make a query that would give me a similar output:
my main doubts are how to create the sequence of 10 mins from the timestamp and cross apply the values to it.
That's a good one! Note that I highly recommend you use VSCode and the ASA extension when working on these queries. The developer experience is much nicer than in the portal thanks to local testing, and you can also unit test your query via the npm package.
I took the following assumptions:
With that, here is the query. It's split in multiple code blocks to explain the flow, but I also pasted it whole in the last code block.
First we bring all the required fields to the first level. It makes things easier to read, but not only. GetArrayElements needs an array to CROSS APPLY, but GetArrayElement (singular) doesn't return the type at compile time. Using an intermediary query step solves that.
Then we expand
DATA:Then we create a subset for records that have a
VALUEof type array (TAGX in your example). Here I avoid hard-coding per tag by detecting the type at runtime. These records will need another round of array processing in the following step.Now we can focus on expanding
VALUEfor those records. Note that we could not do that in a single pass (filter on arrays above and CROSS APPLY below), as GetArrayElements checks types before filtering is done.[edited] To increment
TMSDUTC, we use DATEADD on the index of each item in its array (ArrayIndex/ArrayValue are both returned from the array expansion, see doc below).We union back everything together:
And finally insert everything into the destination:
[edited] Please note that the only order guaranteed on a result set is the one defined by the timestamp. If multiple records occur on the same timestamp, no order is guaranteed by default. Here, at the end of the query, all of the newly created events are still timestamped on the timestamp of the original event. If you now need to apply time logic on the newly generated TMSDUTC, you will need to output these records to Event Hub, and load them in another job using TIMESTAMP BY TMSDUTC. Currently the timestamp can only be changed directly at the very first step of a query.
What is used here :
The entire thing for easier copy/pasting: