I have this dataset of active subjects during specified time-periods.
start end name
0 00:00 00:10 a
1 00:10 00:20 b
2 00:00 00:20 c
3 00:00 00:10 d
4 00:10 00:15 e
5 00:15 00:20 a
The intervals are inclusive on the left(start) side and not inclusive on the right(end).
There are always three subjects active. I want to increase the granularity of the data, so that I will have info of the three active subjects for each second. Each second has three unique values.
This would be the desired result for the test case.
slot1 slot2 slot3
0 a c d
1 a c d
2 a c d
3 a c d
4 a c d
5 a c d
6 a c d
7 a c d
8 a c d
9 a c d
10 b c e
11 b c e
12 b c e
13 b c e
14 b c e
15 b c a
16 b c a
17 b c a
18 b c a
19 b c a
The order of the subjects inside the slots is irrelevant for now. The subjects can reappear in the data like "a" from 00:00 to 00:10 and then again from 00:15 to 00:20. The intervals can be at any second.
Route 1: One (costly but easy) way is to explode the data to the seconds, then merge 3 times:
Output:
Route 2: Another way is to cross merge then query the overlapping intervals:
Output:
As you can see the this output is just the same as the other, but in the original form. Depending on your data, one route might better than the other.
Update Since your data has exactly 3 slot at any time, you can easily do with
pivot. This is the best solution.Output: