I have a array of string called tags in my hive table. I want to join each element from this array with a ',' delimiter and form a string. However, while joining I don't want to join the string if it doesn't include ':' character in it.
The reason I want to do this is, once string is joined, it should become, a:1,b:2,c:3. Then I want to apply str_to_map to make it a dictionary. Right now my str_to_map is failing if input doesn't include ':' in even of the elements.
Minimum input to fail: ["abc-xyz-1"]
SELECT
CAST(SPLIT(hostname, '-')[1] AS BIGINT) AS host_id,
str_to_map(concat_ws(',', tags)) AS tags,
stack_trace
FROM test_events;
I would suggest "exploding" the array data into separate rows, which allows use of an IF on each element inside the concat_ws. Like this (untested code):
refs: lateral view & explode
I have to admit, not entirely sure if the group by is needed, but I have assumed it is.
I suggest experimentation with this snippet:
Should avoid the tags without the ":", and if you change the where clause you could locate the tags that are non-compliant.