Categorize data based on multiple criteria using dplyr

Question

Categorize data based on multiple criteria using dplyr

43 views Asked by andrsn At 25 June 2023 at 17:11

I need to categorize a very long df using a long list of criteria. Here is a simplified version of the criteria as a df:

crit <- data.frame(grp = c("g1", "g1", "g1", "g2", "g2", "g2"),
                   class = c("A", "B", "C", "A", "B", "C"),
                   min = c(1, 3, 5, 8, 10, 12),
                   max = c(3, 5, 8, 10, 12, 14)
                   )

A second df would receive a column containing "class" based on whether the value is linked to "grp" (part 1 of the procedure) and falls within the specified ranges (min, max) (part 2 of the procedure). Also, if a value is below the lowest or above the highest value in a range, it will be categorized as belonging to the lowest/highest "class." For example:

grp	val	class
g1	0	A
g1	1	A
g2	7	A
g2	11	B

df <- data.frame(grp = c("g1", "g1", "g2", "g2"),
                 val = c(0, 1, 7, 11)
                )

Do you have any suggestions on how to do this using dplyr? Any help is very much appreciated.

Original Q&A

There are 1 answers

**Mark** · Accepted Answer · 2023-06-26T01:46:25+00:00

One option is something like this:

df %>%
    left_join(crit, by = "grp", relationship = "many-to-many") %>%
    filter(val >= min & val <= max) %>%
    select(-min, -max)

Essentially, it peforms a kind-of crossjoin, then filters to find the ones that match the criteria.

Another option is this:

# group everything by `grp`, so we just have one row for each `grp`, and a list of classes, mins and maxes
crit <- crit %>%
    mutate(class = list(class), min = list(min), max = list(max), .by = "grp") %>%
    distinct()

df %>%
    left_join(crit, by = "grp") %>%
    mutate(class = pmap(list(val, class, min, max), ~..2[..3 <= ..1 & ..1 <= ..4])) %>% # parallel map
    select(-min, -max) %>%
    unnest(class)

TechQA.

Categorize data based on multiple criteria using dplyr

There are 1 answers

Related Questions in DPLYR

Related Questions in CATEGORIZATION

Popular Questions

Trending Questions