I need to categorize a very long df using a long list of criteria. Here is a simplified version of the criteria as a df:
crit <- data.frame(grp = c("g1", "g1", "g1", "g2", "g2", "g2"),
class = c("A", "B", "C", "A", "B", "C"),
min = c(1, 3, 5, 8, 10, 12),
max = c(3, 5, 8, 10, 12, 14)
)
A second df would receive a column containing "class" based on whether the value is linked to "grp" (part 1 of the procedure) and falls within the specified ranges (min, max) (part 2 of the procedure). Also, if a value is below the lowest or above the highest value in a range, it will be categorized as belonging to the lowest/highest "class." For example:
| grp | val | class |
|---|---|---|
| g1 | 0 | A |
| g1 | 1 | A |
| g2 | 7 | A |
| g2 | 11 | B |
df <- data.frame(grp = c("g1", "g1", "g2", "g2"),
val = c(0, 1, 7, 11)
)
Do you have any suggestions on how to do this using dplyr? Any help is very much appreciated.
One option is something like this:
Essentially, it peforms a kind-of crossjoin, then filters to find the ones that match the criteria.
Another option is this: