how to partially match between 2 columns in R

34 views Asked by At

I'm having trouble finding out if values partially match between 2 columns in different tables so i could update values accordingly. The problem is trying to match a longer value with smaller value and having multiple prefix values. I don't want to use any loops as my dataset is in the 1m range.

df1:

PRODUCT_8 VAT
72093234 0,2
30023000 0,2
12093236 0,2
30066012 0,2

df2:

KN
30022000
30023000
300660
72

Desired output:

PRODUCT_8 VAT
72093234 0,9
30023000 0,9
12093236 0,2
30066012 0,9

I have tried this:

df1[is.na(pmatch(df1$PRODUCT_8, df2$KN, duplicates.ok = TRUE)) == FALSE, 
                 "VAT"] <- 0.09

but pmatch doesn't understand if I want to match longer numbers with smaller. It would be perfect if I could just use startswith(df1$PRODUCT_8, df2$KN), but sadly you can't have multiple values in the prefix argument. sqldf is not complying with my demands either, don't know what's the problem with that.

0

There are 0 answers