Identify text pattern in R dataframe

Question

Identify text pattern in R dataframe

32 views Asked by Marta López At 15 May 2023 at 10:50

I have identifiers in two columns of a dataframe but with different structure. It looks like this:

  Description1                Description2
1  A0A2H1CVW1_FASHEprotein1   tr|A0A2H1CVW1|A0A2H1CVW1_FASHEprotein1 
2  A0A4E0RAA2_FASHEprotein2   tr|A0A2H1BSG1|A0A2H1BSG1_FASHEprotein3
3  A0A2H1CFJ4_FASHEprotein4   tr|A0A2H1CFJ4|A0A2H1CFJ4_FASHEprotein4

How could I identify the different identifiers between the two column, for example in row 2?

Original Q&A

There are 1 answers

**Allan Cameron** · Answer 1 · 2023-05-15T11:00:53+00:00

You could use str_detect from the stringr package to find whether Description1 can be found within Description2

library(stringr)

str_detect(df$Description2, df$Description1)
#> [1]  TRUE FALSE  TRUE

Data in reproducible format

df <- structure(list(Description1 = c("A0A2H1CVW1_FASHEprotein1",  
                                      "A0A4E0RAA2_FASHEprotein2", 
                                      "A0A2H1CFJ4_FASHEprotein4"), 
                     Description2 = c("tr|A0A2H1CVW1|A0A2H1CVW1_FASHEprotein1", 
                                      "tr|A0A2H1BSG1|A0A2H1BSG1_FASHEprotein3",
                                      "tr|A0A2H1CFJ4|A0A2H1CFJ4_FASHEprotein4"
                )), class = "data.frame", row.names = c("1", "2", "3"))

TechQA.

Identify text pattern in R dataframe

There are 1 answers

Related Questions in R

Related Questions in STRING

Related Questions in DATAFRAME

Related Questions in IDENTIFY

Popular Questions

Trending Questions