Dataframe1 has two columns: num_movies and userId. Dataframe2 has two columns: No_movies and userId. But Dataframe2 has 2106 rows and Dataframe1 has 1679 rows. I want to subtract the number of movies in Dataframe2 from Dataframe1 based on matching userId values. I have written the following line:
df1$num_movies = df1$num_movies - df2$No_movies[df1$userId %in% df2$userId]
and I get the following error:
Error in `$<-.data.frame`(`*tmp*`, "num_movies", value = c(2, 9, 743, :
replacement has 2106 rows, data has 1679
In addition: Warning message:
In df1$num_movies - df2$No_movies[df1$userId %in% :
longer object length is not a multiple of shorter object length
Elsewhere it has been proposed that I upgrade from 3.0.2 to 3.1.2 to solve this problem. But I still have the same error after the upgrade. What I have written seems logical for me. I intend to pick only 1679 userIds out of 2106. Why is it selecting all of them? How do I circumvent this error?
You can use the
matchfunction to find the corresponding row fromDataframe2for each row inDataframe1.Data: