Fill values from one dataframe with another dataframe

Question

Fill values from one dataframe with another dataframe

50 views Asked by nickes At 19 January 2024 at 15:43

I have some problems with some processing that I must do in an R environment in Databricks. I have two dataframes.

df1: contains the rows and columns (index1 and index2) and the value to be assigned

index1	index2	value
142342315	1423423341	A
142342315	14234235325	A
1423423166	14234235325	C
1423423166	1423423341	C
...	...	...

df2: generated by an array with all combinations of index

rowsname	142342315	1423423166	1423423341	14234235325
142342315	NA	NA	NA	NA
1423423166	NA	NA	NA	NA
1423423341	NA	NA	NA	NA
14234235325	NA	NA	NA	NA

The first thing that came to mind is to make a loop to assign the value to each location in the dataframe. Something like that:

for (i in 1:nrow(df1)) {
    df2[df1$index_1[i], df1$index_2[i]] <- df1$temp[i]
    df2[df1$index_2[i], df1$index_1[i]] <- df1$temp[i]
}

But, the problem is that it involves a large number of combinations, several hours of execution.

I tried using sparseMatrix but my indices cannot be converted to integers in this case. Can you think of a way to do it optimally?

I tried for loops, it's so slowly. I tried using sparseMatrix but my indices cannot be converted to integers in this case.

Original Q&A

There are 1 answers

**BigFinger** · Answer 1 · 2024-01-19T23:01:06+00:00

If I understand it correctly, you are looking to produce a square matrix with all names across both dimensions. I think your initial approach should work, if you preallocate the output matrix and replace the for loop with some vectorized version. How about the following?

df1 = data.frame(
    index1=c("142342315", "142342315", "1423423166", "1423423166"),
    index2=c("1423423341", "14234235325", "14234235325", "1423423341"),
    value=c("A","A","C","C")
)

rowsname = unique(c(df1$index1,df1$index2))
res = matrix(nrow=length(rowsname),ncol=length(rowsname),dimnames=list(rowsname,rowsname))

lapply(1:nrow(df1), function(r) {
    res[df1$index1[r], df1$index2[r]] <<- df1$value[r]
    res[df1$index2[r], df1$index1[r]] <<- df1$value[r]
})

Content of variable "res":

            142342315 1423423166 1423423341 14234235325
142342315   NA        NA         "A"        "A"        
1423423166  NA        NA         "C"        "C"        
1423423341  "A"       "C"        NA         NA         
14234235325 "A"       "C"        NA         NA

TechQA.

Fill values from one dataframe with another dataframe

There are 1 answers

Related Questions in R

Related Questions in SPARKR

Popular Questions

Trending Questions

Fill values ​from one dataframe with another dataframe

There are 1 answers

Related Questions in R

Related Questions in SPARKR

Popular Questions

Trending Questions

Fill values from one dataframe with another dataframe