I have to classify a lot of crops based on three conditions calculated in a grid of 1e6 points. I'm trying to optimize the function below (hopefully without moving to C or Rust). Any ideas?
Iit's possible to reformat the input data if necessary. I already tried with data.table but the performance was worse.
This is my best shot:
condtion1 <- letters[1:8]
condtion2 <- letters[9:15]
condtion3 <- letters[16:24]
crop <- sample(0:1, 24, replace = T)
names(crop) <- letters[1:24]
n <- 1e6
condtions1 <- sample(condtion1, n, replace = T)
condtions2 <- sample(condtion2, n, replace = T)
condtions3 <- sample(condtion3, n, replace = T)
get_suitability <- function(){
result <- character(n)
for (i in seq_along(result)) {
if (crop[[condtions1[[i]]]] == 0 | crop[[condtions2[[i]]]] == 0) result[[i]] <- "not suitable"
else if(crop[[condtions1[[i]]]] == 1 & crop[[condtions2[[i]]]] == 1 & crop[[condtions3[[i]]]] == 1) result[[i]] <- "suitable"
else if(crop[[condtions1[[i]]]] == 1 & crop[[condtions2[[i]]]] == 1 & crop[[condtions3[[i]]]] == 0) result[[i]] <- "suitable with irrigation"
}
result
}
microbenchmark::microbenchmark(
get_suitability(),
times = 5
)
#> Unit: seconds
#> expr min lq mean median uq max neval
#> get_suitability() 2.402434 2.408322 2.568981 2.641211 2.667943 2.724993 5
Created on 2024-03-24 with reprex v2.1.0
Vectorise over the
condtionsgetting rid offor/if. The logical indices take care of bothforandif.In a comment to the question I write:
Notes:
get_suitability2is my idea in comment to the question, a bad idea as it turned out;get_suitability3bis a simplified version ofget_suitability3and the fastest of all;get_suitability4is user2554330´s last function and faster than the original question code.Created on 2024-03-24 with reprex v2.1.0