I have a lot of long Excel files and it is too hard to handle them manually. I'm reading them in R to identify the highlighted yellow cells as in the image
The objective: is to loop over the days and the hours in the file in order to construct a data frame that indicates the option according to the hour as
I am following these answers: answer1, answer2, answer3 to do the job using the libraries xlsx, openxlsx and tidyr
library(xlsx)
library(openxlsx)
library(tidyr)
wb <- loadWorkbook("active.xlsx") #the table is saved in the file active.xlsx
sheet1 <- getSheets(wb)[[1]]
rows <- getRows(sheet1)
cells <- getCells(rows)
styles <- sapply(cells, getCellStyle)
cellColor <- function(style)
{
fg <- style$getFillForegroundXSSFColor()
rgb <- tryCatch(fg$getRgb(), error = function(e) NULL)
rgb <- paste(rgb, collapse = "")
return(rgb)
}
mycolor <- (yellow = "ffff00")
m <- match(sapply(styles, cellColor), mycolor)
But the data is neither read nor processed correctly and the code is not yielding the needed result, I am not even close!
Is it possible to guide me and link a tutorial or a package in R which I can use to detect the highlighted cells and to construct the required dataframe?







Check out the free online book'Spreadsheet Munging Strategies':
https://nacnudus.github.io/spreadsheet-munging-strategies/ with the tidyxl package.
In the case of colored cells:
https://nacnudus.github.io/spreadsheet-munging-strategies/tidy-formatted-cells.html