Classifying excel workbooks based on "string name" at specific location of excel sheet

51 views Asked by At

I am working with ~30 excel workbooks containing data with different set of designs. However, excel sheets within workbook are well formatted.

# There is a pattern, wherein if we have string "System setting" at 
# i.e., 
df[4,4] == "System setting"  #(then it is single core setup)

# Similarly,
df[4,6] == "System setting" # (then it is double core setup)
df[4,9] == "System setting" # (then it is triple core setup)

Now, I want to classify all of workbooks based on this information; whether their system setting is of single core, double core or triple core.

I tried following.

# Creating list of file names within folder
file_list <- list.files(pattern = '*.xlsx')
file_list

# Extracting sheet of interest as list of dataframes
Setup <- sapply(file_list, function(i){
  x = read_excel(i, sheet = "Setup")
  x
}, simplify = F, USE.NAMES = T)

# Now looking for specific information at cell location, this is for specifically identify sheets with single core 
Check_single_core <- for (i in Setup){
                        x = i[4,4]
                        x
                     }

There are 11 out of 30 files have string, "System setting" at this location (i.e., i[4,4]) However, output over here is null.

I also tried this with lapply

Check_single_core_2 <- lapply(Setup, function(j){
  y = j[4,4]
  y
})

However, this generates list of 30 tibbles. All of them are blank.

I would like to understand how I can resolve this and further classify all of them under correct category.

1

There are 1 answers

0
Emmanuel Hamel On

If the information is in a specific cell in your excel files, you could use the following approach :

library(RDCOMClient)

xlApp <- COMCreate("Excel.Application")
xlApp[["DisplayAlerts"]] <- FALSE
xlApp[["Visible"]] <- FALSE

path_To_Excel_File <- "D:/excel_File.xlsx"
xlWbk <- xlApp$Workbooks()$Open(path_To_Excel_File)

xlWbk$Sheets(1)$Range("B2")$Value()

You could also consider the following approach to read a specific cell :

library(readxl)
read_excel(path_To_Excel_File, range = "B2")

Once you have the information you can use it to classify the excel files.