I have dozens of excel files which i would like to import into R for futher processing and analysis. These are files for each month. The source files are with filters which were to include data not needed. I wanted to use read_xlsx function from readxl package. However all rows are imported into R including the ones not needed. I know i can import all files and then do the filtering thing or just copy paste original excel files. What i want to know is if i can keep the filters in the files during the read_xlsx step which is the best option for me at the moment.
Thanks in advance,
Felix
I struggled with same issue some time ago. And after some research and AFAIK the short answer is (still) that this can't be achieved via
readxl. Moreover, AFAIK there is also (still) no out-of-the-box option in packages likeopenxlsxor ....However, as
openxlsxcould read an excel workbook object it allows to retrieve the information about the set filters which could then be used to apply the filters on the imported dataset.The following code is an example of how this could be achieved.
Note: The code worked fine for my use-case but it is not meant as a general out-of-the-box solution so you probably have to adjust the code to fit your needs.
First, let's create an example excel file for which I use the
mtcarsdataset:Second, in MS Excel I have set two filters on the table, i.e. I set a filter on the
cylcolumn and a second on thecarbcolumn:Prepared with the example data the following code uses a custom function ...
which after loading the data and the workbook could be used to extract the information on the set filters and parses them ...
... and finally applies the filters on the imported dataset for which I use
rlang::parse_exprsTwo more notes:
In my case the data was stored in tables and the custom function only works on tables.
In the example I have only one sheet and one table. In my use-case the excel file contained multiple sheets with one table per sheet. In that case you could or have to use
lapplyorpurrr::mapto iterate over the tables and the sheets.