R can't find missing / blank rows in a char column

512 views Asked by At

There is a char format column in my dataset named start_station_name with some missing values, so I'm trying to remove all rows having a blank / NA value.

However I know that there are more than 7000 rows with no value for the column start_station_name, when I try to remove blank rows, R can't find them and only removes 50 rows:

SD_cleaned <- drop_na(SD) 

Here is a sample of the dataset:

ride_id bike_type start_station_name
273C6C2B99EBAC32 electric_bike
7AB7965997435172 electric_bike Rush St & Superior St
D6C2BC6711446FB5 electric_bike
C2433C9CF5941BBF electric_bike Rush St & Superior St
... ... ...

I've also tried wit na.omit() or is.na(), but I had the same result.

Thanks for any feedback ️

1

There are 1 answers

1
akrun On

drop_na only removes the NA rows. If there are blanks ("") convert the blanks to NA before doing the drop_na

library(dplyr)
SD_cleaned <- SD %>%
   na_if("") %>%
   drop_na()

If there are spaces as well, use trimws on each of the columns before converting the blank to NA

SD_cleaned <- SD %>%
     mutate(across(where(is.character), trimws)) %>%
     na_if("") %>%
     drop_na()