data.table operations by column name with spaces fails

10.4k views Asked by At

Reproducible example

#Use the Iris data set
library(data.table)
iris 
colnames(iris)[3] <- "Petal Length"
iris <- as.data.table(iris)

Accessing column without space is fine

iris[,Petal.Width]

however access a column where the name contains a space doesn't work

iris[,Petal Length]
iris[,'Petal Length']

The only solution seems to be

iris[,iris$'Petal Length']

Comments I'm new to data.table. I understand there's a lot of quirks in data.table; is this one of them? I would change my variable names to get rid of spaces, but I'd prefer not to if i didn't need to. I also read a previous questions regarding just column names - and I understand in the two years since that last question updates have allowed it - this can be seen in the ease when the colname has no spaces.

1

There are 1 answers

3
grrgrrbla On BEST ANSWER

Update 2020-04-22

data.table has evolved and now iris[ , 'Petal.Length'] will return a one-column table (i.e., character and integer literal vectors in j can be used for column selection). There have also been ample updates in extending .SDcols for common use cases to do column filtration (subsetting by pattern on name, subsetting by logical aggregation); see the NEWS for more details.

Leaving the below for posterity.


Just use with = FALSE as explained under data.table FAQ points 1.1-1.3 and 2.17:

iris[ ,'Petal Length', with = FALSE]

and make sure to read the excellent introduction to data.table PDF vignette and the new HTML vignettes.


In this case, for what you expect (a vector), using [[ is more appropriate:

iris[['Petal Length']]

Alternatively, you can also refer to column names as if they were variables in j:

iris[, `Petal Length`] # note the backticks.