I am learning R and trying to write functions that operate with column names as arguments. I am using rlang, and I need help understanding why am I getting some unexpected results when using the curly-curly operator and a group by function.
I don't understand why the second code chunk with the curly-curly doesn't work as I expect, any insight would be much appreciated. I am sorry if I made any mistake presenting the question or it is quite simple, I am learning.
Thank you so much.
This piece of code provides the desired result, with n counts for each different value found in the column "HLA_allele":
group_function <- function(.data, x_column){
.data %>%
group_by(.data[[x_column]]) %>%
summarise(count = n())
}
Calling the function:
group_function (selectedA, "HLA_allele")
Output (expected):
# A tibble: 19 × 2
HLA_allele count
\<chr\> \<int\>
1 01 43
2 02 113
3 03 53
4 11 31
5 23 19
6 24 53
7 25 12
8 26 18
9 29 55
10 30 27
11 31 11
12 32 20
13 33 10
14 34 3
15 36 1
16 66 1
17 68 14
18 69 3
19 80 1
However, when I try to use the curly-curly operator I get one single result as shown:
group_function <- function(.data, x_column){
.data %>%
group_by({{x_column}}) %>%
summarise(count = n())
}
Calling the function:
group_function (selectedA, "HLA_allele")
Output (unexpected):
# A tibble: 1 × 2
`"HLA_allele"` count
\<chr\> \<int\>
1 HLA_allele 488
I've used ensym() as suggested here, and it works, but I don't really understand what's going on or why curly doesn't work here as expected.