I am trying to create a raincloud plot to show scores on sex, however it is subgrouping each point based on its score
I want it to look like this image, where petal.length on grouped by species and not the length itself depicted
. I have code that has been working with other sets, however I am not sure what the issue is.
I have also check to see it the score scale is continuous or discrete, and it is continuous*
here is the code I am using in R:
dplyr::group_by(sex) %>%
dplyr::mutate(
mean = mean(score),
se = sd(score) / sqrt(length(score)),
sex_y = paste0(sex, "\n(", n(), ")")
) %>%
ungroup() %>%
ggplot(aes(x = NIH_score, y = sex_y)) +
stat_slab(aes(fill = sex)) +
geom_point(aes(color = sex),shape = 16,
position = ggpp::position_jitternudge(height = 0.125, width = 0,
y = -0.125,
nudge.from = "jittered")) +
scale_fill_brewer(palette = "Set1", aesthetics = c("fill", "color")) +
geom_errorbar(aes(
xmin = mean - 1.96 * se,
xmax = mean + 1.96 * se
), width = 0.2) +
stat_summary(fun = mean, geom = "point", shape = 16, size = 3.0) +
theme_bw(base_size = 10) +
theme(legend.position = "top") +
labs(title = "Raincloud plot with ggdist", x = "score")```
It's not that your data is being grouped by x axis value. It's just that the bandwidth of the kernel density estimator is too small.
Let's recreate your issue with essentially the same code but some made up data:
But if we increase the bandwidth to, say, 2 inside
stat_slabusing theadjustparameter, we get:It's not clear what it is about your settings or data that is giving such a narrow bandwidth (since neither is in your question), but you should be able to get the result you need by increasing
adjust