Adding a regression line to a dotplot, or stacking overlapping points horizontally without dotplot

32 views Asked by At

I'm trying to create a graph which shows the regression line and of this data and meaningfully shows the individual points. If I keep the x axis data as numeric and use a scatter plot, the graph looks like this:

ggplot(data = plotting, aes(x = x, y = y, color = separator, fill = separator)) +
  geom_point() +
  scale_color_manual(values=c("palevioletred3", "deepskyblue3")) +
  geom_smooth(data = subset(plotting, separator == 0), method = "lm", se = FALSE, color = "violetred3") +
  geom_smooth(data = subset(plotting, separator == 1), method = "lm", se = FALSE, color = "dodgerblue3") +
  labs(title = "Learning and Learner Enjoyment",
       x = "Learner Enjoyment",
       y = "Learning")

enter image description here

But the overlapping points make this graph less than ideal. If I change geom_point to geom_dotplot, I get this with the variable left as numeric:

ggplot(data = plotting, aes(x = x, y = y, color = separator, fill = separator)) +
  geom_dotplot(binaxis = "y", dotsize = 0.5, stackgroups = TRUE, stackdir = "center", binpositions = "all") +
  scale_color_manual(values=c("palevioletred3", "deepskyblue3")) +
  geom_smooth(data = subset(plotting, separator == 0), method = "lm", se = FALSE, color = "violetred3") +
  geom_smooth(data = subset(plotting, separator == 1), method = "lm", se = FALSE, color = "dodgerblue3") +
  labs(title = "Sample",
       x = "X",
       y = "Y")

enter image description here

Which is less than ideal for obvious reasons! (also, why does it change the key like that?) If I want the dotplot to work properly, I change the x variable to a factor and the resulting plot is this:

ggplot(data = plotting, aes(x = factor_x, y = y, color = separator, fill = separator)) +
  geom_dotplot(binaxis = "y", dotsize = 0.5, stackgroups = TRUE, stackdir = "center", binpositions = "all") +
  scale_color_manual(values=c("palevioletred3", "deepskyblue3")) +
  geom_smooth(data = subset(plotting, separator == 0), method = "lm", se = FALSE, color = "violetred3") +
  geom_smooth(data = subset(plotting, separator == 1), method = "lm", se = FALSE, color = "dodgerblue3") +
  labs(title = "Sample",
       x = "X",
       y = "Y")

enter image description here

which is nearly perfect.... I just want to add the regression lines, which I now can't do because the variable is a factor. Is there a way to do the dotplot with the X axis left as numeric, or a way to add the lines I want to the dotplot? I know this is an unconventional use of a dotplot, I just can't find a better way to stack the points of a scatter graph. I've tried jitter and dodge but I still end up with a lot of overlapping points.

Also weirdly enough when I made this a factor, it changed the values from what they were (5-10) to numbers 1-6, which is infuriating.

0

There are 0 answers