Last Updated
Viewed 43 Times
           

I have a problem with spacing data points in a boxplot. I use the following code.

DF1 <- data.frame(x = c(1, 2, 3, 4, 7, 11, 20, 23, 24, 25, 30), y = c(3, 6, 12, 13, 17, 22, NA, NA, NA, NA, NA))
library(ggplot2)
library(tidyverse)
n <- 11
DF1 <- as.data.frame(DF1)
DF1 <- reshape2::melt(DF1)
DF1 %>%
  group_by(variable) %>%
  arrange(value) %>%
  mutate(xcoord = seq(-0.25, 0.25, length.out = n())) %>%
  ggplot(aes(x = variable, y = value, group = variable)) +
  geom_boxplot() +
  geom_point(aes(x = xcoord + as.integer(variable)))

This results in the following:

R boxplot ggplot2

For x, all data points are evenly distributed left to right, but since y has fewer data points, they are not evenly distributed left to right. How can the above code be modified to evenly space out data points for y too? I would appreciate any suggestions.

I found a somewhat similar post here, but that could not help me.

Thank you.

I have a boxplot generated with the following code:

b.males <- c(6, 7, 8, 8, 8, 9, 10, 10, 11, 11, 12, 12, 12, 12, 13, 14, 15)
b.females <- c(14, 13, 12, 12, 11, 10, 10, 9, 9, 9, 9, 9, 8, 8, 8, 7, 7, 7, 7)
b.total<-c(b.males,b.females)

b.m<-data.frame(b.males)
b.f<-data.frame(b.females)
b.t<-data.frame(b.total)

myList<-list(b.m, b.f, b.t)
df<-melt(myList)

colnames(df) <- c("class","count")
plt<-ggplot(df, aes(x=class,y=count))+geom_boxplot() 
plt + geom_point(aes(x = as.numeric(class) + 0, colour=class))

What I'd like to do is, for any given y-axis point, show all individual points in a row. For example, for b.males, I'd like to see 3 dots at 8, with the middle dot exactly in the center and the other two dots right next to it on either side.

I attempted:

plt + geom_point(aes(x = as.numeric(class) + 0, colour=class)) +
      geom_jitter(position=position_jitter(width=.1, height=0))

But this did not keep the points close together. Additionally, in some cases it would put multiple points to the right or left of the middle of the box, not distributing them evenly as I'd like.

Similar Question 2 : Boxplot in R using ggplot2

I'm new to R and have been trying to make a boxplot. A part of the data I'm using is shown

            h1          h2          h3          h4          h5          h6          h7          h8          h9         h10
1  0.003719430 0.002975544 0.003049933 0.003421876 0.003421876 0.003347487 0.003645042 0.003496264 0.007364472 0.009075410
2  0.003400540 0.002749373 0.003038781 0.003328188 0.003328188 0.003400540 0.003472892 0.003400540 0.007741656 0.009333398
3  0.003741387 0.002918282 0.003142765 0.003367248 0.003367248 0.003367248 0.003666559 0.003516904 0.008081396 0.008156223
4  0.003870634 0.002884002 0.003187581 0.003339370 0.003567055 0.003415265 0.003794739 0.003491160 0.008348426 0.007741268
5  0.003782963 0.002950711 0.003177689 0.003480326 0.003404667 0.003404667 0.003707304 0.003631645 0.008927793 0.007414608
6  0.003643736 0.002884624 0.003264180 0.003416002 0.003491913 0.003416002 0.003871469 0.003795558 0.009033428 0.007135649
7  0.003718600 0.003035592 0.003111482 0.003339151 0.003566821 0.003566821 0.003642710 0.003870380 0.008120209 0.008044319
8  0.003819313 0.002979064 0.003284609 0.003360995 0.003590154 0.003437382 0.003895699 0.003590154 0.008326102 0.007791398
9  0.003899334 0.002981844 0.003211216 0.003364131 0.003669961 0.003440589 0.003746419 0.003669961 0.008410328 0.007569295
10 0.003828488 0.002986220 0.003292499 0.003445639 0.003522209 0.003522209 0.003598778 0.003598778 0.008422673 0.007810115

When I use the default boxplot command then this is what I get

boxplot(df)

enter image description here

I have been trying to generate the boxplot for same data using ggplot2 but it gives an error which I am unable to resolve. Here's what I tried.

library(ggplot2)
df <- readRDS('data.Rda')
ggplot(df) + geom_boxplot()

Here's the error

Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous
Error: Aesthetics must either be length one, or the same length as the dataProblems:df[, 6:15]

I saw the ggplot2 docs for geom_boxplot and realize (from the example) that I need to rearrange my data like

col1        col2
h1   0.003719430
h1   0.003400540
h1   0.003741387
h1   0.003870634
h1   0.003782963
h1   0.003643736
h2   0.002975544
h2   0.002749373
h2   0.002918282
h2   0.002884002
h2   0.002950711
h2   0.002884624
...

and use something like

ggplot(df, aes(factor(col1), col2)) + geom_boxplot()

But that is a lot of work. I believe that there must be some way to do this automatically which I'm not able to find. Any help is appreciated.

I am making a boxplot conditioned by a factor similar to this example:

p <- ggplot(mtcars, aes(factor(cyl), mpg))
p + geom_boxplot(aes(fill = factor(am)))

There are few points in the data set, and I'd like to express this visually by overlaying the data points. I want to overlay the points colored by the same factor "am" which I try to do like this:

p + geom_boxplot(aes(fill = factor(am))) + geom_jitter(aes(colour = factor(am)))

The points are colored by the factor "am" but not spaced to lay only over the box plots they are associated with. Rather they mix and cover both. Does anyone know how the condition the geom_jitter so the points associate with the factor "am"?

Similar Question 4 (1 solutions) : R ggplot2 grouped boxplot of TCGA expression data

Similar Question 5 (1 solutions) : Boxplot of CSV data with ggplot2

Similar Question 7 (2 solutions) : How to make a base R style boxplot using ggplot2?

Similar Question 8 (1 solutions) : Transparency in boxplot legend keys using R and ggplot2

cc