Last Updated
Viewed 6,854 Times
           

I can't find a way to ask ggplot2 to show an empty level in a boxplot without imputing my dataframe with actual missing values. Here is reproducible code :

# fake data
dftest <- expand.grid(time=1:10,measure=1:50)
dftest$value <- rnorm(dim(dftest)[1],3+0.1*dftest$time,1)

# and let's suppose we didn't observe anything at time 2

# doesn't work even when forcing with factor(..., levels=...)
p <- ggplot(data=dftest[dftest$time!=2,],aes(x=factor(time,levels=1:10),y=value))
p + geom_boxplot()

# only way seems to have at least one actual missing value in the dataframe
dftest2 <- dftest
dftest2[dftest2$time==2,"value"] <- NA
p <- ggplot(data=dftest2,aes(x=factor(time),y=value))
p + geom_boxplot()

So I guess I'm missing something. This is not a problem when dealing with a balanced experiment where these missing data might be explicit in the dataframe. But with observed data in a cohort for example, it means imputing the data with missing values for unobserved combinations... Thanks for your help.

This question follows from this other one. I was unable to implement answers there.

Define:

df2 <- data.frame(variable=rep(c("vnu.shr","vph.shr"),each=10),
        value=seq(1:20))

Plot:

require(ggplot2)
qplot(variable,value, data=df2,geom="boxplot")+
geom_jitter(position=position_jitter(w=0.1,h=0.1))

I would like to have the boxplots in the reverse order (e.g. one in right on left and so on).

I have tried various ways of reordering the factors using levels, ordered, relevel, rev and so on, but I simply cannot seem to get the syntax right.

I would like to create boxplots of multiple variables for groups of a continuous x-variable. The boxplots should be arranged next to each other for each group of x.

The data looks like this:

require (ggplot2)
require (plyr)
library(reshape2)

set.seed(1234)
x   <- rnorm(100)
y.1 <- rnorm(100)
y.2 <- rnorm(100)
y.3 <- rnorm(100)
y.4 <- rnorm(100)

df <- as.data.frame(cbind(x,y.1,y.2,y.3,y.4))

which I then melted

dfmelt <- melt(df, measure.vars=2:5)    

The facet_wrap as shown in this solution ( Multiple plots by factor in ggplot (facets)) gives me out each variable in an individual plot, but I would like to have the boxplots of each variable next to each other for each bin of x in one diagram.

ggplot(dfmelt, aes(value, x, group = round_any(x, 0.5), fill=variable))+
geom_boxplot() + 
geom_jitter() + 
facet_wrap(~variable)

fig1

This shows the y-variables next to each other but does not bin x.

ggplot(dfmelt) +
geom_boxplot(aes(x=x,y=value,fill=variable))+
facet_grid(~variable)

fig2

Now I would like to produce such a plot for each bin of x.

What has to be changed or added?

I want to produce a grouped boxplot, so first I modified a piece of code I found on the internet (http://www.r-bloggers.com/ggplot2-multiple-boxplots-with-metadata/) to generate a dataframe of test values:

    Y <- data.frame(
      values = c(rnorm(mean=20, sd=4, n=3), rnorm(mean=10, sd=2, n=3), rnorm(mean=50, sd=10, n=3), rnorm(mean=60, sd=12, n=3)),
      factor1 = rep(c('oil1', 'oil2'), each = 3),
      factor2 = rep(c('product1', 'product2'), each = 6)
   )

    values  factor1 factor2
1   13.527314   oil1    product1
2   23.495898   oil1    product1
3   14.881210   oil1    product1
4   9.110103    oil2    product1
5   9.330372    oil2    product1
6   10.846560   oil2    product1
7   40.786020   oil1    product2
8   43.157393   oil1    product2
9   43.050182   oil1    product2
10  39.588651   oil2    product2
11  65.963630   oil2    product2
12  63.425253   oil2    product2

Then, the code:

ggplot(Y, aes(x = factor2, y = values, fill = factor1)) +
  geom_boxplot()

produces the boxplot I want.

My real data are in this other data frame, created reading a .csv file:

    values  factor1 factor2
1   0.2 oil1    product1
2   1.7 oil1    product1
3   3.2 oil1    product1
4   27.8    oil2    product1
5   29.8    oil2    product1
6   31.8    oil2    product1
7   0   oil1    product2
8   1   oil1    product2
9   2.5 oil1    product2
10  29.3    oil2    product2
11  31.3    oil2    product2
12  33.3    oil2    product2

(I am unable to correct the misalignement in this table) Yet when I try to create a boxplot using the code above, instead of the boxes the plot contains horizontal segments at y=value.

How can I resolve this problem?

Similar Question 5 (1 solutions) : extreme value labels ggplot2 in geom_boxplot

Similar Question 8 (1 solutions) : finding x coordinates of box in geom_boxplot (ggplot2)

cc