Last Updated
Viewed 561 Times
        

I have a CSV file of weights taken everyday for six months (August 2016 - January 2017) for every day. I would like to plot a boxplot for each month that basically plots the summary() of the data for each month. I would like to use ggplot2 for it, since it looks much prettier. I've fished around for a solution and come up with many but nothing that seems to solve what I want.

The head and summary of the data:

> wts <- read.csv('weights.csv', header=T, sep=',')
> head(wts)
  August.2016 September.2016 October.2016 November.2016 December.2016 January.2016
1       254.2          250.0        248.2         245.8         245.6        244.4
2       252.6          249.2        248.6         246.4         246.0        245.0
3       251.8          250.6        249.2         248.0         246.4        244.3
4       253.2          252.4        249.8         247.5         246.0        243.6
5       252.2          250.6        248.8         247.0         246.0        242.6
6       254.0          251.0        247.8         247.6         246.0        242.0
> summary(wts)
  August.2016    September.2016   October.2016   November.2016   December.2016    January.2016  
 Min.   :249.6   Min.   :245.6   Min.   :245.4   Min.   :244.2   Min.   :243.4   Min.   :241.6  
 1st Qu.:252.2   1st Qu.:248.3   1st Qu.:246.7   1st Qu.:246.2   1st Qu.:244.8   1st Qu.:242.9  
 Median :252.8   Median :249.2   Median :247.8   Median :246.6   Median :245.6   Median :243.6  
 Mean   :252.7   Mean   :249.1   Mean   :247.6   Mean   :246.7   Mean   :245.3   Mean   :243.5  
 3rd Qu.:253.6   3rd Qu.:250.0   3rd Qu.:248.2   3rd Qu.:247.2   3rd Qu.:246.0   3rd Qu.:244.3  
 Max.   :255.2   Max.   :252.4   Max.   :249.8   Max.   :248.6   Max.   :247.0   Max.   :245.0  
                 NA's   :1                       NA's   :1                       NA's   :1  

From what I've gathered I need to reshape the data in way that ggplot likes, but I'm not sure how to do it. I would also, like highlight the mean (with the actual number) on the boxplot if it is possible. Could I get an idea on how to do it?

Thanks

do you have any idea of how to apply jittering just to the outliers data of a boxplot? This is the code:

ggplot(data = a, aes(x = "", y = a$V8)) +
geom_boxplot(outlier.size = 0.5)+
geom_point(data=a, aes(x="", y=a$V8[54]), colour="red", size=3) + 
theme_bw()+
coord_flip()

thank you!!

Similar Question 2 : ggplot2 boxplot

I am trying to plot boxplot using ggplot2. sample data is like this.

> sampe

count genotype
71       mt
50       mt
71       mt
95       wt
60       mt
63       mt
75       mt
82       wt
93       wt
87       wt
61       mt
102       wt
60       mt
78       wt
78       wt
87       wt
84       wt
104       wt
81       wt
85       mt


> qplot(factor(genotype),count,data=sampe,geom="boxplot")

The above command produces plot like this: enter image description here

what's wrong here?? why is it plotting like this?? Even this below code produces same output.

ggplot(sampe,aes(x=factor(genotype),y=count))+geom_boxplot()

Similar Question 3 : Boxplot with ggplot2

I am working on a boxplot with forecast and observations which is quite long dataset. I am providing a sample format here.

> forecasts <- data.frame(f_type = c(rep("A", 9), rep("B", 9)), 
                          Date = c(rep(as.Date("2007-01-31"),3), rep(as.Date("2007-02-28"), 3), rep(as.Date("2007-03-31"), 3), rep(as.Date("2007-01-31"), 3), rep(as.Date("2007-02-28"), 3), rep(as.Date("2007-03-31"), 3)), 
                          value = c(10, 50, 60, 05, 90, 20, 30, 46, 39, 69, 82, 48, 65, 99, 75, 15 ,49, 27))
> 
> observation <- data.frame(Dt = c(as.Date("2007-01-31"), as.Date("2007-02-28"), as.Date("2007-03-31")), 
                            obs = c(30,49,57))

So far I have:

ggplot() + 
    geom_boxplot(data = forecasts,
                 aes(x = as.factor(Date), y = value, 
                     group = interaction(Date, f_type), fill = f_type)) +  
    geom_line(data = observations,
              aes(x = as.factor(Dt), y = obs, group = 1), 
              size = 2)

With this the box and whiskers are set by default. I want to assign these values so that I will know the extent of the whiskers. I have tried to pass a function with stat_summary with like:

f <- function(x) {
    r <- quantile(x, probs = c(0.05, 0.25, 0.5, 0.75, 0.95))
    names(r) <- c("ymin", "lower", "middle", "upper", "ymax")
    r
}

o <- function(x) {
    subset(x, x < quantile(x,probs = 0.05) | quantile(x,probs = 0.95) < x)
}

ggplot(forecasts, aes(x = as.factor(Date), y = value)) + 
    stat_summary(fun.data = f, geom = "boxplot", aes(group = interaction(Date, f_type), fill = f_type)) +
    stat_summary(fun.y = o, geom = "point") 

But, with this the groups are messed up. This produces stacked up plots. Does anyone how to accomplish this?

Similar Question 4 (3 solutions) : annotate boxplot in ggplot2

Similar Question 5 (1 solutions) : ggplot2 width of boxplot

Similar Question 7 (1 solutions) : R ggplot2 grouped boxplot of TCGA expression data

Similar Question 8 (1 solutions) : Boxplot of table using ggplot2

Similar Question 9 (1 solutions) : Sorting a ggplot2 boxplot [duplicate]

cc