3

Click here to load reader

Data_analysis_exmp

Embed Size (px)

Citation preview

Page 1: Data_analysis_exmp

toothgrowth_report

HUNG HUO-SU

2015 年 11 月 21 日

1. Introduction

The ToothGrowth data set shows the length of teeth in groups of 10 guinea pigs when given various doses (0.5, 1 and 2 mg) of vitamin C, using two delivery methods (orange juice or ascorbic acid). In this report we try to analyze the impact the tooth growth between 2 different delivery methods. We try to use t-test to see if the test result supports our hypothesis or not.

data("ToothGrowth") library(ggplot2) #Plot the length and supp, it show the different supp impact different length. #So that we divide data into different "supp"" tooth_vc <- ToothGrowth[which(ToothGrowth$supp == "VC"),] tooth_oj <- ToothGrowth[which(ToothGrowth$supp == "OJ"),] #Plot the 'len' and 'dose' relation with 2 different delivery method. The Red is 'OJ' and the Green is 'VC' ggplot(ToothGrowth, aes(x = dose, y = len)) + geom_point(data = tooth_oj, colour = 'red', size = 3) + geom_point(data = tooth_vc, colour = 'green', size = 3)

Page 2: Data_analysis_exmp

We find that it is obvious difference in dose =0.5 or dose = 1 between 2 delivery methods, but it is not obvious in dose=2.0 .

2. Analysis

Our hypothesis is that the tooth growth difference under low dose is significant but high dose is not. So we do our t tests under different dose and we make our hypothesis H0: there is no difference between two delivery methods.

t <- t.test(len ~ supp, data=subset(ToothGrowth, dose == 0.5)) t

## ## Welch Two Sample t-test ## ## data: len by supp ## t = 3.1697, df = 14.969, p-value = 0.006359 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## 1.719057 8.780943

Page 3: Data_analysis_exmp

## sample estimates: ## mean in group OJ mean in group VC ## 13.23 7.98

For dose=0.5, the p-value is 0.006359, so that we can reject the H0.

It means the there is a difference between 2 delivery methods when

dose =0.5 and OJ is beter than VC. However, we do the t-test when dose =2.0

t <- t.test(len ~ supp, data=subset(ToothGrowth, dose == 2.0)) t

## ## Welch Two Sample t-test ## ## data: len by supp ## t = -0.046136, df = 14.04, p-value = 0.9639 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -3.79807 3.63807 ## sample estimates: ## mean in group OJ mean in group VC ## 26.06 26.14

For the 95% confidence interval is [-3.79807, 3.63807] and it

contains the 0. We can not reject the H0

3.Summary

After the analysis, it is reansonable to say: the difference between 2 delivery method exist "only"" when dose is low and OJ is better. But the difference is not obvious when dose is high