Statistical Inference Course Project Part 2: Basic Inferential Data Analysis Instructions Part2
User Manual:
Open the PDF directly: View PDF .
Page Count: 6
Download | ![]() |
Open PDF In Browser | View PDF |
Statistical Inference Course Project - Part 2: Basic Inferential Data Analysis Instructions Omer Shechter October 13, 2018 Overview Analyze the ToothGrowth data in the R datasets package ToothGrowth {dataset } Provide : The Effect of Vitamin C on Tooth Growth in Guinea Pigs Description The response is the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC). A data frame with 60 observations on 3 variables. [,1] len numeric Tooth length [,2] supp factor Supplement type (VC or OJ). [,3] dose numeric Dose in milligrams/day Data Analyzes This part includes the data loading, and initial data analyzes. Load required libraries. library(ggplot2) library(datasets) library(UsingR) ## Loading required package: MASS ## Loading required package: HistData ## Loading required package: Hmisc ## Loading required package: lattice ## Loading required package: survival ## Loading required package: Formula ## ## Attaching package: 'Hmisc' ## The following objects are masked from 'package:base': ## ## format.pval, units ## ## Attaching package: 'UsingR' ## The following object is masked from 'package:survival': ## ## cancer 1 library(kableExtra) A preliminary review of the data. dim(ToothGrowth) ## [1] 60 3 head(ToothGrowth) ## ## ## ## ## ## ## len supp dose 1 4.2 VC 0.5 2 11.5 VC 0.5 3 7.3 VC 0.5 4 5.8 VC 0.5 5 6.4 VC 0.5 6 10.0 VC 0.5 Check how much samples we have for each test. table(ToothGrowth$supp) ## ## OJ VC ## 30 30 table(ToothGrowth$dose) ## ## 0.5 ## 20 1 20 2 20 The methods of providing the Vitamins and the amount is equally split along the 60 guinea pigs Half of the guinea pigs got the Vitamin via orange juice and half via ascorbic acid. Plot some basic graph to get some view of the data Plot the ratio len ~dose ToothGrowth$dose<-as.factor(ToothGrowth$dose) theme_update(plot.title = element_text(hjust = 0.5)) ggplot(ToothGrowth, aes(x=dose, y=len, group=(dose))) + geom_boxplot(aes(fill=dose)) + ggtitle(" Len of Odontoblasts ~ Dose") 2 Len of Odontoblasts ~ Dose 30 len dose 0.5 20 1 2 10 0.5 1 2 dose plot the ratio len ~ dose , and split according to the delivery method. ggplot(ToothGrowth, aes(x=dose, y=len, group=(dose))) +geom_boxplot(aes(fill=dose)) + ggtitle(" Len of Odontoblasts ~ Dose \n Partitioned by delivery methods ") + facet_grid(. ~ supp) 3 Len of Odontoblasts ~ Dose Partitioned by delivery methods OJ VC 30 len dose 0.5 20 1 2 10 0.5 1 2 0.5 1 2 dose Hypothesis and Confidence Interval This section contains several Hypothesis checking and illustration of a confidence interval. Hypothesis I Null hypothesis , The Supplement type (VC or OJ) doesn’t impact the Tooth length H0 -> Mean of Length for VC = Mean of Length for OJ H1 -> Mean of Length for VC != Mean of Length for OJ t.test(ToothGrowth$len[ToothGrowth$supp=="OJ"],ToothGrowth$len[ToothGrowth$supp=="VC"], mu=0,var.equal = FALSE,alternative=c("two.sided")) ## ## ## ## ## ## ## ## ## ## ## Welch Two Sample t-test data: ToothGrowth$len[ToothGrowth$supp == "OJ"] and ToothGrowth$len[ToothGrowth$supp == "VC"] t = 1.9153, df = 55.309, p-value = 0.06063 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -0.1710156 7.5710156 sample estimates: mean of x mean of y 20.66333 16.96333 As it can be seen the p-value = 0.06063 > .005 and , we can see that 0 is in the confidence interval -0.1710156 7.5710156. So the Null Hypothesis can’t be rejected, and we can assume that there is no difference between the two Supplement types when we measure their impact of the length of the tooth. 4 Hypothesis II Check the impact of the amount of Dose on Tooth’s length. The Null Hypothesis is that increasing the Dose doesn’t impact the length of the tooth. Compare amount .5 and 1 res<-t.test(ToothGrowth$len[ToothGrowth$dose==.5],ToothGrowth$len[ToothGrowth$dose==1],mu=0,var.equal = P_Values<-res$p.value Conf_Intervals_Low<-res$conf.int[1] Conf_Intervals_High<-res$conf.int[2] res ## ## ## ## ## ## ## ## ## ## ## Welch Two Sample t-test data: ToothGrowth$len[ToothGrowth$dose == 0.5] and ToothGrowth$len[ToothGrowth$dose == 1] t = -6.4766, df = 37.986, p-value = 1.268e-07 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -11.983781 -6.276219 sample estimates: mean of x mean of y 10.605 19.735 Compare amount 1 and 2 res<-t.test(ToothGrowth$len[ToothGrowth$dose==1],ToothGrowth$len[ToothGrowth$dose==2], mu=0,var.equal = FALSE,alternative=c("two.sided")) P_Values<-c(P_Values,res$p.value) Conf_Intervals_Low<-c(Conf_Intervals_Low,res$conf.int[1]) Conf_Intervals_High<-c(Conf_Intervals_High,res$conf.int[2]) res ## ## ## ## ## ## ## ## ## ## ## Welch Two Sample t-test data: ToothGrowth$len[ToothGrowth$dose == 1] and ToothGrowth$len[ToothGrowth$dose == 2] t = -4.9005, df = 37.101, p-value = 1.906e-05 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -8.996481 -3.733519 sample estimates: mean of x mean of y 19.735 26.100 Compare amount 0.5 and 2 res<-t.test(ToothGrowth$len[ToothGrowth$dose==.5],ToothGrowth$len[ToothGrowth$dose==2],mu=0,var.equal = P_Values<-c(P_Values,res$p.value) Conf_Intervals_Low<-c(Conf_Intervals_Low,res$conf.int[1]) Conf_Intervals_High<-c(Conf_Intervals_High,res$conf.int[2]) res ## ## Welch Two Sample t-test ## ## data: ToothGrowth$len[ToothGrowth$dose == 0.5] and ToothGrowth$len[ToothGrowth$dose == 2] ## t = -11.799, df = 36.883, p-value = 4.398e-14 ## alternative hypothesis: true difference in means is not equal to 0 5 ## 95 percent confidence interval: ## -18.15617 -12.83383 ## sample estimates: ## mean of x mean of y ## 10.605 26.100 Present the results in a table. Dose_Comparison_Values<-c("0.5<->1.0","1.0<->2.0","0.5<->2") df<-data.frame(Dose_Comparison_Values) df<-(cbind(df,P_Values)) df<-(cbind(df,Conf_Intervals_Low)) df<-(cbind(df,Conf_Intervals_High)) kable(df) %>% kable_styling(bootstrap_options = "striped", full_width = F, position = "left") Dose_Comparison_Values 0.5<->1.0 1.0<->2.0 0.5<->2 P_Values 1.00e-07 1.91e-05 0.00e+00 Conf_Intervals_Low -11.983781 -8.996481 -18.156167 Conf_Intervals_High -6.276219 -3.733519 -12.833834 As it can be seen from the table the P_Values are very low (<..05) It means that we need to reject the Null Hypothesis. Increasing the dose impact the length of the teeth. Conclusions 1.There is no clear and direct impact of the two Supplement type (VC or OJ), it means that we don’t see any preferred method that impact the teeth length. 2. There is an impact of the Dose amount on the teeth length. 6
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : No Page Count : 6 Page Mode : UseOutlines Author : Omer Shechter Title : Statistical Inference Course Project - Part 2: Basic Inferential Data Analysis Instructions Subject : Creator : LaTeX with hyperref Producer : pdfTeX-1.40.19 Create Date : 2018:10:20 18:10:51+03:00 Modify Date : 2018:10:20 18:10:51+03:00 Trapped : False PTEX Fullbanner : This is MiKTeX-pdfTeX 2.9.6839 (1.40.19)EXIF Metadata provided by EXIF.tools