1.10 - Practice problems
Load the HELPrct data set from the mosaicData package. The HELP study was a clinical trial for adult inpatients recruited from a detoxification unit. Patients with no primary care physician were randomly assigned to receive a multidisciplinary assessment and a brief motivational intervention or usual care and various outcomes were observed. Two of the variables in the dataset are sex, a factor with levels (male and female) and daysanysub, time (in days) to first use of any substance post-detox. We are interested in the difference in mean number of days to first use of any substance post-detox between males and females. There are some missing responses and the following code will produce favstats with the missing values and then provide a data set that for complete observations by applying the na.omit function that removes any observations with missing values.
require(mosaicData) #load the dataset
HELPrct2<-HELPrct[,c("daysanysub","sex")] #Just focus on two variables
HELPrct3<-na.omit(HELPrct2) #Removes subjects with missing
favstats(daysanysub~sex, data = HELPrct2)
favstats(daysanysub~sex, data = HELPrct3)
1.1. Based on the results provided, how many observations were missing for males and females. Missing values here likely mean that the subjects didn't use any substances post-detox in the time of the study. This is called censoring. What is the problem with the numerical summaries if the missing responses were all something larger than the largest observation?
1.2. Make a beanplot and a boxplot of daysanysub ~ sex using the HELPrct3 data set created above. Compare the distributions, recommending parametric or nonparametric inferences.
1.3. Generate the permutation results and write out the 6+ steps of the hypothesis test, making sure to note the numerical value of observed test statistic you are using. Include scope of inference.
1.4. Interpret the p-value for these results.
1.5. Generate the parametric t.test results, reporting the test-statistic, its distribution under the null hypothesis, and compare the p-value to those observed using the permutation approach.
1.6. Make and interpret a 95% bootstrap confidence interval for the difference in the means.