2.6 - Pair-wise comparisons for Mock Jury data

by Mark Greenwood and Katharine Banner

In our previous work with the Mock Jury data, the overall ANOVA test provided only marginal evidence of some difference in the true means across the three groups with a p-value=0.067. Tukey's HSD does not require you to find a small p-value from your overall F-test to employ the methods but if you apply it to situations with p-values larger than your a priori significance level, you are unlikely to find any pairs that are detected as being different. Some statisticians suggest that you shouldn't employ follow-up tests such as Tukey's HSD when there is not sufficient evidence to reject the overall null hypothesis. For the sake of completeness, we can find the pair-wise comparison results at our typical 95% family-wise confidence level in this situation, with the three confidence intervals displayed in Figure 2-20.

> require(heplots)

> require(mosaic)

> data(MockJury)

> lm2=lm(Years~Attr,data=MockJury)

> require(multcomp)

> Tm2 <- glht(lm2, linfct = mcp(Attr = "Tukey"))

> confint(Tm2)

Simultaneous Confidence Intervals

Multiple Comparisons of Means: Tukey Contrasts

Fit: lm(formula = Years ~ Attr, data = MockJury)

Quantile = 2.3749

95% family-wise confidence level

Linear Hypotheses:
	Estimate	lwr	upr
Average - Beautiful == 0	-0.3596	-2.2968	1.5775
Unattractive - Beautiful == 0	1.4775	-0.4729	3.4278
Unattractive - Average == 0	1.8371	-0.1257	3.7999

> old.par <- par(mai=c(1.5,2.5,1,1)) #Makes room on the plot for the group names

> plot(Tm2)

> cld(Tm2)

Beautiful	Average	Unattractive
"a"	"a"	"a"

Figure2.20 — *Figure 2-20: Tukey's HSD confidence interval results at the 95% family-wise confidence level.*

At the family-wise 5% significance level, there are no pairs that are detectably different - they all get the same letter of "a". Now we will produce results for the reader that thought a 10% significance was suitable for this application before seeing any of the results. We just need to change the confidence level or significance level that the CIs or tests are produced with inside the functions. For the confint function, the level option is the confidence level and for the cld, it is the family-wise significance level.

> confint(Tm2,level=0.9)

Simultaneous Confidence Intervals

Multiple Comparisons of Means: Tukey Contrasts

90% family-wise confidence level

	Estimate	lwr	upr
Average - Beautiful == 0	-0.3596	-2.0511	1.3318
Unattractive - Beautiful == 0	1.4775	-0.2255	3.1804
Unattractive - Average == 0	1.8371	0.1233	3.5510

> old.par <- par(mai=c(1.5,2.5,1,1)) #Makes room on the plot for the group names

> plot(confint(Tm2,level=.9))

> cld(Tm2,level=0.1)

Beautiful	Average	Unattractive
"ab"	"a"	"b"

Figure2.21 — *Figure 2-21: Tukey's HSD 90% family-wise confidence intervals.*

With family-wise 10% significance and 90% confidence levels, the Unattractive and Average picture groups are detected as being different but the Average group is not detected as different from Beautiful and Beautiful is not detected to be different from Unattractive. This leaves the "overlap" of groups across the sets of groups that was noted earlier. The Beautiful level is not detected as being dissimilar from levels in two different sets and so gets two different letters.

The beanplot's means (Figure 2-22) helps to clarify some of reasons for this set of results. The detection of a difference between Average and Unattractive just barely occurs and the mean for Beautiful is between the other two so it ends up not being detectably different from either one. This sort of overlap is actually a fairly common occurrence in these sorts of situations so be prepared a mixed set of letters for some levels.

> beanplot(Years~Attr,data=MockJury,log="",col="white",method="jitter")

> text(c(1),c(5),"ab",col="blue",cex=2)

> text(c(2),c(4.8),"a",col="green",cex=2)

> text(c(3),c(6.5),"b",col="red",cex=2)

Figure2.22 — *Figure 2-22: Beanplot of sentences with compact letter display results from 10% family-wise significance level Tukey's HSD.*

previous next