Caveats first. Using appropriate statistics is not always easy. Please do not blame me when a reviewer caustically refers to the 56 uncorrected t-tests that you performed on your data as the work of a moron. The techniques explained here will probably be adequate for univariate experiments and confirmatory tests of group equality. This is not a statistics book.
infertdata set and the
brkdn()function, let's look at the means of age for cases and non-cases.
> brkdn(age~case,infert) 0 1 Mean 31.49091 31.53012 Variance 27.60510 27.86189 n 165.00000 83.00000 attr(,"class")  "dstat"
It looks as though the two groups have been age-matched. Try a t-test to see if there is a difference.
> t.test(subset(infert$age,infert$case == 0), + subset(infert$age,infert$case == 1)) Welch Two Sample t-test data: subset(infert$age, infert$case == 0) and subset(infert$age, infert$case == 1) t = -0.0553, df = 163.766, p-value = 0.956 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.439600 1.361177 sample estimates: mean of x mean of y 31.49091 31.53012
Bronwyn wants to know if those women in the sample who completed high school were significantly younger than those who did not.
> brkdn(age~education,infert) 0-5yrs 6-11yrs 12+ yrs Mean 35.25000 32.85000 29.72414 Variance 40.02273 28.66639 19.19280 n 12.00000 120.00000 116.00000 attr(,"class")  "dstat"
She may have a case here, but first let's do something about that painful typing
in of every subsetting operation. Have a look at the function
By calling this function as follows:
we can specify a grouping factor of high school completion versus everything else. This function also allows us to test two specified groups against one another.
Before we leave
t.test(), the ellipsis (...) at the end of the
arguments means that you can pass additional arguments to
For example, you might want the 99% confidence interval displayed rather than
the default 95% one.
Welch Two Sample t-test data: age by as.factor(ifelse(infert$education != "12+ yrs", "<12 yrs", "12+ yrs")) t = 5.3423, df = 243.997, p-value = 2.103e-07 alternative hypothesis: true difference in means is not equal to 0 99 percent confidence interval: 1.718973 4.969115 sample estimates: mean in group <12 yrs mean in group 12+ yrs 33.06818 29.72414
Looks like Bronwyn was right.
For a much more detailed treatment of ANOVAs and other methods, get the VR package or Notes on the use of R..." in the Contributed documentation page.
Back to Table of Contents