"Student's" t Test – Interactive tutorial

BACK

Choosing a t test – type

It really should not be difficult to select a type of t test, but many students are confused about the criteria nonetheless. There are several different kinds of t test, but the types offered by Excel are commonly used.

Is a t test valid for these data?

In a two-sample test each of the two populations being compared should follow a normal distribution. The same is true for the data sets colleced for a paired test. This can be tested using a normality test, such as the Shapiro-Wilk or Kolmogorov-Smirnov test, or graphically using a normal quantile plot. Such tests and how to use them should be available on line. I will note here that many investigators do not bother to test for normally distributed data, but merely assume such a distribution. The same is true for equal variances (see below). This practice is probably okay if the investigator tests the same kind of data taken from the same kinds of experiments, and if the assumptions were demonstrated to be true at some time in the past.

Paired t test

We use a paired t test when we measure two responses on the same individual. For example, suppose that you measure blood pressure in 20 hypertensive patients before and after a treatment. If treatment is effective then we expect blood pressure to come down. In this example you will have two data points from each individual "statistical unit." Each data point from the "before" set can be paired with one and only one unique data point from the "after" data set, namely the data point associated with the same patient.

It should be obvious that if you collect data from a control group of randomly selected individuals, such as cells, trees, people, lab mice, etc. and collect data from a different treatment group of randomly selected individuals, there is no basis for a paired test. If you want to make it as simple as possible, if the numbers of data points in two data sets are unequal or could be unequal then you cannot use a paired t test.

Pairing of data is very helpful because it can factor out variations from one individual to the next. However it makes no sense to pair up data when there is no basis for it. We also should test whether or not the data are parametric before publishing results of any t test.

Two-sample t tests

The example used in this tutorial employed a two-sample equal variance t test. It is a two-sample test because we took data from two different populations. There is no unique relationship between any data point in set one and a data point in set two. We also refer to such samples as independent samples and the two-sample test as a test for independent data. This is the kind of t test you are most likely to run. If you collect a data point from a number of cultures, specimens, cells, mice, people, or any other group of individuals and then do not collect data from those same individuals again, then you have collected an independent sample.

Shall you use the test for equal or unequal variances? If you have equal numbers of data points, or the numbers are nearly the same, then you should be able to safely use the two-sample test for equal variances. Otherwise, one should use Levene's test, Bartlett's test, or the Brown-Forsythe test, or test for equal variances graphically using a normal quantile plot.

Question

What type of t test will you run if you have limited time, so you measure heart rate in a group of 20 untreated animals and another group of 20 animals treated with the drug?

NEXT

Reference
http://en.wikipedia.org/wiki/Student%27s_t-test