Statistical Methods
The comparison of a
continuous response variable (E.coli concentration) between two
treatments of a categorical independent variable (fenced vs.
unfenced treatment) was completed using a version of the student's
t test. Individual t tests were made for each year: 2004,
2005, 2006, and 2007.
Assumptions
The use of a t test requires that the data, at each treatment level, meets several assumptions. The data needs to be:
- Normally distributed
- Equal variances
- Equal sample size
Prior
to begining analysis, the data were visually examined to confirm that
they met these assumptions. The assumption of equal sample size
was met (n=12 for each treatment, each year). Plotted histograms,
however, presented non-normal distributions for E.coli concentrations
(Figures 9 & 10). Plots of residuals and Bartlett's
test for homogeneity (p=0.6791) also suggested unequal
variances between treatments for each year (Figure 11) .
Paragraph.
Figure
11. Plot of residuals of E.coli concentrations by treatment and year
Differences in variance are represented by the difference in
vertical spread of the points at each treatment*year.
The presence of outliers creates unequal variances between
treatments in two of the treatment*years. The large limit on
the y axis forces a clustering of data points, but variances appear to
show some difference between treatments, as represented by the red line.
To allow the use of a
parametric statistical test despite the known or potential presence of
unequal variances, I selected a version of the t test, Welch's two
sample t test, that handles inequality in variance by adjusting
degrees of freedom (5). Data distributions were normalized using
a log-10 transformation of E.coli concentrations (Figures 12 &
13). A comparison of pre- and post-transformed data is presented
in the two probability plots below (Figure 14).
Paragraph.
Figure 14. Probability plots
generated for untransformed and log-transformed E.coli concentration
data. With a normal distribution, the plotted points should
lie on a straight line of unit slope from the origin; this is more
closely approached in the log transformed data on the left.
References cited:
(5)
Ruxton, GD. 2006. The unequal variance t test is an underused
alternative to Student's t test and the Mann-Whitney U test.
Behavioural Ecology 47: 688-690.