by Andy Field
SPSS Output 6.25
Calculating the Effect Size for Two-Way Repeated Measures ANOVA
In two-way repeated measures ANOVA each of the three effects (lighting, alcohol and the interaction) has its own error term. To calculate the effect size we need to look at SPSS Output 6.23 and get the value of the mean square for each effect (MSM), and the mean square for its associated error term (MSR). We can then place these values into the equation for the effect size on page 181.
For the effect of lighting, we look at SPSS Output 6.23 and find that the mean squares for the experimental effect are 1993.92, and the mean squares for the error term are 85.13.
Using the benchmarks for effect sizes this represents a large effect (it is above the threshold of .5). Therefore, the change in attractiveness of the mate due to lighting is a substantive finding.
Using the benchmarks for effect sizes the effect of alcohol was very large (it is well above the threshold of .5, and is close to 1). Therefore, the change in attractiveness of the mate due to alcohol is a substantive finding.
The interaction has a large effect size, so, the combined effect of lighting and alcohol on the attractiveness of the selected mate was very substantial.
Writing the Result for Two-Way Repeated Measures ANOVA
The write up of a two-way repeated measures ANOVA is much the same as the other two-way ANOVAs that we’ve come across: you have to report the details of the F-ratio and the degrees of freedom from which it was calculated for each of the three effects. The main difference is you’ll need to look up different error degrees of freedom for each effect, because each effect had its own error term. We can report the three effects from this analysis as follows (check back to SPSS Output 6.23 to see from where I got the degrees of freedom):
The results show that the attractiveness of the mates selected was significantly lower when the lighting in the club was dim compared to when the lighting was bright, F(1, 25) = 23.42, p < .001, r = .68.
The main effect of alcohol on the attractiveness of mates selected was significant, F(3, 75) = 104.39, p < .001, r = .89. This indicated that when the lighting in the club was ignored, the attractiveness of the mates selected differed according to how much alcohol was drunk before the selection was made. Specifically, post hoc test revealed that compared to a baseline of when no alcohol had been consumed, the attractiveness of selected mates was not different after two pints (p > .05), but was significantly lower after four and six pints (both ps < .001). The mean attractiveness after two pints was also significantly higher than after four pints and six pints (both ps < .001), and the mean attractiveness after four pints was significantly higher than after six pints (p < .001). To sum up, the beer-goggles effect seems to take effect after two pints have been consumed and has an increasing impact until six pints are consumed.
The lighting × alcohol interaction was significant, F(3, 75) = 22.22, p < .00 1, r = .67, indicating that the effect of alcohol on the attractiveness of the mates selected differed when lighting was dim compared to when it was bright. Contrasts on this interaction term revealed that when the difference in attractiveness ratings between dim and bright clubs was compared after no alcohol and after two pints had been drunk there was no significant difference, F(1, 25) < 1. However, when comparing the difference between dim and bright clubs when two pints were drunk with the difference after four pints were drunk a significant difference emerged, F(1,25) = 24.75, p < .001. A final contrast revealed that the difference between dim and bright clubs after four pints were drunk compared to after six pints was not significant, F(1,25) = 2.16, ns. To sum up, there was a significant interaction between the amount of alcohol drunk and the lighting in the club: the decline in the attractiveness of the selected mate seen after two pints (compared to after four) was significantly more pronounced when the lights were dim.
6.11 Analysis of Covariance (ANCOVA)
* * *
Often when we conduct an experiment, we know (from previous research) that some factors already have an influence over our dependent variable. For example, in any memory experiment we might want to be aware of the fact that memory gets worse as you get older. Age could confound the results of our experiment. Variables such as these, that are not part of the main experimental manipulation but have an influence on the dependent variable, are known as covariates and they can be included in an ANOVA (see Field, 2000, Chapter 8). For example, in the previous example in which we looked at the effect of alcohol and lighting on the beer-goggles effect, we know that the effect of alcohol (i.e. how drunk you get) will depend on factors such as weight (big people can generally drink more before they get drunk whereas little people like me only have to sniff a glass of wine and they’re staggering around saying ‘you’re my best friend in the whole world’ to a chair) and tolerance (people who drink regularly require more alcohol to get drunk). Therefore, to assess the true effect we might want to take account of these factors. Likewise, in our text message example (see page 201), we had two groups: one of which was encouraged to use text messages and the other was discouraged. However, there would have been individual differences in the amount of text messages sent in the 6 month period: people encouraged to use text messages will differ in the degree to which they use text messages, and those discouraged probably still used some text messages, and again would show individual differences. We would expect that the number of text messages sent would relate to the drop in grammatical ability, and so we might want to control this variable. If these variables (covariates) are measured, then it is possible to control for the influence they have on the dependent variable by including them in the analysis. What, in effect, happens is a two stage process: (1) we work out how much variance in the outcome can be explained by the covariate (so, if you imagine the total variance is a cake, then we remove a slice that represents the effect of the covariate); then (2) we look at what’s left over (we can’t look at the variance explained by the covariate anymore because that slice of cake has been eaten and we certainly don’t want to vomit it back up!), and see how much of what’s left over is explained by the experimental manipulation. So, we start with 100% of variance, say the covariate can explain 20% of this, then we become interested in how much of the remaining 80% the experimental manipulation can explain. In short, we end up seeing the effect an independent variable has after removing the effect of the covariate – we control for (or partial out) the effect of the covariate (see Field, 2000, Chapters 7 and 8 for details). The purpose of including covariates in ANOVA is two-fold:
To reduce the error variance: The F-ratio in ANOVA compares the amount of variability explained by the experimental manipulation (MSM), against the variability that it cannot explain (MSR). What we hope is that the covariate explains different variance to the experiment, and so it explains some of the variance that was previously unexplained. This has the effect of reducing the unexplained variance (MSR becomes smaller), and so our F-ratio gets bigger. In real terms this means we’re getting a much more sensitive measure of our experimental effect.
Elimination of confounds: As I’ve explained, in any experiment there may be unmeasured variables that confound the results (i.e. a variable that varies systematically with the experimental manipulation). By measuring and controlling for these variables in the analysis we remove the bias of these variables.
When we include covariates into ANOVA it becomes known as Analysis of Covariance (or ANCOVA for short). We can include covariates in any of the situations we’ve come across (independent t-test, dependent t-test, one-way independent, one-way repeated, twoway independent, two-way repeated and two-way mixed), and we can include one covariate or several in an analysis. For this example we’ll keep things simple and look at a simple two-group example (different participants) with only one covariate.
ANCOVA has the same basic assumptions of all of the parametric tests, but it has an additional one as well: that is, the assumption of homogeneity of regression slopes. Put simply, in ANCOVA we assume that our
covariate has some effect with our outcome variable (in fact we assume they are correlated so as scores on the covariate change, scores on the outcome change by a similar amount). Obviously, we hope that the effect that the covariate has on the outcome is the same for all of the groups we test. This is the assumption of the homogeneity of regression slopes: we assume that the relationship between the covariate and the outcome variable is the same in all of the groups we test. How this is tested is more complex than just ticking a button in SPSS, and so I’ll simply refer you to Field (2000, Section 8.1.4) if you want to know more. For now, just be aware that this assumption exists.
Example: Stalking is a very disruptive and upsetting (for the person being stalked) experience in which someone (the stalker) constantly harasses or obsesses about another person. It can take many forms, from sending intensely disturbing letters threatening to boil your cat if you don’t reciprocate the stalker’s undeniable love for you, to literally following you around your local area in a desperate attempt to see which CD you buy on a Saturday (as if it would be anything other than Fugazi!). A psychologist, who’d had enough of being stalked by people, decided to try two different therapies on different groups of stalkers (25 stalkers in each group). The first group of stalkers he gave what he termed ‘cruel to be kind therapy’. This therapy was based on punishment for stalking behaviours; in short, every time the stalker followed him around, or sent him a letter, the psychologist attacked them with a cattle prod until they stopped their stalking behaviour. It was hoped that the stalkers would learn an aversive reaction to anything resembling stalking. The second therapy was ‘psychodyshamic therapy’, which was a recent development on Freud’s psychodynamic therapy that acknowledges what a sham this kind of treatment is (so, you could say it’s based on Fraudian theory!). The stalkers were hypnotized and regressed into their childhood, the therapist would also discuss their penis (unless it was a woman in which case they discussed their lack of penis), the penis of their father, their dog’s penis, the penis of the cat down the road, and anyone else’s penis that sprang to mind. At the end of therapy, the psychologist measured the number of hours in the week that the stalker spent stalking their prey. Now, the therapist believed that the success of therapy might well depend on how bad the problem was to begin with, so before therapy the therapist measured the number of hours that the patient spent stalking (as an indicator of how much of a stalker the person was).
SPSS Output for ANCOVA
SPSS Output 6.26 shows (for illustrative purposes) the ANOVA table when the covariate is not included. It is clear from the significance value that there is no difference in the hours spent stalking after therapy for the two therapy groups p is .074 which is greater than .05). You should note that the total amount of variation to be explained (SST) was 9118, of which the experimental manipulation accounted for 591.68 units (SSM)’ whilst 8526.32 were unexplained (SSR).
Figure 6.14 shows a bar chart of the mean number of hours spent stalking after therapy. The normal means are shown as well as the same means when the data are adjusted for the effect of the covariate. In this case the adjusted and unadjusted means are relatively similar, however, sometimes the effect of the covariate will have a more pronounced effect on the group means. The more noticeable difference that the covariate makes for these data is that it reduces the standard error of the group means (the bars sticking out of the top are shorter for the covariate adjusted means).
SPSS Output 6.26
SPSS Output 6.27 shows the unadjusted means (i.e. the normal means if we ignore the effect of the covariate). These are the same values plotted on the left hand side of Figure 6.14. These results show that the time spent stalking after therapy was less after cruel to be kind therapy. However, we know from SPSS Output 6.26 that this difference is non-significant. So, what now happens when we consider the effect of the covariate (in this case the extent of the stalker’s problem before therapy)?
Before, we get too carried away we need to check the homogeneity of variance assumption (see page 159). SPSS Output 6.28 shows the results of Levene’s test, which is significant because the significance value is .01 (less than .05). This finding tells us that the variances across groups are different and the assumption has been broken. We could try to transform our data to rectify this problem (see page 176) or do a non-parametric test instead (see Chapter 7). If these options don’t work then we must report the significance of Levene’s test when we report the results – we might also be cautious about how we interpret the results because the final ANCOVA will lack accuracy.
Figure 6.14 Mean number of hours spent stalking after the two types of therapy (the normal group means are shown but also the same means adjusted for the effect of the covariate)
SPSS Output 6.27
SPSS Output 6.28
The format of the ANCOVA table (SPSS Output 6.29) is largely the same as without the covariate, except that there is an additional row of information about the covariate (the hours spent stalking before therapy commenced). Looking first at the significance values, it is clear that the covariate significantly predicts the dependent variable, so the hours spent stalking after therapy depends on the extent of the initial problem (i.e. the hours spent stalking before therapy). More interesting is that when the effect of initial stalking behaviour is removed, the effect of therapy becomes significant (p has gone down from .074 to .023, which is less than .05). The amount of variation accounted for by the model (SSM) has gone down to 480.27 units, which means that initial stalking behaviour actually explains some of the same variance that therapy accounted for in the initial ANOVA; however, it also explains a lot of the variance that was previously unexplained (the error variance, SSR has decreased to 4111.72 – nearly half its original value). Notice that the total amount of variation (9118.00) has not changed, all that has changed is how that total variation is explained. The result of the covariate explaining some of the error variance, is that the mean square error (MSR) has gone down too (from 177.63 to 87.48), the resulting F-ratio is, therefore, bigger than before. In fact it has gone up from 3.33 to 5.49, which explains why it is now significant.
SPSS Output 6.29
This example illustrates how ANCOVA can help us to exert stricter experimental control by taking account of confounding variables to give us a ‘purer’ measure of effect of the experimental manipulation. Without taking account of the initial level of stalking behaviour we would have concluded that the different therapies had no effect on stalking behaviour, yet clearly it does. To interpret the results of the main effect of therapy we need to look at adjusted means. Adjusted means are the group means, adjusted for whatever effect the covariate has had. SPSS Output 6.30 shows the adjusted means for these data; note that these means are different from the unadjusted ones in SPSS Output 6.27 (both sets of means are compared in Figure 6.14). Although in this example the adjusted means are fairly similar to the unadjusted ones this will not always be the case so be sure to check the adjusted ones when you make your interpretation. There are only two groups being compared in this example so we can conclude that the therapies had a significantly different effect on stalking behaviour; specifically stalking behaviour was lower after the therapy involving the cattle prod compared to psychodyshamic therapy.
SPSS Output 6.30
In addition to the main effect, we need to interpret the covariate. In SPSS Output 6.29 we can see that the covariate has its own F-ratio and this is significant (the value of p is .000 in the output indicating a highly significant result). To interpret this result you need to think about what would happen if we plotted the values of the dependent variable against the values of the covariate. Figure 6.15 shows such a graph for the time spent stalking after therapy (dependent variable) and the initial level of stalking (covariate). This graph shows that there is a positive relationship between the two variables, that is, high scores on one variable correspond with high scores on the other, whereas low scores on one variable correspond with low scores on the other (see
Field, 2000, Chapter 3). If we were to do a correlation between these two variables we would find a significant correlation between them. The F-ratio is just telling us the same thing as the correlation would: there is a significant relationship between initial levels of stalking and levels of stalking after therapy. The graph tells us that this relationship is positive.