|
The "Other F-Tests" option is very powerful, but you have to know what you are doing in order to use it. It is provided to enable you to do power analyses for any test based on the F-distribution which is not covered by the F-Test (ANOVA) item and the F-Test (MCR) item. Of course, you can do power analyses for standard ANOVAs and MCRs using the "Other F-Tests" option, but it is usually much more convenient (and less error-prone) to use the options we provided for these standard cases directly. "Other F-Tests" is similar to the "Other t-Tests" item in that you can (in fact: must) specify the sample size and the degrees of freedom (both numerator and denominator) independently. Although this is important for a number of F-based tests, we think the two most important classes are
In this section, we briefly sketch how you can use G*Power to perform power analyses for these types of tests. Before we begin, please note that, as with "Other t-Tests," you cannot do a priori power analyses directly, but you can of course do repeated post-hoc power analyses, adjusting N and (simultaneously!) the df's until you arrive at the power value you desire. |
|
For reasons given in Bredenkamp and Erdfelder (1985), Olson (1976) and Stevens (1979), we prefer the Pillai-Bartlett V criterion as a multivariate test statistic. It is well known that under H0 the transformed V statistic V(h)/s(h) / df1
F = _____________________
(1 - V(h)/s(h)) / df2
is approximately F(df1, df2) distributed and V(h) is the Pillai-Bartlett V for the effect to be tested, V(h)/s(h) varies between 0 and 1 and can be regarded as a multivariate R2 or eta2.
A convenient measure for the multivariate effect size in the underlying population is
V(h)/s(h) V(h)
f2(mult) = ______________ = ______________,
1-V(h)/s(h) s(h) - V(h)
where V(h) denotes the Pillai-Bartlett V in the underlying population, not in a particular sample. Pillai and Jayachandran (1967) published exact power tables for small values of f2(mult) and small values of p. Stevens (1980) reported computer simulation results for a larger range of effect sizes and p values. We have compared these power tables to the power values computed by means of G*Power's "Other F-Tests" option assuming that, under H1, the F transformation of the V statistic is approximately noncentral F(df1, df2, lambda) distributed with numerator df1 = p * n(h),
In general, we found a quite good agreement, with perhaps a slight tendency to overestimate the power using the propsed approximation. Nevertheless, the approxiamtion may often be sufficiently precise. Note that the relation between sample size, effect size, and the noncentrality parameter lambda for MANOVAs is different from that for ANOVAs where lambda = f2 * N.
For a global MANOVA test we find that n(h) = k-1, and for special MANOVA tests we find that n(h) = the number of predictors of the effect to be tested. For instance, in a MANOVA based on an AxB design, A having a levels and B having b levels, we find n(h) = a-1 for the main effect of A, |
|
Assume that we have a k=3 group MANOVA design with a total sample size of 3 * 20 = 60 subjects, p = 2 dependent variables, and our effect size is f2(mult) = .15. This is how we calculate the power for this test: |
|
Select: |
Type of Power Analysis: |
|
|
Type of Test: |
Other F-Tests |
|
|
Input: |
.05 |
||
|
0.1500 |
|||
|
N: |
120 |
Note that we enter N = (2 * total sample size) and not
simply the plain total sample size because lambda |
|
|
Numerator DF: |
4 |
p * n(h) = 2 * 2 = 4 |
|
|
Denominator DF: |
114 |
s(h) * (N-k-p+s(h)) |
|
Result: |
0.9330 |
|
|
F(4,114) = 2.4513 |
||
|
18.0000 |
|
To illustrate power analyses for the so-called univariate approach to repeated measures designs, we use an A x B design in which A is a between-subjects factor and B is a within-subject factor. Factors A and B have a and b levels, respectively. |
|
Assume that we have a = 2 levels of Factor A, Between-Subjects EffectThe test for the between-subjects main effect of Factor A has numerator df = a - 1 = 2 - 1 = 1, and The power of the between-subjects effect depends on the number of repeated measures in our design, and on the correlation between the levels of the repeated measures. This can be seen when looking at the noncentrality parameter lambda for this case: lambda = N * (m/(1+(m-1)*rho)) * f2 where N is the total number of subjects, Obviously, if there is no repeated measures factor (i.e., m = 1), then the above equation reduces to lambda = N * f2 which is just the noncentrality parameter G*Power uses in F-Tests (ANOVA). Let us assume that we want to detect a "medium" effect according to Cohen's effect size conventions for ANOVA F-tests. Thus, f = .25 and therefore f2 = 0.0625. Next we assume that the correlation between the levels of the repeated measures Factor B is .75. As a consequence of the so-called sphericity assumption, we must assume that the correlation between all possible pairs of repeated measurements is identical. If sphericity is not given in our data, then we have a problem. We will deal with the sphericity problem below. Given the above assumptions, the noncentrality parameter for our design is lambda = N * (m/(1+(m-1)*rho)) * f2= 20 * (4/(1+(4-1)*.75)) * 0.0625 A technical point to be aware of is that G*Power computes the noncentrality parameter lambda as lambda = N*f2 where f2 is the label of the effect size slot when you select "Other F-Tests". Therefore, we need to enter (m/(1+(m-1)*rho)) * f2 = 0.0769 as the effect size term to be used in our computations. |
|
Select: |
Type of Power Analysis: |
|
|
Type of Test: |
Other F-Tests |
|
|
Input: |
.05 |
||
|
0.0769 |
|||
|
N: |
20 |
2 * 10 = 20 |
|
|
Numerator DF: |
1 |
a - 1 = 2 - 1 = 1 |
|
|
Denominator DF: |
18 |
N - a = 20 - 2 = 18 |
|
Result: |
0.2170 |
|
|
F(1,18) = 4.4139 |
||
|
1.5380 |
Within-Subject EffectThe test for the with-subjects main effect of Factor B has numerator df = b - 1 = 4 - 1 = 3, and The power of the within-subject effect depends on the correlation between the levels of the repeated measures. This can be seen when looking at the noncentrality parameter lambda for this case: lambda = N * m* f2 /(1-rho) where N is the total number of subjects, Let us assume that we want to detect an effect of the same size as before (i.e., f2 = 0.0625). The correlation between the levels of the repeated measures Factor B is .75 (see above). As a consequence of the so-called sphericity assumption, we must again assume that the correlation between all possible pairs of repeated measurements is identical. If it is not, then we have a problem. We will deal with this problem further on. Given the above assumptions, the noncentrality parameter for our design is lambda = N * m * f2 /(1-rho)= 20 * 1 As before, the technical point to be aware of is that G*Power computes the noncentrality parameter lambda as lambda = N * f2 where f2 is the label of the effect size slot when you select "Other F-Tests". Therefore, we need to enter m * f2 /(1-rho) = 1 as the effect size term to be used in our computations. We can now proceed as before. |
|
Select: |
Type of Power Analysis: |
|
|
Type of Test: |
Other F-Tests |
|
|
Input: |
.05 |
||
|
1 |
|||
|
N: |
20 |
2 * 10 = 20 |
|
|
Numerator DF: |
3 |
b - 1 = 4 - 1 = 3 |
|
|
Denominator DF: |
54 |
(N - a) * (b - 1) |
|
Result: |
0.9646 |
|
|
F(3,54) = 2.7758 |
||
|
20.0000 |
Interaction of Between-Subjects and Within-Subject EffectThe procedure for the within-between interaction test is basically identical to the procedure for within-subject effects. The formulae for the degrees of freedom in this case are numerator df = (a - 1) * (b - 1) = (2 - 1) * (4 - 1) = 3, and The power of the interaction effect also depends on the correlation between the levels of the repeated measures. The noncentrality parameter lambda for this case is: lambda = N * m* f2 /(1-rho) where N is the total number of subjects, As before, the technical point to be aware of is that G*Power computes the noncentrality parameter lambda as lambda = N * f2 where f2 is the label of the effect size slot when you select "Other F-Tests". Therefore, we need to enter m * f2 /(1-rho) as the effect size term to be used in our computations. We can now proceed as before. |
Problems Resulting from the Sphericity AssumptionIn the so-called univariate approach, we must assume that all repeated measures have equal variances and are correlated equally with each other. This is often referred to as the sphericity assumption. If sphericity is met, then analytic results for the power calculations of univariate repeated measures tests such as those illustrated above are available. Unfortunately, sphericity is a very strong assumption which is very likely violated in many situations (see O'Brien & Kaiser, 1985). For instance, if five levels of a repeated measures factor represent successive points in time, then it is almost certain that the correlation of the measures taken at the first and the second level is larger than the correlation between the first and the fifth level. If sphericity is not met, then the tests of main effects and interactions involving the within-subject factors occur at an artificially increased Type I error rate because the resulting F values are artificially inflated. One way to react to this problem is to apply the corrected univariate tests in which the Geisser-Greenhouse or the Huynh-Feldt estimate of epsilon are used to provide improved Type I error rates. Epsilon is 1 if sphericity is met, whereas without sphericity, we find that 1/n <= epsilon <= 1 (where n represents the size of the associated residual covariance matrix, e.g., n = k-1 for a within-subject main effect with k levels). In order to take violations of sphericity into account, both the numerator and the denominator degrees of freedom of the F test must be multiplied by epsilon, and the significance of the F ratio must be evaluated with the new degrees of freedom. The Geisser-Greenhouse epsilon tends to be relatively conservative, which is a property the Huynh-Feldt epsilon tries to correct. How can we assess the power of corrected univariate tests? Muller and Barton (1989) have proposed an approximation to the power of the Geisser-Greenhouse or Huynh-Feldt-corrected test. Following their approach, we compute numerator df(c) = (numerator df)*(estimate of epsilon), Assume that sphericity is violated and we find that the estimate of epsilon = .6. What is the effect of this violation on the power of our within-subject test? Using the within-subject example above, we compute numerator df(c) = 3 * .6 = 1.8 We can now reevaluate the power of this test. Note that G*Power expects df values to be integers, which is why we need to enter numerator df = 2 and denominator df = 33. We also need to make adjustments to what we enter as the effect size index in order to ensure proper calculation of lambda. More precisely, we enter 1 * 0.6 = 0.6 as the effect size index for our within-subject effect. In that way, we arrive at lambda = 20 * 0.6 = 12. |
|
Select: |
Type of Power Analysis: |
|
|
Type of Test: |
Other F-Tests |
|
|
Input: |
.05 |
||
|
0.6 |
|||
|
N: |
20 |
2 * 10 |
|
|
Numerator DF: |
2 |
round(3 * .6) = 2 |
|
|
Denominator DF: |
33 |
round(54 * .6) = 33 |
|
Result: |
0.8506 |
|
|
F(2,33) = 3.2849 |
||
|
12.0000 |
Thus, the power of the corrected test is clearly less than the power of the uncorrected test in which the sphericity problem is simply ignored.
In essencen, if we insist in using the so-called univariate approach to repeated measure analyses, then we face a choice between two unattractive alternatives: Either we ignore the (non)sphericity problem (and accept that we commit an error by testing at an artificially increased Type I error rate), or accept a reduction of the power of our statistical tests.
|
Repeated measures designs may also be analyzed using a multivariate approach. One advantage of this approach is that MANOVAs do not require the sphericity assumption to be met (which appears to be violated quite often, see O'Brien & Kaiser, 1985). Using the MANOVA approach, we treat the levels of the within-subject factor as different dependent variables. The univariate A x B design discussed above thus is regarded as a multivariate design with between-subjects factor A and p = b dependent variables. Let us consider the same design as above, but from a multivariate perspective. |
Between-Subjects EffectFirst, the result for the between-subjects is identical to the result we received for the univariate approach. We can therefore proceed quickly to the Within-Subjects EffectThe F-test for the within-subject Factor B has numerator df = b - 1 = 4 - 1 = 3, and where N is the number of participants, The noncentrality parameter lambda for this case is identical to the one used in the univerate approach to repeated measures analyses (see Davidson, 1972): lambda = N * m* f2 /(1-rho) where N is the total number of subjects, We consider again an effect size of f2 = 0.0625 in the following example. Given rho = .75 (as before), we need to enter m * f2/(1-rho) = 1 as the effect size term to be used in our computations. |
|
Select: |
Type of Power Analysis: |
|
|
Type of Test: |
Other F-Tests |
|
|
Accuracy |
|
Input: |
.05 |
||
|
1 |
|||
|
N: |
20 |
2 * 10 = 20 |
|
|
Numerator DF: |
3 |
b - 1 = 4 - 1 = 3 |
|
|
Denominator DF: |
16 |
(s(h) * (N-k-p+s(h)) |
|
Result: |
0.9270 |
|
|
Critical F: |
F(3,16) = 3.2389 |
|
|
Lambda: |
20.0000 |
|
Thus, the power for the multivariate approach (0.9270) is slightly smaller than that for the univariate approach (0.9646). However, this small advantage of the univariate approach is present if and only if the sphericity assumption is met. If not, the multivariate approach usually has more power (see O'Brien & Kaiser, 1985). In our example, the power of the corrected univariate test was 0.8506. As with the so-called univariate approach, interactions of within-subject and between-subjects factors are treated just like within-subject effects. |
|
|
|
Please report suggestions for improvements to Axel Buchner, Franz Faul, or Edgar Erdfelder.s |