Other F-Tests

The "Other F-Tests" option is very powerful, but you have to know what you are doing in order to use it.

It is provided to enable you to do power analyses for any test based on the F-distribution which is not covered by the F-Test (ANOVA) item and the F-Test (MCR) item. Of course, you can do power analyses for standard ANOVAs and MCRs using the "Other F-Tests" option, but it is usually much more convenient (and less error-prone) to use the options we provided for these standard cases directly.

"Other F-Tests" is similar to the "Other t-Tests" item in that you can (in fact: must) specify the sample size and the degrees of freedom (both numerator and denominator) independently. Although this is important for a number of F-based tests, we think the two most important classes are

In this section, we briefly sketch how you can use G*Power to perform power analyses for these types of tests.

Before we begin, please note that, as with "Other t-Tests," you cannot do a priori power analyses directly, but you can of course do repeated post-hoc power analyses, adjusting N and (simultaneously!) the df's until you arrive at the power value you desire.

MANOVAs

For reasons given in Bredenkamp and Erdfelder (1985), Olson (1976) and Stevens (1979), we prefer the Pillai-Bartlett V criterion as a multivariate test statistic. It is well known that under H0 the transformed V statistic

	   V(h)/s(h) / df1
F =  _____________________
     (1 - V(h)/s(h)) / df2

 

is approximately F(df1, df2) distributed and

V(h) is the Pillai-Bartlett V for the effect to be tested,

s(h) = min(p, n(h)),

p = the number of dependent variables,

n(h) = the number of predictors for the effect to be tested,

df1 = p * n(h) (numerator degress of freedom),

df2 = s(h) * (N-k-p+s(h)), and

N is the total number of subjects summed across all k groups of the design (see Pillai & Mijares, 1959; Olson, 1976).

V(h)/s(h) varies between 0 and 1 and can be regarded as a multivariate R2 or eta2.

 

A convenient measure for the multivariate effect size in the underlying population is

 
             V(h)/s(h)          V(h)
f2(mult) = ______________ = ______________,
          
             1-V(h)/s(h)     s(h) - V(h)

where V(h) denotes the Pillai-Bartlett V in the underlying population, not in a particular sample.

Pillai and Jayachandran (1967) published exact power tables for small values of f2(mult) and small values of p. Stevens (1980) reported computer simulation results for a larger range of effect sizes and p values. We have compared these power tables to the power values computed by means of G*Power's "Other F-Tests" option assuming that, under H1, the F transformation of the V statistic is approximately noncentral F(df1, df2, lambda) distributed with

numerator df1 = p * n(h),

denominator df2 = s(h) * (N-k-p+s(h)), and

the noncentrality parameter = s(h) * N * f2(mult).

 

In general, we found a quite good agreement, with perhaps a slight tendency to overestimate the power using the propsed approximation. Nevertheless, the approxiamtion may often be sufficiently precise.

Note that the relation between sample size, effect size, and the noncentrality parameter lambda for MANOVAs is different from that for ANOVAs where lambda = f2 * N.

 

For a global MANOVA test we find that n(h) = k-1, and for special MANOVA tests we find that n(h) = the number of predictors of the effect to be tested. For instance, in a MANOVA based on an AxB design, A having a levels and B having b levels, we find

n(h) = a-1 for the main effect of A,
n(h) = b-1 for the main effect of B, and
n(h) = (a-1)(b-1) for the multivariate interaction.

Example

Assume that we have a k=3 group MANOVA design with a total sample size of 3 * 20 = 60 subjects, p = 2 dependent variables, and our effect size is f2(mult) = .15.

This is how we calculate the power for this test:

Select:

Type of Power Analysis:

Post-hoc

Type of Test:

Other F-Tests

Accuracy mode calculation

Input:

Alpha:

.05

Effect size "f2":

0.1500

N:

120

Note that we enter N = (2 * total sample size) and not simply the plain total sample size because lambda
= s(h) * N * f2
= 2 * 60 * 0.15
= 18.

Numerator DF:

4

p * n(h) = 2 * 2 = 4

Denominator DF:

114

s(h) * (N-k-p+s(h))
= 2 * (60-3-2+2)
= 114

Result:

Power (1-beta):

0.9330

Critical F:

F(4,114) = 2.4513

Lambda:

18.0000

 

Repeated Measures Designs, So-Called Univariate Approach

To illustrate power analyses for the so-called univariate approach to repeated measures designs, we use an A x B design in which A is a between-subjects factor and B is a within-subject factor. Factors A and B have a and b levels, respectively.

Example

Assume that we have

a = 2 levels of Factor A,
b = 4 levels of Factor B, and
N = 2 * 10 = 20.
Between-Subjects Effect

The test for the between-subjects main effect of Factor A has

numerator df = a - 1 = 2 - 1 = 1, and
denominator df = N - a = 20 - 2 = 18.

The power of the between-subjects effect depends on the number of repeated measures in our design, and on the correlation between the levels of the repeated measures. This can be seen when looking at the noncentrality parameter lambda for this case:

lambda = N * (m/(1+(m-1)*rho)) * f2

where

N is the total number of subjects,

m is the number of levels of the repeated measures factor,

rho is is the population correlation between the individual levels of the repeated measures factor, and

f2 is just the effect size for between-subject designs as used by Cohen (1977, 1988), that is, the ratio of effect variance to the error variance within cells.

Obviously, if there is no repeated measures factor (i.e., m = 1), then the above equation reduces to

lambda = N * f2

which is just the noncentrality parameter G*Power uses in F-Tests (ANOVA).

Let us assume that we want to detect a "medium" effect according to Cohen's effect size conventions for ANOVA F-tests. Thus,

f = .25 and therefore f2 = 0.0625.

Next we assume that the correlation between the levels of the repeated measures Factor B is .75. As a consequence of the so-called sphericity assumption, we must assume that the correlation between all possible pairs of repeated measurements is identical. If sphericity is not given in our data, then we have a problem. We will deal with the sphericity problem below.

Given the above assumptions, the noncentrality parameter for our design is

lambda = N * (m/(1+(m-1)*rho)) * f2
= 20 * (4/(1+(4-1)*.75)) * 0.0625

= 20 * 0.0769

= 1.538.

A technical point to be aware of is that G*Power computes the noncentrality parameter lambda as

lambda = N*f2

where f2 is the label of the effect size slot when you select "Other F-Tests". Therefore, we need to enter

(m/(1+(m-1)*rho)) * f2 = 0.0769

as the effect size term to be used in our computations.

Select:

Type of Power Analysis:

Post-hoc

Type of Test:

Other F-Tests

Accuracy mode calculation

Input:

Alpha:

.05

Effect size "f2":

0.0769

N:

20

2 * 10 = 20

Numerator DF:

1

a - 1 = 2 - 1 = 1

Denominator DF:

18

N - a = 20 - 2 = 18

Result:

Power (1-beta):

0.2170

Critical F:

F(1,18) = 4.4139

Lambda:

1.5380

 

Within-Subject Effect

The test for the with-subjects main effect of Factor B has

numerator df = b - 1 = 4 - 1 = 3, and
denominator df = (N - a) * (b - 1) = 18 * 3 = 54.

The power of the within-subject effect depends on the correlation between the levels of the repeated measures. This can be seen when looking at the noncentrality parameter lambda for this case:

lambda = N * m* f2 /(1-rho)

where

N is the total number of subjects,

m is the number of levels of the repeated measures factor,

rho is is the population correlation between the individual levels of the repeated measures effect, and

f2 is just the effect size for between-subject designs as used by Cohen (1977, 1988), that is, the ratio of effect variance to the error variance within cells.

Let us assume that we want to detect an effect of the same size as before (i.e., f2 = 0.0625). The correlation between the levels of the repeated measures Factor B is .75 (see above). As a consequence of the so-called sphericity assumption, we must again assume that the correlation between all possible pairs of repeated measurements is identical. If it is not, then we have a problem. We will deal with this problem further on.

Given the above assumptions, the noncentrality parameter for our design is

lambda = N * m * f2 /(1-rho)
= 20 * 1

= 20.

As before, the technical point to be aware of is that G*Power computes the noncentrality parameter lambda as

lambda = N * f2

where f2 is the label of the effect size slot when you select "Other F-Tests". Therefore, we need to enter

m * f2 /(1-rho) = 1

as the effect size term to be used in our computations.

We can now proceed as before.

Select:

Type of Power Analysis:

Post-hoc

Type of Test:

Other F-Tests

Accuracy mode calculation

Input:

Alpha:

.05

Effect size "f2":

1

N:

20

2 * 10 = 20

Numerator DF:

3

b - 1 = 4 - 1 = 3

Denominator DF:

54

(N - a) * (b - 1)
= 18 * 3
= 54

Result:

Power (1-beta):

0.9646

Critical F:

F(3,54) = 2.7758

Lambda:

20.0000

 

Interaction of Between-Subjects and Within-Subject Effect

The procedure for the within-between interaction test is basically identical to the procedure for within-subject effects. The formulae for the degrees of freedom in this case are

numerator df = (a - 1) * (b - 1) = (2 - 1) * (4 - 1) = 3, and
denominator df = (N - a) * (b - 1) = 18 * 3 = 54.

The power of the interaction effect also depends on the correlation between the levels of the repeated measures. The noncentrality parameter lambda for this case is:

lambda = N * m* f2 /(1-rho)

where

N is the total number of subjects,

m is the number of levels of the repeated measures factor,

rho is is the population correlation between the individual levels of the repeated measures effect, and

f2 is just the effect size for between-subject designs as used by Cohen (1977, 1988), that is, the ratio of effect variance to the error variance within cells.

As before, the technical point to be aware of is that G*Power computes the noncentrality parameter lambda as

lambda = N * f2

where f2 is the label of the effect size slot when you select "Other F-Tests". Therefore, we need to enter

m * f2 /(1-rho)

as the effect size term to be used in our computations.

We can now proceed as before.

 

Problems Resulting from the Sphericity Assumption

In the so-called univariate approach, we must assume that all repeated measures have equal variances and are correlated equally with each other. This is often referred to as the sphericity assumption.

If sphericity is met, then analytic results for the power calculations of univariate repeated measures tests such as those illustrated above are available.

Unfortunately, sphericity is a very strong assumption which is very likely violated in many situations (see O'Brien & Kaiser, 1985). For instance, if five levels of a repeated measures factor represent successive points in time, then it is almost certain that the correlation of the measures taken at the first and the second level is larger than the correlation between the first and the fifth level.

If sphericity is not met, then the tests of main effects and interactions involving the within-subject factors occur at an artificially increased Type I error rate because the resulting F values are artificially inflated.

One way to react to this problem is to apply the corrected univariate tests in which the Geisser-Greenhouse or the Huynh-Feldt estimate of epsilon are used to provide improved Type I error rates.

Epsilon is 1 if sphericity is met, whereas without sphericity, we find that 1/n <= epsilon <= 1 (where n represents the size of the associated residual covariance matrix, e.g., n = k-1 for a within-subject main effect with k levels).

In order to take violations of sphericity into account, both the numerator and the denominator degrees of freedom of the F test must be multiplied by epsilon, and the significance of the F ratio must be evaluated with the new degrees of freedom. The Geisser-Greenhouse epsilon tends to be relatively conservative, which is a property the Huynh-Feldt epsilon tries to correct.

How can we assess the power of corrected univariate tests?

Muller and Barton (1989) have proposed an approximation to the power of the Geisser-Greenhouse or Huynh-Feldt-corrected test. Following their approach, we compute

numerator df(c) = (numerator df)*(estimate of epsilon),

denominator df(c) = (denominator df)*(estimate of epsilon), and

lambda(c) = lambda*(estimate of epsilon).

Assume that sphericity is violated and we find that the estimate of epsilon = .6. What is the effect of this violation on the power of our within-subject test? Using the within-subject example above, we compute

numerator df(c) = 3 * .6 = 1.8

denominator df(c) = 54 * .6 = 32.4, and

lambda(c) = 20 * 0.6 = 12.

We can now reevaluate the power of this test. Note that G*Power expects df values to be integers, which is why we need to enter numerator df = 2 and denominator df = 33. We also need to make adjustments to what we enter as the effect size index in order to ensure proper calculation of lambda. More precisely, we enter 1 * 0.6 = 0.6 as the effect size index for our within-subject effect. In that way, we arrive at lambda = 20 * 0.6 = 12.

Select:

Type of Power Analysis:

Post-hoc

Type of Test:

Other F-Tests

Accuracy mode calculation

Input:

Alpha:

.05

Effect size "f2":

0.6

N:

20

2 * 10

Numerator DF:

2

round(3 * .6) = 2

Denominator DF:

33

round(54 * .6) = 33

Result:

Power (1-beta):

0.8506

Critical F:

F(2,33) = 3.2849

Lambda:

12.0000

Thus, the power of the corrected test is clearly less than the power of the uncorrected test in which the sphericity problem is simply ignored.

In essencen, if we insist in using the so-called univariate approach to repeated measure analyses, then we face a choice between two unattractive alternatives: Either we ignore the (non)sphericity problem (and accept that we commit an error by testing at an artificially increased Type I error rate), or accept a reduction of the power of our statistical tests.

 

Repeated Measures Designs, Multivariate Approach

Repeated measures designs may also be analyzed using a multivariate approach. One advantage of this approach is that MANOVAs do not require the sphericity assumption to be met (which appears to be violated quite often, see O'Brien & Kaiser, 1985).

Using the MANOVA approach, we treat the levels of the within-subject factor as different dependent variables. The univariate A x B design discussed above thus is regarded as a multivariate design with between-subjects factor A and p = b dependent variables. Let us consider the same design as above, but from a multivariate perspective.

Example

Between-Subjects Effect

First, the result for the between-subjects is identical to the result we received for the univariate approach. We can therefore proceed quickly to the

Within-Subjects Effect

The F-test for the within-subject Factor B has

numerator df = b - 1 = 4 - 1 = 3, and

denominator df = s(h) * (N-k-p+s(h)) = 1 * (20-2-3+1) = 16.

where

N is the number of participants,

k is the number of groups in the design (Factor A has 2 levels),

p is the number of dependent variables (The 4 levels of the within-subject factor B are recoded into 4 - 1 = 3 dependent variables using appropriate contrast variables. The recoded variables may then represent, for instance, linear, quadratic, and cubic trends in the repeated measurement. See O'Brien and & Kaiser,1985 , for details.).

The noncentrality parameter lambda for this case is identical to the one used in the univerate approach to repeated measures analyses (see Davidson, 1972):

lambda = N * m* f2 /(1-rho)

where

N is the total number of subjects,

m is the number of levels of the repeated measures factor,

rho is is the population correlation between the individual levels of the repeated measures effect, and

f2 is just the effect size for between-subject designs as used by Cohen (1977, 1988), that is, the ratio of effect variance to the error variance within cells.

We consider again an effect size of f2 = 0.0625 in the following example. Given rho = .75 (as before), we need to enter

m * f2/(1-rho) = 1

as the effect size term to be used in our computations.

 

Select:

Type of Power Analysis:

Post-hoc

Type of Test:

Other F-Tests

Accuracy

Input:

Alpha:

.05

Effect size "f2":

1

N:

20

2 * 10 = 20

Numerator DF:

3

b - 1 = 4 - 1 = 3

Denominator DF:

16

(s(h) * (N-k-p+s(h))
= 1 * (20-2-3+1)= 16

 

Result:

Power (1-beta):

0.9270

Critical F:

F(3,16) = 3.2389

Lambda:

20.0000

Thus, the power for the multivariate approach (0.9270) is slightly smaller than that for the univariate approach (0.9646). However, this small advantage of the univariate approach is present if and only if the sphericity assumption is met. If not, the multivariate approach usually has more power (see O'Brien & Kaiser, 1985). In our example, the power of the corrected univariate test was 0.8506.

As with the so-called univariate approach, interactions of within-subject and between-subjects factors are treated just like within-subject effects.


Math

Home

Prg


Please report suggestions for improvements to
Axel Buchner, Franz Faul, or Edgar Erdfelder.s