Means: Wilcoxon-Mann-Whitney test (two groups)
Note: This option will be available in the next version of G*Power.The Wilcoxon-Mann-Whitney test, or U test, is a nonparametric alternative to the two-groups t test. Its use is mainly motivated by uncertainty concerning the assumption of normality made in the t test.
It refers to a general two sample model, in which F and G characterize response distributions under two different conditions. The null effects hypothesis states F = G, while the alternative is F ≠ G.
The power routines implemented in G*Power refer to the important special case of a "shift model", which states that G is obtained by shifting F by an amount ∆: G(x) = F(x − ∆) for all x. The shift model expresses the assumption that the treatment adds a certain amout ∆ to the response x (additivity).
The Wilcoxon-Mann-Whitney is based on ranks. Assume m subjects are randomly assigned to control group A and n other subjects to treatment group B. After the experiment, all N = n + m subjects together are ranked according to a some response measure, i.e., each subject is assigned a unique number between 1 and N .
The general idea of the test is to calculate the sum of the ranks assigned to subjects in either group and to reject the “no effect hypothesis” if the rank sums in the two groups are clearly different. The actual procedure is as follows: Since the m ranks of control group A are known if those of treatment group B are given, it suffices to consider the n ranks of group B. They can be specified by a n-tupel (S1, . . . , Sn).
There are N!/(n! · (N - n)!) possible n-tuples. Therefore, if the null hypothesis is true, then the probability to observe a particular n-tupel is: P(S1 = s1, . . . , Sn = sn) = 1/(N!/(n! · (N - n)!)). To calculate the probability to observe, under H0, a particular rank sum Ws = S1 + . . . + Sn we just need to count the number k of all tuples with rank sum Ws and to add their probabilities. Thus P(Ws = w) = k/(N!/(n! · (N - n)!)). Repeating this for all possible Ws between the minimal value n(n + 1)/2, corresponding to the tuple (1, 2, . . . , n), and the maximal value n(1 + 2m + n)/2, corresponding to the tuple (N − n + 1, N − n + 2, . . . , N), gives the discrete probability distribution of Ws under H0. This distribution is symmetric about n(N + 1)/2. Referring to this probability distribution we choose a critical value c with P (Ws ≥ c) ≤ α and reject the null hypothesis if a rank sum Ws > c is observed. With increasing sample sizes the exact distribution converges rapidly to the normal distribution with mean E(Ws) = n(N + 1)/2 and variance Var(W) = mn(N + 1)/12.
A drawback of using Ws is that the minimal value of Ws depends on n. It is often more convenient to subtract the minimal value and use WXY = Ws − n(n + 1)/2 instead. The statistic WXY (also known as the Mann-Whitney statistic) can also be derived from a slightly different perspective: If X1 , . . . , Xm and Y1 , . . . , Yn denote the observations in the control and treatment group, respectively, then WXY is the number of pairs (Xi, Yj)with Xi < Yj. The approximating normal distribution of WXY has a mean of E(WXY) = mn/2 and Var(WXY) = Var(Ws) = mn(N + 1)/12.
Power of the Wilcoxon-Mann-Whitney test
The Wilcoxon-Mann-Whitney test as described above is distribution free in the sense that its validity does not depend on the specific form of the response distribution F. This distribution independence does no longer hold, however, if one wants to estimate numerical values for the power of the test. The reason is that the effect of a certain shift ∆ on the rank distribution depends on the specific form of F (and G). For power calculations it is therefore necessary to specify the response distribution F. G*Power provides three predefined continuous and symmetric response functions that differ with respect to kurtosis, that is, the "peakedness" of the distribution.Normal distribution N(µ, σ2 )

Laplace or Double Exponential distribution

Logistic distribution

Scaled and/or shifted versions of the Laplace and Logistic densities that can be calculated by applying the transformation 1/a p((x - b)/a), a > 0, are again probability densities and are referred to by the same name.
Approaches to the power analysis
G*Power implements two different methods to estimate the power for the Wilcoxon-Mann-Whitney test:- The asymptotic relative efficiency (A.R.E.) method that defines power relative to the one-sample t test, and
- a normal approximation to the power proposed by Lehmann (1975, pp. 164-166).
A.R.E-method
The A.R.E method assumes the shift model described in the introduction. It relates normal approximations to the power of the one-sample t test (Lehmann, 1975, Eq. (2.42), p. 78) and the Wilcoxon-Mann-Whitney test for a specified distribution F (Lehmann, 1975, Eq. (2.29), p. 72). If, for a model with fixed F and ∆, the sample sizes n = m are required to achieve a specified power for the Wilcoxon-Mann-Whitney test and sample sizes n' = m' are required in the t test to achieve the same power, then the ratio n'/n is called the efficiency of the Wilcoxon-Mann-Whitney test relative to the two-groups t test. The limiting efficiency as sample sizes n = m increase towards infinity is called the asymptotic relative efficiency (A.R.E. or Pitman efficiency) of the Wilcoxon-Mann-Whitney test relative to the t test. It is given by (Hettmansperger, 1984, p. 71)
If F is a normal distribution, then the A.R.E. is 3/π ≈ 0.955. This shows that the efficiency of the Wilcoxon-Mann-Whitney test relative to the t test is rather high even if the assumption of normality made in the t test is true. It can be shown that the minimal A.R.E. (for H with finite variance) is 0.864. For non-normal distributions the Wilcoxon test can be much more efficient than the t test. The A.R.E.s for some specific distributions are given in the implementation notes. To estimate the power of the Wilcoxon test for a given H with the A.R.E. method one basically scales the sample size with the corresponding A.R.E. value and then performs the procedure for the t test for two independent means.
Lehmann method
The computation of the power requires the distribution of WXY for the non-null case, that is, for F ≠ G. The Lehmann method uses the fact that(WXY − E(WXY))/√(Var(WXY))tends towards the standard normal distribution as n and m approache infinity for any fixed distributions F and G for X and Y for which 0 < P(X < Y) < 1. The problem is then to compute expectation and variance of WXYs. These values depend on three "moments" p1, p2, p3, which are defined as:
- p1 = P(X < Y).Note that p2 = p3 for symmetric F in the shift model. The expectation and variance are given as
- p2 = P(X < Y and X < Y').
- p3 = P(X < Y and X' < Y).
E(WXY) = nmp1The value p1 is easy to interpret: If the response distribution G is shifted to larger values, then p1 is the probability to observe a lower value in the control condition than in the treatment condition. For a null shift (no treatment effect, ∆ = 0, i.e. H symmetric about zero) we get p1 = 1/2.
V ar(WXY) = nmp1(1 - p1) + nm(n - 1) (p2 − p21) + nm(m - 1 )(p3 − p21).
If c denotes the critical value of a level-α test, and Φ denotes the CDF of the standard normal distribution, then the normal approximation of the power of the (one-sided) test is given by
Π(F, G) ≈ 1 − Φ [(c − a − nmp1)/√(Var(WXY))]where a = 0.5 if a continuity correction is applied, and a = 0 otherwise. The formulae for p1, p2, and p3 for the predefined distributions are given in the implementation section below.
Effect size index
A.R.E. method
In the A.R.E. method the effect size d is defined as d = (µ1 − µ2)/σ = ∆/σ where µ1, µ2 are the means of the response functions F and G and σ the standard deviation of the response distribution F.In addition to the effect size, you have to specify the A.R.E. for F. If you want to use the predefined distributions, you may choose the Normal, Logistic, or Laplace transformation or the minimal A.R.E. From this selection G*Power determines the A.R.E. automatically. Alternatively, you can choose the option to determine the A.R.E. value by hand. In this case you must first calculate the A.R.E. for your F using the formula given above.
Lehman method
The conventional values proposed by Cohen (1969, p. 38) for the t test are applicable. He defined the following conventional values for d:small d = 0.2Pressing the button Determine on the left side of the effect size label opens the effect size dialog. You can use this dialog to calculate d from the means and a common standard deviations in the two populations.
medium d = 0.5
large d = 0.8

If the sample sizes are equal (n1 = n2) a mean σ' may be used as the common within-population σ (Cohen, 1969, p.42):
σ' = √((σ12 + σ22)/2)where σ12 and σ22 are the variances in populations 1 and 2, respectively. This is the formula used by G*Power when you select the n1 = n2 option in the effect size drawer.
In the case of substantially different sample sizes the n1 = n2 option should not be used because it may lead to power values that differ greatly from the true values (Cohen, 1969, p.42).
If you have unequal sample sizes and unequal variances in the populations from which the samples were or are to be drawn, then it is very reasonable to bring the samples to equal sizes.
Options
This test has no options.Examples
Related tests
Implementation notes
The H0 distribution is the central Student t distribution t(N k − 2, 0). The H1 distribution is the noncentral Student t distribution t(N k − 2, δ), where the noncentrality parameter δ is given by: δ = d √((N1 N2 k)/(N1 + N2)). Parameter k represents the asymptotic relative efficiency relative to correspondig t tests (Lehmann, 1975, p. 371ff) and depends in the following way on the parent distribution:Uniform parent distribution: k = 1.0
Normal parent distribution: k = 3/pi
Logistic parent distribution: k = π2 /9
Laplace parent distribution: k = 3/2
Min ARE = 0.864; this is a limiting case that gives a theoretic al minimum of the power for the Wilcoxon-Mann-Whitney test.
Validation
The results were checked against the values produced by PASS (Hintze, 2006) and those produced by unifyPow (O’Brien, 1998). There was complete correspondence with the values given in O’Brien, while there were slight differences to those produced by PASS. The reason of these differences seems to be that PASS truncates the weighted sample sizes to integer values.References
Hettmansperger, T. P. (1984). Statistical inference based on ranks. New York: Wiley.Hintze, J. (2006). NCSS, PASS, and GESS. Kaysville, Utah: NCSS.
Lehmann, E. L. (1975). Nonparametrics. Statistical methods based on ranks. San Francisco, CA: Holden-Day.
O’Brien, R. (2002). Sample size analysis in study planning (using unifypow.sas). (available on the WWW: http://www.bio.ri.ccf.org/UnifyPow.all/UnifyPowNotes020811.pdf)
Letzte Änderung: 05.06.2009, 12:17

