## Heinrich-Heine-Universität - Institut für experimentelle Psychologie

http://www.psycho.uni-duesseldorf.de/abteilungen/aap/gpower3/user-guide-by-distribution/z/correlations_two_dependent_pearson_rs

# Correlations: Two dependent Pearson r's

This procedure provides power analyses for tests of the hypothesis that two dependent Pearson correlation coefﬁcients ρa,b and ρc,d are identical. The corresponding test statistics Z1⋆ and Z2⋆ were proposed by Dunn and Clark (1969) and are described in Equations (11) and (12) in Steiger (1980) (for more details on these tests, see implementation notes below).

Two correlation coefﬁcients ρa,b and ρc,d are dependent if at least one of the four possible correlation coefﬁcients ρa,c, ρa,d, ρb,c, and ρb,d between other pairs of the four data sets a, b, c, and d is non-zero.

Thus, in the general case where a, b, c, and d are different data sets, we do not only have to consider the two correlations under scrutiny but also four additional correlations.

In the special case in which two of the data sets are identical, the two correlations are obviously always dependent, because at least one of the four additional correlations mentioned above is exactly 1. Two other of the four additional correlations are identical to the two correlations under test. Thus, there remains only one additional correlation that can be freely speciﬁed. In this special case we denote the H0 correlation ρa,b , the H1 correlation ρa,c and the additional correlation ρb,c.

It is convenient to describe the general case and the special case by two corresponding correlations matrices (that may be sub-matrices of a larger correlations matrix): A 4 × 4-matrix in the general case of four different data sets ('no common index'):

and a 3 × 3 matrix in the special case in which one of the data sets is identical in both correlations ('common index'):

Note:
The values for ρx,y in matrices C1 and C2 cannot be chosen arbitrarily between -1 and 1. This can be seen by considering matrix C2: We cannot, for instance, choose ρa,b = −ρa,c and ρb,c = 1.0 because the latter choice implies that the other two correlations are identical. It is, however, generally less easy to decide whether a given matrix is a valid correlation matrix. In more complex cases the following formal criterion can be used: A given symmetric matrix is a valid correlation matrix if and only if the matrix is positive semi-deﬁnite, that is, if all eigenvalues of the matrix are non-negative.

The null hypothesis in the general case with 'no common index' states that ρa,b = ρc,d. The (two-sided) alternative hypothesis is that these correlation coefﬁcients are different: ρa,b ≠ ρc,d:

H0 : ρa,b − ρc,d = 0
H1 : ρa,b − ρc,d ≠ 0.
Here, G*Power refers to the test Z2⋆ described in Equation (12) in Steiger (1980).

The null hypothesis in the special case of a 'common index' states that ρa,b = ρa,c. The (two-sided) alternative hypothesis is that these correlation coefﬁcients are different: ρa,b = ρa,c:

H0 : ρa,b − ρa,c = 0
H1 : ρa,b − ρa,c ≠ 0.

Here, G*Power refers to the test Z1⋆ described in Equation (12) in Steiger (1980).

If the direction of the deviation ρa,b − ρc,d (in the 'no common index' case) or ρa,b − ρa,c (in the 'common index' case) cannot be predicted a priori, a two-sided ('two-tailed') test should be used. Otherwise a one-sided test is adequate.

## Effect size index

In this procedure the correlation coefﬁcient assumed under H1 is used as effect size, that is, ρc,d in the general case of 'no common index' and ρa,c in the special case of a 'common index'.

To fully specify the effect size, the following additional inputs are required:

• ρa,b, the correlation coefﬁcient assumed under H0, and
• all other relevant correlation coefﬁcients that specify the dependency between the correlations assumed in H0 and H1:
• ρa,c , ρa,d , ρb,c , ρb,d in the general case of 'no common index', and
• ρb,c in the 'common index' case.

G*Power requires the correlations assumed under H0 and H1 to be within the interval [−0.000001, 0.000001]. The additional correlations must be within the interval [−1, 1]. In a priori analyses, zero effect sizes are not allowed, because this would imply an inﬁnite sample size. In this case the additional restriction |ρa,b − ρc,d| > 0.000001 (or |ρa,b − ρa,c | > 0.000001) holds.

Why do we not use q, the effect size proposed by Cohen (1988) for the case of two independent correlations? The effect size q is deﬁned as a difference between two 'Fisher z'-transformed correlation coefﬁcients: q = z1z2, with z1 = ln((1 + ρ1)/(1 − ρ1))/2, z2 = ln((1 + ρ2)/(1 − ρ2))/2. The choice of q as effect size is sensible for tests of independent correlations because in this case the power of the test does not depend on the absolute value of the correlation coefﬁcient assumed under H0. The power of the test depends only on the difference q between the transformed correlations under H0 and H1. This is no longer true for dependent correlations. We therefore used the effect size described above. (See the implementation section for a more thorough discussion of these issues.)

Although the power is not strictly independent of the value of the correlation coefﬁcient under H0, the deviations are usually relatively small and it may therefore be convenient to use the deﬁnition of q to specify the correlation under H1 for a given correlation under H0. In this way, one can relate to the effect size conventions for q deﬁned by Cohen (1969, p. 109ff) for independent correlations:
small q = 0.1
medium q = 0.3
large q = 0.5

The effect size drawer, which can be opened by pressing the Determine button on the left side of the effect size label, can be used to do this calculation:

## Options

This test has no options.

## Examples

We assume the following correlation matrix in the population:

### General case: No common index

We want to perform an a priori analysis for a one-sided test of whether ρ1,4 = ρ2,3 or whether ρ1,4 < ρ2,3 holds.

With respect to the notation used in G*Power we have the following identities a = 1, b = 4, c = 2, and d = 3. Thus wet get:

H0 correlation:
ρa,b = ρ1,4 = 0.1
H1 correlation:
ρc,d = ρ2,3 = 0.2,
ρa,c = ρ1,2 = 0.5,
ρa,d = ρ1,3 = 0.4,
ρb,c = ρ4,2 = −0.4, and
ρb,d = ρ4,3 = 0.8.

We want to know how large our samples need to be in order to achieve the error levels α = 0.05 and β = 0.2. We choose the procedure 'Correlations: Two independent Pearson r 's (common index)' and set:

#### Select

Type of power analysis: A priori

#### Input

Tail(s): one
H1 corr ρ_cd: 0.2
α err prob: 0.05
Power (1-β err prob): 0.8
H0 Corr ρ_ab: 0.1
Corr ρ_ac: 0.5
Corr ρ_bc: -0.4
Corr ρ_bd: 0.8

#### Output

Critical z : 1.644854
Sample Size: 886
Actual Power: 0.800093

We ﬁnd that the sample size in each group needs to be N = 886. How large would N be if we instead assumed that ρ1,4 and ρ2,3 were independent, that is, that ρac = ρad = ρbc = ρbd = 0? To calculate this value, we may set the corresponding input ﬁelds to 0, or, alternatively, use the procedure for independent correlations with an Allocation ratio N2/N1 = 1. In either case, we ﬁnd a considerably larger sample size of N = 1183 per data set (i.e. we correlate data vectors of length N = 1183). This shows that the power of the test increases considerably if we take dependencies between correlations coefﬁcients into account.

If we try to change the correlation ρbd from 0.8 to 0.9, G*Power shows an error message stating: 'The correlation matrix is not valid, that is, not positive semi-deﬁnite'. This indicates that the matrix Cp with ρ3,4 changed to 0.9 is not a possible correlation matrix.

### Special case: Common index

Assuming again the population correlation matrix Cp shown above, we want to do an a priori analysis for the test of whether ρ1,3 = ρ2,3 or whether ρ1,3 > ρ2,3 holds. With respect to the notation used in G *Power we have the following identities: a = 3 (the common index), b = 1 (the index of the second data set entering in the correlation assumed under H0, here ρa,b = ρ3,1), and c = 2 (the index of the remaining data set).

Thus, we get:

H0 correlation:
ρa,b = ρ3,1 = 0.4
H1 correlation
ρa,c = ρ3,2 = 0.2, and
ρb,c = ρ1,2 = 0.5.

For this effect size we want to calculate how large our sample size needs to be in order to achieve  error levels of α = 0.05 and β = 0.2. We choose the procedure 'Correlations: Two independent Pearson r 's (common index)' and set:

#### Select

Type of power analysis: A priori

#### Input

Tail(s): one
H1 corr ρ_ac: 0.2
α err prob: 0.05
Power (1-β err prob): 0.8
H0 Corr ρ_ab: 0.4
Corr ρ_bc: 0.5

#### Output

Critical z : 1.644854
Sample Size: 144
Actual Power: 0.801161

The answer is that we need sample sizes of 144 in each group (i.e. the correlations are calculated between data vectors of length N = 144).

### Sensitivity analyses

We now assume a scenario that is identical to that described above, with the exception that ρb,c = −0.6. We want to know the minimum H1 correlation ρa,c that we can detect with α = 0.05 and β = 0.2 given a sample size N = 144. In this sensitivity analysis, we have in general two possible solutions, namely one for ρa,c ≤ ρa,b and one for ρa,c ≥ ρa,b. The relevant settings for the former case are:

#### Select

Type of power analysis: Sensitivity

#### Input

Tail(s): one
Effect direction: ρ_ac ≤ ρ_ab
α err prob: 0.05
Power (1-β err prob): 0.8
Sample Size: 144
H0 Corr ρ_ab: 0.4
Corr ρ_bc: -0.6

#### Output

Critical z : -1.644854
H1 corr ρ_ac: 0.047702

The result is that the error levels are as requested or lower if the H1 correlation ρa,c is equal to or lower than 0.047702.

We now try to ﬁnd the corresponding H1 correlation that is larger than ρa,b = 0.4. To this end, we change the effect size direction in the settings shown above, that is, we choose ρa,c ≥ ρa,b. In this case, however, G*Power shows an error message, indicating that no solution was found. The reason is that there is no H1 correlation ρa,c ≥ ρa,b that leads to a valid (i.e. positive semi-deﬁnite) correlation matrix and simultaneously ensures the requested error levels. To indicate a missing result, the output for the H1 correlation is set to the nonsensical value 2.

In both the general case with 'no common index' and in the special 'common index' case G*Power checks whether the correlation matrix is valid and shows a H1 correlation of 2 if no solution is found. This is also true in the X-Y plot if the H1 correlation is the dependent variable: Combinations of input values for which no solution can be found show up with a nonsensical value of 2.

## Related tests

Correlation: Bivariate normal model
Correlations: Point biserial model
Correlations: Two independent Pearson r 's

## Implementation notes

### Background

Let X1 , . . . , XK denote multinormally distributed random variables with mean vector µ and covariance matrix C. A sample of size N from this K-dimensional distribution leads to a N × K data matrix, and pair-wise correlation of all columns to a K × K correlation matrix. By drawing M samples of size N one can compute M such correlation matrices, and one can determine the variances σ2a,b of the sample of M correlation coefﬁcients ra,b, and the covariances σa,b;c,d between samples of size M of two different correlations ra,b, rc,d. For M → ∞, the elements of Ψ, which denotes the asymptotic variance-covariance matrix of the correlations times N, are given by [see Eqns (1) and (2) in Steiger (1980)]:

When two correlations have an index in common, the expression given for Ψa,b;c,d simpliﬁes to [see Eq (3) in Steiger (1980)]:

If the raw sample correlations ra,b are transformed by the Fisher r-to-z transform to za,b = 0.5 ln((1 + ra,b)/(1 - ra,b)), then the elements of the variance-covariance matrix times (N-3) of the transformed raw correlations are [see Eqs. (9)-(11) in Steiger (1980)]:

### Test statistics

The test statistics proposed by Dunn and Clark (1969) are [see Eqns (12) and (13) in Steiger (1980)]:

where sa,b;a,c and sa,b;c,d denote sample estimates of the covariances ca,b;a,c and ca,b;c,d between the transformed correlations, respectively.

Note:
1. The SD of the difference of za,bza,c given in the denominator of the formula for Z1⋆ depends on the value of ρa,b and ρa,c; the same holds analogously for Z2⋆.
2. The only difference between Z2⋆ and the z-statistic used for independent correlations is that in the latter the covariance sa,b;c,d is assumed to be zero.

### Central and noncentral distributions in power calculations

In the general case 'without a common index' the H0 distribution is also the standard normal distribution N(0, 1). The H1 distribution is the normal distribution N(m1, s1), with

s1 =  √((2 − 2c1)/(N − 3))/s0, with c1 = ca,b;a,c for H1,
m1 = (za,bzc,d)/s0, and
s0 =  √((2 − 2c0)/(N − 3)), with c0 = ca,b;c,d for H0, i.e. ρa,b = ρc,d.

For the special case with a 'common index' the H0 distribution is the standard normal distribution N(0, 1). The H1 distribution is the normal distribution N(m1 , s1), with

s1 =  √((2 − 2c1)/(N − 3))/s0, with c1 = ca,b;a,c for H1,
m1 = (za,bza,c)/s0, and
s0 =  √((2 − 2c0)/(N − 3)), with c0 = ca,b;a,c for H0, i.e. ρa,b = ρa,c.

## Validation

The results were checked against Monte-Carlo simulations.

## References

Dunn, O. J., & Clark, V. (1969). Correlation coefficients measured on the same individuals. Journal of the American Statistical Association, 64, 366-377.

Steiger, J. H. (1980). Tests for comparing elements of a correlation matrix. Psychological Bulletin, 87, 245-251.