Linear multiple regression: Random model
In multiple regression analyses, we are interested in the relation of a dependent variable Y to m independent factors X = (X1, ..., Xm). The present procedure refers to the so-called unconditional or random factors model of multiple regression (Gatsonis & Sampson, 1989 Sampson, 1974), that is, it is assumed that Y and X1 , . . . , Xm are random variables, where (Y, X1, . . . , Xm) have a joint multivariate normal distribution with a positive definite covariance matrix:
ρ2YX = Σ′YX Σ−1X ΣYX /σ2Y.The regression coefficient vector γ, the analog of β in the fixed model, is given by γ = Σ−1X ΣYX. The maximum likelihood estimates of the regression coefficient vector and the residuals are the same under both the fixed and the random model (see Theorem 1 in Sampson, 1974); the models differ, however, with respect to power. The present procedure allows power analyses for the test that the population squared correlations coefficient ρ2YX has the value ρ20 . The null and alternative hypotheses are:
H0 : ρ2YX = ρ20
H1 : ρ2YX X ≠ ρ20.An important special case is ρ0 = 0 (corresponding to the assumption that ΣYX = 0). A commonly used test statistic for this case is F = [(N − m − 1)/p]R2YX /(1 − R2YX), which has a central F distribution with df1 = m and df2 = N − m − 1. This is the same test statistic as that used in the fixed model. The power differs, however.
Effect size index
The effect size is the population squared correlation coefficient H1 ρ2 under the alternative hypothesis. To fully specify the effect size, you also need to specify the population squared correlation coefficient H0 ρ2 under the null hypothesis. Pressing the Determine button on the left of the effect size label in the main window opens the effect size drawer that may be used to calculate ρ2 either from the confidence interval for the population ρ2YX given an observed squared multiple correlation R2, or from predictor correlations.
Effect size from C.I.
Above you can see an example of how H1 ρ can be determined from the confidence interval computed for an observed R2. You have to specify the total sample size, the number of predictors, the observed R2 and the confidence level (1 - α) of the confidence interval. In the remaining input field, a relative position inside the confidence interval can be specified that determines the H1 ρ2 value. This value can range from 0 to 1, where 0, 0.5 and 1 correspond to the left, central and right positions inside the interval, respectively.The C.I. lower ρ2 and C.I. upper ρ2 output fields contain the left and right border of the two-sided 100(1 − α) percent confidence interval for ρ2. The Statistical lower bound and Statistical upper bound output fields show the one-sided (0, R) and (L, 1) intervals, respectively. Effect size from predictor correlations By choosing the From predictor correlation matrix option one may compute ρ2 from the matrix of correlations among the predictor variables and the correlations between predictors and the dependent variable Y. Pressing the Insert/edit matrix button opens a window in which one can specify
- the row vector u containing the correlations between each of the m predictors Xi and the dependent (or outcome) variable Y and
- the m × m matrix B of correlations among the predictors (see below).


Relation of ρ2 to effect size f 2
The relation between ρ2 and effect size f2 used in the fixed factors model is:f2 = ρ2/(1 − ρ2)
and conversely:
ρ2 = f2/(1 + f2)
Cohen (1988, p. 412) defines the following conventional values for the effect size f2:
small f2 = 0.02
medium f2 = 0.15
large f2 = 0.35
which translate into the following values for ρ2 :
small ρ2 = 0.02
medium ρ2 = 0.13
large ρ2 = 0.26
Options
You can switch between an exact procedure for the calculation of the distribution of the squared multiple correlation coefficient ρ2 and a three-moment F approximation suggested by Lee (1971, p.123). The latter is slightly faster and may be used to check the results of the exact routine.Examples
Power and sample size: Example 1
We replicate an example given for the procedure for the fixed model, but now under the assumption that the predictors are not fixed but random samples: We assume that a dependent variable Y is predicted by as Set B of 5 predictors and that ρ2YX is 0.10, that is, that the 5 predictors account for 10% of the variance of Y. The sample size is N = 95 subjects. What is the power of the F test that ρ2YX = 0 at α = 0.05?
We choose the following settings in G*Power:
Select
Type of power analysis: Post hoc
Input parameters
Tail(s): One
H1 ρ2: 0.1
H0 ρ2: 0.0
α err prob: 0.05
Total sample size: 95
Number of predictors: 5
Output
Lower critical R2: 0.115170The output shows that the power of this test is about 0.663 which is slightly lower than the power of 0.674 found in the fixed model. This observation holds in general: The power in the random model is never larger than that found for the same scenario in the fixed model.
Upper critical R2: 0.115170
Power (1- β): 0.662627
Power and sample size: Example 2
We now replicate the test of the hypotheses H0 : ρ2 ≤ 0.3 versus H1 : ρ2 > 0.3 given in Shieh and Kung (2007, p.733), for N = 100, α = 0.05, and m = 5 predictors. We assume that H1 ρ2 = 0.4 . The settings and output in this case are:

Pressing the Determine button next to the effect size field in the main window opens the effect size drawer. After selecting From confidence interval we specify
Select
Type of power analysis: Post hoc
Input parameters
Tail(s): One
H1 ρ2: 0.4
H0 ρ2: 0.3
α err prob: 0.05
Total sample size: 100
Number of predictors: 5
Output
Lower critical R2 : 0.456625The results show that H0 should be rejected if the observed R2 is larger than 0.457. The power of the test is about 0.346. Assume that we observed R2 = 0.5. To calculate the associated p-value we may use the G*power calculator. The syntax for obtaining the CDF of the squared sample multiple correlation coefficient is mr2cdf(R2,ρ2,m+1,N). Thus for the present case we insert 1-mr2cdf(0.5,0.3,6,100) in the calculator. Hitting Calculate gives 0.01278. These values replicate those given in Shieh and Kung (2007).
Upper critical R2 : 0.456625
Power (1- β): 0.346482

Power and sample size: Example 3
We now ask for the minimum sample size required for testing the hypothesis H0: ρ2 ≥ 0.2 vs. the specific alternative hypothesis H1: ρ2 = 0.05 with 5 predictors to achieve a power of β = 0.9 and α = 0.05 (Example 2 in Shieh & Kung, 2007). The inputs and outputs are:Select
Type of power analysis: A priori
Input parameters
Tail(s): One
H1 ρ2: 0.05
H0 ρ2 : 0.2
α err prob: 0.05
Power (1- β ): 0.9
Number of predictors: 5
Output
Lower critical R2 : 0.132309The results show that N should not be less than 153. This confirms the results in Shieh and Kung (2007).
Upper critical R2 : 0.132309
Total sample size: 153
Actual power: 0.901051
Using confidence intervals to determine the effect size
Let us assume that in a regression analysis with 5 predictors and N = 50 we observed a squared multiple correlation coefficient R2 = 0.3. Let us assume further that we want to use the lower boundary of the 95% confidence interval for ρ2 as H1 ρ2.Pressing the Determine button next to the effect size field in the main window opens the effect size drawer. After selecting From confidence interval we specify
Total sample size: 50Finally, we set Rel C.I. pos to use (0=left, 1=right) to 0 to select the left interval border. Pressing Calculate computes the 95% two-sided confidence intervals [0.0337, 0.4606] as well as the lower and upper bounds [0, 4245], [0.0589, 1]. The left boundary of the two-sided interval (0.0337) is transfered to the H1 ρ2 field.
Number of predictors: 5
Observed R2: 0.3
Confidence level: 0.95
Using predictor correlations to determine the effect size
We may use assumptions about the (m × m) correlation matrix between a set of m predictors, and the m correlations between predictor variables and the dependent variable Y to determine ρ2. Pressing the Determine button next to the effect size field in the main window opens the effect size drawer.
After selecting From predictor correlations, we insert the number of predictors in the corresponding field and press Insert/edit matrix. This opens a input dialog (see above).
Suppose that we have 4 predictors and that the 4 correlations between Xi and Y are u = (0.3, 0.1, −0.2, 0.2). We select Corr between predictors and outcome and then insert these values.
Assume further that the correlations between X1 and X3 and between X2 and X4 are 0.5 and 0.2, respectively, whereas all other predictor pairs are uncorrelated. We select Corr between predictors and insert the correlation matrix

Pressing the Calc ρ2 button computes ρ2 = u B − 1 u′ = 0.297083, which also confirms that B is positive-definite and thus a correct correlation matrix.
After selecting From predictor correlations, we insert the number of predictors in the corresponding field and press Insert/edit matrix. This opens a input dialog (see above).
Suppose that we have 4 predictors and that the 4 correlations between Xi and Y are u = (0.3, 0.1, −0.2, 0.2). We select Corr between predictors and outcome and then insert these values.
Assume further that the correlations between X1 and X3 and between X2 and X4 are 0.5 and 0.2, respectively, whereas all other predictor pairs are uncorrelated. We select Corr between predictors and insert the correlation matrix

Pressing the Calc ρ2 button computes ρ2 = u B − 1 u′ = 0.297083, which also confirms that B is positive-definite and thus a correct correlation matrix.
Related tests
Linear Multiple Regression: Fixed model, deviation of R2 from zero
Linear Multiple Regression: Fixed model, increase of R2
Implementation notes
The procedure uses the exact sampling distribution of the squared multiple correlation coefficient (MRC distribution; Lee, 1971, 1972). The parameters of this distribution are the population squared multiple correlation coefficient ρ2, the number of predictors m, and the sample size N. The only difference between the H0 and H1 distribution is that the population multiple correlation coefficient is set to H0 ρ2 in the former and to H1 ρ2 in the latter case.Several algorithms for the computation of the exact or approximate CDF of the sampling distribution have been proposed (Benton & Krishnamoorthy, 2003; Ding, 1996; Ding & Bargmann, 1991; Lee, 1971, 1972). Benton and Krishnamoorthy (2003) have shown, that the implementation proposed by Ding and Bargmann (1991) (that is used in Dunlap, Xin, & Myers, 2004) may produce grossly false results in some cases. The implementation of Ding (1996) has the disadvantage that it overflows for large sample sizes, because factorials occuring in ratios are explicitly evaluated. This can easily be avoided by using the log of the gamma function in the computation instead.
In G*Power we use the procedure of Benton and Krishnamoorthy (2003) to compute the exact CDF and a modified version of the procedure given in Ding (1996) to compute the exact PDF of the distribution. Optionally, one can choose to use the 3-moment noncentral F approximation proposed by Lee (1971) to compute the CDF. The latter procedure has also been used by Steiger and Fouladi (1992) in their R2 program, which provides similar functionality.
Validation
The power and sample size results were checked against the values produced by R2 (Steiger & Fouladi, 1992), the tables in Gatsonis and Sampson (1989), and results reported in Dunlap et al. (2004) and in Shieh and Kung (2007). Slight deviations from the values computed with R2 were found. These deviations are due to the approximation used in R2, whereas complete correspondence was found in all other tests. The confidence intervals were checked against values computed in R2, the results reported in Shieh and Kung (2007), and the tables given in Mendoza and Stafford (2001).References
Benton, D., & Krishnamoorthy, K. (2003). Computing discrete mixtures of continuous distributions: noncentral chisquare, noncentral t and the distribution of the square of the sample multiple correlation coefficient. Computational Statistics & Data Analysis, 43, 249-267.
Ding, C. G. (1996). On the computation of the distribution of the square of the sample multiple correlation coefficient. Computational statistics & data analysis, 22, 345-350.
Ding, C. G., & Bargmann, R. E. (1991). Algorithm as 260: Evaluation of the distribution of the square of the sample multiple correlation coefficient. Applied Statistics, 40, 195-198.
Dunlap, W. P., Xin, X., & Myers, L. (2004). Computing aspects of power for multiple regression. Behavior Research Methods, Instruments & Computers, 36, 695-701.
Gatsonis, C., & Sampson, A. R. (1989). Multiple correlation: Exact power and sample size calculations. Psychological Bulletin, 106, 516-524.
Ding, C. G. (1996). On the computation of the distribution of the square of the sample multiple correlation coefficient. Computational statistics & data analysis, 22, 345-350.
Ding, C. G., & Bargmann, R. E. (1991). Algorithm as 260: Evaluation of the distribution of the square of the sample multiple correlation coefficient. Applied Statistics, 40, 195-198.
Dunlap, W. P., Xin, X., & Myers, L. (2004). Computing aspects of power for multiple regression. Behavior Research Methods, Instruments & Computers, 36, 695-701.
Gatsonis, C., & Sampson, A. R. (1989). Multiple correlation: Exact power and sample size calculations. Psychological Bulletin, 106, 516-524.
Lee, Y. (1971). Some results on the sampling distribution of the multiple correlation coefficient. Journal of the Royal Statistical Society. Series B (Methodological), 33, 117- 130.
Lee, Y. (1972). Tables of the upper percentage points of the multiple correlation coefficient. Biometrika, 59, 179-189.
Lee, Y. (1972). Tables of the upper percentage points of the multiple correlation coefficient. Biometrika, 59, 179-189.
Mendoza, J., & Stafford, K. (2001). Confidence interval, power calculation, and sample size estimation for the squared multiple correlation coefficient under the fixed and random regression models: A computer program and useful standard tables. Educational & Psychological Measurement, 61, 650-667.
Sampson, A. R. (1974). A tale of two regressions. American Statistical Association, 69, 682-689.
Sampson, A. R. (1974). A tale of two regressions. American Statistical Association, 69, 682-689.
Shieh, G., & Kung, C.-F. (2007). Methodological and computational considerations for multiple correlation analysis. Behavior Research Methods, 39, 731-734.
Steiger, J. H., & Fouladi, R. T. (1992). R2: A computer program for interval estimation, power calculations, sample size estimation, and hypothesis testing in multiple regression. Behavior Research Methods, Instruments, & Computers, 24, 581-582.
Steiger, J. H., & Fouladi, R. T. (1992). R2: A computer program for interval estimation, power calculations, sample size estimation, and hypothesis testing in multiple regression. Behavior Research Methods, Instruments, & Computers, 24, 581-582.
Letzte Änderung: 06.12.2009, 22:16

