Correlation: Point biserial model
The point biserial correlation is a measure of association between a continuous variable X and a binary variable Y, the latter of which takes on the values 0 and 1. It is assumed that the continuous variables X at Y = 0 and Y = 1 are normally distributed with means μ0, μ1 and equal variance σ. If π is the proportion of values with Y = 1 then the point biserial correlation coefficient is defined as:ρ = ((μ1 − μ0 ) √(π(1 − π)))/σx
where σx = σ + (μ1 − μ0 )2/4.
The point biserial correlation is identical to a Pearson correlation between two vectors x and y, where xi contains a value from X at Y = j, and yi = j codes the group from which the X was taken.
The statistical model is the same as that underlying a test for a difference in means μ0 and μ1 in two independent groups. The relation between the effect size d = (μ1 − μ0 )/σ used in that test and the point biserial correlation ρ considered here is given by:
ρ = d/√(d2 + (N2/(n1 n2)))
where n1 and n2 denote the sizes of the two groups and N = n1 + n2. The power procedure refers to a t test used to evaluate the null hypothesis that there is no (point-biserial) correlation in the population (ρ = 0). The alternative hypothesis is that the correlation coefficient has a non-zero value r.
H0 : ρ = 0
H1 : ρ = r.
The two-sided ("two tailed") test should be used if there is no restriction on the sign of ρ under the alternative hypothesis. Otherwise you should use the one-sided ("one tailed") test.
Effect size index
The effect size index |ρ| is the absolute value of the correlation coefficient in the population as postulated in the alternative hypothesis. From this definition it follows that 0 ≤ |ρ| < 1.Cohen (1969, p.79) defined the following effect size conventions for |ρ|:
small ρ = 0.1Pressing the Determine button on the left side of the effect size label opens the effect size drawer. You can use this drawer to calculate |ρ| from the coefficient of determination, r2.
medium ρ = 0.3
large ρ = 0.5

Options
This test has no options.Examples
We want to know how many subjects it takes to detect ρ = .25 in the population, given α = β = .05. Thus, H0: ρ = 0, H1: ρ = 0.25.Select
Type of power analysis: A priori
Input
Tail(s): One
Effect size |ρ|: 0.25
α err prob: 0.05
Power (1-β err prob): 0.95
Output
Noncentrality parameter δ: 3.306559The results indicate that we need at least N = 164 subjects to ensure a power > 0.95. The actual power achieved with this N (0.950308) is slightly higher than the requested power.
Critical t: 1.654314
df: 162
Total sample size: 164
Actual power: 0.950308
To illustrate the connection to the two groups t test, we calculate the corresponding effect size d for equal sample sizes n0 = n1 = 82:
d = (N · ρ)/√(n1 · n2 · (1 - ρ2)) = (164 · 0.25)/√(82 · 82 · (1 - 0.252)) = 0.51639778
Performing a power analysis for the one-sided two group t test with d = 0.51639778, n1 = n2 = 82, and α = 0.05 leads to exactly the same power as in the example above. If we assumed unequal sample sizes in both groups, for example, n1 = 64, n2 = 100, then we would compute a different value for d:
d = (N · r)/√(n1 · n2 · (1 - ρ2)) = (164 · 0.25)/√(100 · 64 · (1 - 0.252)) = 0.52930772
but we would again arrive at the same power. It thus poses no restriction of generality that we only input the total sample size and not the individual group sizes in the t test for correlation procedure.
Related tests
Correlation: Bivariate normal model
Correlations: Two dependent Pearson r's
Correlations: Two independent Pearson r's
Implementation notes
The H0 distribution is the central t distribution with df = N − 2. The H1 distribution is the noncentral t distribution with df = N − 2 and noncentrality parameter δ whereδ = √((|ρ|2 · N)/(1 - |ρ|2)).
N represents the total sample size and |r| represents the effect size index as defined above.
Validation
The results were checked against the values produced by GPower 2.0.References
Cohen, J. (1969). Statistical power analysis for the behavioral sciences. New York, NY: Academic Press.
Letzte Änderung: 12.05.2009, 16:23

