# Poisson regression

A Poisson regression model describes the relationship between a Poisson distributed response variable Y (a count) and one or more independent variables (covariates) Xi, which are themselves random variables with probability density fX(x).

The probability of y events during a ﬁxed 'exposure time' t is:

Pr(Y = y|λ, t) = (e−λt t)y)/y!

It is assumed that the parameter λ of the Poisson distribution, the mean incidence rate of an event during exposure time t, is a function of the Xi's. In the Poisson regression model considered here, λ depends on the covariates Xi in the following way:

λ = exp(β0 + β1 X1 + β2 X2 + ... + βmXm)
where β0 ,..., βm denote regression coefﬁcients that are estimated from the data (Frome, 1986).

In a simple Poisson regression with just one covariate X1, the procedure implemented in G*Power estimates the power of the score test which is applied to decide whether covariate X1 has an inﬂuence on the event rate or not. The null and alternative hypothesis for a two-sided test are:

H0 : β1 = 0
H1 : β1 ≠ 0.
The standard normally distributed test statistic of the score test is:

z = βˆ1/√[var(βˆ1)] = βˆ1/SE(βˆ1)
where βˆ1 is the maximum likelihood estimator for parameter β1 and var(βˆ1) is the variance of this estimate.

In a multiple Poisson regression model: λ = exp(β0 + β1 X1 + ... + βmXm), m > 1, the effect of a speciﬁc covariate in the presence of other covariates is tested. In this case the null and alternative hypotheses are:

H0 : [β1 , β2 , . . . , βm] = [0, β2 , . . . , βm]
H1 : [β1 , β2 , . . . , βm] = [β1*, β2 , . . . , βm]
where β1* > 0.

## Effect size index

The effect size is speciﬁed by the ratio R of λ under H1 to λ under H0:

R = exp(β0 + β1 X1)/exp(β0) = exp β1 X1
The following inputs are needed to fully specify the effect size:

### Exp(β1)

This is the value of the λ ratio R deﬁned above for X = 1, that is, the relative increase of the event rate over the base event rate exp(β0) assumed under H0 if X is increased one unit. If, for instance, a 10% increase over the base rate is assumed if X is increased by one unit, this value is set to (100+10)/100 = 1.1. An input of exp(β1) = 1 corresponds to "no effect" and must not be used in a priori calculations.

### Base rate exp(β0).

This is the mean event rate assumed under H0. It must be greater than 0.

### Mean exposure.

This is the time unit during which the events are counted. It must be greater than 0.

### R2 other X.

In models with more than one covariate, the inﬂuence of the other covariates X2, . . . , Xp on the power of the test can be taken into account by using a correction factor. This factor depends on the proportion R2 = ρ21·23...p of the variance of X1 explained by the regression relationship with X2, . . . , Xp. If N is the sample size considering X1 alone, then the sample size in a setting with additional covariates is: N' = N/(1 − R2). This correction for the inﬂuence of other covariates is identical to that proposed by Hsieh et al. (1998) for the logistic regression. In line with the interpretation of R2 as squared correlation it must lie in the interval [0, 1].

### X distribution:

There a 7 options for the distribution of the Xi .
1. Binomial [P(k) = n!/(k! · (n - k)!) πk (1 − π)N −k, where k is the number of successes (X = 1) in N trials of a Bernoulli process with probability of success π, 0 < π < 1 ]
2. Exponential [f(x) = (1/λ)e−1/λ , exponential distribution with parameter λ > 0]
3. Lognormal [f(x) = 1/(xσ√(2π)) exp[−(ln x − µ)2/(2σ2)], lognormal distribution with parameters µ and σ > 0.]
4. Normal [f(x)= 1/(σ√(2π)) exp[−(x − µ)2/(2σ2)], normal distribution with parameters µ and σ > 0)
5. Poisson (P(X = k) = (λk/k!)e−λ, Poisson distribution with parameter λ > 0)
6. Uniform (f(x) = 1/(ba) for axb, f(x) = 0 otherwise, continuous uniform distribution in the interval [a, b], a < b)
7. Manual (allows to specify manually the variance of βˆ under H0 and H1)
G*Power provides two different types of procedure to calculate power: An enumeration procedure and large sample approximations. The Manual mode is available only in the large sample procedures.

## Options

G*Power provides two different types of procedure to estimate power. An enumeration procedure proposed by Lyles, Lin, and Williamson (2007) and large sample approximations.

The enumeration procedure seems to provide reasonably accurate results over a wide range of situations, but it can be rather slow and may need large amounts of computer memory. The large sample approximations are much faster.

Results of Monte-Carlo simulations indicate that the accuracy of the procedures proposed by Demidenko (2007) and Hsieh et al. (1998) are comparable to that of the enumeration procedure for N > 200. The procedure base on the work of Demidenko (2007) is more general and slightly more accurate than that proposed by Hsieh et al. (1998). We thus recommend to use the procedure proposed by Demidenko (2007) as the standard procedure.

The enumeration procedure of Lyles et al. (2007) may be used to validate the results (if the sample size is not too large). It must also be used if one wants to compute the power for likelihood ratio tests. The procedure of Signorini (1991) is problematic and should not be used; it is only included to allow checks of published results referring to this widespread procedure.

The enumeration procedure provides power analyses for the Wald test and the Likelihood ratio test. The general idea is to construct an exemplary data set with weights that represent response probabilities given the assumed values of the parameters of the X distribution. Then a ﬁt procedure for the generalized linear model is used to estimate the variance of the regression weights (for Wald tests) or the likelihood ratio under H0 and H1 (for likelihood ratio tests).

The size of the exemplary data set increases with N and the enumeration procedure may thus be rather slow (and may need large amounts of computer memory) for large sample sizes. The procedure is especially slow for analysis types other then Post hoc power analyses, which internally call the power routine several times.
By specifying a threshold sample size N you can restrict the use of the enumeration procedure to sample sizes < N. For sample sizes ≥ N the large sample approximation selected in the Options dialog is used.

Note: If a computation takes too long you can abort it by pressing the ESC key.

G*Power provides two different large sample approximations for a Wald-type test. Both rely on the asymptotic normal distribution of the maximum likelihood estimator for parameter β1 and are related to the method described by Whittemore (1981). The accuracy of these approximations increases with sample size, but the deviation from the true power may be quite noticeable for small and moderate sample sizes. This is especially true for X distributions that are not symmetric about the mean, i.e. the lognormal, exponential, and poisson distribution, and the binomial distribution with π ≠ 1/2.The approach of Hsieh et al. (1998) is restricted to binary covariates and covariates with a standard normal distribution. The approach based on Demidenko (2007) is more general and usually more accurate and is recommended as standard procedure. For this test, a variance correction option can be selected that compensates for variance distortions that may occur in skewed X distributions (see implementation notes). With the Hsieh procedure selected, the program automatically switches to the procedure of Demidenko if a distribution other than the standard normal or the binomial distribution is selected.

## Examples

We replicate the example given on page 449 in Signorini (1991). The number of infections Y of swimmers (X = 1) versus non-swimmers (X = 0) during a swimming season (exposure time = 1) is tested. The infection rate is modeled as a Poisson distributed random variable. X is assumed to be binomially distributed with π = 0.5 (equal numbers of swimmers and non-swimmers are sampled). The base rate, that is, the infection rate in non-swimmers, is estimated to be 0.85. The signiﬁcance level is α = 0.05. We want to know the sample size needed to detect a 30% or greater increase in infection rate with a power of 0.95. A 30% increase implies a relative rate of 1.3 ([100%+30%]/100%).

### Select

Type of power analysis: A priori

### Input

Tail(s): One
Exp(β1): 1.3
α err prob: 0.05
Power (1-β err prob): 0.95
Base rate exp(β ): 0.85
Mean exposure: 1.0
Over-dispersion: 1.0
R2 other X: 0
X distribution: Binomial
X parm π: 0.5

### Output

Critical z: 1.644854
Total sample size: 697
Actual power: 0.950121

The required sample size of N = 697 replicates the value given in Signorini (1991). The procedure of Demidenko (2007) with variance correction as well as that of and Lyles et al. (2007) for Likelihood-Ratio tests yield N = 649. The procedure of Demidenko (2007) without variance correction and that of Lyles et al. (2007) for Wald tests both yield N = 655.

In Monte-Carlo simulations of the Wald test for this scenario with 150000 independent cases per test a mean power of 0.94997, 0.95183, and 0.96207 was found for sample sizes of 649, 655, and 697, respectively. This simulation results suggest that in this speciﬁc case the procedure based on Demidenko (2007) with variance correction gives the most accurate result.

We now assume that we have additional covariates and estimate the squared multiple correlation with these others covariates to be R2 = 0.1. All other conditions are identical. The only change we need to make is to set the input ﬁeld R2 other X to 0.1. Under this condition the sample size increases to a value of N = 774 (when we use the procedure of Signorini, 1991) and N = 722 (when we use the procedure of Demidenko, 2007, with variance correction.

## Comparison between procedures

To compare the accuracy of the Wald test procedures we replicated a number of test cases presented inTable III in Lyles et al. (2007) and conducted several additional tests for X distributions not considered in Lyles et al. (2007). In all cases a two-sided test with N = 200, β0 = 0.5, and α = 0.05 is assumed.

Using the procedure of Demidenko (2007) the complete G*Power input and output for the test in the ﬁrst row of the table displayed below below would be:

### Select

Type of power analysis:  Post hoc

### Input

Tail(s): Two
Exp(β1):  =exp(-0.1)
α err prob: 0.05
Total sample size: 200
Base rate exp(β ):  =exp(0.5)
Mean exposure: 1.0
R2 other X: 0
X distribution: Normal
X parm π: 0.5

### Output

Critical z:  -1.959964
Power (1-β err prob): 0.444593
When using the enumeration procedure, the input is exactly the same, but the test is based on the χ2 distribution, which leads to a different output:
Noncentrality parameter λ: 3.254068
Critical χ2 : 3.841459
Df: 1
Power (1-β err prob): 0.438076

The rows in the table displayed above show power values for different test scenarios.
• The Dist. X column indicates the distribution of the predictor X and the parameters used.
• The β1 column contains the values of the regression weight β1 under H1
• The Sim LLW column contains the simulation results reported by Lyles et al. (2007) for the test (where available).
• The Sim column contains the results of a simulation done by us with considerably more cases (150000 instead of 2000).
• The following columns contain the results of different procedures
• LLW = Lyles et al. (2007)
• Dem = Demidenko (2007)
• Demi(c) = Demidenko (2007) with variance correction, and
• Signorini = Signorini (1991).

## Related tests

Logistic regression

## Implementation notes

### Enumeration procedure

The procedures for the Wald and Likelihood ratio tests are implemented exactly as described in Lyles et al. (2007).

### Large sample approximations

The large sample procedures for the univariate case use the general approach of Signorini (1991), which is based on the approximate method for the logistic regression described in Whittemore (1981). They get more accurate for larger sample sizes. The correction for additional covariates has been proposed by Hsieh et al. (1998).

The H0 distribution is the standard normal distribution N(0, 1). The H1 distribution is the normal distribution N(m1, s1) with

in the Signorini (1991) procedure, and

in the procedure based on Demidenko (2007). In these equations t denotes the mean exposure time, N the sample size, R2 the squared multiple correlation coefﬁcient of the covariate of interest on the other covariates, and v0 and v1 the variance of βˆ1 under H0 and H1, respectively. Without variance correction s* = 1, with variance correction s* = √((av*0 + (1 − a)v1)/v1), where v*0 is the variance under H0 for β*0 = log(µ), with µ = ∫ƒX(x) exp(β0 + β1 x)dx. For the lognormal distribution a = 0.75; in all other cases a = 1. With variance correction, the value of s* is often close to 1, but deviates from 1 for X distributions that are not symmetric about the mean. Simulations showed that this compensates for variance inﬂation/deﬂation in the distribution of βˆ1 under H1 that occurs with such distributions with ﬁnite sample sizes.

The large sample approximations use the result that the (m + 1) maximum likelihood estimators β0, βi, . . . , βm asymptotically follow a (multivariate) normal distribution, where the variance-covariance matrix is given by the inverse of the (m + 1) × (m + 1) Fisher information matrix I. The (i, j)th element of I is given by

Thus, in the case of one continuous predictor I is a 4 × 4 matrix with elements:

where ƒ(x) is the PDF of the X distribution (for discrete predictors the integrals must be replaced by corresponding sums). The element M11 of the inverse of this matrix (M = I −1), that is, the variance of β1, is given by M11 = Var(β1) = I00/(I00 I11I201). In G*Power, numerical integration is used to compute these integrals.

To estimate the variance of βˆ1 under H1, the parameter β0 and β1 in the equations for Iij are chosen as speciﬁed in the input. To estimate the variance under H0, one chooses β1 = 0 and β0 = β*. In the procedures proposed by Signorini (1991) β* = β0. In the procedure with variance correction, β* is chosen as deﬁned above. The integrals given above happen to be proportional to derivatives of the moment generating function g(x) := E(etX ) of the X distribution, which is often known in closed form. Given g(x) and its derivatives, the variance of β is calculated as:

With this deﬁnition, the variance of βˆ1 under H0 is v0 = lim(β→0) Var(β) and the variance of βˆ1 under H1 is v1 = Var(β1).

The moment generating function for the lognormal distribution does not exist. For the other ﬁve predeﬁned distributions this leads to:

1. Binomial distribution with parameter π

2. Exponential distribution with parameter λ

3. Normal distribution with parameters (µ, σ)

4. Poisson distribution with parameter λ

5. Uniform distribution with parameters (u, v) corresponding to the interval borders.

In the manual mode you have to calculate the variances v0 and v1 by yourself. To illustrate the necessary steps, we show the calculation for the exponential distribution (given above) in more detail:

Note, however, that sensitivity analyses in which the effect size is estimated are not possible in the manual mode. This is so because in this case the argument β of the function Var(β) is not constant under H1 but the target of the search.

## Validation

The procedure proposed by Signorini (1991) was checked against the corresponding routine in PASS (Hintze, 2006) and we found perfect agreement in one-sided tests. (The results of PASS 06 are wrong for normally distributed X with σ ≠ 1; this error has been corrected in PASS 2008).

In two-sided tests we found small deviations from the results calculated with PASS. The reason for these deviations is that the (usually small) contribution to the power from the distant tail is ignored in PASS but not in G*Power. Given the generally low accuracy of the Signorini (1991) procedure, these small deviations are of no practical consequence.

The results of the other two procedures were compared to the results of Monte-Carlo simulations and with each other. The results given in the table shown above are quite representative of these more extensive test. These results indicate that the deviation of the computed power values may deviate from the true values by ±0.05.

## References

Demidenko, E. (2007). Sample size determination for logistic regression revisited. Statistics in Medicine, 26, 3385-3397.

Demidenko, E. (2008). Sample size and optimal design for logistic regression with binary interaction. Statistics in Medicine, 27, 36-46.

Hintze, J. (2006). NCSS, PASS, and GESS. Kaysville, Utah: NCSS.

Hsieh, F. Y., Bloch, D. A., & Larsen, M. D. (1998). A simple method of sample size calculation for linear and logistic regression. Statistics in Medicine, 17, 1623-1634.

Lyles, R. H., Lin, H.-M., & Williamson, J. M. (2007). A practical approach to computing power for generalized linear models with nominal, count, or ordinal responses. Statistics in Medicine, 26, 1632-48.

Signorini, D. F. (1991). Sample size for Poisson regression. Biometrika, 78, 446-450.

Whittemore, A. S. (1981). Sample size for logistic regression with small response probability. Journal of the American Statistical Association, 76, 27-32.

Dienstag, 21. 05. 2013