Power and Sample Size for Testing Means and Proportion

April 10, 2017 | Author: Cecilia Riley | Category: N/A

Share Embed Donate

Report this link

Short Description

1 Power ad Sample Sie for Testig Meas ad Proportio Type I & Type II Error Type I Error: reject the ull hypothesis wh...

Description

Power and Sample Size Estimation

Power and Sample Size for Testing Means and Proportion Type I & Type II Error Type I Error: reject the null hypothesis when it is true. The probability of a Type I Error is denoted by α. Type II Error: accept the null hypothesis when it is false and the alternative hypothesis is true. The probability of a Type II Error is denoted by β. Decision Reject H0 Do Not Reject H0

Null hypothesis, H0 True False Type I Error Correct Decision 1−β α Correct Decision Type II Error 1−α β

Power of the Test (Power = 1 − β) The power of the statistical test is the ability of the study as designed to distinguish between the hypothesized value and some specific alternative value. For a statistician, the power of a test is the probability that the test will reject the hypothesis tested when a specific alternative hypothesis is true. To calculate the power of a given test, it is necessary to specify α (the probability that the test will lead to the rejection of the hypothesis tested when that hypothesis is true) and to specify a specific alternative hypothesis Ha. Stated another way, Power = P(reject H0 | H0 is false) or P(reject H0 | HA is true) For example, suppose a medication is about to be marketed that claims to lower blood pressure. Suppose the medication lowers the subject’s blood pressure by 5 mm Hg. The probability that the null hypothesis of no change in blood pressure will be rejected based on the results of the sample study given that the real reduction is 5 mm Hg is the power of the test. The power of the test is affected by four factors: 1. The level of significance. As α decreases, the power decreases. 2. The sample size. As n increases, the power increases. 3. The numerical difference between the null hypothesis value and the specific alternative value. As this difference increases, the power increases. (It is very difficult to reject a null hypothesis of no difference in favor of an alternative that there is a difference if that difference is very small.)

1

Power and Sample Size Estimation Example: Suppose we know that the serum cholesterol levels for all 20- to 24-year-old males in the United States is normally distributed with a mean of 180 mg/100ml and the standard deviation is 46 mg/100ml. We would expect that the mean cholesterol level of a special diet group in this population to be higher than 180 mg/100ml. (Assuming that the cholesterol levels are normally distributed and have the same standard deviation, i.e., 46 mg/100ml.) To test whether the mean cholesterol level of this special diet group to be higher than 180 mg/100ml, we would conduct a one-sided test with the hypotheses: Ho: µ = 180 mg/100ml (or µ ≤ 180 mg/100ml) HA: µ > 180 mg/100ml (H1 is sometimes used instead of HA) Assuming that we use a sample of size 25 and test the hypothesis with a level of significant α = 0.05, what would be the β and the power of the test, i.e. 1 − β?

A Decision Rule:

Rejection region

Figure 1:

z 0

Right tail area is 0.05

1.645

Since the standard deviation is known and sample is from a normal population, a z-test can be used. If we take the critical value approach, the critical value would be 1.645. At the level of significant α = 0.05, H0 would be rejected if z ≥ 1.645. If a sample of size 25 is selected, what would be the critical value in terms of the mean cholesterol level?

180 + 1.645 ⋅

A Decision Rule in original scale:

46 25

= 195.1

Rejection region

Figure 2:

x 180

195.1

Let assume that the alternative hypothesis is Ha: µ = µa = 211 mg/100ml. If Ha is true, what is the probability of accepting H0, i.e. β, and what would be the power of the test? That is to say “how powerful can this test detect a 31 mg/100ml increase in average cholesterol level?” H0: µ = µ0 = 180 HA: µ = µa = 211

Figure 3:

x µ0=180

195.1

µa=211

µa is sometimes denoted as µ1

β = 0.042 Power of the test when sample size is 25 would be:

Power = 1 – β = 1 – 0.042 = 0.958.

2

Power and Sample Size Estimation Assuming that we use a sample of size 100 and test the hypothesis with a level of significant α = 0.05, what would be the β and the power of the test, i.e. 1 − β? Rejection region Figure 4:

z 0

Right tail area is 0.05

1.645

If a sample of size 100 is selected, what would be the critical value in terms of the mean cholesterol level? c.v. = 180 + 1.645 ⋅

46 100

= 187.6

Rejection region Figure 5:

x 180

187.6

If Ha: µ = µa = 211 mg/100ml is true, what is the probability of accepting H0, i.e. β?

Figure 6:

x 180

187.6

β ≈ 0.00

211 α = .05

Power of the test when sample size is 100 would be:

Power = 1 – β ≈ 1 – 0.0 = 1.0.

n = 100

n = 25

180 187.6

211 195.1

x 3

Power and Sample Size Estimation Sample Size Estimation Suppose we wish to test the hypothesis Ho: µ = 180 mg/100ml (or µ ≤ 180 mg/100ml) v.s. Ha: µ > 180 mg/100ml, at the level of significance of 0.01. If we want to risk a 5% (or say, with the power of the test 0.95) chance of failing to reject the null hypothesis in case of that the true mean is as large as 211 mg/100ml. how large a sample do we need? (Power of 0.95 can also mean we want to have a 95% chance of rejecting the null hypothesis if the true mean is as large as 211 mg/100ml.) Assume the standard deviation is 46.

Figure 7:

x µ0 =180

x µa = 211 α = .01

β = 0.05

If α = .01, then the critical value in z-score would be 2.32. The critical value is cholesterol level would be x = 180 + 2.32 ⋅

46 n

.

If the true mean is 211 mg/100ml , with a power of 0.95, i.e. β = .05, we would reject the null hypothesis when the sample average is less than x = 211 − 1.645 ⋅

Set the two equations equal to each other:

180 + 2.32 ⋅

46 n

46 n

= 211 − 1.645 ⋅

46 n

2

 (2.32 + 1.645)(46)  Solve for n: n =   = 34.6 ≈ 35 211 − 180   Formula: The estimated sample size for a one-sided test for mean with level of significance α and power 1 − β is,  ( zα + z β ) ⋅ σ  n=   µa − µ0 

2

For a two-sided test,  ( zα / 2 + z β ) ⋅ σ  n=   µa − µ0 

2

* Check the following web page for power calculation:

http://calculators.stat.ucla.edu/powercalc/

4

Power and Sample Size Estimation Try this problem using the formula above:

For testing hypothesis that H0: µ = µ0 = 70

v.s.

HA: µ ≠ 70 (Two-sided Test)

with a level of significance of α = .05. Find the sample size so that one can have a power of 1 − β = .90 to reject the null hypothesis if the actual mean is µa = 80. (Another way to say it that if the actual mean is 80, you want the test to have a power of 90% to reject the null hypothesis and accept that the mean is significantly different from 70.) The standard deviation, σ, is approximately equal 15. Solution:

Testing one proportion:

As in the case of proportion, the specification of α for p0 and β for p1 would lead to the following two equations, ignoring the continuity correction: zα =

pˆ − p0 p0 (1 − p0 ) / n

Solving for n, eliminating pˆ , we get

zβ =

pˆ − p1 p1 (1 − p1 ) / n

 zα n= 

p 0 (1 − p 0 ) + z β p1 − p 0

p1 (1 − p1 )   

2

Replace zα by zα/2 if for two-sided test. Example: Consider the planning of a survey to find out how smoking behavior changed while students were in college. A comprehensive survey four years ago found that 30% of freshmen smoked. The investigator wants to know how many seniors to be sampled. He wants to perform a two-tailed test at the five percent level. The null hypothesis is p0=0.3. This indicates zα = 1.96 . He also states that if the proportion is changed as much as 5 percentage points, then he wishes to risk 10% chance of failing to reject the null hypothesis. This indicates p1 = .35 and z β = 1.28 (as mentioned earlier, one-tailed z

value is used for z β ). Then the required sample size is 2

1.96 (.3)(.7) + 1.28 (.35)(.65)  n=  = 910.48 ≈ 911 .35 − .3  

5

Power and Sample Size Estimation Testing the difference of two means:

We can determine the sample size for testing the difference of two independent means in the same manner. We assume equal variance in two groups ( σ 12 = σ 22 ) and equal division of sample size between the two groups ( n1 = n 2 ). Specifying Type I error for the null hypothesis ( µ1 − µ 2 = ∆ 0 ), we have zα =

( x1 − x 2 ) − ∆ 0

σ 12 n1

+

σ 22

=

( x1 − x 2 ) − ∆ 0

n2

2σ 12

.

n1

Specifying Type II for the alternative hypothesis ( µ1 − µ 2 = ∆ 1 ), we have zβ =

( x1 − x 2 ) − ∆1 2σ 12

. By solving these two equations for n1, eliminating ( x1 − x 2 ), we

n1 2

 ( zα − z β )σ 1  get n1 = 2   . Since ∆ 0 is zero in most application, the denominator is  ∆1 − ∆ 0  usually ∆ 1 , the value specified in the alternative hypothesis. Note that n is the sample size in each group and the total sample size for the study is 2n1. Let us consider a case of designing a clinical nutritional study of special diet regimen to lower blood pressure among hypertensive adult males (diastolic blood pressure over 90 mmHg). The investigator expects to demonstrate the new diet would reduce diastolic blood pressure by 4 mmHg in three months. He is willing to risk Type I error of 5% and Type II error of 10% for a one-sided test. The NHANES data show that standard deviation of diastolic blood pressure among hypertensive males is around 5.6 mmHg. The required sample size in each group can be calculated by 2

 5.6(1.645 + 1.28)  n1 = 2  = 33.5 ≈ 34. The proposed study would need a total of 68 4   subjects, allocated randomly and equally between the treatment and the control group. Formula: The estimated sample size for a one-sided test, with level of significance α and power 1 − β is, for n1 = n2 = n, where ∆ = | µ1 - µ2 | is sometimes a difference that is considered practically significant, is  ( zα + z β ) ⋅ σ  n = 2  ∆  

2

For a two-sided test,  ( zα / 2 + z β ) ⋅ σ  n = 2  ∆  

2

6

Power and Sample Size for Testing Means and Proportion

Short Description

Description

Comments