STATISTICS - CH 10, LECTURE
10.1 Tests
of Hypotheses
Be able to identify a null hypothesis and an alternate hypothesis
Be able to identify a type 1 error and a type 2 error
Note that the probability of a type 1 error is the significance level α
In hypothesis testing, we should minimize the probability of a type 1 error; α is usually set at .01 or .05
10.2 Significance Tests
One-sided tests vs. two-sided tests
α is the total tail area; so for a two-sided test, each tail has an area of α/2
Power of a test:
|
FINDING N TO GET THE POWER OF A TEST TO BE .8 OR ABOVE |
|
|
|
|||||
|
Ho |
Alternative |
a |
za |
n |
p hat |
z |
B |
power |
|
p=? |
Ho, p=? |
[rh tail] |
|
|
|
|
|
|
|
0.5 |
0.65 |
0.05 |
1.64485363 |
67 |
0.600475427 |
-0.849899242 |
0.197690553 |
0.802309 |
|
|
|
|
|
|
|
|
|
|
|
FORMULAS: |
|
normsinv(c4) |
|
a4+d4*sqrt |
(f4-b4)/sqrt |
normsdist(g4) |
1-h4 |
|
|
|
|
|
|
|
(a4*(1-a4)/e4) |
(b4*(1-b4)/e4) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Find n by trial and error: the goal is the minimum n that will make the power .8 or above.
10.3-4 Tests Concerning Means
5-step process
1. Identify hypotheses:
H0: μ = claimed value
HA: μ > or < or ≠ claimed value
> or <: one-tailed tests
≠: two-tailed test
2. Set the level of significance
α = .01 corresponds to 99% confidence
α = .05 corresponds to 95% confidence
3. Determine the criterion for rejection of the null hypothesis
Find critical value of the test statistic (z for n≥30, t for n<30)
t or z: based on α
One-tailed test with z: table area = .5−α; z could be pos or neg.
One-tailed test with t: subscript for t is α
Two-tailed test with z: table area = .5−α/2; z could be pos or neg
Two-tailed test with t: subscript for t is α/2
p: critical value is α
Write an inequality statement, such as: Reject H0 if F>1.96
For p-value test, "reject H0 if p < α
***You should do all this before collecting your data.
It is fundamentally dishonest to decide on your criteria after you know survey results.
4. Calculations: calculate the test statistic based on the sample data
t or z: use formula for central limit theorem
p: Extra step: translate t or z to the total tail area (i.e. both tails for a 2-tail test)
5. Decision: Compare the test statistic based on the data (step 4) with the critical value (step 3)
Do we reject the null hypothesis, or do we "reserve judgment?"
Note: It does not make sense, based on hypotheses testing, to accept the null hypothesis. By setting the significance level to minimize the probability of a type I error, we have a relatively high probability of a type II error. Remember that hypothesis testing is done to challenge an existing claim about a mean.
If what we really want to do is validate the mean, we should not be hypothesis testing. Instead, we should find a 95% or 99% confidence interval for the mean.
CAN YOU USE A COMPUTER (without buying a special statistics package)?
It is possible, if not very user-friendly, to use Excel for hypothesis testing
The example below includes two solved problems from the text, a 2-tail test and a 1-tail test; and it examines them both as large samples and as small samples, and it also shows the use of the p value.
You have to enter the formula for the test statistic yourself, but functions do exist in Excel for calculating the critical value of t or z, as well as the p-value of the data.
Notice, in the formulas for the critical values (in the Criteria column), that:
The inverse standard normal formula assumes a 1-tail test; divide α by 2 for a 2-tail test.
The inverse t formula assumes a 2-tail test; multiply α by 2 for a 1-tail test.
|
p333, X10.11 |
|
|
Criteria |
|
Data: |
|
|
|
|
Concl. |
|
Large |
|
|
n=60 |
|
n |
x-bar |
s |
z or t |
p |
|
|
H0: m = |
HA: m<>12.8 |
LS 0.01 |
z<-? Or >-? |
p<LS of .01 |
|
|
|
|
|
|
|
12.8 |
|
0.01 |
-2.575829304 |
|
60 |
11.2 |
3.5 |
-3.541013345 |
0.000785564 |
reject |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Formulas: |
|
NORMSINV(.005) |
|
|
|
|
(xbar - m)/(s/sqrt(n)) |
TDIST(abs(z),df,tails) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Small |
|
|
n=20 |
|
|
|
|
|
|
|
|
|
|
|
t<-? Or >? |
p<LS of .01 |
|
|
|
|
|
|
|
12.8 |
|
0.01 |
2.860934604 |
|
20 |
11.2 |
3.5 |
-2.044405008 |
0.05502263 |
don't rej |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Formulas: |
|
TINV(.01,19) |
|
|
|
|
(xbar - m)/(s/sqrt(n)) |
TDIST(abs(t),df,tails) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
p335,X10.21 |
|
|
Criteria |
|
Data: |
|
|
|
|
Concl. |
|
Large |
|
|
n=60 |
|
n |
x-bar |
s |
z or t |
p |
|
|
H0: m=9 |
HA: m<9 |
LS .05 |
z<-? |
p<LS of .05 |
|
|
|
|
|
|
|
9 |
|
0.05 |
-1.644853627 |
|
60 |
7.2 |
1.8 |
-7.745966692 |
7.36365E-11 |
reject |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Formulas: |
|
NORMSINV(.05) |
|
|
|
|
(xbar - m)/(s/sqrt(n)) |
TDIST(abs(z),df,tails) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Small |
|
|
n=12 |
|
|
|
|
|
|
|
|
|
|
|
t<-? |
p<LS of .05 |
|
|
|
|
|
|
|
9 |
|
0.05 |
1.795884814 |
|
12 |
7.2 |
1.8 |
-3.464101615 |
0.002647366 |
reject |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Formulas: |
|
TINV(0.1,11) |
|
|
|
|
(xbar - m)/(s/sqrt(n)) |
TDIST(abs(t),df,tails) |
|
10.5 Differences Between Means (Large Samples)
Uses the statistic 
10.6 Differences Between Means (Small Samples)
Omit: We will use 10.9 or 13.6 for this
10.7
Use for before-and -after comparisons.
Use a t-or z-statistic based on the signed differences between the before and after values
10.8 Differences
Among Several Means
Concept: use the F-statistic, which is (Variation among the samples) / (Variation within the samples)
Definition: (n times variance of
means) / (mean of variances) ![]()
10.9 Analysis
of Variance (ANOVA)
Collect, for each sample, statistics for n, x, and x2. Summarize as follows:
= N
ΣΣx
ΣΣx2
Σ [(Σx)2/n]
Summary statistics:
Based on the totals, calculate:
Calculations: 
Where
k = number of treatments (or data sets)
n = sample size of each data set
N = Σn = total 1
SST = ΣΣx2 - (ΣΣx)2 / N, = total 3 - (total 2)2 / total 1
SS(Tr) = Σ[(Σx)2/n] - (ΣΣx)2 / N = total 4 - (total 2)2 / total 1
SSE = SST -SS(Tr)
MS(Tr) = SS(Tr) / (k-1)
MSE = SSE / (N-k)
F = MS(Tr) / MSE
When calculating the critical value of F,
DF (numerator) = DF (Treatments) = k - 1, and
DF (denominator) = DF (Error) = N - k.
Minitab example for ANOVA (One-way analysis of variance), based on the data on pg 346:
Tests whether the difference among several means is significant
All data goes into column 1; column 2 tells which data set the entry comes from.
In this example, the first 4 items are from set 1, the next 4 from set 2, and the last 4 from set 3.


Here is the same calculation, with Excel. The data from each sample is in a different column.
To run this test, call up the Analysis Tool-Pack, and ask to do a Single-Factor ANOVA:
|
65 |
80 |
72 |
|
Anova: Single Factor |
|
|
|
|
|
|
|
69 |
84 |
76 |
|
|
|
|
|
|
|
|
|
71 |
86 |
77 |
|
SUMMARY |
|
|
|
|
|
|
|
75 |
90 |
79 |
|
Groups |
Count |
Sum |
Average |
Variance |
|
|
|
|
|
|
|
Column 1 |
4 |
280 |
70 |
17.33333333 |
|
|
|
|
|
|
|
Column 2 |
4 |
340 |
85 |
17.33333333 |
|
|
|
|
|
|
|
Column 3 |
4 |
304 |
76 |
8.666666667 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ANOVA |
|
|
|
|
|
|
|
|
|
|
|
Source
of Variation |
SS |
df |
MS |
F |
P-value |
F
crit |
|
|
|
|
|
Between Groups |
456 |
2 |
228 |
15.78461538 |
0.001141 |
4.256495 |
|
|
|
|
|
Within Groups |
130 |
9 |
14.44444444 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Total |
586 |
11 |
|
|
|
|
13.6
Differences Among Samples: The
H-Test (Kruskal-Wallis Test)
This is a "non-parametric test." Advantages of non-parametric tests:
Don't require same conditions of many previously discussed tests
that the population have roughly the shape of a normal distribution, or
that variations of samples be the same, or
that samples be independent
Easily computed, typically
H-Test (Kruskal-Wallis Test) is a test of the differences among means
It is a "rank-sum" test, based on
1. Arranging the data values in order
2. Assigning a rank to each value
3. Adding all the ranks for a set of values. Call the sums R1, R2, R3...
1. H0: populations are identical; HA: populations are not identical
2. α = .05 or .01
3. Criterion: Reject H0 if H > critical value = χ2 for df = n-1
4. Calculation of H:
H-statistic is ![]()
where
n = total samples for all data sets
ni = total samples for data set i. ni should be at least 5 for each data set
Ri = sum of ranks for data set i
5. Decision: Compare the calculated H with χ2 for df = n-1 and apply the rejection criterion
Minitab example for:
Kruskal-Wallis non-parametric test
Tests whether the difference among several means is significant
All data goes into column 1; column 2 tells which data set the entry comes from.
In this example, the first 4 items are from set 1, the next 4 from set 2, and the last 4 from set 3.


Return to: Merced College; Don Power Updated 12/05/08 by Don Power