Merced College; Don Power

 

STATISTICS - CH 10, LECTURE

 

10.1     Tests of Hypotheses

 

Be able to identify a null hypothesis and an alternate hypothesis

Be able to identify a type 1 error and a type 2 error

Note that the probability of a type 1 error is the significance level α

In hypothesis testing, we should minimize the probability of a type 1 error; α is usually set at .01 or .05

 

10.2     Significance Tests

 

One-sided tests vs. two-sided tests

α is the total tail area; so for a two-sided test, each tail has an area of α/2

 

Power of a test:

FINDING N TO GET THE POWER OF A TEST TO BE .8 OR ABOVE

 

 

 

Ho

Alternative

a

za

n

p hat

z

B

power

p=?

Ho, p=?

[rh tail]

 

 

 

 

 

 

0.5

0.65

0.05

1.64485363

67

0.600475427

-0.849899242

0.197690553

0.802309

 

 

 

 

 

 

 

 

 

FORMULAS:

 

normsinv(c4)

 

a4+d4*sqrt

(f4-b4)/sqrt

normsdist(g4)

1-h4

 

 

 

 

 

  (a4*(1-a4)/e4)

  (b4*(1-b4)/e4)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Find n by trial and error:  the goal is the minimum n that will make the power .8 or above.

 

 

 

10.3-4  Tests Concerning Means

 

5-step process

            1.  Identify hypotheses:

                        H0:  μ = claimed value

                        HA:  μ > or < or ≠ claimed value

                                    > or <:  one-tailed tests

                                    ≠:  two-tailed test

            2.  Set the level of significance

                        α = .01 corresponds to 99% confidence

                        α = .05 corresponds to 95% confidence

            3.  Determine the criterion for rejection of the null hypothesis

                        Find critical value of the test statistic (z for n≥30, t for n<30)

                                    t or z:  based on α

                                                One-tailed test with z:  table area = .5−α; z could be pos or neg.

                                                One-tailed test with t:  subscript for t is α

                                                Two-tailed test with z:  table area = .5−α/2; z could be pos or neg

                                                Two-tailed test with t:  subscript for t is α/2

                                    p:  critical value is α

                        Write an inequality statement, such as:  Reject H0 if F>1.96

                                    For p-value test, "reject H0 if p < α

 

            ***You should do all this before collecting your data.

                        It is fundamentally dishonest to decide on your criteria after you know survey results.

 

            4.  Calculations:  calculate the test statistic based on the sample data

                                    t or z:  use formula for central limit theorem

                                    p:  Extra step:  translate t or z to the total tail area (i.e. both tails for a 2-tail test)

            5.  Decision:  Compare the test statistic based on the data (step 4) with the critical value (step 3)

                        Do we reject the null hypothesis, or do we "reserve judgment?"

 

Note:  It does not make sense, based on hypotheses testing, to accept the null hypothesis.  By setting the significance level to minimize the probability of a type I error, we have a relatively high probability of a type II error.  Remember that hypothesis testing is done to challenge an existing claim about a mean.

 

If what we really want to do is validate the mean, we should not be hypothesis testing.  Instead, we should find a 95% or 99% confidence interval for the mean.

 

CAN YOU USE A COMPUTER (without buying a special statistics package)?

It is possible, if not very user-friendly, to use Excel for hypothesis testing

The example below includes two solved problems from the text, a 2-tail test and a 1-tail test; and it examines them both as large samples and as small samples, and it also shows the use of the p value.

You have to enter the formula for the test statistic yourself, but functions do exist in Excel for calculating the critical value of t or z, as well as the p-value of the data.

Notice, in the formulas for the critical values (in the Criteria column), that:

            The inverse standard normal formula assumes a 1-tail test; divide α by 2 for a 2-tail test.

            The inverse t formula assumes a 2-tail test; multiply α by 2 for a 1-tail test.

 

p333, X10.11

 

 

Criteria

 

Data:

 

 

 

 

Concl.

Large

 

 

n=60

 

n

x-bar

s

z or t

p

 

H0: m =

HA: m<>12.8

LS  0.01

z<-? Or >-?

p<LS of .01

 

 

 

 

 

 

12.8

 

0.01

-2.575829304

 

60

11.2

3.5

-3.541013345

0.000785564

reject

 

 

 

 

 

 

 

 

 

 

 

 

Formulas:

 

NORMSINV(.005)

 

 

 

 

(xbar - m)/(s/sqrt(n))

TDIST(abs(z),df,tails)

 

 

 

 

 

 

 

 

 

 

 

 

Small

 

 

n=20

 

 

 

 

 

 

 

 

 

 

t<-? Or >?

p<LS of .01

 

 

 

 

 

 

12.8

 

0.01

2.860934604

 

20

11.2

3.5

-2.044405008

0.05502263

don't rej

 

 

 

 

 

 

 

 

 

 

 

 

Formulas:

 

TINV(.01,19)

 

 

 

 

(xbar - m)/(s/sqrt(n))

TDIST(abs(t),df,tails)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

p335,X10.21

 

 

Criteria

 

Data:

 

 

 

 

Concl.

Large

 

 

n=60

 

n

x-bar

s

z or t

p

 

H0: m=9

HA: m<9

LS .05

z<-?

p<LS of .05

 

 

 

 

 

 

9

 

0.05

-1.644853627

 

60

7.2

1.8

-7.745966692

7.36365E-11

reject

 

 

 

 

 

 

 

 

 

 

 

 

Formulas:

 

NORMSINV(.05)

 

 

 

 

(xbar - m)/(s/sqrt(n))

TDIST(abs(z),df,tails)

 

 

 

 

 

 

 

 

 

 

 

 

Small

 

 

n=12

 

 

 

 

 

 

 

 

 

 

t<-?

p<LS of .05

 

 

 

 

 

 

9

 

0.05

1.795884814

 

12

7.2

1.8

-3.464101615

0.002647366

reject

 

 

 

 

 

 

 

 

 

 

 

 

Formulas:

 

TINV(0.1,11)

 

 

 

 

(xbar - m)/(s/sqrt(n))

TDIST(abs(t),df,tails)

 

 

 

10.5     Differences Between Means (Large Samples)

 

Uses the statistic

 

10.6     Differences Between Means (Small Samples)

 

Omit:  We will use 10.9 or 13.6 for this

 

10.7

 

Use for before-and -after comparisons.

Use a t-or z-statistic based on the signed differences between the before and after values

 

10.8     Differences Among Several Means

 

Concept:  use the F-statistic, which is (Variation among the samples) / (Variation within the samples)

Definition:  (n times variance of means) / (mean of variances)

 

10.9     Analysis of Variance (ANOVA)

 

Collect, for each sample, statistics for n, x, and x2.  Summarize as follows:

 

=  N       ΣΣx         ΣΣx2     Σ [(Σx)2/n]

 
Summary statistics:        Based on the totals, calculate:

 

 

 

Calculations:        

 

Where

            k = number of treatments (or data sets)

            n = sample size of each data set

            N =  Σn                                                = total 1

            SST = ΣΣx2 - (ΣΣx)2 / N,                    = total 3 - (total 2)2 / total 1

            SS(Tr) = Σ[(Σx)2/n] - (ΣΣx)2 / N          = total 4 - (total 2)2 / total 1

            SSE = SST -SS(Tr)

            MS(Tr) = SS(Tr) / (k-1)

            MSE = SSE / (N-k)

            F = MS(Tr) / MSE

 

When calculating the critical value of F,

            DF (numerator) = DF (Treatments) = k - 1, and

            DF (denominator) = DF (Error) = N - k.

 

 

 

Minitab example for ANOVA (One-way analysis of variance), based on the data on pg 346:

            Tests whether the difference among several means is significant

 

All data goes into column 1; column 2 tells which data set the entry comes from.

            In this example, the first 4 items are from set 1, the next 4 from set 2, and the last 4 from set 3.

 

 

 

Here is the same calculation, with Excel.  The data from each sample is in a different column.

To run this test, call up the Analysis Tool-Pack, and ask to do a Single-Factor ANOVA:

 

65

80

72

 

Anova: Single Factor

 

 

 

 

 

 

69

84

76

 

 

 

 

 

 

 

 

71

86

77

 

SUMMARY

 

 

 

 

 

 

75

90

79

 

Groups

Count

Sum

Average

Variance

 

 

 

 

 

 

Column 1

4

280

70

17.33333333

 

 

 

 

 

 

Column 2

4

340

85

17.33333333

 

 

 

 

 

 

Column 3

4

304

76

8.666666667

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

ANOVA

 

 

 

 

 

 

 

 

 

 

Source of Variation

SS

df

MS

F

P-value

F crit

 

 

 

 

Between Groups

456

2

228

15.78461538

0.001141

4.256495

 

 

 

 

Within Groups

130

9

14.44444444

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Total

586

11

 

 

 

 

 

 

 

13.6  Differences Among Samples:  The H-Test (Kruskal-Wallis Test)

 

This is a "non-parametric test."  Advantages of non-parametric tests:

            Don't require same conditions of many previously discussed tests

                        that the population have roughly the shape of a normal distribution, or

                        that variations of samples be the same, or

                        that samples be independent

            Easily computed, typically

           

 H-Test (Kruskal-Wallis Test) is a test of the differences among means

 

It is a "rank-sum" test, based on

            1.         Arranging the data values in order

2.         Assigning a rank to each value

3.         Adding all the ranks for a set of values.  Call the sums R1, R2, R3...

 

1.         H0: populations are identical; HA: populations are not identical

2.         α =  .05 or .01

3.         Criterion:  Reject H0 if H > critical value = χ2 for df = n-1

4.         Calculation of H:

H-statistic is

where

                        n = total samples for all data sets

                        ni = total samples for data set i. ni should be at least 5 for each data set

                        Ri = sum of ranks for data set i

 

5.         Decision:  Compare the calculated H with χ2 for df = n-1 and apply the rejection criterion

 

Minitab example for:

 

Kruskal-Wallis non-parametric test

Tests whether the difference among several means is significant

 

All data goes into column 1; column 2 tells which data set the entry comes from.

            In this example, the first 4 items are from set 1, the next 4 from set 2, and the last 4 from set 3.

 

 

 

 

 

 

Return to:  Merced College; Don Power               Updated 12/05/08 by Don Power