Merced College; Don Power

 

Lecture Notes:  Triola, Chapter 1:  Introduction to statistics

 

Know these definitions:

 

population

parameter    Data relating to an entire population are called parameters

sample

statistic        Data from samples are called statistics

 

            discrete data                 Can you count it?  How many students....

            continuous data Can you get fractional measurements?  Average speeds of ..

 

            voluntary response sample

            random sample

            simple random sample

 

1.2 Types of Data

 

The point:  Don't use methods/computations that are inappropriate for your type of data.

 

Know the four types of data:                 First 2:  Qualitative,       Last 2: Quantitative

            Nominal - Based on names                  

                        e.g. male, female

                        or,  far west, midwest, southern, mid-atlantic, northeast

 

            Ordinal - Some order is possible, even if it can't be quantified

                       

                        e.g. large, medium, small

                        or, child, adult, senior

 

            Interval - Numerical scales (like temperatures); ratios are not valid

                        e.g.  songs from the 50's, 60's, 70's, 80's, ...

 

            Ratio - Numerical ratios can be formed and are valid

(Test:  if you double one number, are you also doubling the quantity you are measuring? Is 60 degrees twice as warm as 30 degrees?)

 

Question:  What type is appropriate for the numbers on football jerseys? (The bigger the player, the bigger the number?)

 

 

1.3  Critical Thinking

 

Per text, main sources of errors in statistical work:

            Dishonesty (on the part of the researcher or the people providing the data)

            Unintentional errors (Are political polls always accurate?)

 

Good sampling techniques are crucial.  Potential errors could come from:

 

            Voluntary response sample (or self-selected sample)

            Samples that are too small

            Loaded questions:  questions worded in a way that affects the response

            Order of the questions, or the order of the phrases in the questions

            Nonresponse

            Missing data

            Self-interest study

           

Other errors relate to presentation of the results

            Misleading graphs or pictographs

            Errors interpreting percentages -- Page 15 has a good summary of % calculations

            Confusing correlation with causality

            Overly precise numbers (textbook itself fell into this trap on page 4)

            Partial pictures

            Deliberate distortions

 

Discuss the census controversy referred to on page 4, middle paragraph

Discuss the statement, "90% of all statistics are made up on the spot."

 

1.4  Design of Experiments

 

Observational studies vs Experiments

            Obsv -- We observe and measure, but don't modify the subjects

                        retrospective (Past data)

                        cross-sectional (i.e. current status)

                        prospective (longitudinal)  Future data

            Experiment -- we apply some treatment, then observe the effects

               Avoid confounding:  multiple causes could produce the effect you"re measuring

 

See decision tree on pg 22:  Triola has a number of these:  helpful for complex decisons

 

Controlling the effects of variables

 

            Blinding -- Hawthorn effect

            Blocks  -- Sampling strategy.  

 

            Fig 1.4:

            Randomized block design

                        Form blocks with similar characteristics

                        Randomly assign treatments to the subjects in each group

            Completely randomized Experimental Design

                        No blocks

                        Assign to groups randomly

                        Rigorously controlled design

Careful selection of subjects:  [more thorough consideration of blocks]

"Repetition of an experiment on sufficiently large groups of subjects is called replication"

The idea is to be able to apply the scientific method:  if you repeat the experiment you should get the same result

 

Sampling strategies

 

            Random sample -- Each individual member has an equal chance of being selected

 

Simple random sample -- Every possible sample of size n has the same chance of being selected (If n=1 only, this is a random sample)

 

Probability sample -- Each member has a known (not necessarily the same) probability of being selected

 

 

Multistage sample design:  combination of the sampling techniques in

            Fig 1-5 (pg 28)

 

 

Return to:  Merced College; Don Power               Updated 08/21/08 by Don Power