Merced College; Don Power

 

Lecture Notes:  Triola, Chapter 1:  Introduction to statistics

 

1.2 Statistical Thinking

 

Know these definitions:

 

population

parameter    Data relating to an entire population are called parameters

sample

statistic        Data from samples are called statistics

 

            Census vs sample

 

            discrete data                Can you count it?  How many students....

            continuous data           Can you get fractional measurements?  Average speeds of ..

 

            voluntary response sample – Why would such samples cause studies to go wrong?

 

1.3 Types of Data

 

The point:  Don't use methods/computations that are inappropriate for your type of data.

 

Know the four types of data:             First 2:  Qualitative,    Last 2: Quantitative

 

            Nominal - Based on names                

                        e.g. male, female

                        or,  far west, midwest, southern, mid-atlantic, northeast

 

            Ordinal - Some order is possible, even if it can't be quantified

                       

                        e.g. large, medium, small

                        or, child, adult, senior

 

            Interval - Numerical scales (like temperatures); ratios are not valid

                        e.g.  songs from the 50's, 60's, 70's, 80's, ...

 

            Ratio - Numerical ratios can be formed and are valid

(Test:  if you double one number, are you also doubling the quantity you are measuring? Is 60 degrees twice as warm as 30 degrees?  Is 60 pounds twice as heavy as 30 pounds?)

 

Question:  What type is appropriate for the numbers on football jerseys? (The bigger the player, the bigger the number?)

 

 

1.4  Critical Thinking

 

Per text, main sources of errors in statistical work:

            Dishonesty (on the part of the researcher or the people providing the data)

            Unintentional errors (Are political polls always accurate?)

                        Did you get a valid sample?

                        Did you draw a valid conclusion from your sample?

 

Good sampling techniques are crucial.  Potential errors could come from:

 

            Voluntary response sample (or self-selected sample)

            Samples that are too small

            Loaded questions:  questions worded in a way that affects the response

            Order of the questions, or the order of the phrases in the questions

            Nonresponse

            Missing data

            Self-interest study

           

Other errors relate to presentation of the results

 

            Misleading graphs or pictographs

            Errors interpreting percentages – Text has a good summary of % calculations

            Confusing correlation with causality

            Overly precise numbers

            Partial pictures

            Deliberate distortions

 

What was the problem with the Literary Digest poll (chapter problem, pg 3)?

Discuss the statement, "90% of all statistics are made up on the spot."

 

1.5  Collecting Sample Data:  “Design of Experiments”

 

Observational studies vs Experiments

            Obsv -- We observe and measure, but don't modify the subjects

                        retrospective (Past data)

                        cross-sectional (i.e. current status)

                        prospective (longitudinal)  Future data

            Experiment -- we apply some treatment, then observe the effects

               Avoid confounding:  multiple causes could produce the effect you"re measuring

 

Be able to distinguish among Sampling strategies (fig 1-2, pg 29)

 

            Random sample -- Each individual member has an equal chance of being selected

 

Simple random sample -- Every possible sample of size n has the same chance of being selected (If n=1 only, this is a random sample)

 

Systematic sample – After some random starting point, select every kth member.

 

Stratified sampling – Subdivide population into subgroups sharing common characteristics (e.g. Democrat, Republican, Independent), and get a random sample within each group

 

Cluster sampling – Divide population into groups/clusters/classes, select random groups, survey all the members of the selected groups.

 

           

 

Controlling the effects of variables

 

            Blinding -- Hawthorn effect

            Blocks  -- Sampling strategy.  

 

            Fig 1.4:

            Randomized block design

                        Form blocks with similar characteristics

                        Randomly assign treatments to the subjects in each group

            Completely randomized Experimental Design

                        No blocks

                        Assign to groups randomly

                        Rigorously controlled design

Careful selection of subjects:  [more thorough consideration of blocks]

 

"Repetition of an experiment on sufficiently large groups of subjects is called replication"

The idea is to be able to apply the scientific method:  if you repeat the experiment you should get the same result

 

Multistage sample design:  combination of sampling techniques

 

Be able to distinguish among (based on when the data occurred):

           

Retrospective study

Cross-sectional study

Prospective (longitudinal) study

 

See decision tree on pg 31:  Triola has a number of these:  helpful for complex decisons

 

 

 

Return to:  Merced College; Don Power               Updated 08/18/09 by Don Power