Lecture Notes:
Triola, Chapter 1: Introduction
to statistics
Know these definitions:
|
population |
parameter Data relating to an entire population are called parameters |
|
sample |
statistic Data from samples are called statistics |
discrete data Can you count it? How many students....
continuous data Can you get fractional measurements? Average speeds of ..
voluntary response sample
random sample
simple random sample
1.2 Types of Data
The point: Don't use methods/computations that are inappropriate for your type of data.
Know the four types of data: First 2: Qualitative, Last 2: Quantitative
Nominal - Based on names
e.g. male, female
or, far west, midwest, southern, mid-atlantic, northeast
Ordinal - Some order is possible, even if it can't be quantified
e.g. large, medium, small
or, child, adult, senior
Interval - Numerical scales (like temperatures); ratios are not valid
e.g. songs from the 50's, 60's, 70's, 80's, ...
Ratio - Numerical ratios can be formed and are valid
(Test: if you double one number, are you also doubling the quantity you are measuring? Is 60 degrees twice as warm as 30 degrees?)
Question: What type is appropriate for the numbers on football jerseys? (The bigger the player, the bigger the number?)
1.3
Critical Thinking
Per text, main sources of errors in statistical work:
Dishonesty (on the part of the researcher or the people providing the data)
Unintentional errors (Are political polls always accurate?)
Good sampling techniques are crucial. Potential errors could come from:
Voluntary response sample (or self-selected sample)
Samples that are too small
Loaded questions: questions worded in a way that affects the response
Order of the questions, or the order of the phrases in the questions
Nonresponse
Missing data
Self-interest study
Other errors relate to presentation of the results
Misleading graphs or pictographs
Errors interpreting percentages -- Page 15 has a good summary of % calculations
Confusing correlation with causality
Overly precise numbers (textbook itself fell into this trap on page 4)
Partial pictures
Deliberate distortions
Discuss the census controversy referred to on page 4, middle paragraph
Discuss the statement, "90% of all statistics are made up on the spot."
1.4 Design of Experiments
Observational studies vs Experiments
Obsv -- We observe and measure, but don't modify the subjects
retrospective (Past data)
cross-sectional (i.e. current status)
prospective (longitudinal) Future data
Experiment -- we apply some treatment, then observe the effects
Avoid confounding: multiple causes could produce the effect you"re measuring
See decision tree on pg 22: Triola has a number of these: helpful for complex decisons
Controlling the effects of variables
Blinding -- Hawthorn effect
Blocks -- Sampling strategy.
Fig 1.4:
Randomized block design
Form blocks with similar characteristics
Randomly assign treatments to the subjects in each group
Completely randomized Experimental Design
No blocks
Assign to groups randomly
Rigorously controlled design
Careful selection of subjects: [more thorough consideration of blocks]
"Repetition of an experiment on sufficiently large groups of subjects is called replication"
The idea is to be able to apply the scientific method: if you repeat the experiment you should get the same result
Sampling strategies
Random sample -- Each individual member has an equal chance of being selected
Simple random sample -- Every possible sample of size n has the same chance of being selected (If n=1 only, this is a random sample)
Probability sample -- Each member has a known (not necessarily the same) probability of being selected
Multistage sample design: combination of the sampling techniques in
Fig 1-5 (pg 28)
Return to: Merced College; Don Power Updated 08/21/08 by Don Power