Lecture - Ch 3 Triola
3.2 Measures of Center
Mean
Median
Mode
Midrange
Weighted Mean / Mean from frequency distribution / GPA
Meaning of Greek capital Sigma: Σ
Pg 88: round-off suggestion
3.3 Measures of Variation: How spread out is the data?
Possible measures:
Range = Max - Min Doesn't consider all data
Mean deviation from mean
Σ(x-x-bar) / N Always 0
Mean absolute deviation from mean
Σ |x - x-bar| / N Usable, but awkward due to abs value
RMS (Root mean square)
sqrt [ Σ x2 / N ] Used for electricity, with alternating current
Sqrt of mean of squares of deviations from mean [Standard Deviation]
sqrt [ Σ (x - x-bar)2 / N ] Most frequently used measure
Variance = (Std Dev)2
Issues with StDev
1. Sample StDev vs Pop StDev
2. Computing Formula to avoid having to calculate mean first
Computing formula
Definition: s =
Computing
formula: s =
, where ![]()
Note: Formula in book is slightly different, and has the advantage that if the data consists of integers, both the numerator and denominator under the radical will also be integers.
Definition vs Computing formula: Prove that they are the same
We need to show ![]()
Note
means
so
=1 + 1 +...+1 (n terms) = n
Proof:
Expand
binomial square
=
Linearity: ![]()
=
Factor out
constants: 2 and x-bar
=
Def. of
is
; ![]()
=
Simplify
algebraically
![]()
Range Rule of Thumb: s ~ range / 4
Empirical Rule – For Normal data sets ; What % of data is within _______ SD of mean?
Chebyshev’s Theorem – For general data sets: Same issue, but data cannot be expected to be as close to the mean.
Coefficient of variation – For comparing variation in different data sets:
CV = s / x-bar, converted to a percent.
The data set with the highest CV is the most spread out
3.4 Measures of Relative Standing and Boxplots
z-scores
Formula: z = (x – x-bar) / s or the equivalent for populations
Round-off rule: to 2 decimal places (like table A-2)
Ex 1: Which score is relatively more extreme?
What constitutes an “unusual” value? Interpret using the empirical rule; Chebyshev’s Thm
Negative z-scores: How do they happen? What do they tell us?
Percentiles – See definition, pg 116: they are locations that divide the data into 100 groups
Two basic problems
Given a value (data item), find its percentile
Formula: %ile = nr of values less than x / total nr of values, convert to %
Given a percentile, find the location (data value)
Compute L = (k/100)*n, then
If L is not an integer, round L up (always up), then take the Lth item
If L is an integer, average the Lth item and the next item
Relate percentiles to the median and to quartiles
Boxplots:
Plot the 5-number summary: Min, Q1, Median, Q3, Max
For modified boxplot, you can show outliers as dots outside the main graph
Outliers, for this purpose, are items that are outside Q1 or Q3 by an amout greater than 1.5 * Interquartile range (Q3 – Q1)