10Dspread


 * Measures of Spread in Discrete Distributions **

In Year 9, we worked with the ** range ** (largest minus smallest) and the Inter-Quartile Range (Upper Quartile minus Lower Quartile) as a way of measuring the spread of the data.

In higher levels of maths we use ** variance ** and ** standard deviation ** because they give us a finer measurement. This is because they are based on all the data, not just the endpoints (or the quartiles).

In all measures of spread, a larger value means the data is more spread out, a smaller value means the data is clumped more closely together.


 * Variance **

Notation:

math . \qquad \textbf{Var}(X) \textit{ or } \sigma^2 \qquad \{ \sigma \text{ is the lower case sigma } \} \qquad. math

The rule for the Variance of the Variable X is:

math \\ . \qquad \sigma^2 = E(X - \mu)^2 \qquad \{ \mu \text{ is the mean or Expected Value of X } \} \qquad. \\ . \\ . \qquad \sigma^2 = \sum \, (x - \mu)^2 \times p(x) math

Example 1a (note shortcut below)

In the distribution below, calculate Var(X)


 * __Solution__**


 * First calculate the mean (Expected Value) **

math \\ . \qquad \mu = \Sigma \; x \times p(x) \\. \\ . \qquad \mu = 0\times 0.25 + 1 \times 0.35 + 2 \times 0.2 + 3 \times 0.2 \qquad. \\ . \\ . \qquad \mu = 1.35 math


 * Then fill in the table as shown: **


 * Hence the Variance Var(X) is: **

math \\ . \qquad \sigma^2 = \sum \, (X - \mu)^2 \times p(x) \qquad. \\ . \\ . \qquad \sigma^2 = 1.1275 math

{This is a lot of calculations. There is an easier way!!}


 * Alternate Rule for Variance **

(This rule is obtained on page 410)

math . \qquad \sigma^2 = E(X^2) - \mu^2 \qquad. math

Recall that:

math \\ . \qquad E(X) = \sum \, x \, p(x) \\. \\ . \textit{therefore} \\. \\ . \qquad E(X^2) = \sum \, x^2 \, p(x) \qquad. math

So the alternate rule for Variance can be expressed as:

math . \qquad \sigma^2 = \sum \, x^2 \, p(x) - \mu^2 \qquad. math


 * Example 1b **

In the distribution below, calculate Var(X)


 * __Solution__**


 * First calculate the mean (Expected Value) **

math . \qquad \mu = 1.35 \qquad. math


 * Then fill in the table as shown **

math \\ . \qquad E(X^2) = \sum \, x^2 \, p(x) = 2.95 \qquad. \\ . \\ .\text{So} \\. \\ . \qquad \sigma^2 = E(X^2) - \mu^2 \\. \\ . \qquad \sigma^2 = 2.95 - 1.35^2 \\. \\ . \qquad \sigma^2 = 1.1275 math


 * Variance Theorems **

... ... Var(aX) = a 2 Var(X)

... ... Var(X + b) = Var(X)


 * {This second result is because the mean has changed but the spread of data is not affected} **

... ... Eg: Var(3X + 2) = 9Var(X)


 * Standard Deviation **

The ** Standard Deviation ** of X is the **__square root__** of the ** Variance **.

Hence the Standard Deviation is simply sigma

math . \qquad \sigma = \sqrt{Var(X)} \qquad. math

The Standard Deviation is useful because it cancels out the x 2 in the calculation for Variance. Therefore Standard Deviation has the same magnitude as the original data.


 * Example 1c **

In the above example, the variance was:

math \\ . \qquad Var(X) = \sigma^2 = 1.1275 \qquad. \\ . \\ . \textit{hence} \\. \\ . \qquad SD(X) = \sigma = 1.0618 math

Mean and Standard Deviation on the CAS Calculator

From the **Main Menu** select the **Statistics** package.

Enter **x** into **list1** and **Pr(X=x)** into **list2**

Go to the ** Calc ** menu and Select ** One-Variable **

Set **Xlist** to **List1** Set **Freq** to **List2**

math \\ . \qquad \text{Mean will appear as } \bar{x} \\. \\ . \qquad \text{Standard Deviation will appear as } x\sigma_n \\. \\ . \qquad \qquad \qquad \qquad \qquad \qquad \qquad \quad \text{ or } \;\; \sigma_x \qquad. math

For the Variance you will need to square the Standard Deviation


 * Confidence Intervals **

In many distributions, __approximately__ 95% of the data will lie between two standard deviations from the mean.

This means that __approximately__ 95% of the data will lie between

math . \qquad \mu - 2\sigma \text{ and } \mu + 2\sigma \qquad. math

OR

math . \qquad Pr(\mu - 2\sigma \leqslant X \leqslant \mu + 2\sigma) \approx 0.95 \qquad. math

This is known as the 95% confidence interval.

(not in the course - why not is a very deep mystery, they are mentioned in continuous distributions)
 * Other Confidence Intervals **


 * 68% confidence interval ** (1 standard deviation from the mean)

math . \qquad Pr(\mu - \sigma \leqslant X \leqslant \mu + \sigma) \approx 0.68 \qquad. math


 * 99.7% confidence interval ** (3 standard deviations from the mean)

math . \qquad Pr(\mu - 3\sigma \leqslant X \leqslant \mu + 3\sigma) \approx 0.997 \qquad. math


 * Example 2 **

The probability distribution of X is given by math . \qquad p(x) = \dfrac{x^2}{54} \qquad \text { where } \; x \in \{ 2, 3, 4, 5 \} \qquad. math ... ... a) find the mean, variance and the standard deviation correct to 4 decimal places ... ... b) hence find the probability that x is within 2 standard deviations of the mean correct to 4 decimal places

__**Solution**__
 * First fill out the probability distribution table **


 * Mean (Expected Value)[[image:bhs-methods34/10Dcalc2.gif width="262" height="356" align="right"]] **

math \\ . \qquad \mu = E(X) = \Sigma \; xp(x) \\. \\ . \qquad \mu = 2 \times \dfrac{4}{54} + 3 \times \dfrac{9}{54} + 4 \times \dfrac{16}{54} + 5 \times \dfrac{25}{54} \qquad. \\ . \\ . \qquad \mu = 4\dfrac{8}{54} \\. \\ . \qquad \mu \approx 4.1481 math


 * Variance **

math \\ . \qquad E(X^2) = 4 \times \dfrac{4}{54} + 9 \times \dfrac{9}{54} + 16 \times \dfrac{16}{54} + 25 \times \dfrac{25}{54} \qquad .\\. \\ . \qquad E(X^2) = 18\dfrac{6}{54} \\. \\ . \qquad E(X^2) \approx 18.1111 \\. \\ . \\ . \qquad \sigma^2 = 18.1111 - 4.1481^2 \\. \\ . \qquad \sigma^2 = 0.9040 math


 * Standard Deviation **

math . \qquad \sigma = \sqrt{0.9040} \qquad. \\ . \\ . \qquad \sigma = 0.9508 math


 * Confidence Interval **

math \\ . \qquad \mu-2\sigma = 2.2465 \qquad. \\ . \\ . \qquad \mu+2\sigma = 6.0497 math

but x = {2, 3, 4, 5} so math \\ . \qquad Pr(\mu-2\sigma \leqslant X \leqslant \mu+2\sigma) \qquad. \\ . \\ . \qquad = Pr(2.2465 \leqslant X \leqslant 6.0497) \\. \\ . \qquad = Pr(3 \leqslant X \leqslant 5) \\. \\ . \qquad = \dfrac{9}{54} + \dfrac{16}{54} + \dfrac{25}{54} \\ .\\ . \qquad = \dfrac{50}{54} \\ \\ . \qquad = 0.9259 \qquad (92.59\%) math

We expected this confidence interval to be approximately 95%. This is sufficiently close. .