Lesson Plan - Get It!
- Have you heard the joke about the statistician who got soaking wet crossing the river?
He thought he could cross because it was only a foot and a half deep on average.
Clearly knowing the average depth of the river didn't give that statistician enough information.
- What other information might the statistician want to know before crossing the river?
Maybe there is a sharp drop-off in the middle of the river and then a very shallow sandbar. Consider these two scenarios:
- The statistician knows the depth of the river at 5 distinct points. These depths are 0.5, 3.5, 2.5, 0.75, and 0.25 feet.
The average depth of the river is 1.5 feet.
However, with a depth of 3.5 feet at one point in the river, the statistician will get pretty wet. That's because the data are fairly spread out.
- The statistician knows the depth of the river at 5 distinct points. These depths are 1, 1.5, 2, 1.75, and 1.25 feet.
Again, the average depth of the river is 1.5 feet.
However, he won't get as wet because there is much less spread to the data.
We need more than just a measure of average, called mean in statistics, to have enough information about a set of data.
This is where variance and standard deviation come in. They are two measures of variation, closely tied together as you will see, that give information about the spread of data around the mean.
Begin with the idea of variation in data as shown in Variability (Statistics) from Lydia Flynn:
With that introduction to the idea of variability, let's take it a step further.
- Remember how she talked about the deviations?
We can talk about variability by talking about the deviations of each data point, but for a large data set, that's a lot of numbers.
What would be the most helpful is a single number that puts all of those deviations together. We have one! It's called the standard deviation.
Simply put, the standard deviation represents a kind of average of all the deviations within the data set.
Here are the steps to calculating a standard deviation from a data set:
- Find the mean. You will use the mean to calculate all of the deviations.
- Calculate all of the deviations. To do this, take each data point and subtract the mean from it. Some of your deviations will be positive, and some of your deviations will be negative.
- Square all of the deviations. Now you have all the positive numbers.
- Add up all of these squared deviations.
- Divide the sum by the number of data points that you have minus 1. The result is the variance.
- Take the square root of the variance and you have the standard deviation.
Make sure you have those steps written in your notes, and then watch Finding a Sample Standard Deviation from Parabola Magic:
Let's go back to our statistician crossing the river.
In the first scenario, where the depths were much more spread out, the standard deviation is 1.43 feet and the variance is 2.04.
In the second scenario, the standard deviation is 0.40 and the variance is 0.16.
If he had the information about the standard deviation and variance of the data, the statistician would have sufficient information to avoid unexpected depths.
Now that you know what a standard deviation is and how to calculate one, go to the Got It! section to practice your new skills.