Data
Chapters
Formulas for Standard Deviation
Formulas for Standard Deviation
Standard Deviation
measures how spread out the values in a set of data are. We use the Greek letter \(\sigma\) (sigma) as the symbol
for standard deviation, and calculate it using the following formula
Now, that formula looks pretty scary, doesn't it. Don't worry, we're going to work through what it actually says, but to calculate the standard deviation of a list of numbers, we
- Find the mean (average) of the values. The formula calls this (\mu).
- Subtract each value from the mean and square the result to give the squared difference.
- Find the average of the squared differences. That's the variance. This is why we're dividing by \(N\), which refers to the number of elements in our list.
- Take the square root to give the standard deviation.
The formula actually tells you to take all of those steps. Let's see how. We'll work through it with an example.
A Worked Example
Lucy, our Cavalier King Charles Spaniel puppy has lots of brothers and sisters. Her mum has had 16 puppies altogether. Jasmin has decided to work out the standard deviation of the birth weights of all of these puppies.
The birth weights of the 16 puppies are:
Step 2: Subtract the mean from each number in your data set and square the result to give the squared difference. The part of the formula that says
The \(x_i\)s range through all the individual data values: 113, 128, 233, 241, etc, one at a time. We choose one to call \(x_1\), say 113, one to call \(x_2\), say 128, and so on.
Our squared differences (correct to 3 decimal places) are:
- \( (113 - 177.1875)^2 = 4120.035\)
- \(128 - 177.1875)^2 = 2419.410\)
- \((233 - 177.1875)^2 = 3115.035\)
- \(212 - 177.1875)^2 = 1211.910\)
- \( (241 - 177.1875)^2 = 4072.035\)
- \(135 - 177.1875)^2 = 1779.785\)
- \((119 - 177.1875)^2 = 3385.785\)
- \(237 - 177.1875)^2 = 3577.535\)
- \( (240 - 177.1875)^2 = 3945.410\)
- \(156 - 177.1875)^2 = 448.910\)
- \((162 - 177.1875)^2 = 230.660\)
- \(171 - 177.1875)^2 = 46.478\)
- \( (182 - 177.1875)^2 = 23.160\)
- \(164 - 177.1875)^2 = 173.910\)
- \((168 - 177.1875)^2 = 84.410\)
- \(174 - 177.1875)^2 = 10.160\)
Step 3: Find the average of the squared differences. Add them all up and divide by \(N = 16\), the number of data values.
This is the reason for the funny notation in the formula:
Step 4: Finally, take the square root
Standard Deviation of a Sample
Sometimes our data set only provides a sample of the entire population. This might be because it is impossible to collect data for the whole population or because it is too expensive or time consuming for the whole population.
The idea is that we use data from a small subset of the population to predict the values for the whole population.
Using a sample can often give us a good idea of what's going on with the whole population, but we do introduce errors into our values called sample errors
The formula for the standard deviation of a sample is slightly different to the one for the whole population: instead of dividing by \(N\) (the size of the whole population), we divide by \(N - 1\) (the size of the sample minus 1). There are other slight differences in the notation in the formula. We call the sample standard deviation \(s\), and the sample mean \(\overline{x}\). So the formula looks like this:
Our Puppy Example with a Sample
Suppose we only know the weights of 5 of the 16 puppies: 113 g, 240 g, 171 g, 182 g and 174 g.
Then our population is all 16 puppies.
Our sample is the 5 puppies that we know the weights for.
We can use the formula for sample standard deviation to estimate the standard deviation of the entire population. Let's complete the steps to calculate the value:
Step 1: Find the mean of the sample values.
Step 2: Subtract the mean from each weight and square the result.
- \((113 - 176 )^2 = 3969\)
- \((240 - 176)^2 = 4096\)
- \(171 - 176)^2 = 25\)
- \(182 - 176)^2 = 36\)
- \(174 - 176)^2 = 4\)
Step 3: Find the "mean" of the squared differences. Don't forget to divide by \(N - 1\) as we're working with a sample.
Step 4: Take the square root of the sample variance. This is the sample standard variation:
Comparision
When we used the whole population, the mean was \(177.1875\), and the standard deviation was \(42.31\).
When we used the sample, the mean was \(176\), and the standard deviation was \(40.323\).
The sample mean was wrong by less than \(1\%\) and the sample standard deviation was wrong by \(4.7\%\).
Summary
The population standard deviation is given by the formula
Description
This chapter series is on Data and is suitable for Year 10 or higher students, topics include
- Accuracy and Precision
- Calculating Means From Frequency Tables
- Correlation
- Cumulative Tables and Graphs
- Discrete and Continuous Data
- Finding the Mean
- Finding the Median
- FindingtheMode
- Formulas for Standard Deviation
- Grouped Frequency Distribution
- Normal Distribution
- Outliers
- Quartiles
- Quincunx
- Quincunx Explained
- Range (Statistics)
- Skewed Data
- Standard Deviation and Variance
- Standard Normal Table
- Univariate and Bivariate Data
- What is Data
Audience
Year 10 or higher students, some chapters suitable for students in Year 8 or higher
Learning Objectives
Learn about topics related to "Data"
Author: Subject Coach
Added on: 28th Sep 2018
You must be logged in as Student to ask a Question.
None just yet!