Edexcel International AS Maths: Statistics 1

Revision Notes

1.2.5 Outliers

Test Yourself

Outliers

What are outliers?

  • Outliers are extreme data values that do not fit with the general pattern of the data
  • They can come from one or two extreme events or from mistakes in the data collection
  • Outliers will affect some statistics that are calculated from the data
    • They can have a big effect on the mean, but not on the median or usually the mode
    • The range will be completely changed by a single outlier, but the interquartile range will not be affected
    • When calculating the mean or the range it is important to decide whether the outlier(s) should be included in the calculations
      • The question will tell you whether to include the outliers or not
      • You may have to decide which value is the outlier to be removed
      • In general outliers are included if they are a valid piece of data and excluded if it is likely that they are erroneous

How are outliers calculated?

  • Most of the time within this syllabus the outliers will be a particular distance either side of the interquartile range
    • The most common way to calculate an outlier will be using the formulae:
      • A value that is less than begin mathsize 16px style Q subscript 1 minus k end style(interquartile range)
      • A value that is greater than begin mathsize 16px style Q subscript 3 plus k end style(interquartile range)
      • k is a constant that will be given to you in the exam, commonly k=1.5
  • Outliers could also be situated a number of standard deviations away from the mean
    • The most common way to calculate an outlier will be using the formulae
      • A value that is less than begin mathsize 16px style x with bar on top minus k sigma end style
      • A value that is greater than size 16px x with size 16px bar on top plus size 16px k size 16px sigma
      • k is a constant that will be given to you in the exam, commonly begin mathsize 16px style k equals 2 end style

How are outliers represented on box plots?

  • On a box plot an outlier is represented as a cross either side of the maximum or minimum value
  • If the maximum or minimum value is discovered to be an outlier, the new maximum or minimum value will need to be found for the box plot
    • If the data value just above the minimum or just below the maximum is known, this will become the new value
    • If the data value is not known, the new minimum or maximum will become the outlier boundary

Worked example

The ages, in years, of a number of children attending a birthday party are given below:

 2,   7,   5,  4,   8,   4,   6,   5,   5,   29,     2,   5,   13,

An outlier is defined as an observation that falls more than 1.5 space cross times the interquartile range above the upper quartile or below the lower quartile

(i)
Identify any outliers within the data set.

 

(ii)

Decide which values (if any) should be removed, justify your answer.

2-3-1-outliers-we-solution

Exam Tip

  • Read the question carefully to determine which type of outlier you should be finding and to make sure you are using the correct method.

You've read 0 of your 0 free revision notes

Get unlimited access

to absolutely everything:

  • Downloadable PDFs
  • Unlimited Revision Notes
  • Topic Questions
  • Past Papers
  • Model Answers
  • Videos (Maths and Science)

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Dan

Author: Dan

Dan graduated from the University of Oxford with a First class degree in mathematics. As well as teaching maths for over 8 years, Dan has marked a range of exams for Edexcel, tutored students and taught A Level Accounting. Dan has a keen interest in statistics and probability and their real-life applications.