Linear Interpolation (Edexcel GCSE Statistics)

Revision Note

Roger

Author

Roger

Expertise

Maths

Median from Grouped Data

How do I find the median for grouped data?

  • Grouped data doesn’t contain the individual data values

    • So we can’t find the exact median in the usual way

  • We can estimate the median for grouped data using linear interpolation

    • This assumes the data is evenly spread across the class containing the median

  • STEP 1
    Identify the class interval containing the median

    • This is the class containing the ‘n over 2th’ data value

      • Divide the total number of values, n, by 2

    • e.g. if there are 80 data values, n over 2 equals 80 over 2 equals 40

      • Consider cumulative frequencies until you find the class interval containing the 40th value

  • STEP 2
    Find ‘how far into’ that class interval the n over 2th data value is

    • e.g. if the interval with the median contains the 36th through 43rd data values

      • then the 40th value is ‘5 values in’ (36, 37, 38, 39, 40)

      • and there are 8 values in the interval

      • so the 40th value is5 over 8of the way into the interval

  • STEP 3
    Multiply the class width of the class interval containing the median by the fraction found in Step 2

    • e.g. if the interval with the median is 50 less or equal than x less or equal than 70

      • the class width is 70 minus 50 equals 20

      • 20 cross times 5 over 8 equals 12.5

  • STEP 4
    Add the result from Step 3 to the lower boundary of the class interval containing the median

    • The result is the estimated median

    • e.g. lower bound of 50 less or equal than x less or equal than 70 is 50

      • The estimated median is 50 plus 12.5 equals 62.5

  • The estimated median can also be found using the following formula:

    • estimated space median equals L plus fraction numerator n over 2 minus C over denominator f end fraction cross times w

      • L is the lower boundary of the class interval containing the median

      • n is the total number of data values

      • C is the cumulative frequency of all the class intervals before the one containing the median

      • f is the frequency of (i.e. the number of values in) the class interval containing the median

      • w is the width of the class interval containing the median (upper boundary minus lower boundary)

    • This formula combines all four steps of the process

      • but it is not on the exam formula sheet

      • So if you want to use it you’ll need to remember it

Exam Tip

  • The formula can be tricky to remember correctly

    • It’s better to understand how the method works

    • Then you don’t need to remember the formula

  • Remember that the median found this way is an estimate

    • You can’t find the exact median without knowing all the data values

Worked Example

A student collected data about the length of time (x hours) students in his school spent listening to music in a given week. He collected data from 50 students in total.  The following table summarises the data:

Time spent, x (hours)

Number of students

0 ≤ x ≤ 10

3

10 < x ≤ 20

19

20 < x ≤ 30

12

30 < x ≤ 40

10

40 < x ≤ 50

5

50 < x ≤ 60

1

Work out an estimate for the median amount of time spent listening to music by the students.

STEP 1: Identify the class interval containing the median

Divide the total number of values, n, by 2

Here n = 50

50 over 2 equals 25

So we’re looking for the interval with the 25th value

Note that this is different from finding the median from a set of data values

In that case we would be looking for the value halfway between the 25th and 26th values

With linear interpolation we don’t have to worry about that!

Add a cumulative frequency column to the table and work out the cumulative frequencies

Time spent, x (hours)

Number of students

Cumulative Frequency

0 ≤ x ≤ 10

3

3

10 < x ≤ 20

19

22

20 < x ≤ 30

12

34

30 < x ≤ 40

10

44

40 < x ≤ 50

5

49

50 < x ≤ 60

1

50

The second class interval goes up to the 22nd data value and the third class interval goes up to the 34th data value

So the median is in the third class interval

The median is in the 20 less than x less or equal than 30 class interval

STEP 2: Find how far into the class interval the n over 2th data value is

The interval with the median contains the 23rd through 34th data values

The 25th data value is ‘3 values in’ to the interval (23, 24, 25)

And there are 12 data values in the interval

3 over 12 equals 1 fourth

So the median is 1/4 of the way into the interval

STEP 3: Multiply the class width by the fraction found in Step 2

Subtract the lower boundary from the upper boundary to find the class width of the 20 less or equal than x less or equal than 30 interval

30-20=10

Multiply by the fraction in Step 2

10 cross times 1 fourth equals 2.5

STEP 4: Add the result from Step 3 to the lower boundary of the class interval

20 + 2.5 = 22.5

Don’t forget the units in your answer!

Estimated median = 22.5 hours

How do I find the median for data on a histogram?

  • A histogram is a way of representing grouped data as a diagram

    • See the 'Histograms & Frequency Polygons' revision note for full details

  • The connection between the frequency density shown on the histogram and the frequency that would be shown in a grouped data table is given by the formula

    • frequency space density equals fraction numerator frequency over denominator class space width end fraction

  • If you are asked to estimate a median for data in a histogram there are two options:

    • You can recreate the grouped data table using the frequency density formula, and then follow the method given above

    • Or you can work out the estimated median directly from the histogram

      • See the following Worked Example for how to do this

Worked Example

The histogram shows the weight, in kg, of 60 newborn bottlenose dolphins.

A histogram for the weights of newborn bottlenose dolphins

Find an estimate for the median weight of the dolphins in the sample, giving your answer correct to two decimal places.
 

60 over 2 equals 30, so to estimate the median we need to find the weight of the 30th dolphin

To find the frequencies represented by the different bars, rearrange  frequency space density equals fraction numerator frequency over denominator class space width end fraction  as  frequency equals frequency space density cross times class space width

 
4 minus 8 space kg colon space space frequency equals 1 cross times 4 equals 4
8 minus 10 space kg colon space space frequency equals 8 cross times 2 equals 16
10 minus 12 space kg colon space space frequency equals 9.5 cross times 2 equals 19
 

The first two classes have a cumulative frequency of 4 plus 16 equals 20
So the median is going to be '10 dolphins into' the 10-12 kg class

The height (frequency density) of the 10-12 kg bar is 9.5
We need to find what width would give a frequency of 10
Use  frequency space density equals fraction numerator frequency over denominator class space width end fraction   and solve for width
 

9.5 equals 10 over width

9.5 cross times width equals 10

width equals fraction numerator 10 over denominator 9.5 end fraction equals 20 over 19 equals 1.0526... equals 1.05 space open parentheses 2 space straight d. straight p. close parentheses

 
That means that the median lies 1.05 kg into the 10-12 kg class interval

10 plus 1.05 equals 11.05

A histogram showing the part of the 10-12 kg class interval calculated in the answer

 
Estimated median = 11.05 kg (2 d.p.)

You've read 0 of your 0 free revision notes

Get unlimited access

to absolutely everything:

  • Downloadable PDFs
  • Unlimited Revision Notes
  • Topic Questions
  • Past Papers
  • Model Answers
  • Videos (Maths and Science)

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Roger

Author: Roger

Roger's teaching experience stretches all the way back to 1992, and in that time he has taught students at all levels between Year 7 and university undergraduate. Having conducted and published postgraduate research into the mathematical theory behind quantum computing, he is more than confident in dealing with mathematics at any level the exam boards might throw at you.