AQA AS Maths: Statistics

Topic Questions

2.3 Working with Data

1a
Sme Calculator
3 marks

In a conkers competition the number of strikes required in order to smash an opponent’s conker (and thus win a match) is recorded for 15 matches and are given below.

6 2 9 10 9 12 5  
8 7 5 11 9 17 8 9

Find the median, the upper and lower quartiles, and the interquartile range for the number of strikes required to smash a conker.

1b
Sme Calculator
2 marks

An outlier is defined as any data value that falls either more than 1.5 x  (interquartile range) above the upper quartile or less than 1.5 x (interquartile range) below the lower quartile.

Identify any outliers.

Did this page help you?

2a
Sme Calculator
2 marks

A hotel manager recorded the number of towels that went missing at the end of each day for 12 days.  The results are below.

2 4 1 0 3 4
3.2 9 3 2 4 5

The data value 3.2 is not an outlier but is an error.
Explain why 3.2 is an error and why it should be removed from the data set.

2b
Sme Calculator
3 marks

With the data value 3.2 removed, find the mean and the standard deviation for the number of towels missing at the end of each day.
You may use the summary statistics  n=11straight capital sigma x=37straight capital sigma x squared=181  with the formulae x with bar on top = fraction numerator straight capital sigma x over denominator n end fraction and  σ= square root of fraction numerator straight capital sigma x squared over denominator n end fraction minus open parentheses x with bar on top close parentheses squared end root

2c
Sme Calculator
2 marks

An outlier is defined as any data value lying outside of 2 standard deviations of the mean.  Find any outliers in the data (still excluding 3.2) and justify whether these should be removed from the data set or not.

Did this page help you?

3a
Sme Calculator
1 mark

Joe counts the number of different species of bird visiting his garden each day for a week. The results are given below.

7 8 5 12 9 7 3


Calculate the mean number of different species of bird visiting Joe’s garden.

3b
Sme Calculator
3 marks

Joe continues to record the number of different species of bird visiting his garden each day for the rest of the month and calculates the mean number of different species is 9.25 for the remaining 24 days.

Joe says, using the data from the whole month, he would expect to see 9 different species every day. Explain whether Joe is correct. You must support your answer with clear working.

3c
Sme Calculator
2 marks

Later, Joe notices that one of the values in his data is 8.8.  Explain why this must be an error and justify whether you think this value should be removed from the data set or not.

Did this page help you?

4a
Sme Calculator
3 marks

The cumulative frequency diagram below shows the length of 100 phone calls, in minutes, made to a computer help centre for one morning.q4-easy-2-3-working-with-data-edexcel-a-level-maths-statistics

(i)
Use the cumulative frequency graph to estimate the 10th and 90th percentiles.

(ii)
Find the 10th to 90th interpercentile range.
4b
Sme Calculator
3 marks

In the afternoon, on the same day, the length of another 100 phone calls to the computer help centre were recorded.  The median length of these calls was 15 minutes and the 10th to 90th interpercentile range was 18 minutes.

Compare the location (median) and spread (interpercentile range) of the calls in the morning and the afternoon.

Did this page help you?

5a
Sme Calculator
3 marks

Two geologists are measuring the size of rocks found on a beach in front of a cliff.
The geologists record the greatest length, in millimetres, of each rock they find at distances of 5 m and 25 m from the base of the cliff.  They randomly choose 20 rocks at each distance.  Their results are summarised in the table below.

Distance from cliff base m 25 m
Number of rocks, n 20 20
straight capital sigma x 3885 2220
S subscript x x end subscript 369 513.75 287 580

Using the formulae  stack x space with bar on top equals fraction numerator straight capital sigma x over denominator n end fraction  and  σ= square root of S subscript x x end subscript over n end root, find the mean and standard deviation for the size of rocks at both 5 m and 25 m from the base of the cliff.

5b
Sme Calculator
2 marks

Compare the location (mean) and spread (standard deviation) of the size of rocks at 5 m and 25 m from the base of the cliff.

5c
Sme Calculator
2 marks

In this instance, an outlier is determined to be any data value that lies outside one standard deviation of the mean (x with bar on top±σ).

(i)
Find the smallest rock that is not an outlier at 5 m from the base of the cliff.

(ii)
Briefly explain why there cannot be any small rock outliers at 25 m from the base of the cliff.

Did this page help you?

6a
Sme Calculator
1 mark

The incomplete box plot below shows data from the large data set regarding cloud cover between May and October 2015 in Cambourne.  Cloud cover is measured in Oktas on a scale from 0 (no cloud cover) to 8 (full cloud cover).q6a-easy-2-3-working-with-data-edexcel-a-level-maths-statistics

Find the interquartile range.

6b
Sme Calculator
2 marks

An outlier is defined as any data value that falls either more than
1.5 x (interquartile range) above the upper quartile or less than
1.5 x (interquartile range) below the lower quartile.

Find the boundaries (fences) at which outliers are defined.

6c
Sme Calculator
3 marks

The random sample of 34 cars has one outlier, a CO2 reading of 250 g/km.
Complete the box plot given that the maximum and minimum values should be located at the boundaries (fences) at which outliers are defined.
Any outliers should be indicated with a cross (cross times).

Did this page help you?

1a
Sme Calculator
4 marks

As part of an experiment, 15 maths teachers are asked to solve a riddle and their times, in minutes, are recorded:

8            12         19         20         20

21         22         23         23         23

25         26         27         37         39

An outlier is an observation which lies more than plus-or-minus 2 standard deviations away from the mean.

Show that there is exactly one outlier.

1b
Sme Calculator
2 marks

State, with a reason, whether the mean or the median would be the most suitable measure of central tendency for these data.

1c
Sme Calculator
2 marks

15 history teachers also completed the riddle; their times are shown below in the box plot:

q4b-1-2-hard-ial-sl-maths-statistics

Explain what the cross (×) represents on the box plot above. Interpret this in context.

1d
Sme Calculator
4 marks

By comparing the distributions of times taken to complete the riddle, decide which set of teachers were faster at solving the riddle.

Did this page help you?

2a
Sme Calculator
3 marks

Hugo, a newly appointed HR administrator for a company, has been asked to investigate the number of absences within the IT department. The department contains 23 employees, and the box plot below summarises the data for the number of days that individual employees were absent during the previous quarter.

q5a-1-2-hard-ial-sl-maths-statistics

An outlier is an observation that falls either more than 1.5 cross times (interquartile range) above the upper quartile or less than 1.5 cross times (interquartile range) below the lower quartile.

Show that these data have an outlier, and state its value.

2b
Sme Calculator
4 marks

For the 23 employees within the department, Hugo has the summary statistics:

sum x equals space 286 and  sum x squared equals 4238

Hugo investigates the employee corresponding to the outlier value found in part (a) and discovers that this employee had a long-term illness.  Hugo decides not to include that value in the data for the department.

Assuming that there are no other outliers, calculate the mean and standard deviation of the number of days absent for the remaining employees.

Did this page help you?

3a
Sme Calculator
4 marks

Sam, a zoologist, is a member of a group researching the masses of gentoo penguins.  The research group takes a sample of 100 male and 100 female penguins and records their masses.

An outlier is an observation that falls either more than 1.5 cross times(interquartile range) above the upper quartile or less than 1.5 cross times  (interquartile range) below the lower quartile.

Given that values are outliers if they are less than 4.2kg or more than 8.5kg, calculate the upper and lower quartiles for the mass of the 200 gentoo penguins.

3b
Sme Calculator
5 marks

Casey is another member of Sam’s research group.  She believes that the masses of male and female gentoo penguins follow different distributions.  The cumulative frequency graphs below show the masses of the male and female gentoo penguins in the sample.q3-hard-2-3-working-with-data-edexcel-a-level-maths-statistics

By calculating a measure of central tendency and a measure of variation, compare the two distributions.

Did this page help you?

4a
Sme Calculator
4 marks

Ms Chew is an accountant who is examining the length of time it takes her to complete jobs for her clients.  Ms Chew looks at her spreadsheet and lists the number of hours it took her to complete her last 12 jobs:

9 2 - 6 5 2 - 6 21 5 4 8

‘-’ represents a job for which the length of time taken was not recorded.

An outlier is an observation which lies more than  ±2  standard deviations away from the mean.

By first cleaning the data, show that 21 is the only outlier.

4b
Sme Calculator
3 marks

Ms Chew looks at her handwritten records and finds that the value 21 was typed into the spreadsheet incorrectly.  It should have been 12.

Without further calculations, explain the effect this would have on the:

(i)
mean

(ii)
standard deviation

(iii)
median.

Did this page help you?

5a
Sme Calculator
2 marks

Bartholomew is a fan of BMW cars and is using the large data set to compare the masses of BMW cars in 2002 to 2016.

Using your knowledge of the large data set, explain why Bartholomew may not be able to include all the 2002 BMW cars from the large data set in his investigation.

5b
Sme Calculator
5 marks

Bartholomew cleans the data and calculates the following summary statistics:

  Min Max Number of cars bold sum bold italic x bold sum bold italic x to the power of bold 2
2002 1355 2180 126 199490 320295950
2016 1375 2350 410 668330 1109748800

Bartholomew uses the definition that an outlier is more than 2 standard deviations away from the mean.

Using this definition show that both years have outliers.

5c
Sme Calculator
3 marks

Compare the masses of BMW cars in 2002 and 2016.

Did this page help you?

6a
Sme Calculator
4 marks

Kristoff plays a game on his mobile phone where a player must match items together. Upon completion of the game, Kristoff receives a score out of 100. Kristoff plays the game 200 times and attempts to draw a box plot showing the distribution of his scores. However, Kristoff suspects that some of his scores may be outliers, so he defines an outlier as a score that is either more than 1.5 cross times (interquartile range) above the upper quartile or more than 1.5 cross times (interquartile range) below the lower quartile.

Kristoff's three worst scores were 12, 25 and 30 and his three best scores were 85, 93 and 96.

Complete the box plot of Kristoff s scores. Indicate any outliers by using a cross (cross times).

q6a-2-3-working-with-data-hard-aqa-a-level-maths-statistics

6b
Sme Calculator
2 marks

Comment on the skewness of the distribution. Give a reason for your answer.

6c
Sme Calculator
2 marks

The game gives the player a gold medal if the player achieves a score of at least 70. After playing the game 200 times, Kristoff claims that he was awarded a gold medal 60 times.

Comment on the validity of Kristoff s claim.

Did this page help you?

1a
Sme Calculator
2 marks

The lengths of unicorn horns are measured in cm.  For a group of adult unicorns, the lower quartile was 87 cm and the upper quartile was 123 cm.  For a group of adolescent unicorns, the lower quartile was 33 cm and the upper quartile was 55 cm.

An outlier is an observation that falls either more than 1.5 x (interquartile range) above the upper quartile or less than 1.5 x  (interquartile range) below the lower quartile.

Which of the following adult unicorn horn lengths would be considered outliers?

32 cm 96 cm 123 cm 188 cm
1b
Sme Calculator
2 marks

Which of the following adolescent unicorn horn lengths would be considered outliers?

12 cm 52 cm 86 cm 108 cm
1c
Sme Calculator
2 marks
(i)
State the smallest length an adult unicorn horn can be without being considered an outlier.

(ii)
State the smallest length an adolescent unicorn horn can be without being considered an outlier.

Did this page help you?

2a
Sme Calculator
4 marks

The cumulative frequency diagram below shows completion times for 100 competitors at the 2019 Rubik’s cube championships.  The quickest completion time was 9.8 seconds and the slowest time was 52.4 seconds.q2-medium-2-3-working-with-data-edexcel-a-level-maths-statistics

The grid below shows a box plot of the 2020 championship data.  Draw a box plot on the grid to represent the 2019 championship data. q2a-medium-2-3-working-with-data-edexcel-a-level-maths-statistics

2b
Sme Calculator
3 marks
(i)
Compare the distribution of completion times for the 2019 and 2020 championships.

(ii)
Given that the 2020 championships happened after the global pandemic, during which many competitors spent months at home, interpret your findings from part (b)(i).

Did this page help you?

3
Sme Calculator
7 marks

Students at two Karate Schools, Miyagi Dojo and Cobra Kicks, measured the force of a particular style of hit.  Summary statistics for the force, in newtons, with which the students could hit are shown in the table below:

  bold italic n bold capital sigma bold italic x bold capital sigma bold italic x to the power of bold 2
Miyagi Dojo 12 21873 41532545
Cobra Kicks 17 29520 52330890

(i)
Calculate the mean and standard deviation for the forces with which the students could hit.

(ii)
Compare the distributions for the two Karate Schools.

Did this page help you?

4a
Sme Calculator
4 marks

The heights, in metres, of a flock of 20 flamingos are recorded and shown below:

0.4 0.9 1.0 1.0 1.2 1.2 1.2 1.2 1.2 1.2
1.3 1.3 1.3 1.4 1.4 1.4 1.4 1.5 1.5 1.6


An outlier is an observation that falls either more than 1.5 x
(interquartile range) above the upper quartile or less than 1.5 x  (interquartile range) below the lower quartile.

(i)
Find the values of Q1, Q2 and Q3.

(ii)
Find the interquartile range.

(iii)
Identify any outliers.
4b
Sme Calculator
3 marks

Using your answers to part (a), draw a box plot for the data.q4b-medium-2-3-working-with-data-edexcel-a-level-maths-statistics

Did this page help you?

5a
Sme Calculator
3 marks

The number of daily Covid-19 vaccinations reported by one vaccination centre over a 14-day period are given below:

237 264 308 313 319 352 378
378 405 421 428 450 465 583


Given that  straight capital sigma
x= 5301  and  straight capital sigmax2= 2 113 195,  calculate the mean and standard deviation for the number of daily vaccinations.

5b
Sme Calculator
2 marks

An outlier is an observation which lies more than  ±2  standard deviations away from the mean.

Identify any outliers for this data.

5c
Sme Calculator
3 marks

By removing any outliers identified in part (b), clean the data and recalculate the mean and standard deviation.

Did this page help you?

6
Sme Calculator
7 marks

The cumulative frequency diagram below shows the distribution of income of 120 managers across a supermarket chain.q6-medium-2-3-working-with-data-edexcel-a-level-maths-statistics

The income of a sample of 120 other employees across the supermarket chain are recorded in the table below.

Income I (£ Thousand) Frequency
0 ≤ I <20 34
20 ≤ I <40 28
40 ≤ I <60 27
60 ≤ I <80 17
80 ≤ I <100 10
100 ≤ I <120 4


On the grid above, draw a cumulative frequency graph to show the data for the other employees and compare the income of managers and other employees.

Did this page help you?

7a
Sme Calculator
2 marks

An extract of data from the large data set is given below.

Make Engine Size
(cm3)
BMW 1995
FORD 1499
VAUXHALL 1398
VOLKSWAGEN 2967
TOYOTA 1798
BMW 2171
VAUXHALL 1598
TOYOTA 998
FORD 1388
VOLKSWAGEN 1896


Calculate the mean and standard deviation of the engine sizes.

7b
Sme Calculator
2 marks

Any value more than two standard deviations from the mean can be identified as an outlier.

Using this definition of an outlier, show that one of the values from the sample of engine sizes is an outlier. Fully justify your answer.

Did this page help you?

8a
Sme Calculator
4 marks

Mr Wiltshire, a maths teacher, starts every Year 7 lesson by setting his 27 students the task of completing a randomised multiplication grid. Mr Wiltshire suspects that the time of day affects the speed at which his students complete the task. To test his suspicion, he records the times it takes the students to complete the multiplication grid in a morning lesson and then again in an afternoon lesson.

Below are the 27 times, in minutes, it took the students to complete the multiplication grid in the morning.

1.2 space space space space space space space 1.5 space space space space space space space 2.4 space space space space space space space 2.4 space space space space space space space 2.5 space space space space space space space 2.7 space space space space space space space 3.2 space space space space space space space 3.3 space space space space space space space 3.3
3.3 space space space space space space space 3.4 space space space space space space space 3.6 space space space space space space space 3.8 space space space space space space space 3.9 space space space space space space space 3.9 space space space space space space space 4.0 space space space space space space space 4.1 space space space space space space space 4.3
4.8 space space space space space space space 5.1 space space space space space space space 5.3 space space space space space space space 5.5 space space space space space space space 5.7 space space space space space space space 5.8 space space space space space space space 5.8 space space space space space space space 6.1 space space space space space space space 7.2

The grid below shows a box plot of the times in the afternoon lesson. Draw a box plot on the grid to represent the times in the morning lesson.

q7a-2-3-hard-aqa-a-level-maths-statistics

8b
Sme Calculator
3 marks

Explain what is represented by the cross (cross times) on the box plot for the afternoon lesson.

Comment on the skewness of the distribution of the afternoon times. Give a reason for your answer.

8c
Sme Calculator
3 marks

Compare the two distributions of times taken by the 27 students to complete a randomised multiplication grid in the morning and in the afternoon.

Did this page help you?

1a
Sme Calculator
2 marks

Marya is consistently late for work. David, Marya’s boss, records the number of minutes that she is late during the next six days. David calculates the mean is 18 minutes and the variance is 210 minutes². On one of the six days, Marya was 50 minutes late.

Show that 50 is an outlier, using the definition that outliers are more than 2 standard deviations away from the mean.

1b
Sme Calculator
2 marks

Marya states that the 50 minutes should not be included as it is an outlier.

(i)
Give a reason why Marya wants the 50 minutes to be excluded from the data set.

(ii)
Give a reason why David wants the 50 minutes to be included in the data set.
1c
Sme Calculator
5 marks

Marya tells David that she was 50 minutes late that day due to a road accident, she shows David the traffic report as evidence.

David agrees to remove the 50 from the dataset, calculate the new mean and standard deviation for the remaining values.

Did this page help you?

2
Sme Calculator
8 marks

For each scenario state, with a reason, whether the identified outlier should be included or excluded in the data set.

(i)
Alice is collecting the ages of children in a school classroom. The outlier is the age of 29.

(ii)
Benji records the times taken for some athletes to run a mile. The outlier is the time of 7 seconds.

(iii)
Carlos is collecting data on the number of hours of sunlight per day for the city, Burrow, located in the north of the North America. The outlier is the value of 23.4 hours.

(iv)

Daisy is collecting data on the heights of cows; the median height is 161cm. The outlier is the height 189cm.

Did this page help you?

3a
Sme Calculator
3 marks

The cumulative frequency graph below shows the information about the lengths of time taken for 80 students to run a lap of the sports hall.q3-very-hard-2-3-working-with-data-edexcel-a-level-maths-statistics

Complete the table below:

Time (t seconds) 20 < t ≤ 40 40 < t ≤ 60 60 < t ≤ 80 80 < t ≤ 100
Frequency 8      
3b
Sme Calculator
3 marks

Hence estimate the mean and the standard deviation of the times.

3c
Sme Calculator
3 marks

Given that the fastest time was 21 seconds and the slowest time was 100 seconds, show that these values are outliers using the definition that an outlier is more than 2 standard deviations away from the mean.

Did this page help you?

4a
Sme Calculator
3 marks

Tim has just moved to a new town and is trying to choose a doctor’s surgery to join, HealthHut or FitFirst. He wants to register with the one where patients get seen faster. He takes of sample of 150 patients from HealthHut and calculates the range of waiting times as 45 minutes and the variance as 121 minutes².

An outlier is defined as a value which is more than 2 standard deviations away from the mean.

Prove that the sample contains an outlier.

4b
Sme Calculator
2 marks

Tim finds out that the outlier is a valid piece of data and decides to keep the value in his sample.

Which pair of statistical measures would be more appropriate to use when using the sample to compare the doctor’s surgeries: the mean and standard deviation or the median and interquartile range? Give a reason for your answer.

4c
Sme Calculator
1 mark

The box plots below show the waiting times for the two surgeries.q4b-very-hard-2-3-working-with-data-edexcel-a-level-maths-statistics

Given that there is only one outlier for HealthHut, label it on the box plot with a cross (×).

4d
Sme Calculator
4 marks

Compare the two distributions of waiting times in context.

Did this page help you?

5a
Sme Calculator
2 marks

Georgina is investigating the change in carbon monoxide emissions from cars between 2002 and 2016. She uses the large data set to select a random sample of 100 cars from each year.

Using your knowledge of the large data set, explain why Georgina must clean the data before taking a sample.

5b
Sme Calculator
4 marks

Georgina cleans the data and forms two samples and calculates the following summary statistics:

  Min Lower
Quartile
Range Interquartile
range
2002 0.053 0.144 0.560 0.069
2016 0.035 0.166 0.694 0.297

An outlier is an observation that falls either more than 1.5 cross times (interquartile range) above the upper quartile or less than 1.5 cross times (interquartile range) below the lower quartile.

Show that only one of the samples contains outliers.

5c
Sme Calculator
3 marks

Georgina removes the outliers from the relevant sample, describe what effect this has on the range and interquartile range.

Did this page help you?