CIE AS Maths: Probability & Statistics 1

Topic Questions

1.3 Working with Data

1a
Sme Calculator
3 marks

In a conkers competition the number of strikes required in order to smash an opponent’s conker (and thus win a match) is recorded for 15 matches and are given below.

6 2 9 10 9 12 5  
8 7 5 11 9 17 8 9

Find the median, the upper and lower quartiles, and the interquartile range for the number of strikes required to smash a conker.

1b
Sme Calculator
2 marks

An outlier is defined as any data value that falls either more than 1.5 x  (interquartile range) above the upper quartile or less than 1.5 x (interquartile range) below the lower quartile.

Identify any outliers.

Did this page help you?

2a
Sme Calculator
3 marks

A hotel manager recorded the number of towels that went missing at the end of each day for 12 days. The results are below.

2 4 1 0 3 4
3 9 3 2 4 5

Find the mean and the standard deviation for the number of towels missing at the end of each day.

You may use the summary statistics n space equals space 11 comma space sum x space equals space 40 comma space sum x to the power of 2 space end exponent equals space 190 with the formulae x with bar on top equals fraction numerator sum x over denominator n end fraction space and space sigma equals square root of fraction numerator sum x squared over denominator n end fraction minus open parentheses x with bar on top close parentheses squared end root.

2b
Sme Calculator
2 marks

An outlier is defined as any data value lying outside of 2 standard deviations of the mean. Find any outliers in the data .

Did this page help you?

3a
Sme Calculator
1 mark

Joe counts the number of different species of bird visiting his garden each day for a week. The results are given below.

7 8 5 12 9 7 3


Calculate the mean number of different species of bird visiting Joe’s garden.

3b
Sme Calculator
3 marks

Joe continues to record the number of different species of bird visiting his garden each day for the rest of the month and calculates the mean number of different species is 9.25 for the remaining 24 days.

Joe says, using the data from the whole month, he would expect to see 9 different species every day. Explain whether Joe is correct. You must support your answer with clear working.

3c
Sme Calculator
1 mark

Joe decides he will repeat his investigation into species of bird visiting his garden three months later. That month he finds the mean number of different species visiting his garden per day is 4.5. Joe is concerned that this indicates some species of bird are dying out. Suggest a reason why this may not be the case.

Did this page help you?

4a
Sme Calculator
3 marks

The cumulative frequency diagram below shows the length of 100 phone calls, in minutes, made to a computer help centre for one morning.q4-easy-2-3-working-with-data-edexcel-a-level-maths-statistics

(i)
Use the cumulative frequency graph to estimate the upper and lower quartiles.
(ii)
Hence find the  interquartile range.
4b
Sme Calculator
3 marks

In the afternoon, on the same day, the length of another 100 phone calls to the computer help centre were recorded. The median length of these calls was 15 minutes; the interquartile range was 9 minutes.

Compare the location (median) and spread (interquartile range) of the calls in the morning and the afternoon.

Did this page help you?

5a
Sme Calculator
5 marks

Shara is practising the long jump, recording her distances to the nearest 10 cm. During one practice session, Shara attempts the long jump 23 times and the distances she achieved are listed below.

 

                        3.4       3.1       3.5       3.8       4.1       2.8       3.2            3.0

                        3.2       3.6       3.1       2.9       3.9       3.1       2.7            3.4

                        3.1       3.2       3.5       3.3       3.6       2.8       4.0

 

(i)
Draw an ordered, stem-and-leaf diagram to illustrate Shara’s long jump distances.
(ii)
Find the median and mode distances.
5b
Sme Calculator
3 marks
(i)
Find the mean of Shara’s long jump distances.
(ii)
By considering the relative sizes of the mean, median and mode, comment on the skewness of Shara’s long jump distances.

Did this page help you?

6a
Sme Calculator
3 marks

Two geologists are measuring the size of rocks found on a beach in front of a cliff.
The geologists record the greatest length, in millimetres, of each rock they find at distances of 5 m and 25 m from the base of the cliff. They randomly choose 20 rocks at each distance.  Their results are summarised in the table below.

Distance from cliff base 5 m 25 m
Number of rocks, n 20 20
bold sum bold italic x 3885 2220
sum bold italic x to the power of bold 2 369 513.75 287 580

Using the formulae  stack x space with bar on top equals fraction numerator straight capital sigma x over denominator n end fraction  and  σ = square root of fraction numerator sum x squared over denominator n end fraction minus open parentheses fraction numerator sum x over denominator n end fraction close parentheses squared end root, find the mean and standard deviation for the size of rocks at both 5 straight m and 25 straight m from the base of the cliff.

6b
Sme Calculator
2 marks

Compare the location (mean) and spread (standard deviation) of the size of rocks at 5 straight m and 25 straight m from the base of the cliff.

6c
Sme Calculator
2 marks

In this instance, an outlier is determined to be any data value that lies outside one standard deviation of the mean (x with bar on top±σ).

(i)
Find the smallest rock that is not an outlier at 5 straight m from the base of the cliff.

(ii)
Briefly explain why there cannot be any small rock outliers at 25 straight m from the base of the cliff.

Did this page help you?

7a
Sme Calculator
3 marks

A second hand car business specialises in dealing with cars valued under £5000.
For one day the manager records the sale price of the 35 cars the business sells.

The manager codes the sale prices using the formula Y equals fraction numerator X minus 2500 over denominator 100 end fraction,  where £ X is the sale price of a car.

The coded data for Y  is summarised in the box-and-whisker diagram below.

1-3-s-q---q7a---medium---cie-a-level-statistics

Use the box-and-whisker plot to find: 

    (i)     The median sale price of a car on this day.

    (ii)    The sale price of the least expensive car sold on this day.

7b
Sme Calculator
4 marks

In the last five minutes of business two more cars are sold – one for £3400 and one for £3600.  

(i)
Find the Y values for these cars.
(ii)
 Describe and justify the effect these two late sales have on the median sale price of a car for this day.

 

Did this page help you?

1a
Sme Calculator
4 marks

As part of an experiment, 15 maths teachers are asked to solve a riddle and their times, in minutes, are recorded:

8 12 19 20 20
21 22 23 23 23
25 26 27 37 39

An outlier is an observation which lies more than  ± 2  standard deviations away from the mean.

Show that there is exactly one outlier.

1b
Sme Calculator
2 marks

State, with a reason, whether the mean or the median would be the most suitable measure of central tendency for these data.

1c
Sme Calculator
2 marks

15 history teachers also completed the riddle; their times are shown below in the box plot:q1c-hard-2-3-working-with-data-edexcel-a-level-maths-statistics Explain what the cross (×) represents on the box plot above. Interpret this in context.

1d
Sme Calculator
4 marks

By comparing the distributions of times taken to complete the riddle, decide which set of teachers were faster at solving the riddle.

Did this page help you?

2a
Sme Calculator
3 marks

Hugo, a newly appointed HR administrator for a company, has been asked to investigate the number of absences within the IT department.  The department contains 23 employees, and the box plot below summarises the data for the number of days that individual employees were absent during the previous quarter.q2-hard-2-3-working-with-data-edexcel-a-level-maths-statistics

An outlier is an observation that falls either more than 1.5 (interquartile range) above the upper quartile or less than 1.5 (interquartile range) below the lower quartile.

Show that these data have an outlier, and state its value.

2b
Sme Calculator
4 marks

For the 23 employees within the department, Hugo has the summary statistics:

                     sum x space equals space 286 space and space sum x squared space equals space 4238

Hugo investigates the employee corresponding to the outlier value found in part (a) and discovers that this employee had a long-term illness.  Hugo decides not to include that value in the data for the department.

Assuming that there are no other outliers, calculate the mean and standard deviation of the number of days absent for the remaining employees.

Did this page help you?

3a
Sme Calculator
4 marks

Sam, a zoologist, is a member of a group researching the masses of gentoo penguins.  The research group takes a sample of 100 male and 100 female penguins and records their masses.

An outlier is an observation that falls either more than 1.5 cross times(interquartile range) above the upper quartile or less than 1.5 cross times  (interquartile range) below the lower quartile.

Given that values are outliers if they are less than 4.2 kg or more than 8.5 kg, calculate the upper and lower quartiles for the mass of the 200 gentoo penguins.

3b
Sme Calculator
5 marks

Casey is another member of Sam’s research group.  She believes that the masses of male and female gentoo penguins follow different distributions.  The cumulative frequency graphs below show the masses of the male and female gentoo penguins in the sample.q3-hard-2-3-working-with-data-edexcel-a-level-maths-statistics

By calculating a measure of central tendency and a measure of variation, compare the two distributions.

Did this page help you?

4a
Sme Calculator
4 marks

Ms Chew is an accountant who is examining the length of time it takes her to complete jobs for her clients.  Ms Chew looks at her spreadsheet and lists the number of hours it took her to complete her last 12 jobs:

9 2 - 6 5 2 - 6 21 5 4 8

An outlier is an observation which lies more than  2 standard deviations away from the mean.

Show that 21 is the only outlier.

4b
Sme Calculator
3 marks

Ms Chew looks at her handwritten records and finds that the value 21 was typed into the spreadsheet incorrectly.  It should have been 12.

Without further calculations, explain the effect this would have on the:

(i)
mean
(ii)
standard deviation
(iii)
median.

Did this page help you?

5a
Sme Calculator
2 marks

Jan is concerned about the availability of toilet rolls following news of supply issues due to a lack of lorry drivers.  Jan visits 19 shops, starting at the local village store and visiting numerous shops in the local town centre, before travelling to some of the larger out of town supermarkets.  At each shop, Jan counts the number of standard 4 packs of toilet rolls available for customers to buy. 

Jan records the results in a stem-and-leaf diagram, shown below. 

    n = 19

2

3     represents 23 standard 4 packs of toilet rolls

 

 

0

2   7

1

3   5   6

2

4   7   7   8   9   9

3

5   5   6   7

4

1   3   3

5

0

 

Show that the median is 29 standard 4 packs and find the lower quartile.

5b
Sme Calculator
4 marks

Jan later visits another shop and again counts the number of standard 4 packs of toilet rolls available. Jan did not have anywhere to record the number and later forgets, only recalling that it was between 10 and 19.
   (i)     Explain the effect this new value would have on the median.
   (ii)    Explain the effects this new value could have on the lower quartile.

 

5c
Sme Calculator
1 mark

By considering the range of the number of standard 4 packs of toilet rolls available at the shops Jan visited, suggest a problem with Jan’s method of collecting the data.

Did this page help you?

6a
Sme Calculator
3 marks

A road safety team are investigating how fast vehicles are travelling along a village road with a speed limit of 40 mph. The team record the speeds of 120 vehicles travelling along the road one day during the busy morning rush hour period. 

The histogram below shows the speeds of the 120 vehicles.

1-3-s-q---q6a---hard---cie-a-level-statistics

Determine: 

(i)   the modal class 

(ii)   the class containing the median

6b
Sme Calculator
3 marks

The road safety team decide they will recommend anti-speeding measures if 30% or more of the recorded speeds are higher than 2.5 mph less than the speed limit.

Determine whether the team will be recommending anti-speeding measures.

Did this page help you?

7a
Sme Calculator
2 marks

Andrew is investigating the life expectancy at birth for countries in Europe and Asia.

He takes a random sample of 11 countries from each continent, and using data from a reliable online source he notes the life expectancies at birth from the year 2010 for each of the countries in his samples. 

Andrew codes the data for Asia, using the formula Y space equals space a space open parentheses X minus b close parentheses,  where X is the life expectancy and a and b are positive integers. 

(i)
Given that the interquartile ranges for X and Y are 5.23 and 523 respectively, write down the value of a
(ii)
Given further that the medians for X and Y are 74.15 and negative 85 respectively, find the value of b
7b
Sme Calculator
3 marks

Andrew uses the same coding formula for the 11 European countries and calculates the coded median and interquartile range to be and 551 respectively.
Compare the central tendency and variation of the life expectancies at birth in 2010 for the European and Asian countries in Andrew’s samples.

Did this page help you?

1a
Sme Calculator
2 marks

The lengths of unicorn horns are measured in cm.  For a group of adult unicorns, the lower quartile was 87 cm and the upper quartile was 123 cm. For a group of adolescent unicorns, the lower quartile was 33 cm and the upper quartile was 55 cm.

An outlier is an observation that falls either more than 1.5 x (interquartile range) above the upper quartile or less than 1.5 x  (interquartile range) below the lower quartile.

Which of the following adult unicorn horn lengths would be considered outliers?

32 cm 96 cm 123 cm 188 cm
1b
Sme Calculator
2 marks

Which of the following adolescent unicorn horn lengths would be considered outliers?

12 cm 52 cm 86 cm 108 cm
1c
Sme Calculator
2 marks
(i)
State the smallest length an adult unicorn horn can be without being considered an outlier.

(ii)
State the smallest length an adolescent unicorn horn can be without being considered an outlier.

Did this page help you?

2a
Sme Calculator
4 marks

The cumulative frequency diagram below shows completion times for 100 competitors at the 2019 Rubik’s cube championships. The quickest completion time was 9.8 seconds and the slowest time was 52.4 seconds.q2-medium-2-3-working-with-data-edexcel-a-level-maths-statistics

The grid below shows a box plot of the 2020 championship data.  Draw a box plot on the grid to represent the 2019 championship data. q2a-medium-2-3-working-with-data-edexcel-a-level-maths-statistics

2b
Sme Calculator
3 marks
(i)
Compare the distribution of completion times for the 2019 and 2020 championships.

(ii)
Given that the 2020 championships happened after the global pandemic, during which many competitors spent months at home, interpret your findings from part (b)(i).

Did this page help you?

3
Sme Calculator
7 marks

Students at two Karate Schools, Miyagi Dojo and Cobra Kicks, measured the force of a particular style of hit.  Summary statistics for the force, in newtons, with which the students could hit are shown in the table below:

  bold italic n bold capital sigma bold italic x bold capital sigma bold italic x to the power of bold 2
Miyagi Dojo 12 21873 41532545
Cobra Kicks 17 29520 52330890

(i)
Calculate the mean and standard deviation for the forces with which the students could hit.

(ii)
Compare the distributions for the two Karate Schools.

Did this page help you?

4a
Sme Calculator
4 marks

The heights, in metres, of a flock of 20 flamingos are recorded and shown below:

0.9 0.9 1.0 1.0 1.2 1.2 1.2 1.2 1.2 1.2
1.3 1.3 1.3 1.4 1.4 1.4 1.4 1.5 1.5 1.6


An outlier is an observation that falls either more than 1.5 x
(interquartile range) above the upper quartile or less than 1.5 x  (interquartile range) below the lower quartile.

(i)
Find the values of Q1, Q2 and Q3.
(ii)
Find the interquartile range.
(iii)
Identify any outliers.
4b
Sme Calculator
3 marks

Using your answers to part (a), draw a box plot for the data.q4b-medium-2-3-working-with-data-edexcel-a-level-maths-statistics

Did this page help you?

5a
Sme Calculator
3 marks

The number of daily Covid-19 vaccinations reported by one vaccination centre over a 14-day period are given below:

237 264 308 313 319 352 378
378 405 421 428 450 465 583


Given that  sum x space equals space 5301
  and  sum x squared equals 2 space 113 space 195, calculate the mean and standard deviation for the number of daily vaccinations.

5b
Sme Calculator
2 marks

An outlier is an observation which lies more than  ±2  standard deviations away from the mean.

Identify any outliers for this data.

5c
Sme Calculator
2 marks

Outliers are to be removed from the data, and the mean and standard deviation recalculated. Without making any further calculations, state the effect on the value of the mean and standard deviation removing outliers would have. 

Did this page help you?

6
Sme Calculator
7 marks

The cumulative frequency diagram below shows the distribution of income of 120 managers across a supermarket chain.q6-medium-2-3-working-with-data-edexcel-a-level-maths-statistics

The income of a sample of 120 other employees across the supermarket chain are recorded in the table below.

Income I (£ Thousand) Frequency
0 ≤ I <20 34
20 ≤ I <40 28
40 ≤ I <60 27
60 ≤ I <80 17
80 ≤ I <100 10
100 ≤ I <120 4


On the grid above, draw a cumulative frequency graph to show the data for the other employees and compare the income of managers and other employees.

Did this page help you?

7a
Sme Calculator
2 marks

The stem-and-leaf diagram below shows the number of attempts 27 gamers took to complete the last level of a computer game

               n space equals space 27 

2

8     represents 28 attempts to complete the last level

 

0

5   5   6

(3)

1

1   4   4   7   8   8   9

(7)

2

0   3   4   4   5   7   8   8   8   8   9                           

(11)

3

1   4   8   9

(4)

4

1   2

(2)

 

(i)
How can you tell, without any calculations, that the modal class for the number of attempts was 20 to 29?
(ii)
Explain why it is not necessarily the case that the gamer who took the least number of attempts was the fastest to complete the level.
7b
Sme Calculator
3 marks

Find the median, the lower quartile and the upper quartile.

7c
Sme Calculator
1 mark

The computer game developer decided that the last level was not difficult enough so added an extra, even harder, level. Briefly describe what you would expect to happen to the median if the number of attempts taken by the same 27 gamers to complete the new last level were recorded.

Did this page help you?

8a
Sme Calculator
3 marks

A disgruntled chocaholic is complaining that their favourite tub of chocolates seems to have very few of their favourite ‘toffee deluxe’ sweets in, compared to the other four available. In a bid to show they have a valid complaint, the chocaholic bought 10 tubs of the sweets and counted how many of each of the five sweets are contained in each tub. The results are summarised in the box and whisker diagrams below.

1-3-s-q---q8a---medium---cie-a-level-statistics

(i)
Write down the median number of ‘toffee deluxe’ sweets in a tub.
(ii)

Work out the interquartile range for ‘toffee deluxe’.

8b
Sme Calculator
2 marks

What value is indicated by the (cross times) on the ‘toffee deluxe’ box plot and why has this been plotted individually?

8c
Sme Calculator
3 marks
(i)
Briefly compare the central tendency and variation of ‘toffee deluxe’ with the other four sweets.
(ii)
Do you think the disgruntled chocaholic has a valid complaint regarding the number of  ‘toffee deluxe’ sweets in a tub? Fully justify your answer.

Did this page help you?

9a
Sme Calculator
5 marks

A café owner is analysing data about the number of customers they serve on weekdays during the busy lunchtime period between 12pm and 2pm.  The data below shows the number of customers who were served between 12pm and 2pm each weekday over a three-week period. 

52

64

58

49

52

71

63

52

56

58

53

52

47

68

56

(i)
Find the mode and median number of customers per weekday during the lunchtime period.
(ii)
Given the summary statistics sum x space equals space 851  and  sum x squared space equals 48 space 965, find the mean and the standard deviation.
9b
Sme Calculator
3 marks
(i)
By considering the relative values of the mean, median and mode, state whether there is skewness in the data, giving a reason for your answer.
(ii)

A measure of skewness can be found by calculating

               fraction numerator 3 cross times open parentheses mean minus median close parentheses over denominator standard space deviation end fraction

Find the value of this measure of skewness for the number of customers the café served during the lunchtime period over the three weeks.

Did this page help you?

10a
Sme Calculator
2 marks

David buys and sells small antique items as a hobby, aiming to make a few hundred pounds profit each month.  Over the last two years David has kept a record of how much profit he has made each month left parenthesis £ X right parenthesis and wants to analyse the figures by calculating his mean monthly profit – and, to account for the variation in the antiques market – the standard deviation of his monthly profit.

David codes the data using the relationship Y space equals space 0.1 space X minus 50.
For one of the months, the straight Y value is negative 6. Work out the profit David made in this month.

10b
Sme Calculator
2 marks

Given that the mean of Y is 2.3 and the standard deviation of Y is 0.8, find the mean amount of profit David made and the standard deviation for the two-year period.

Did this page help you?

1a
Sme Calculator
2 marks

Marya is consistently late for work. David, Marya’s boss, records the number of minutes that she is late during the next six days. David calculates the mean is 18 minutes and the variance is 210 minutes². On one of the six days, Marya was 50 minutes late.

 Marya states that the 50 minutes should not be included as it is an outlier. 

(i)
Give a reason why Marya wants the 50 minutes to be excluded from the data set.
(ii)

Give a reason why David wants the 50 minutes to be included in the data set.

1b
Sme Calculator
5 marks

Marya tells David that she was 50 minutes late that day due to a road accident, she shows David the traffic report as evidence. 

David agrees to remove the 50 from the data set. Calculate the mean and standard deviation for the remaining values.

Did this page help you?

2a
Sme Calculator
6 marks

The ages of 27 children invited to a birthday party are summarised in the table below. 

Age ( bold italic a years)

Frequency

0-2

6

2-5

9

5-6

10

6+

2

 

Given that the eldest child invited to the party was 7 years old, estimate the mean and standard deviation of the ages of the children.

2b
Sme Calculator
2 marks

Briefly describe any problems you encountered in answering part (a) and how you overcame them.

2c
Sme Calculator
1 mark

State one disadvantage of grouping data before analysing it.

Did this page help you?

3a
Sme Calculator
3 marks

The cumulative frequency graph below shows the information about the lengths of time taken for 80 students to run a lap of the sports hall.q3-very-hard-2-3-working-with-data-edexcel-a-level-maths-statistics

Complete the table below:

Time (t seconds) 20 < t ≤ 40 40 < t ≤ 60 60 < t ≤ 80 80 < t ≤ 100
Frequency 8      
3b
Sme Calculator
3 marks

Estimate the mean and the interquartile range.

3c
Sme Calculator
3 marks

Any times under 32 seconds or over 90 seconds are classed as outliers.
Estimate the percentage of the students’ times that are outliers.

Did this page help you?

4a
Sme Calculator
1 mark

Tim has just moved to a new town and is trying to choose a doctor’s surgery to join, HealthHut or FitFirst. He wants to register with the one where patients get seen faster. 

He takes a sample of 150 patients from both surgeries and finds their waiting times during their last visit.  Tim illustrates the data for each surgery using box-and-whisker plots as shown below.

1-3-s-q---q4a---very-hard---cie-a-level-statistics

Give a reason as to why box-and-whisker plots are suitable for Tim’s purposes.

4b
Sme Calculator
1 mark

Briefly describe why there is an individual data value for FitFirst denoted with a cross (×) on its box-and-whisker plot.

4c
Sme Calculator
4 marks

Compare the two distributions of waiting times in context.

Did this page help you?

5a
Sme Calculator
4 marks

The captain of a ferry that transports vehicles is investigating the masses of cars in the UK. They collect data on the masses ( m k g ) of 30 randomly selected cars. 

The data can be summarised by sum open parentheses m minus a close parentheses space equals space 90  and sum open parentheses m minus a close parentheses squared equals space 2 space 182 space 818, where a is a constant.

 Given that the mean mass of the 30 cars, m with bar on top equals 1423 space kg , find the value of the constant and the standard deviation of the masses of the 30 cars.

5b
Sme Calculator
3 marks

Given that an outlier is defined as any value m satisfying open vertical bar m minus m with bar on top close vertical bar space greater than space 2 sigma  find the lowest and highest masses (to the nearest whole kilogram) that are not considered outliers.

 

Did this page help you?

6a
Sme Calculator
3 marks

In a safety test, firemen carry out an experiment where a sofa in a typical living room setup is set on fire and allowed to burn.  After 10 minutes the fire is extinguished and a fire damage expert assesses and records the percentage of the room destroyed.

The firemen repeat this experiment with identical equipment except that the sofa is treated with a fire-resistant chemical. 

This process is repeated several times and the percentages recorded are listed below. 

Without fire-resistant chemical (%):                85       76       83       48       91

                                                                        80       67       82       79       85

                                                                        76       72       84       76       69

 

With fire-resistant chemical (%):                     40       51       52       48        60

                                                                        48       77       46       40        51

                                                                        54       51       46       45        49

 

Draw an ordered, back-to-back stem and leaf diagram to illustrate the data.

6b
Sme Calculator
2 marks

Find the median destruction percentages for both “without chemical” and “with chemical”.

6c
Sme Calculator
2 marks

It is later discovered that two of the values were muddled up. The reading of 48% for “without chemical” should have been in the “with chemical” data whilst the 77% for “with chemical” should have been in the “without chemical” data.
Explain how correcting these errors will affect your answers to part (b).

Did this page help you?