Central Limit Theorem (Edexcel A Level Further Maths: Further Statistics 1)

Revision Note

Mark

Author

Mark

Expertise

Maths

Sample Means

How do I take samples of random variables?

  • It's easier to explain this with an example
  • Let X tilde straight B open parentheses 10 comma fraction numerator space 1 over denominator 2 end fraction close parentheses where X is the number of heads in 10 flips of a fair coin
    • straight E open parentheses X close parentheses equals n p equals 5 heads
    • Var open parentheses X close parentheses equals n p open parentheses 1 minus p close parentheses equals 2.5
  • Imagine doing this experiment each day for 7 days
    • X subscript 1 tilde straight B open parentheses 10 comma 1 half close parentheses is the number of heads on day 1
    • X subscript 2 tilde straight B open parentheses 10 comma 1 half close parentheses is the number of heads on day 2
    • ...
    • X subscript 7 tilde straight B open parentheses 10 comma 1 half close parentheses is the number of heads on day 7
  • X subscript 1 to X subscript 7 are independent random variables, each with identical straight B open parentheses 10 comma 1 half close parentheses distributions
    • Identical distributions don't make the number of heads on each day identical
  • X tilde straight B open parentheses 10 comma 1 half close parentheses is called the population distribution 
    • With population mean mu equals 5
    • And population variance sigma squared equals 2.5
  • X subscript 1 comma space... space comma space X subscript 7 is called the random sample of size 7 taken from the population distribution X
    • Where each X subscript i is an independent identical X distribution

What is the sample mean?

  • Let X subscript 1 comma space X subscript 2 comma space... comma space X subscript n, be a random sample of size n taken from the population distribution X
  • The sample mean is given by top enclose X equals fraction numerator X subscript 1 plus X subscript 2 plus... plus X subscript n over denominator n end fraction 
    • It is not a fixed number
    • Different samples of size n have different sample means
  • From the example before
    • Each day there's a different number of heads
      • Here's one sample: 4, 6, 5, 5, 3, 5, 6
      • This particular sample mean is 4.857...
      • This is close to the population mean of mu equals 5 heads
    • A second sample of 7 days would give a different sample mean
    • As would a third sample of 7 days 
      • Generating lots of samples like this will give a distribution of sample means

What is the distribution of the sample mean?

  • If a random sample of size n, X subscript 1 comma space X subscript 2 comma space... comma space X subscript n, is taken from a normal population distribution, X tilde straight N open parentheses mu comma sigma squared close parentheses
    • Then the distribution of the sample mean is X with bar on top tilde straight N open parentheses mu comma sigma squared over n close parentheses 
      • Where top enclose X equals fraction numerator X subscript 1 plus X subscript 2 plus... plus X subscript n over denominator n end fraction
      • And X subscript 1 comma space... comma space X subscript n are independent
    • The mean of the sample mean distribution is the same as the population mean
    • The variance of the sample mean distribution is the population variance divided by bold italic n
      • So larger samples are better, as their sample means are closer to mu (less spread out)
  • This only holds when the population is normal

Central Limit Theorem

What is the Central Limit Theorem (CLT)?

  • The Central Limit Theorem (CLT) says that
    • If X subscript 1 comma space X subscript 2 comma space... space comma space X subscript n is a random sample of n independent distributions 
    • Taken from any population distribution  X
    • With population mean mu and population variance sigma squared
    • Then, provided bold italic n is large
    • The sample mean has the approximate normal distribution top enclose X almost equal to tilde straight N open parentheses mu comma sigma squared over n close parentheses
      • almost equal to tilde means approximately modelled by
  • This works for any population distribution
    • For example, X subscript 1 to X subscript 50 could be a random sample of size 50 taken from a Po open parentheses lambda close parentheses distribution (mu equals lambda and sigma squared equals lambda)
      • The CLT says top enclose X almost equal to tilde straight N open parentheses mu comma sigma squared over n close parentheses
      • n equals 50 is a large sample
    • The sample mean has an approximate normal distribution, despite the population distribution being Poisson
  • In the special case when the population distribution is itself normal
    • Then the CLT is exact
      • top enclose X tilde straight N open parentheses mu comma sigma squared over n close parentheses
      • No approximation symbol

How do I find the population mean and variance?

  • To use top enclose X almost equal to tilde straight N open parentheses mu comma sigma squared over n close parentheses you need to find mu and sigma squared from the question
    • The population mean and variance
  • If the population, X, is a discrete random variable, then
    • mu equals straight E open parentheses X close parentheses 
    • sigma squared equals Var open parentheses X close parentheses
  • Use known formulae for standard discrete models
    • For example, if X tilde Geo open parentheses p close parentheses
      • Then mu equals 1 over p and sigma squared equals fraction numerator 1 minus p over denominator p squared end fraction
      • So sigma squared over n is fraction numerator 1 minus p over denominator n p end fraction
  • Don't forget to divide the population variance by bold italic n
  • Don't confuse n for sample size here with the " n " from a binomial distribution
    • Use a different letter, for example
      • X tilde straight B open parentheses m comma p close parentheses
      • So σ2n{"language":"en","fontFamily":"Times New Roman","fontSize":"18","autoformat":true}is mp1-pn{"language":"en","fontFamily":"Times New Roman","fontSize":"18","autoformat":true}

How do I know if it's a CLT question?

  • Key phrases to look out for are
    • "Estimate the probability that the mean number of ... is greater than ..."
    • "A random sample X subscript 1 comma space... space comma space X subscript 50 is taken"
  • Look out for samples being large

Exam Tip

  • Don't forget that n must be large for the Central Limit Theorem to hold!

Worked example

Let X represent the value shown when a four-sided spinner is spun, given by the distribution below.

x 0 2 4 8
straight P open parentheses X equals x close parentheses 0.5 0.2 0.1 0.2

The spinner will be spun 60 times and the values shown on each spin will be recorded.

Estimate the probability that the mean of the values recorded is less than 2.8.

clt-1clt-2

You've read 0 of your 0 free revision notes

Get unlimited access

to absolutely everything:

  • Downloadable PDFs
  • Unlimited Revision Notes
  • Topic Questions
  • Past Papers
  • Model Answers
  • Videos (Maths and Science)

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Did this page help you?

Mark

Author: Mark

Mark graduated twice from the University of Oxford: once in 2009 with a First in Mathematics, then again in 2013 with a PhD (DPhil) in Mathematics. He has had nine successful years as a secondary school teacher, specialising in A-Level Further Maths and running extension classes for Oxbridge Maths applicants. Alongside his teaching, he has written five internal textbooks, introduced new spiralling school curriculums and trained other Maths teachers through outreach programmes.