Chi Squared Tests for Standard Distributions (Edexcel A Level Further Maths: Further Statistics 1)

Revision Note

Author

Roger

Expertise

Maths

Chi Squared for Discrete Uniform

How do I do a chi-squared test with a discrete uniform distribution?

A chi-squared ( $χ^{2}$ ) goodness of fit test can be used to test data from a sample which suggests that the population has a discrete uniform distribution
For a random variable $X$ with the discrete uniform distribution
- $X$ can take a finite number $k$ of distinct values
- each value is equally likely
  - $P (X = x) = \frac{1}{k}, x = 1, 2, . . ., k$
There will never be any parameters to estimate for a discrete uniform goodness of fit test

What are the steps?

STEP 1: Write the hypotheses
- $H_{0}$ : A discrete uniform distribution is a suitable model for Variable X
- $H_{1}$ : A discrete uniform distribution is not a suitable model for Variable X
  - The hypotheses should always be stated in the context of the question
  - Make sure you clearly write what the variable is and don’t just call it 'Variable X'
STEP 2: Calculate the expected frequencies
- each expected frequency is the same
- divide the total frequency $N$ by the number of possible outcomes $k$
STEP 3: Calculate the degrees of freedom for the test
- For k possible outcomes
- degrees of freedom is $ν = k - 1$
STEP 4: Calculate $X^{2}$ using either version of the formula

$X^{2} = \sum_{i = 1}^{n} \frac{{(O_{i} - E_{i})}^{2}}{E_{i}} = (\sum_{i = 1}^{n} \frac{{O_{i}}^{2}}{E_{i}}) - N$

- Determine the appropriate $χ^{2}$ critical value
  - $χ_{ν}^{2} (α %)$ is the critical value with $ν$ degrees of freedom for significance level $α$
  - use the 'Percentage Points of the $χ^{2}$ Distribution' table in the exam formula booklet
- Or, alternatively, use a calculator to find the $χ_{ν}^{2}$ p-value
  - This is the probability of obtaining a chi-squared value of $X^{2}$ or more
STEP 5: Decide whether there is evidence to reject the null hypothesis
- Compare the statistic with the critical value you have determined
  - If $X^{2}$ > critical value (or $p < α$ ) then there is sufficient evidence to reject $H_{0}$
  - If $X^{2}$ < critical value (or $p > α$ ) then there is insufficient evidence to reject $H_{0}$
STEP 6: Write your conclusion
- If you reject H₀
  - A discrete uniform distribution is not a suitable model
- If you do not reject H₀
  - A discrete uniform distribution is a suitable model
- Be sure to state your conclusion in the context of the question

Worked example

A car salesperson is interested in how her sales are distributed and records her sales results over a period of six weeks. The data is shown in the table.

Week	1	2	3	4	5	6
Number of sales	15	17	11	21	14	12

Test, at the 5% significance level, whether or not the observed frequencies could be modelled by a discrete uniform distribution.

uniform-chi-squared-test-we-1 uniform-chi-squared-test-we-2

Chi Squared for Binomial

How do I do a chi-squared test with a binomial distribution?

A chi-squared ( $χ^{2}$ ) goodness of fit test can be used to test data from a sample suggesting that the population has a binomial distribution
For a random variable $X$ to have a binomial distribution:
- the number of trials ( $n$ ) must be fixed in each observation
- the trials must be independent
- each trial can have only two outcomes (success and failure)
- the probability of success ( $p$ ) must be constant
A question may give a precise binomial distribution $B (n, p)$ to test
- with an assumed value for $p$
Or you may be asked to test whether a binomial distribution is suitable without being given an assumed value for $p$
- In this case you will have to calculate an estimate for the value of $p$ for the binomial distribution
- For $N$ observations of the variable

$p = \frac{total number of successes}{number of trials \times N} = \frac{\sum (x \times f)}{n \times N}$

- - $f$ is the frequency for each value of $x$ (these are given in a table in the question)
  - $n$ is from $B (n, p)$ and $N$ is the sum of the observed values
- Remember that estimating this parameter uses up one degree of freedom

What are the steps?

STEP 1: Write the hypotheses
- $H_{0}$ : A binomial distribution is a suitable model for Variable X
- $H_{1}$ : A binomial distribution is not a suitable model for Variable X
  - The hypotheses should always be stated in the context of the question
  - Make sure you clearly write what the variable is and don’t just call it 'Variable X'
  - If you are given the assumed value of $p$ then state the precise distribution $B (n, p)$
STEP 2: Calculate the expected frequencies
- If you were not given the assumed value of p then you will first have to estimate it using the observed data
- Find the probability of the outcome using the binomial distribution $P (X = x)$
- Multiply the probability by the total number of observations $P (X = x) \times N$
- You will have to combine rows/columns if any expected values are less than 5 until they are greater than 5
STEP 3: Calculate the degrees of freedom for the test
- For $k$ outcomes (after combining expected values if needed)
- Degrees of freedom is
  - $ν = k - 1$ if you were given the assumed value of $p$ in the question
  - $ν = k - 2$ if you had to estimate the value of $p$ using data in the question

STEP 4: Calculate $X^{2}$ using either version of the formula

$X^{2} = \sum_{i = 1}^{n} \frac{{(O_{i} - E_{i})}^{2}}{E_{i}} = (\sum_{i = 1}^{n} \frac{{O_{i}}^{2}}{E_{i}}) - N$

- Determine the appropriate $χ^{2}$ critical value
  - $χ_{ν}^{2} (α %)$ is the critical value with $ν$ degrees of freedom for significance level $α$
  - use the 'Percentage Points of the $χ^{2}$ Distribution' table in the exam formula booklet
- Or, alternatively, use a calculator to find the $χ_{ν}^{2}$ p-value
  - This is the probability of obtaining a chi-squared value of $X^{2}$ or more
STEP 5: Decide whether there is evidence to reject the null hypothesis
- Compare the statistic with the critical value you have determined
  - If $X^{2}$ > critical value (or $p < α$ ) then there is sufficient evidence to reject $H_{0}$
  - If $X^{2}$ < critical value (or $p > α$ ) then there is insufficient evidence to reject $H_{0}$
STEP 6: Write your conclusion
- If you reject H₀
  - A binomial distribution is not a suitable model
- If you do not reject H₀
  - A binomial distribution is a suitable model
- Be sure to state your conclusion in the context of the question

Worked example

A stage in a video game has three boss battles. 1000 people try this stage of the video game and the number of bosses defeated by each player is recorded.

Number of bosses defeated	0	1	2	3
Frequency	490	384	111	15

It is suggested that the distribution can be modelled by a binomial distribution with $p = 0.2$ .

Test, at the 5% significance level, whether or not a binomial distribution is a good model.

4aJ47JIQ_binomial-chi-squared-test-we-1 binomial-chi-squared-test-we-2

Chi Squared for Poisson

How do I do a chi-squared test with a Poisson distribution?

A chi-squared (χ²) goodness of fit test can be used to test data from a sample suggesting that the population has a Poisson distribution
For a random variable $X$ to have a Poisson distribution:
- events must occur independently of each other
- events must occur singly and randomly
- events must occur at a constant rate (in space or time)
- the mean and the variance must be equal
You will either be given a precise Poisson distribution $Po (λ)$ to test
- with an assumed value for $λ$
Or you will be asked to test whether a Poisson distribution is suitable without being given an assumed value for $λ$
- In this case you will have to calculate an estimate for the value of $λ$ for the Poisson distribution
- The estimate for $N$ observations is just the mean of the observed sample:

$λ = \frac{\sum (x \times f)}{N}$

- - $f$ is the frequency for each value of $x$ (these are given in a table in the question)
  - $N$ is the sum of the observed values
- Remember that estimating this parameter uses up one degree of freedom

What are the steps?

STEP 1: Write the hypotheses
- $H_{0}$ : A Poisson distribution is a suitable model for Variable X
- $H_{1}$ : A Poisson distribution is not a suitable model for Variable X
  - The hypotheses should always be stated in the context of the question
  - Make sure you clearly write what the variable is and don’t just call it 'Variable X'
  - If you are given the assumed value of $λ$ then state the precise distribution $Po (λ)$
STEP 2: Calculate the expected frequencies
- If you were not given the assumed value of $λ$ then you will first have to estimate it using the observed data
- Find the probability of the outcome using the Poisson distribution $P (X = x)$
- Multiply the probability by the total number of observations $P (X = x) \times N$
- Poisson variables start on $X = 0$ and go up to infinity
  - If a is the smallest observed value in the table then calculate all of $P (X \leq a)$ for that column
  - If b is the largest observed value in the table then calculate all of $P (X \geq b)$ up to infinity
    - $1 - P (X \leq b - 1)$
- You will have to combine rows/columns if any expected values are less than 5 until they are greater than 5
STEP 3: Calculate the degrees of freedom for the test
- For k outcomes (after combining expected values if needed)
- Degrees of freedom is
  - $ν = k - 1$ if you were given the assumed value of $λ$
  - $ν = k - 2$ if you had to estimate the value of $λ$

STEP 4: Calculate $X^{2}$ using either version of the formula

$X^{2} = \sum_{i = 1}^{n} \frac{{(O_{i} - E_{i})}^{2}}{E_{i}} = (\sum_{i = 1}^{n} \frac{{O_{i}}^{2}}{E_{i}}) - N$

- Determine the appropriate $χ^{2}$ critical value
  - $χ_{ν}^{2} (α %)$ is the critical value with $ν$ degrees of freedom for significance level $α$
  - use the 'Percentage Points of the $χ^{2}$ Distribution' table in the exam formula booklet
- Or, alternatively, use a calculator to find the $χ_{ν}^{2}$ p-value
  - This is the probability of obtaining a chi-squared value of $X^{2}$ or more
STEP 5: Decide whether there is evidence to reject the null hypothesis
- Compare the statistic with the critical value you have determined
  - If $X^{2}$ > critical value (or $p < α$ ) then there is sufficient evidence to reject $H_{0}$
  - If $X^{2}$ < critical value (or $p > α$ ) then there is insufficient evidence to reject $H_{0}$
STEP 6: Write your conclusion
- If you reject H₀
  - A Poisson distribution is not a suitable model
- If you do not reject H₀
  - A Poisson distribution is a suitable model
- Be sure to state your conclusion in the context of the question

Worked example

A parent claims that the number of messages they receive from their teenage child within an hour can be modelled by a Poisson distribution. The parent collects data from 100 one hour periods and records the observed frequencies of the messages received from the child. The parent calculates the mean number of messages received from the sample and uses this to calculate the expected frequencies if a Poisson model is used.

Number of messages	Observed frequency	Expected frequency
0	9	7.28
1	16	$a$
2	23	24.99
3	22	21.82
4	16	14.29
5	14	7.49
6 or more	0	$b$

A goodness of fit test at the 10% significance level is to be used to test the parent’s claim.

Write down null and alternative hypotheses to test the parent’s claim.

poisson-chi-squared-test-we-a

Show that the mean number of messages received per hour for the sample is 2.62.

poisson-chi-squared-test-we-b

Calculate the values of

a

and

b

, giving your answers to 2 decimal places.

poisson-chi-squared-test-we-c

Perform the hypothesis test.

poisson-chi-squared-test-we-d

Chi Squared for Geometric

How do I do a chi-squared test with a geometric distribution?

A chi-squared ( $χ^{2}$ ) goodness of fit test can be used to test data from a sample suggesting that the population has a geometric distribution
For a random variable $X$ to have a geometric distribution:
- the trials must be independent
- each trial can have only two outcomes (success and failure)
- trials are repeated until the first success
- the probability of success ( $p$ ) must be constant
- the value of the variable is the number of trials until the first success
A question may give a precise geometric distribution $Geo (p)$ to test
- with an assumed value for $p$
Or you may be asked to test whether a geometric distribution is suitable without being given an assumed value for $p$
- In this case you will have to calculate an estimate for the value of $p$ for the geometric distribution
- For $N$ observations of the variable

$p = \frac{total number of successes}{total number of trials} = \frac{N}{\sum (x \times f)}$

- - $f$ is the frequency for each value of $x$ (these are given in a table in the question)
  - $N$ is the sum of the observed values
- Remember that estimating this parameter uses up one degree of freedom

What are the steps?

STEP 1: Write the hypotheses
- $H_{0}$ : A geometric distribution is a suitable model for Variable X
- $H_{1}$ : A geometric distribution is not a suitable model for Variable X
  - The hypotheses should always be stated in the context of the question
  - Make sure you clearly write what the variable is and don’t just call it 'Variable X'
  - If you are given the assumed value of $p$ then state the precise distribution $Geo (p)$
STEP 2: Calculate the expected frequencies
- If you were not given the assumed value of p then you will first have to estimate it using the observed data
- Find the probability of the outcome using the geometric distribution $P (X = x)$
- Multiply the probability by the total number of observations $P (X = x) \times N$
- Geometric variables start on 1 and go up to infinity
  - If a is the smallest observed value in the table then calculate all of $X \leq a$ for that column
  - If b is the largest observed value in the table then calculate all of $X \geq b$ up to infinity
    - The formulae $P (X \leq x) = 1 - {(1 - p)}^{x}$ and $P (X \geq x) = {(1 - p)}^{x - 1}$ can help
- You will have to combine rows/columns if any expected values are less than 5 until they are greater than 5
STEP 3: Calculate the degrees of freedom for the test
- For $k$ outcomes (after combining expected values if needed)
- Degrees of freedom is
  - $ν = k - 1$ if you were given the assumed value of $p$
  - $ν = k - 2$ if you had to estimate the value of $p$

STEP 4: Calculate $X^{2}$ using either version of the formula

$X^{2} = \sum_{i = 1}^{n} \frac{{(O_{i} - E_{i})}^{2}}{E_{i}} = (\sum_{i = 1}^{n} \frac{{O_{i}}^{2}}{E_{i}}) - N$

- Determine the appropriate $χ^{2}$ critical value
  - $χ_{ν}^{2} (α %)$ is the critical value with $ν$ degrees of freedom for significance level $α$
  - use the 'Percentage Points of the $χ^{2}$ Distribution' table in the exam formula booklet
- Or, alternatively, use a calculator to find the $χ_{ν}^{2}$ p-value
  - This is the probability of obtaining a chi-squared value of $X^{2}$ or more
STEP 5: Decide whether there is evidence to reject the null hypothesis
- Compare the statistic with the critical value you have determined
  - If $X^{2}$ > critical value (or $p < α$ ) then there is sufficient evidence to reject $H_{0}$
  - If $X^{2}$ < critical value (or $p > α$ ) then there is insufficient evidence to reject $H_{0}$
STEP 6: Write your conclusion
- If you reject H₀
  - A geometric distribution is not a suitable model
- If you do not reject H₀
  - A geometric distribution is a suitable model
- Be sure to state your conclusion in the context of the question

Worked example

Mercurio is a door-to-door salesman. Over the course of a week he records the number of doors he needs to knock on each time before getting an answer.

Number of doors	1	2	3	4	5	Total
Frequency	205	61	22	8	4	300

Mercurio thinks he can model the number of doors he needs to knock on each time using a geometric random variable $X ~ Geo (p)$ .

Using the observed frequencies, find an estimate for

p

geometric-chi-squared-test-we-a

Conduct a goodness of fit test at the 10% significance level, and say whether a geometric random variable is a good model for the data.

geometric-chi-squared-test-we-b1

You've read 0 of your 0 free revision notes

Get unlimited access

to absolutely everything:

Downloadable PDFs
Unlimited Revision Notes
Topic Questions
Past Papers
Model Answers
Videos (Maths and Science)

Join the 100,000+ Students that ❤️ Save My Exams

the (exam) results speak for themselves:

Next topic

Did this page help you?

Chi Squared Tests for Standard Distributions (Edexcel A Level Further Maths: Further Statistics 1)

Revision Note

How do I do a chi-squared test with a discrete uniform distribution?

What are the steps?

How do I do a chi-squared test with a binomial distribution?

What are the steps?

How do I do a chi-squared test with a Poisson distribution?

What are the steps?

How do I do a chi-squared test with a geometric distribution?

What are the steps?

You've read 0 of your 0 free revision notes

Get unlimited access

Join the 100,000+ Students that ❤️ Save My Exams

Author: Roger