Chapter 15: Probability Distribution
Master probability distributions, random variables, binomial and normal distributions with comprehensive SPM exam strategies.
Chapter 15: Probability Distribution
Overview
Probability distribution is a fundamental concept in statistics that describes how probabilities are distributed over the values of a random variable. This chapter explores random variables, discrete and continuous distributions, binomial distribution, normal distribution, and their applications in real-world scenarios. Understanding probability distributions is essential for statistical inference, decision-making, and analyzing random phenomena in science, engineering, and social sciences.
Learning Objectives
After completing this chapter, you will be able to:
- Define and classify random variables
- Understand probability mass and density functions
- Apply binomial distribution to binary outcome scenarios
- Use normal distribution for continuous random variables
- Calculate probabilities and parameters for different distributions
- Apply these concepts to real-world problems
Key Concepts
15.1 Random Variables
Definition of Random Variables
A random variable is a numerical quantity whose value is determined by the outcome of a random experiment.
Types of Random Variables:
- Discrete Random Variable: Takes on countable values (integers)
- Continuous Random Variable: Takes on any value in an interval
Probability Distribution
For a random variable X, the probability distribution is a function that gives the probability of each possible value.
Discrete Case:
Continuous Case:
Cumulative Distribution Function
The cumulative distribution function (CDF) gives the probability that the random variable takes a value less than or equal to x.
General Form:
15.2 Binomial Distribution
Definition of Binomial Distribution
The binomial distribution describes the number of successes in a fixed number of independent trials, each with the same probability of success.
Parameters:
- n: Number of trials
- p: Probability of success on each trial
- q = 1-p: Probability of failure on each trial
Probability Mass Function:
Where: k = 0, 1, 2, ..., n
Mean and Variance of Binomial Distribution
- Mean (Expected Value): μ = np
- Variance: σ² = npq
- Standard Deviation: σ = √(npq)
Conditions for Binomial Distribution
- Fixed number of trials (n)
- Independent trials
- Constant probability of success (p)
- Binary outcomes (success/failure)
15.3 Normal Distribution
Definition of Normal Distribution
The normal distribution (or Gaussian distribution) is a continuous probability distribution that is symmetric and bell-shaped.
Probability Density Function:
Where:
- μ: Mean (location parameter)
- σ: Standard deviation (scale parameter)
Properties of Normal Distribution
- Symmetric about the mean μ
- Bell-shaped curve with peak at μ
- Mean = Median = Mode
- Total area under curve = 1
- Inflection points at μ ± σ
Standard Normal Distribution
The standard normal distribution has μ = 0 and σ = 1.
Z-score transformation:
Standard normal density:
Important Formulas and Methods
Key Distribution Formulas
| Distribution | Parameters | Mean | Variance | PMF/PDF |
|---|---|---|---|---|
| Binomial | n, p | np | npq | C(n,k)pᵏqⁿ⁻ᵏ |
| Normal | μ, σ | μ | σ² | (1/σ√2π)e^[-½((x-μ)/σ)²] |
Probability Calculation Methods
For Binomial Distribution:
- Identify n, p, and k
- Use binomial formula: P(X = k) = C(n,k)pᵏqⁿ⁻ᵏ
- For cumulative probabilities: P(X ≤ k) = ΣC(n,i)pⁱqⁿ⁻ⁱ from i=0 to k
For Normal Distribution:
- Identify μ and σ
- Convert to z-score: z = (x - μ)/σ
- Use standard normal table
- For intervals: P(a < X < b) = Φ((b-μ)/σ) - Φ((a-μ)/σ)
Solved Examples
Example 1: Binomial Distribution Basics
A fair coin is flipped 5 times. Find the probability of getting: a) Exactly 3 heads b) At least 3 heads c) Exactly 2 heads
Solution:
This is a binomial distribution with n = 5, p = 0.5, q = 0.5
a) P(X = 3) = C(5,3)(0.5)³(0.5)² = 10 × 0.125 × 0.25 = 0.3125
b) P(X ≥ 3) = P(X = 3) + P(X = 4) + P(X = 5) = C(5,3)(0.5)⁵ + C(5,4)(0.5)⁵ + C(5,5)(0.5)⁵ = (10 + 5 + 1) × 0.03125 = 16 × 0.03125 = 0.5
c) P(X = 2) = C(5,2)(0.5)²(0.5)³ = 10 × 0.25 × 0.125 = 0.3125
Example 2: Binomial Distribution with Applications
A factory produces light bulbs with a 2% defect rate. If 20 bulbs are randomly selected, find: a) The probability that exactly 2 are defective b) The probability that at least 1 is defective c) The expected number and standard deviation
Solution:
Binomial distribution with n = 20, p = 0.02, q = 0.98
a) P(X = 2) = C(20,2)(0.02)²(0.98)¹⁸ = 190 × 0.0004 × 0.7006 = 190 × 0.00028024 = 0.053246
b) P(X ≥ 1) = 1 - P(X = 0) = 1 - C(20,0)(0.02)⁰(0.98)²⁰ = 1 - 1 × 1 × (0.98)²⁰ = 1 - 0.6676 = 0.3324
c) Expected number: μ = np = 20 × 0.02 = 0.4 Standard deviation: σ = √(npq) = √(20 × 0.02 × 0.98) = √0.392 ≈ 0.626
Example 3: Normal Distribution Basics
A normal distribution has mean μ = 50 and standard deviation σ = 10. Find: a) P(X < 45) b) P(40 < X < 60) c) P(X > 70)
Solution:
Convert to z-scores and use standard normal table.
a) P(X < 45): z = (45 - 50)/10 = -0.5 P(Z < -0.5) = 0.3085
b) P(40 < X < 60): = (40 - 50)/10 = -1.0, = (60 - 50)/10 = 1.0 P(-1.0 < Z < 1.0) = P(Z < 1.0) - P(Z < -1.0) = 0.8413 - 0.1587 = 0.6826
c) P(X > 70): z = (70 - 50)/10 = 2.0 P(Z > 2.0) = 1 - P(Z < 2.0) = 1 - 0.9772 = 0.0228
Example 4: Normal Distribution Applications
The heights of adult males in a population are normally distributed with mean 175 cm and standard deviation 7 cm. Find: a) The probability that a randomly selected male is taller than 185 cm b) The height that 95% of males are shorter than c) The range that contains the middle 68% of heights
Solution:
Normal distribution with μ = 175, σ = 7
a) P(X > 185): z = (185 - 175)/7 ≈ 1.4286 P(Z > 1.4286) ≈ 1 - 0.9236 = 0.0764
b) Find height h such that P(X < h) = 0.95 z = 1.645 (from standard normal table for 95%) h = μ + zσ = 175 + 1.645 × 7 ≈ 175 + 11.515 = 186.515 cm
c) Middle 68% corresponds to z = ±1 Range: μ ± σ = 175 ± 7 = 168 cm to 182 cm
Example 5: Binomial Probability Calculations
A multiple-choice test has 10 questions, each with 4 options. A student guesses all answers. Find: a) The probability of getting exactly 3 correct answers b) The probability of getting at least 5 correct answers c) The most likely number of correct answers
Solution:
Binomial distribution with n = 10, p = 0.25, q = 0.75
a) P(X = 3) = C(10,3)(0.25)³(0.75)⁷ = 120 × 0.015625 × 0.13348 ≈ 120 × 0.002085 = 0.2502
b) P(X ≥ 5) = P(X = 5) + P(X = 6) + P(X = 7) + P(X = 8) + P(X = 9) + P(X = 10) = C(10,5)(0.25)⁵(0.75)⁵ + C(10,6)(0.25)⁶(0.75)⁴ + ... + C(10,10)(0.25)¹⁰(0.75)⁰ ≈ 0.0166 + 0.0031 + 0.0004 + 0.00003 + 0.000001 + 0.0000001 ≈ 0.0201
c) Mode is the most likely value. Check around np = 10 × 0.25 = 2.5 P(X = 2) = C(10,2)(0.25)²(0.75)⁸ ≈ 45 × 0.0625 × 0.1001 ≈ 0.2816 P(X = 3) = 0.2502 (from a) P(X = 2) > P(X = 3), so mode is 2.
Example 6: Normal Distribution with Percentiles
IQ scores are normally distributed with mean 100 and standard deviation 15. Find: a) The IQ score that corresponds to the 90th percentile b) The percentage of people with IQ scores between 85 and 115 c) The range that contains the middle 95% of IQ scores
Solution:
Normal distribution with μ = 100, σ = 15
a) 90th percentile: z = 1.282 IQ = 100 + 1.282 × 15 ≈ 100 + 19.23 = 119.23
b) P(85 < X < 115): = (85 - 100)/15 = -1.0, = (115 - 100)/15 = 1.0 P(-1.0 < Z < 1.0) = 0.6826 (68.26%)
c) Middle 95%: z = ±1.96 Range: 100 ± 1.96 × 15 = 100 ± 29.4 = 70.6 to 129.4
Mathematical Derivations
Derivation of Binomial Mean
For binomial distribution X ~ B(n, p): E[X] = Σk × C(n,k)pᵏqⁿ⁻ᵏ from k=0 to n = Σk × [n!/(k!(n-k)!)] pᵏqⁿ⁻ᵏ = np × Σ[(n-1)!/((k-1)!(n-k)!)] pᵏ⁻¹qⁿ⁻ᵏ = np × ΣC(n-1,j)pʲqⁿ⁻¹⁻ʲ (where j = k-1) = np × (p + q)ⁿ⁻¹ = np × 1ⁿ⁻¹ = np
Derivation of Binomial Variance
Var[X] = E[] - (E[X])² E[] = Σ × C(n,k)pᵏqⁿ⁻ᵏ = np + n(n-1) (after detailed algebraic manipulation) Var[X] = np + n(n-1) - = np(1 - p) = npq
Derivation of Normal Distribution Properties
The normal distribution can be derived as the limit of binomial distribution as n → ∞ (De Moivre-Laplace theorem). The bell shape comes from the fact that for large n, the binomial probabilities become approximately normal due to the central limit theorem.
Real-World Applications
1. Quality Control and Manufacturing
Acceptance Sampling:
- Binomial distribution for defective products
- Normal distribution for quality measurements
- Statistical process control
Example: A factory monitors defect rates using binomial distribution. If 5% of products are defective, what's the probability that a sample of 200 contains more than 15 defects?
2. Medicine and Healthcare
Diagnostic Testing:
- Binomial distribution for test results
- Normal distribution for biological measurements
- Risk assessment
Example: If a test has 95% accuracy, what's the probability it correctly diagnoses 18 out of 20 patients?
3. Finance and Economics
Risk Analysis:
- Normal distribution for stock returns
- Binomial distribution for credit defaults
- Portfolio optimization
Example: If annual stock returns are normally distributed with mean 8% and standard deviation 15%, what's the probability of a negative return?
4. Social Sciences
Survey Analysis:
- Binomial distribution for yes/no responses
- Normal distribution for continuous measurements
- Sampling distributions
Example: If 60% of voters support a policy, what's the probability that a poll of 1000 voters shows support between 55% and 65%?
Complex Problem-Solving Techniques
Problem: A factory produces widgets with a 3% defect rate. Quality control samples 200 widgets. Find the probability that:
a) Exactly 5 are defective b) More than 8 are defective c) Between 2 and 7 (inclusive) are defective
Solution:
Binomial distribution with n = 200, p = 0.03, q = 0.97
a) P(X = 5) = C(200,5)(0.03)⁵(0.97)¹⁹⁵ = [200!/(5!195!)] × 0.00000243 × 0.002302 ≈ 2,535,650,040 × 0.00000243 × 0.002302 ≈ 0.0992
b) P(X > 8) = 1 - P(X ≤ 8) This would normally require summing 9 terms, but for large n, we can use normal approximation: μ = np = 200 × 0.03 = 6 σ = √(npq) = √(200 × 0.03 × 0.97) = √5.82 ≈ 2.412
P(X > 8) = P(X ≥ 8.5) (continuity correction) z = (8.5 - 6)/2.412 ≈ 1.037 P(Z > 1.037) ≈ 0.150
c) P(2 ≤ X ≤ 7) = P(X ≤ 7) - P(X ≤ 1) Using normal approximation: P(X ≤ 7.5) - P(X ≤ 1.5) = (7.5 - 6)/2.412 ≈ 0.623, = (1.5 - 6)/2.412 ≈ -1.866 P(Z ≤ 0.623) ≈ 0.734, P(Z ≤ -1.866) ≈ 0.031 P(2 ≤ X ≤ 7) ≈ 0.734 - 0.031 = 0.703
Problem: The lifetimes of light bulbs are normally distributed with mean 1000 hours and standard deviation 100 hours. Find:
a) The probability that a bulb lasts more than 1200 hours b) The probability that a bulb lasts between 800 and 1100 hours c) The lifetime that only 5% of bulbs exceed
Solution:
Normal distribution with μ = 1000, σ = 100
a) P(X > 1200): z = (1200 - 1000)/100 = 2.0 P(Z > 2.0) = 1 - 0.9772 = 0.0228
b) P(800 < X < 1100): = (800 - 1000)/100 = -2.0, = (1100 - 1000)/100 = 1.0 P(-2.0 < Z < 1.0) = P(Z < 1.0) - P(Z < -2.0) = 0.8413 - 0.0228 = 0.8185
c) Find lifetime L such that P(X > L) = 0.05 P(X ≤ L) = 0.95 ⇒ z = 1.645 L = μ + zσ = 1000 + 1.645 × 100 = 1164.5 hours
Problem: A multiple-choice test has 25 questions, each with 5 options. A student gets 12 answers correct. Is this evidence of knowledge rather than guessing?
Solution:
If guessing alone: n = 25, p = 0.20, q = 0.80
Find probability of getting 12 or more correct: P(X ≥ 12) = ΣC(25,k)(0.20)ᵏ(0.80)²⁵⁻ᵏ from k=12 to 25
This is small, so calculate using normal approximation: μ = np = 25 × 0.20 = 5 σ = √(npq) = √(25 × 0.20 × 0.80) = √4 = 2
P(X ≥ 12) = P(X ≥ 11.5) (continuity correction) z = (11.5 - 5)/2 = 3.25 P(Z ≥ 3.25) ≈ 0.0006
Since P(X ≥ 12) ≈ 0.0006 < 0.05, getting 12 correct is statistically significant evidence against guessing alone.
Summary Points
- Random Variables: Numerical outcomes of random experiments (discrete/continuous)
- Binomial Distribution: Fixed trials, constant probability, binary outcomes
- Normal Distribution: Bell-shaped, symmetric, defined by mean and standard deviation
- Parameters: Mean (location) and variance (spread) describe distributions
- Applications: Quality control, medicine, finance, social sciences
- Calculations: Use formulas for exact values or approximations for large samples
Common Mistakes to Avoid
- Distribution selection errors - Choose binomial for discrete counts, normal for continuous measurements
- Parameter misidentification - Correctly identify n, p for binomial; μ, σ for normal
- Continuity correction errors - Use 0.5 adjustments when approximating discrete with continuous
- Interpretation errors - Understand what probability values mean in context
- Calculation errors - Be careful with factorials, exponents, and z-score calculations
SPM Exam Tips
Exam Strategies
- Identify the distribution - Determine whether to use binomial or normal
- Check conditions - Verify independence and sample size requirements
- Use appropriate formulas - Apply correct formulas for each distribution
- Show working clearly - Step-by-step calculations for partial marks
- Use approximations wisely - Normal approximation for large n when appropriate
Key Exam Topics
- Binomial distribution calculations (30% of questions)
- Normal distribution applications (30% of questions)
- Mean and variance calculations (20% of questions)
- Probability between values (10% of questions)
- Real-world applications (10% of questions)
Time Management Tips
- Simple binomial problems: 4-5 minutes
- Simple normal problems: 4-5 minutes
- Complex binomial applications: 7-8 minutes
- Complex normal applications: 7-8 minutes
- Real-world application problems: 8-10 minutes
Practice Problems
Level 1: Binomial Distribution Basics
-
A coin is flipped 8 times. Find the probability of getting exactly 3 heads.
-
A basketball player makes 70% of free throws. If she attempts 5 shots, find the probability she makes exactly 3.
Level 2: Binomial Applications
-
A factory produces 5% defective items. If 15 items are selected randomly, find: a) The probability of exactly 2 defective items b) The probability of at least 1 defective item c) The expected number and standard deviation of defective items
-
A multiple-choice test has 12 questions, each with 4 options. A student guesses all answers. Find: a) The probability of getting exactly 4 correct b) The probability of getting more than 6 correct c) The most likely number of correct answers
Level 3: Normal Distribution Basics
-
IQ scores are normally distributed with mean 100 and standard deviation 15. Find: a) P(IQ > 120) b) P(90 < IQ < 110) c) The IQ score that 80% of people score below
-
Heights are normally distributed with mean 170 cm and standard deviation 10 cm. Find: a) P(height > 185 cm) b) P(160 < height < 180) c) The heights that contain the middle 90% of the population
Level 4: Complex Applications
-
Quality Control: A machine produces parts with 3% defect rate. In a sample of 500 parts, find the probability that more than 20 are defective.
-
Medical Testing: A diagnostic test has 95% sensitivity and 90% specificity. If 10% of a population has the disease, find: a) The probability a random person tests positive b) The probability a person who tests positive actually has the disease
Level 5: Advanced Problems
-
Manufacturing: Production rates follow a normal distribution with mean 100 units/hour and standard deviation 10 units/hour. Find the probability that production falls below 80 units/hour on any given hour.
-
Finance: Stock returns are normally distributed with mean 12% and standard deviation 20%. Find the probability of a negative return and the return that corresponds to the 95th percentile.
Did You Know? 📚
The normal distribution was first discovered by Abraham de Moivre in 1733 while studying games of chance. Carl Friedrich Gauss developed it extensively in the early 19th century for astronomical calculations. The central limit theorem, which explains why normal distributions are so common in nature, was proven by Laplace and Lyapunov. Today, the normal distribution is fundamental in statistics, used in everything from quality control to quantum mechanics.
Quick Reference Guide
| Concept | Formula/Method | Key Points |
|---|---|---|
| Binomial PMF | P(X=k) = C(n,k)pᵏqⁿ⁻ᵏ | Discrete, fixed trials |
| Binomial mean | μ = np | Average number of successes |
| Binomial variance | σ² = npq | Spread around mean |
| Normal PDF | f(x) = (1/σ√2π)e^[-½((x-μ)/σ)²] | Continuous, bell-shaped |
| Z-score | z = (x-μ)/σ | Standardization |
| Normal approximation | μ = np, σ = √(npq) | For large n in binomial |
Probability distributions provide mathematical frameworks for understanding randomness and uncertainty in the world. Mastering these concepts enables statistical reasoning and informed decision-making in scientific, business, and everyday contexts.