1. Sampling Distribution of \(\bar{X}\)
When we repeatedly draw samples of size \(n\) from a population and compute the sample mean \(\bar{X}\) each time, the distribution of all those sample means is called the sampling distribution of \(\bar{X}\).
Population
(\(\mu, \sigma\))
↓ draw samples ↓
CSUF
\(\bar{X}_1\)
UCI
\(\bar{X}_2\)
USC
\(\bar{X}_3\)
UCLA
\(\bar{X}_4\)
Each university yields a sample mean; together they form the sampling distribution.
Key Formulas
Distribution
\(\bar{X} \sim N\!\left(\mu,\;\dfrac{\sigma}{\sqrt{n}}\right)\)
When Is It Normal?
- If the population itself is Normal — always.
- If the population is not Normal — by the Central Limit Theorem, approximately Normal when \(n \ge 30\).
2. Sampling Distribution of \(\hat{p}\)
For a categorical variable (yes/no), we estimate the population proportion \(p\) with the sample proportion \(\hat{p}\). Repeated sampling gives a distribution of \(\hat{p}\) values.
Key Formulas
Distribution
\(\hat{p} \sim N\!\left(p,\;\sqrt{\dfrac{p(1-p)}{n}}\right)\)
Normality Check
The sampling distribution of \(\hat{p}\) is approximately Normal when both conditions hold:
\(np \ge 5 \quad\text{and}\quad n(1-p) \ge 5\)
3. Z-Score Conversions
To find probabilities, convert to the standard Normal distribution using the appropriate Z-score formula.
4. Problem Types
Tip: For N2 (right-tail) problems, use \(P(Z > z) = 1 - P(Z < z)\). For N3 (between) problems, compute the two tail areas and subtract.
5. Worked Examples
Example 1 — Fullerton Household Incomes
Problem Setup
Let \(X\) = Fullerton household income. The population has \(\mu = 72{,}000\) and \(\sigma = 6{,}000\). Samples are drawn from various universities (CSUF, UCI, USC, UCLA). Describe the sampling distribution of \(\bar{X}\).
Mean of \(\bar{X}\): \(\mu_{\bar{X}} = \mu = 72{,}000\)
Standard error: \(\sigma_{\bar{X}} = \dfrac{\sigma}{\sqrt{n}} = \dfrac{6{,}000}{\sqrt{n}}\)
Shape: Normal if the population is Normal or \(n \ge 30\) (CLT).
Key Insight: The mean of the sampling distribution always equals the population mean, regardless of sample size. Increasing \(n\) only reduces the spread.
Example 2 — Trader Joe's Customers
Problem Setup
76% of Trader Joe's customers read ingredients before purchasing (\(p = .76\)). A random sample of \(n = 400\) customers is selected.
Standard error:
\(\sigma_{\hat{p}} = \sqrt{\dfrac{.76 \times .24}{400}} = \sqrt{\dfrac{.1824}{400}} = \sqrt{.000456} = .0214\)
Normality check:
\(400 \times .76 = 304 \ge 5\) ✓
\(400 \times .24 = 96 \ge 5\) ✓
Distribution: \(\hat{p} \sim N(.76,\;.0214)\)
(i) Find \(P(\hat{p} > .75)\) — N2
Convert: \(Z = \dfrac{.75 - .76}{.0214} = \dfrac{-.01}{.0214} = -.4673\)
\(P(\hat{p} > .75) = P(Z > -.4673) = P(Z < .4673) = .6799\)
(ii) Find \(P(.73 < \hat{p} < .79)\) — N3
Lower: \(Z_1 = \dfrac{.73 - .76}{.0214} = -1.4019\)
Upper: \(Z_2 = \dfrac{.79 - .76}{.0214} = 1.4019\)
\(P(.73 < \hat{p} < .79) = P(-1.40 < Z < 1.40) = 2 \times P(Z < 1.40) - 1 \approx .8385\)
(iii) Find \(P(\hat{p} < .75)\) — N1
From part (i): \(Z = -.4673\)
\(P(\hat{p} < .75) = P(Z < -.4673) = 1 - .6799 = .3201\)
Key Insight: Parts (i) and (iii) are complements — they must sum to 1. This is a great self-check.
Example 3 — First-Time Amazon Customers
Problem Setup
30% of customers are first-time Amazon buyers (\(p = .30\)). A sample of \(n = 100\) is drawn.
Standard error:
\(\sigma_{\hat{p}} = \sqrt{\dfrac{.30 \times .70}{100}} = \sqrt{\dfrac{.21}{100}} = \sqrt{.0021} = .0458\)
Normality check:
\(100 \times .30 = 30 \ge 5\) ✓
\(100 \times .70 = 70 \ge 5\) ✓
Distribution: \(\hat{p} \sim N(.30,\;.0458)\)
Key Insight: Even with a relatively low proportion (\(p = .30\)), a sample of 100 is more than enough to satisfy the normality conditions.
Example 4 — Fullerton Solar Energy
Problem Setup
20% of Fullerton households use solar energy (\(p = .20\)). A sample of \(n = 100\) is drawn.
Standard error:
\(\sigma_{\hat{p}} = \sqrt{\dfrac{.20 \times .80}{100}} = \sqrt{\dfrac{.16}{100}} = \sqrt{.0016} = .04\)
Normality check:
\(100 \times .20 = 20 \ge 5\) ✓
\(100 \times .80 = 80 \ge 5\) ✓
Distribution: \(\hat{p} \sim N(.20,\;.04)\)
Key Insight: The standard error \(\sigma_{\hat{p}}\) is also called the standard deviation of \(\hat{p}\) — both terms are used interchangeably.
Example 5 — Car Insurance
Problem Setup
The mean annual cost of car insurance is \(\mu = \$939\) with \(\sigma = \$245\). A random sample of \(n = 50\) policies is selected.
(a) Mean: \(\mu_{\bar{X}} = \mu = 939\)
(b) Standard error: \(\sigma_{\bar{X}} = \dfrac{245}{\sqrt{50}} = \dfrac{245}{7.071} = 34.65\)
(c) Shape: Normal, since \(n = 50 \ge 30\) (CLT). Thus \(\bar{X} \sim N(939,\;34.65)\).
(d) Find \(P(\bar{X} < 964)\)
Convert to Z:
\(Z = \dfrac{964 - 939}{34.65} = \dfrac{25}{34.65} = 0.72\)
Look up:
\(P(\bar{X} < 964) = P(Z < 0.72) = .7642\)
Key Insight: There is about a 76.4% chance that the sample mean insurance cost is less than $964 — even though individual policies vary widely (\(\sigma = 245\)), the sampling distribution is much tighter (\(\sigma_{\bar{X}} = 34.65\)).
📝 Practice Problems
Test your understanding. Click each question to reveal the answer.
1. A population has \(\mu = 500\) and \(\sigma = 80\). If you take a sample of \(n = 64\), what is the standard error \(\sigma_{\bar{X}}\)?
\[\sigma_{\bar{X}} = \frac{\sigma}{\sqrt{n}} = \frac{80}{\sqrt{64}} = \frac{80}{8} = 10\]
2. A poll finds that \(p = 0.45\) of voters support a measure. If \(n = 200\), what is \(\sigma_{\hat{p}}\)?
\[\sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.45 \times 0.55}{200}} = \sqrt{\frac{0.2475}{200}} = \sqrt{0.001238} = 0.0352\]
3. Can you use the Normal approximation for \(\hat{p}\) if \(p = 0.02\) and \(n = 100\)?
No. Check: \(np = 100 \times 0.02 = 2\), which is less than 5. The condition \(np \geq 5\) is not met, so the Normal approximation is not appropriate.
4. Household income has \(\mu = \$50{,}000\) and \(\sigma = \$12{,}000\). For a sample of \(n = 36\), find \(P(\bar{X} > 53{,}000)\).
Step 1: \(\sigma_{\bar{X}} = \frac{12000}{\sqrt{36}} = 2000\)
Step 2: \(Z = \frac{53000 - 50000}{2000} = 1.50\)
Step 3: \(P(Z > 1.50) = 1 - 0.9332 = 0.0668\)
5. In a city, 60% of residents recycle. A sample of \(n = 150\) is taken. Find \(P(\hat{p} < 0.55)\).
Step 1: \(\sigma_{\hat{p}} = \sqrt{\frac{0.60 \times 0.40}{150}} = \sqrt{0.0016} = 0.04\)
Step 2: \(Z = \frac{0.55 - 0.60}{0.04} = -1.25\)
Step 3: \(P(Z < -1.25) = 0.1056\)
6. Which is larger: the standard error when \(n = 25\) or when \(n = 100\)? Why?
\(n = 25\) has the larger standard error. Since \(\sigma_{\bar{X}} = \sigma/\sqrt{n}\), a smaller \(n\) means dividing by a smaller number, giving a larger standard error. Larger samples produce more precise estimates.