Chapter 7 — Sampling Distributions

1. Sampling Distribution of $\bar{X}$

When we repeatedly draw samples of size $n$ from a population and compute the sample mean $\bar{X}$ each time, the distribution of all those sample means is called the sampling distribution of $\bar{X}$.

Population
($\mu, \sigma$)

↓ draw samples ↓

CSUF
$\bar{X}_1$

UCI
$\bar{X}_2$

USC
$\bar{X}_3$

UCLA
$\bar{X}_4$

Each university yields a sample mean; together they form the sampling distribution.

Key Formulas

Mean
$\mu_{\bar{X}} = \mu$

Standard Error
$\sigma_{\bar{X}} = \dfrac{\sigma}{\sqrt{n}}$

Distribution
$\bar{X} \sim N\!\left(\mu,\;\dfrac{\sigma}{\sqrt{n}}\right)$

When Is It Normal?

If the population itself is Normal — always.
If the population is not Normal — by the Central Limit Theorem, approximately Normal when $n \ge 30$.

2. Sampling Distribution of $\hat{p}$

For a categorical variable (yes/no), we estimate the population proportion $p$ with the sample proportion $\hat{p}$. Repeated sampling gives a distribution of $\hat{p}$ values.

Key Formulas

Mean
$\mu_{\hat{p}} = p$

Standard Error
$\sigma_{\hat{p}} = \sqrt{\dfrac{p(1-p)}{n}}$

Distribution
$\hat{p} \sim N\!\left(p,\;\sqrt{\dfrac{p(1-p)}{n}}\right)$

Normality Check

The sampling distribution of $\hat{p}$ is approximately Normal when both conditions hold:

$np \ge 5 \quad\text{and}\quad n(1-p) \ge 5$

3. Z-Score Conversions

To find probabilities, convert to the standard Normal distribution using the appropriate Z-score formula.

For $\bar{X}$
$Z = \dfrac{\bar{X} - \mu}{\sigma / \sqrt{n}}$

For $\hat{p}$
$Z = \dfrac{\hat{p} - p}{\sigma_{\hat{p}}}$

4. Problem Types

N1 — Left Tail
$P(\bar{X} < a)$ or $P(\hat{p} < a)$

N2 — Right Tail
$P(\bar{X} > a)$ or $P(\hat{p} > a)$

N3 — Between
$P(a < \bar{X} < b)$ or $P(a < \hat{p} < b)$

Tip: For N2 (right-tail) problems, use $P(Z > z) = 1 - P(Z < z)$. For N3 (between) problems, compute the two tail areas and subtract.

5. Worked Examples

Example 1 — Fullerton Household Incomes

Problem Setup

Let $X$ = Fullerton household income. The population has $\mu = 72{,}000$ and $\sigma = 6{,}000$. Samples are drawn from various universities (CSUF, UCI, USC, UCLA). Describe the sampling distribution of $\bar{X}$.

Mean of $\bar{X}$: $\mu_{\bar{X}} = \mu = 72{,}000$

Standard error: $\sigma_{\bar{X}} = \dfrac{\sigma}{\sqrt{n}} = \dfrac{6{,}000}{\sqrt{n}}$

Shape: Normal if the population is Normal or $n \ge 30$ (CLT).

Key Insight: The mean of the sampling distribution always equals the population mean, regardless of sample size. Increasing $n$ only reduces the spread.

Example 2 — Trader Joe's Customers

Problem Setup

76% of Trader Joe's customers read ingredients before purchasing ($p = .76$). A random sample of $n = 400$ customers is selected.

Standard error: $\sigma_{\hat{p}} = \sqrt{\dfrac{.76 \times .24}{400}} = \sqrt{\dfrac{.1824}{400}} = \sqrt{.000456} = .0214$

Normality check: $400 \times .76 = 304 \ge 5$ ✓ $400 \times .24 = 96 \ge 5$ ✓

Distribution: $\hat{p} \sim N(.76,\;.0214)$

(i) Find $P(\hat{p} > .75)$ — N2

Convert: $Z = \dfrac{.75 - .76}{.0214} = \dfrac{-.01}{.0214} = -.4673$

$P(\hat{p} > .75) = P(Z > -.4673) = P(Z < .4673) = .6799$

(ii) Find $P(.73 < \hat{p} < .79)$ — N3

Lower: $Z_1 = \dfrac{.73 - .76}{.0214} = -1.4019$

Upper: $Z_2 = \dfrac{.79 - .76}{.0214} = 1.4019$

$P(.73 < \hat{p} < .79) = P(-1.40 < Z < 1.40) = 2 \times P(Z < 1.40) - 1 \approx .8385$

(iii) Find $P(\hat{p} < .75)$ — N1

From part (i): $Z = -.4673$

$P(\hat{p} < .75) = P(Z < -.4673) = 1 - .6799 = .3201$

Key Insight: Parts (i) and (iii) are complements — they must sum to 1. This is a great self-check.

Example 3 — First-Time Amazon Customers

Problem Setup

30% of customers are first-time Amazon buyers ($p = .30$). A sample of $n = 100$ is drawn.

Standard error: $\sigma_{\hat{p}} = \sqrt{\dfrac{.30 \times .70}{100}} = \sqrt{\dfrac{.21}{100}} = \sqrt{.0021} = .0458$

Normality check: $100 \times .30 = 30 \ge 5$ ✓ $100 \times .70 = 70 \ge 5$ ✓

Distribution: $\hat{p} \sim N(.30,\;.0458)$

Key Insight: Even with a relatively low proportion ($p = .30$), a sample of 100 is more than enough to satisfy the normality conditions.

Example 4 — Fullerton Solar Energy

Problem Setup

20% of Fullerton households use solar energy ($p = .20$). A sample of $n = 100$ is drawn.

Standard error: $\sigma_{\hat{p}} = \sqrt{\dfrac{.20 \times .80}{100}} = \sqrt{\dfrac{.16}{100}} = \sqrt{.0016} = .04$

Normality check: $100 \times .20 = 20 \ge 5$ ✓ $100 \times .80 = 80 \ge 5$ ✓

Distribution: $\hat{p} \sim N(.20,\;.04)$

Key Insight: The standard error $\sigma_{\hat{p}}$ is also called the standard deviation of $\hat{p}$ — both terms are used interchangeably.

Example 5 — Car Insurance

Problem Setup

The mean annual cost of car insurance is $\mu = \$939$ with $\sigma = \$245$. A random sample of $n = 50$ policies is selected.

(a) Mean: $\mu_{\bar{X}} = \mu = 939$

(b) Standard error: $\sigma_{\bar{X}} = \dfrac{245}{\sqrt{50}} = \dfrac{245}{7.071} = 34.65$

(c) Shape: Normal, since $n = 50 \ge 30$ (CLT). Thus $\bar{X} \sim N(939,\;34.65)$.

(d) Find $P(\bar{X} < 964)$

Convert to Z: $Z = \dfrac{964 - 939}{34.65} = \dfrac{25}{34.65} = 0.72$

Look up: $P(\bar{X} < 964) = P(Z < 0.72) = .7642$

Key Insight: There is about a 76.4% chance that the sample mean insurance cost is less than $964 — even though individual policies vary widely ($\sigma = 245$), the sampling distribution is much tighter ($\sigma_{\bar{X}} = 34.65$).

📝 Practice Problems

Test your understanding. Click each question to reveal the answer.

1. A population has $\mu = 500$ and $\sigma = 80$. If you take a sample of $n = 64$, what is the standard error $\sigma_{\bar{X}}$?

2. A poll finds that $p = 0.45$ of voters support a measure. If $n = 200$, what is $\sigma_{\hat{p}}$?

3. Can you use the Normal approximation for $\hat{p}$ if $p = 0.02$ and $n = 100$?

4. Household income has $\mu = \$50{,}000$ and $\sigma = \$12{,}000$. For a sample of $n = 36$, find $P(\bar{X} > 53{,}000)$.

5. In a city, 60% of residents recycle. A sample of $n = 150$ is taken. Find $P(\hat{p} < 0.55)$.

6. Which is larger: the standard error when $n = 25$ or when $n = 100$? Why?

Chapter 7 — Sampling Distributions

1. Sampling Distribution of \(\bar{X}\)

Key Formulas

When Is It Normal?

2. Sampling Distribution of \(\hat{p}\)

Key Formulas

Normality Check

3. Z-Score Conversions

4. Problem Types

5. Worked Examples

Example 1 — Fullerton Household Incomes

Problem Setup

Example 2 — Trader Joe's Customers

Problem Setup

(i) Find \(P(\hat{p} > .75)\) — N2

(ii) Find \(P(.73 < \hat{p} < .79)\) — N3

(iii) Find \(P(\hat{p} < .75)\) — N1

Example 3 — First-Time Amazon Customers

Problem Setup

Example 4 — Fullerton Solar Energy

Problem Setup

Example 5 — Car Insurance

Problem Setup

(d) Find \(P(\bar{X} < 964)\)

📝 Practice Problems