p sample, july 10

rob mahurin

date	page	question	duration	ans	result	thoughts
2025-07-10 Thu 11:24	32	71	14	a	E	double normal? also, insufficiently gay
		72	7	b?	D	my answer not an option
	202	492	3	a	C	ARGH, $F$ vs. $S$
		493	4	b	✅
	75	175	5	d	✅	$F$ vs. $S$
		176	7	c	✅	fail sanity check
	163	397	5	b	✅
		398	11	a	E	guess close

Dang, that's rough.

Almost three hours on these problems today, though I took some breaks.

71. Pensions for married policewomen.
- The solution manual's approach is better.
72. The mean of binned data.
- The solution manual
492. Machine survival time.
398. Conditional homeowner probability
- The solution manual is so annoying.

71. Pensions for married policewomen.

A city has just added 100 new female recruits to its police force. The city will provide a pension to each new hire who remains with the force until retirement. In addition, if the new hire is married at the time of her retirement, a second pension will be provided for her husband. A consulting actuary makes the following assumptions:

Each new recruit has a 0.4 probability of remaining with the police force until retirement.

Given that a new recruit reaches retirement with the police force, the probability that she is not married at the time of retirement is 0.25.

The events of different new hires reaching retirement and the events of different new hires being married at retirement are all mutually independent events.

Calculate the probability that the city will provide at most 90 pensions to the 100 new hires and their husbands.

It was straightforward enough to calculate the expectation value of seventy pensions: one hundred officers, of whom forty retire, of whom thirty are married. But the variances confused me. I think the right distribution is binomial, with $p=0.40$, so the number of actual retirees should be distributed like

$$\begin{aligned} P(k) &= {100\choose k} p_\text{retire}^k q_\text{retire}^{100-k} \end{aligned}$$

This has mean $np$ and variance $npq$ — which I am just realizing now I took as a standard deviation of $n\sqrt{pq}$, too large by a factor of ten. Oops.

Maybe now my central-limit-theorem approach will work. We should have $40 \pm 4.9$ retirees. The number of husbands¹, given forty retirees, will be distributed like

$$\begin{aligned} P(h) &= {40 \choose h} p_\text{spouse}^h q_\text{spouse}^{40-h} \end{aligned}$$

with expectation value $30\pm 2.73$. Using the gaussian approximation, the sum is $70 \pm 5.61$ pensions.

That should never go above ninety pensions.

Which is of course the question: "at most 90" pensions. The $z$-score is $\frac{20}{5.61} = 3.57$, so I should have picked 99% probability — perhaps without even having done the problem so hard.

The solution manual's approach is better.

For each recruit, the probabilities of pensions are

number	probability
0	$0.6$
1	$0.4\times0.25 = 0.1$
2	$0.4\times0.75 = 0.3$

First moment, second moment, variance: $$\begin{aligned} E(P) &= 0\times0.6 + 1\times 0.1 + 2\times 0.3 &&= 0.7 \\ E(P^2) &= 0^2\times0.6 + 1^2 \times 0.1 + 2^2 \times 0.3 &&= 1.3 \\ V(P) &= E(P^2) - (E(P))^2 \\ &= 1.3 - 0.49 &&= 0.81 \end{aligned}$$

So now we should have $100\times 0.7 = 70$ pensions, with a standard error of $\sqrt{100\times 0.81} = 9$. That means having fewer than 90 pensions has $z$-score $z= \frac{20}{9} = 2.22$, which has probability 0.9868.

The solution manual uses $z=\frac{20.5}{9}$, including a "continuity correction."

72. The mean of binned data.

In an analysis of healthcare data, ages have been rounded to the nearest multiple of 5 years. The difference between the true age and the rounded age is assumed to be uniformly distributed on the interval from −2.5 years to 2.5 years. The healthcare data are based on a random sample of 48 people.

Calculate the approximate probability that the mean of the rounded ages is within 0.25 years of the mean of the true ages.

I think perhaps I made the same squaring-the-variance mistake here. Each uniformly-distributed estimate has variance $\frac{5}{12}$ and standard deviation $\sqrt{\frac{5}{12}}$. The sum should have variance $n\frac{5}{12}$ and standard deviation $\sqrt{n\frac{5}{12}}$, because we are adding independent random variables. The mean should then have standard deviation $\sqrt{\frac1n\frac{5}{12}}$, because we're just dividing by the constant. So the uncertainty on the mean is

$$\begin{aligned} \sigma_\text{mean} &= \sqrt{\frac{5}{48\times12}} &= 0.093 \end{aligned}$$

Being farther than 0.25 years above the true mean age would be a $z$-score of $z=\frac{0.25}{0.093} = 2.68$, with probability $1-0.9963 = 0.0037.$ You have the same odds of being too low, for a total probability of 0.99226 of being in the desired range.

That feels way too high, and it is. Let's see the solution manual's approach.

The solution manual

The variance is $\frac{5^2}{12}$. Oops. Units would have helped me here.

That gives $\sigma_\text{mean} = \sqrt{\frac{25}{48\times12}} = 0.208$, and $z$-score of $z=\frac{0.25}{0.208} = 1.2$. The high-side probability is $0.8849 = 1-0.1151$, so the probability of being symmetrically within the interval is $1-2\times 0.1151 = 0.7698$.

That's the right answer.

492. Machine survival time.

A machine's lifetime $X$, in years, is modeled by an exponential distribution. The probability that the machine still functions after one year is $0.80$.

$F$ is the cumulative distribution function for $X$.

Determine $F(x)$ for $x ≥ 0$.

I got thrown here because the answers contained $0.8^{+x}$ and $e^{-0.8x}$. I figured out it was $0.8^{+x}$, with the negative sign coming from $\ln 0.80$, but then I mixed up the c.d.f. with the survival function.

Annoying and stupid.

398. Conditional homeowner probability

Conditional probabilities! The bane of my existence.

Ten percent of homeowners in a certain city are classified as high-risk, and ninety percent are classified as low-risk. Each homeowner’s classification remains unchanged over the next four years.

In any given year, each high-risk homeowner has probability 0.80 of experiencing no fires, and each low-risk homeowner has probability 0.99 of experiencing no fires. For each homeowner, the numbers of fires in different years are mutually independent.

A randomly chosen homeowner experiences no fires in the first and second years.

Calculate the probability that this homeowner will experience no fires in the third and fourth years.

Here I think I needed to figure out the probability that the homeowner was in each risk group. I made a table:

years without fire	high	low	p(no fires)
1	0.80	0.99	0.971
2	0.64	0.98	0.946

This is assuming that the second year is independent of the first. Now I need to find the probability that we're in each group:

$$\begin{aligned} P(\text{high} | \text{no fires}) &= \frac{ P(\text{no fires} | \text{high}) P(\text{high})} { P(\text{no fires})} \\&= \frac{0.64\times0.10}{0.946} = 0.0676^* \\ P(\text{low} | \text{no fires}) &= \frac{0.98\times0.90}{0.946} = 0.9323 \end{aligned}$$

Note$^*$ that I accidentally used 0.80 for one year instead of 0.64 for two years, but I don't think this was my problem. Now I have

$$\begin{aligned} P(\text{no more fires}) &= P(\text{no more fires} | \text{high} ) P(\text{high}) + P(\text{no more fires} | \text{low} ) P(\text{low}) \\&= 0.98\times 0.9323 + 0.64\times 0.0676 \\&= 0.9566 \end{aligned}$$

That makes a bigger difference than I thought — but it's midway between two answers, 0.9548 versus 0.9571. Let's see what I should have done.

The solution manual is so annoying.

Make the following definitions:

$$\begin{aligned} X &= \text{no fires first two years} \\ Y &= \text{no fires following two years} \\ H &= \text{homeowner is high risk} & P(H)&= 0.10 \\ L &= \text{homeowner is low risk} & P(L) &= 0.90\\ \end{aligned}$$

The table I made says $P(X|H) = 0.80^2$ and $P(X|L) = 0.99^2$.

What I didn't do was

$$\begin{aligned} P(X\and Y | H) &= 0.80^4 & P(X\and Y|L) &= 0.99^4 \end{aligned}$$

So now I want the probability of $Y$ given $X$:

$$\begin{aligned} P(Y | X) &= \frac{P(X\and Y)}{P(X)} \\&= \frac{ P(X\and Y|H) P(H) + P(X\and Y|L) P(L) }{ P(X|H)P(H) + P(X|L)P(L) } \\&= \frac{ 0.80^4 \times 0.10 + 0.99^4 \times 0.90 } { 0.80^2 \times 0.10 + 0.99^2 \times 0.90 } \end{aligned}$$

This should be "spouses," not "husbands." A pre-Obergefell problem. ↩