Hassett, chapter 4: Discrete random variables

4.1 Random variables
4.2 The probability function of a discrete random variable
- Example 4.7: a slot machine
4.3 Cumulative distribution function
4.4 Measuring central tendency; expected value
- Mean
- Mode
4.4 Variance and standard deviation
4.5 Population and sample statistics
Problems

4.1 Random variables

Definition 4.1

A "random variable" is a numerical quantity that depends on chance.

Examples and counterexamples: the number of heads in a series of coin tosses is a random variable. But a non-numeric sequence, like $HTHHT$, isn't numerical.

Continuous vs. discrete: this chapter is discrete.

Definition 4.1a

A random variable is a function mapping a sample space to the real numbers.

Two-coin example $$ \begin{aligned} HH $\to 2 \\ TH $\to 1 \\ HT $\to 1 \\ TT $\to 0 \\ \end{aligned} $$

A capitalization convention: * $X$ is the "entire" random variable, which can take on any value * $x$ is a specific result

Consider "let $x$" be the number of heads in the first two coin tosses," which might be abbreviated as $X=x$.

4.2 The probability function of a discrete random variable

Definition 4.2

Let $X$ be a discrete random variable. A "probability function" for $X$ is a function $p(x)$ which assigns a probabilit to each value of the random variable, so that

$p(x) \geq 0$ for all $x$, and

$\sum p(x) = 1$, there isn't any missing probability.

Also known as the "probability mass function" or the probability density function."

Tables or functions.

Example 4.7: a slot machine

Probability of winning on an individual play is $0.05$. Let $X$ be the number of unsuccessful attempts before the first win. From independence, $$ p(k) = P(X=k) = 0.05 \times 0.95^k,\text{ for } k\in\set{0,1,2,\cdots}. $$

4.3 Cumulative distribution function

It's $F(x) = P(X\leq x)$. For the slot machine, there's a geometric series: $$ \begin{aligned} F(x) &= \sum_{k=0}^x P(X=k) \\ &= 0.5\sum 0.95^k \\ &= 0.5\times\frac{1-0.95^{x+1}}{1-0.95} \\ &= 1-0.95^{x+1} \end{aligned} $$

4.4 Measuring central tendency; expected value

Mean

The mean of actual results is $\langle x\rangle = \sum x\cdot p(x)$; the expectation value of the distribution has the same value but is called $E(X) = \mu$.

For the slot machine, $$ \begin{aligned} E(X) &= \sum x\cdot p(x) = \sum k\times0.05\times{0.95^k} \\ &= \frac{0.05}{(1-0.95)^2} = \frac{1}{0.05} = 20. \end{aligned} $$ where the sum is from some derivative-of-sum cuteness.

Note that for random variable $Y=aX + b$, with $a,b$ any constants, $E(Y) = a\cdot E(X) + b$.

Mode

It's the value $x$ for which $p(x)$ is the largest.

4.4 Variance and standard deviation

The "variance" is the expectation value of the square of the distance from a random $x$ to the mean: $$ \begin{aligned} V(X) &= E\left((X-\mu)^2\right) \\ &= \sum (x-\mu)^2 p(x) \end{aligned} $$ The "standard deviation" is the square root of the variance, $\sigma = \sqrt{V(X)}$.

Variance standard deviation of $Y=aX$

This has to be $\sigma_Y = a\sigma_X$, by dimensional analysis. $$ \begin{aligned} V(aX) &= \mean{(aX-a\mu)^2} \\ &= a^2\mean{(X-\mu)^2} = a^2 V(X). \end{aligned} $$ Ooops, it's $\sigma_Y = |a|\sigma_x$, because the standard deviation is always the positive square root.

$z$-scores

The "$z$-score" is a name for what I have called the "significance": $$ z = \frac{x-\mu}{\sigma} $$

… and Chebyshev's theorem

For any random variable $X$, the probability that $X$ is within $k\sigma$ of the mean is no smaller than $1-\frac1{k^2}$: $$ \begin{aligned} P\big( \mu-k\sigma \leq X \leq m + k\sigma \big) \geq 1-\frac{1}{k^2}. \end{aligned} $$

This is apparently too conservative to usually be useful. For the normal distribution $k=1$ is the 68% confidence limit, but Chebyshev's theorem gives $P \geq 0$; the 95% confidence limit corresponds to $k=2$, but the theorem gives $P \geq \frac34$.

4.5 Population and sample statistics

For an entire population, let $f(x)$ be the number of the population for which the random variable is $X=x$, with $\sum f = n$. $$ \begin{aligned} \text{\emph{population} mean: }&& \mu &= \frac1n \sum f\cdot x \\ \text{standard deviation: } && \sigma &= \sqrt{ \frac1n \sum f\cdot (x-\mu)^2 } \end{aligned} $$ For a sample, however, the denominator of the standard deviation changes: $$ \begin{aligned} \text{\emph{sample} mean: } && \bar x &= \frac1{n} \sum f\cdot x \\ \text{standard deviation: } && s &= \sqrt{ \frac1{n-1} \sum f\cdot (x-\mu)^2 } \end{aligned} $$ These are estimators, not values. Note the change in symbol. Apparently these symbols are common on the "stats" menus of some calculators.

At some point I had understood the $\frac{1}{n-1}$ denominator in terms of degrees of freedom; on my back-burner might be some modeling to understand it better, or digging for a proof. (I know without checking that there's a proof on Wikipedia, and that it's unreadable.) Intuitively, taking $n=1$ samples doesn't give any information about the standard deviation, so $s\to\infty$ make sense there.

Problems

Strategerizing:

The problems are broken down like so:

1–4: probability functions
5–11: central tendencies and expected values
12–15 variance and standard deviation
16–17: population and sample statistics
18–20: sample actuarial examination problems

This is miles different from Chapter 3, which had almost seventy problems.

problem 4-10: the number of rolls from a fair die.

Let $X$ be the random variable for the number of times a fair die is tossed before a six appears. Find $E(X)$.

We have probability density $p(k) = \frac16\left(\frac56\right)^k$, cumulative distribution $$ \begin{aligned} F(k) &= \frac16\ \frac{1-(5/6)^{k+1}}{1-(5/6)} = 1-(5/6)^k \end{aligned} $$

The expectation value is $$ \begin{aligned} E(X) &= \sum k\cdot p(k) \\ &= \sum k\cdot\frac16\left(\frac56\right)^k \\ &= \frac16 \ \frac{1}{(1-(5/6))^2} = 6 \end{aligned} $$

But the key gives five — which is clearly correct. Whichever particular number should, on average, come up on the sixth roll.

Where is this off-by-one error? It's apparently in my geometric sum. I know by heart that $$ \begin{aligned} \sum_{k=0}^\infty x^k &= \frac{1}{1-x}. \end{aligned} $$ Take the derivative of both sides, $$ \begin{aligned} \frac{\mathrm d}{\mathrm dx}\sum_{k=0}^\infty x^k &= \frac{\mathrm d}{\mathrm dx}\frac{1}{1-x} \\ \sum_{k=1}^\infty k\,x^{k-1} &= \frac{1}{(1-x)^2}. \end{aligned} $$ Notice that the $k=0$ term was killed from the sum by the derivative.

I was doing $\sum_{k=1}^\infty k\,x^{k} = \sum_{j=0}^\infty (j+1)\,x^{j}$, which is larger by the sum of all the probabilities (which is one, of course).

problem 4-13: An insurance policy.

From a previous problem, 4-7:

One unit of insurance pays \$1k for an injury and \$10k for a death. In a year, 7.3% of workers are injured and 0.41% are killed. What is the expected unit claim amount (pure premium) for this insurance? If a company has 10,000 employees and exactly 7.3% are injured and exactly 0.41% are killed, what is the average cost per unit of the insurance claims?

The expectation value of the claim amount is $$ \rm \$1k\times 0.073 + \$10k \times 0.0041 = \$114. $$

If 730 employees are injured and 41 are killed (wow, talk about high risk), there are \$730,000 in injury payments and \$410,000 in death payments, spread over 10,000 policies: also \$114 per policy.

Now we have

For this policy, what's the standard deviation for the claim amount of five units of insurance?

There's a long note about including people who don't make payments, which biased to expect a too-low mistake.

The mean-square expectation value is $$ \begin{aligned} \mean{\text{payment}^2} &= {(\$1000)^2\times0.073 + (\$10,000)^2\times0.0041} \\&= (\$)^2 483\,000 \end{aligned} $$ giving variance and standard deviation $$ \begin{aligned} \sigma^2 &= \mean{\text{payment}^2} - \mean{\text{payment}}^2 \\ &= (\$)^2 470\,004 \\ \sigma &= \$685.57 \end{aligned} $$

I did this just fine, I just missed that it was five units instead of just one, so the answer in the book is $5\times\$685.57 = \$3428.$