Hassett, chapter 11: Applying multivariate distributions

$ %%%%%%%%%%%%%%%% Helpful KaTeX macros. \gdef\bar#1{\overline{ #1} } \gdef\and{\cap} \gdef\or{\cup} \gdef\set#1{\left\{ #1\right\} } \gdef\mean#1{\left\langle#1\right\rangle} $

11.1 Distributions of functions of two random variables

Consider the random variable functions

$$ X+Y \quad X-Y \quad \min(X,Y) \quad \max(X,Y) $$

For sums, we're basically adding up all of the probabilities

$$\begin{aligned} p_S(s) &= \sum_x p(x, s-x) \\ &= \sum_x p_X(x) \cdot p(s-x | x) \\ &= \sum_x p_X(x) \cdot p_Y(s-x) \text{(if independent)} \end{aligned}$$

For independent continuous variables, likewise

$$\begin{aligned} f_S(s) &= \int_{-\infty}^{\infty} f_X(x)\cdot f_Y(s-x) \mathrm dx \end{aligned}$$

Sums of exponential variables are gamma-distributed

Let's look at the waiting times between accidents in two towns. The p.d.f. and the marginal density functions are

$$\begin{aligned} f(x,y) &= e^{-x-y} \\ f_X(x) &= e^{-x} \\ f_Y(y) &= e^{-y} \\ \end{aligned}$$

Let's show (as was done in the previous chapter, but I skipped it) that $X$ and $Y$ are independent. We have the marginal probability

$$\begin{aligned} f_Y(y) &= \int_0^\infty \mathrm dx\ e^{-x-y} \\ &= e^{-y} \cdot \left(1-0\right) \text{, as assumed above} \end{aligned}$$

and the conditional probability

$$\begin{aligned} f(x|y) &= \frac{f(x,y)}{f(y)} \\ &= \frac{e^{-x-y}}{e^{-y}} = f_X(x) \end{aligned}$$

and vice-versa. So they're independent. Now we want to find the density function for $X+Y$.

$$\begin{aligned} f(X+Y) &= \int_0^\infty \mathrm dx\ f_X(x) \cdot f_Y(s-x) \\&= \int_0^\infty \mathrm dx\ e^{-x} \cdot e^{-(s-x)} \\&= \int_0^\infty \mathrm dx\ e^{-s} \qquad ??? \end{aligned}$$

Oh, this is subtle: because the probability density for a negative waiting time is zero, I should have changed the limits of the integral. Let's write that again, more clearly.

$$\begin{aligned} f(x,y) &= \begin{cases} e^{-x-y} & (x,y) \geq 0 \\ 0 & x<0 \text{ or } y < 0 \\ \end{cases} \\ f_X(x) &= \begin{cases} e^{-x} & x > 0 \\ 0 & x < 0 \end{cases} \\ f_Y(y) &= \begin{cases} e^{-y} & y > 0 \\ 0 & y < 0 \end{cases} \end{aligned}$$

So now we have the joint probability distribution for the sum of

$$\begin{aligned} f(X+Y) &= \int_0^\infty \mathrm dx\ f_X(x) \cdot f_Y(s-x) \\&= \int_0^s \mathrm dx\ e^{-x} \cdot e^{-(s-x)} + \int_s^\infty \mathrm dx\ e^{-x} \cdot 0 \\&= e^{-s} \int_0^s \mathrm dx\ 1 \\&= s e^{-s} \end{aligned}$$

Because $X$ and $Y$ were exponentially distributed with $\beta=1$, their sum $X+Y$ should be gamma-distributed with $(\alpha,\beta) = (2,1)$. Remember that the gamma distribution has density

$$\begin{aligned} f_\Gamma(x) &= \frac{\beta^\alpha}{\Gamma(\alpha)} x^{\alpha-1} e^{-\beta x}. \end{aligned}$$

I can sort of see how to get this relation in general from induction, but I shouldn't spend the time on it.

Minimum of two random variables

Now let's look at $\min(X+Y)$ for exponential distributions. Remember the cumulative and survival functions

$$\begin{aligned} F(t) &= &P(X \leq t) &= 1-e^{-\beta t} \\ S(t) &= &P(X > t) &= e^{-\beta t} \\ \end{aligned}$$

Suppose $X$ and $Y$ are exponentially distributed with parameters $\beta,\lambda$. The probability that $\min(X,Y)$ is at least some time $t$ is the probability that both $X$ and $Y$ have survived through $t$. That is

$$\begin{aligned} S(t) &= P(\min{X,Y} > t) \\ &= P(X > t \and Y > t) \\ &= P(X > t) \cdot P(Y > t), \text{ by independence} \\ &= S(X) \cdot S(Y) \\ &= e^{-\beta t}e^{-\lambda t} = e^{-(\beta+\lambda)t} \end{aligned}$$

This survival functions means that the minimum time is exponentially distributed with parameter $\beta+\lambda$.

This procedure works for the minimum of any two (independent) random variables. The survival obeys

$$\begin{aligned} S_\text{min}(t) & = S(X) \cdot S(Y) \end{aligned}$$

because both independent variables must survive. The survival function for the minimum might correspond to a known distribution. Likewise, the distribution for the maximum of two variables can be found by preserving the c.d.f.,

$$\begin{aligned} F_\text{max}(t) &= F_X(t) \cdot F_Y(t) \end{aligned}$$

11.2 Expected values of functions of random variables

We don't need to know distributions to find expectation values, because we can do

$$\begin{aligned} E[ g(X,Y) ] &= \sum_{x,y} g(x,y) \cdot p(x,y) \end{aligned}$$

or its continuous equivalent.

Because expectation values are linear,

$$\begin{aligned} E(X+Y) &= E(X) + E(Y) \end{aligned}$$

Products don't work like that, unless $X$ and $Y$ are independent.

Covariance

The covariance is

$$\begin{aligned} \text{Cov} (X,Y) &= E[ (X-\mu_x) \cdot (Y-\mu_Y) ] \end{aligned}$$

with positive and negative associations/correlations having the usual meaning.

An alternative definition is

$$\begin{aligned} \text{Cov}(X,Y) &= E(XY) - E(X)\cdot E(Y) \end{aligned}$$

This formulation makes it clear (based on the statements above) that, if $X$ and $Y$ are independent, their covariance will be zero.

The variance of the sum depends on the covariance:

$$\begin{aligned} V(X+Y) &= V(X) + V(Y) + 2\cdot\text{Cov}(X+Y) \end{aligned}$$

Some properties:

  • The covariance is symmetric: $\text{Cov}(X,Y) = \text{Cov}(Y,X)$.
  • The covariance of a random variable with itself is its variance.
  • The covariance of a random variable with a constant is zero.
  • Scaling either random variable scales the covariance: $$ \text{Cov}(aX,bY) = ab\cdot\text{Cov}(X,Y).$$
  • Covariances are distributive: $$\text{Cov}(X,Y+Z) = \text{Cov}(X,Y) + \text{Cov}(X,Z).$$

Correlation coefficients

The correlation coefficient is

$$\begin{aligned} \rho_{XY} &= \frac{\text{Cov}(X,Y)}{\sigma_X\sigma_Y} \\&+ \frac{\text{Cov}(X,Y)}{\sqrt{V(X)\cdot V(Y)} \end{aligned}$$

Variables which are linearly related have a unit correlation coefficient:

$$\begin{aligned} \rho_{XY} &= \frac{\text{Cov}(X, aX+b)}{\sigma_X\sigma_{aX+b}} \\&= \frac{\text{Cov}(X,aX) + \text{Cov}(X,b)}{\sigma_X(|a|\sigma_{aX+b})} \\&= \frac{a\cdot V(X) + 0}{|a| \sigma_X^2} \\&= \pm 1, \text{ depending on the sign of } a. \end{aligned}$$

(exclude) Bivariate normal distribution

Correlated normal variables can have

$$\begin{aligned} f(x,y) &= \frac{1}{2\pi\sigma_x\sigma_y\sqrt{1-\rho^2}} e^{ \frac{-1}{2(1-\rho^2)} \left[ \left( \frac{x-\mu_x}{\sigma_x}\right)^2 - 2\rho \left( \frac{x-\mu_x}{\sigma_x}\right) \left( \frac{y-\mu_y}{\sigma_y}\right) + \left( \frac{y-\mu_y}{\sigma_y}\right)^2 \right] } \end{aligned}$$

11.3 (exclude) Moment generating functions for sums of independent random variables

I don't care about moment-generating functions.

11.4 The sum of more than two random variables

Sums of different distributions

Poisson $\to$ Poisson

If $X_1,X_2,\cdots,X_n$ are independent Poisson random variables with parameters $\lambda_1,\lambda_2,\cdots,\lambda_n$, then their sum $\sum_i X_i$ is Poisson distributed with parameter $\sum_i\lambda_i$.

Geometric $\to$ negative binomial

The sum of $n$ i.i.d. geometric random variables with success probability $p$ is a negative binomial random variable with the same $p$ and $r=n$.

Normal $\to$ normal

If the $X_i$ are independent normal variables with means $\mu_i$ and variances $\sigma_i^2$, then the sum $\sum_i X_i$ has mean $\sum_i\mu_i$ and variance $\sum_i \sigma_i ^2$.

Exponential $\to$ gamma

If the $X_i$ are i.i.d. exponential random variables with parameter $\beta$, their sum $\sum_i X_i$ is a gamma random variable with $\alpha=n$ and the same $\beta$.

Mean and variance of a multiple sums

In a triple sum, the pairwise terms all enter the covariance twice.

$$\begin{aligned} E(X+Y+Z) &= E(X) + E(Y) + E(Z) \\ V(X+Y+Z) &= V(X) + V(Y) + V(Z) \\& \quad + 2\times\left( \text{Cov}(X,Y) + \text{Cov}(X,Z) + \text{Cov}(Y,Z) \right) \end{aligned}$$

In fact, that's true no matter how many terms are in the sum.

$$\begin{aligned} E\left( \sum_i X_i \right) &= \sum_i E(X_i) \\ V\left( \sum_i X_i \right) &= \sum_i V(X_i) + 2\sum_{i<j} \text{Cov}(X_i,X_j) \end{aligned}$$

Central limit theorem

Big sums are normally distributed.

11.5 Double expectation theorem

The expectation value of a conditional probability is just the expectation value of the variable back again:

$$\begin{aligned} E[ E(X|Y) ] &= E(X) \end{aligned}$$}

The conditional variance is

$$\begin{aligned} V(X|Y=y) &= E(X^2 | Y=y) - \left( E(X|Y=y) \right)^2 \end{aligned}$$

Eventually this cute thing happens:

$$\begin{aligned} V(X) &= E[ V(X|Y) ] + V[ E(X|Y) ] \\ V(Y) &= E[ V(Y|X) ] + V[ E(Y|X) ] \\ \end{aligned}$$

11.6 Applying the double expectation theorem

Problems