# Probability and Distribution Theory (PDT)

Question 1

Let Y1,…,Ynbe independent and identically distributed (i.e., i.i.d.) random variables with pdf given by the Standard Normal distribution N(0,1). Suppose thisrandom sample is randomly split into two parts, one part with k observations andthe second part with n-k observations. Define the sample means of these two parts
respectively as

• Find the distribution of

(b) Find the distribution of

### Question 2

This question considers the systolic (SBP) and diastolic (DBP) blood pressures of
a random sample of adults from a population. For i = 1,2, … , n, let the random
variables Xiand Yidenote respectively the systolic and diastolic blood pressure
(SBP and DBP) of the ith randomly chosen adult. Assume that the pairs (Xi, Yi)
constitute a random sample of size n from a bivariate normal population which has
E(Xi) = µx, E(Yi) = µy, Var(Xi) = Var(Yi) = σ2, and Corr(Xi, Yi) = ρ. Consider
the following random variables Ui= (Xi+ Yi) (the sum of SBP and DBP) and
Vi= (Xi– Yi) (the difference between SBP and DBP, called the pulse pressure), for
i = 1,2, …, n. Let:

(a)Derive expressions for the means and variances of the randomvariables Uiand Vi, i = 1,2, … , n.

(b) Derive that Cov(Ui, Vi) = 0, i = 1,2, … , n, which indicates that Ui
and Viare independent random variables.
(c) Using sampling distribution arguments prove that the random variableW has an F distribution, where

Question 3
An emergency department research group is interested in developing a statistical
model for the number of in-hospital deaths among patients presenting to the emergency department with community-acquired pneumonia (CAP). Let N be the number of patients presenting to the emergency department with CAP in a given year
and assume that this random variable follows a geometric distribution

Next, for the ith patient define the random variable Yito be equal to 1 if the
patient dies during their hospital stay and Yi= 0 if they are discharged alive, with
P(Yi= 1) = π with 0 < π <1. Assume that the mortality outcomes of different
patients are all independent of each other.
The key quantity of interest to the researchers is the total number of deaths among
the patients presenting with CAP in a given year, defined as the random variable
T:
T = Y1+ Y2+ · · · + YN:
(a) Derive expressions for E(T) and Var(T) in terms of the parametersπ and θ.
(b) Derive an expression for Corr(N; T) in terms of π and θ.
(c) Derive an expression for P(T = 0).

Question 4
Let X1, X2,…,Xnbe a random sample of size n from a distribution with probability
density function (pdf)

Let X(1)be defined as the minimum of (X1, X2, … ,Xn), i.e.,
X(1)= min(X1, X2,…,Xn).
This question concerns the distribution of Qn= nα(X(1)– 1), and in particular the
distribution of Qnas n becomes very large, i.e., .
[Note that Qnhas the subscript “n” because it depends on the sample size n.]
(a) Derive the cumulative distribution function (cdf) of Qn.
(b) Determine the distribution that Qnconverges to as , providingyour reasoning.

Solution

\documentclass{article}

\usepackage[utf8]{inputenc}

\usepackage{amsmath}

\title{Probability and Distribution Theory (PDT)\\Assignment 2 2017}

\author{}

\date{}

\begin{document}

\maketitle

Given $Y_{1},Y_{2},…,Y_{n}$ are IID random variables following $N(0,1)$ the Standard Normal Distribution.\\

\begin{center}

\overline{Y}_{k}=\dfrac{1}{k}\sum_{i=1}^{k} Y_{i}

\overline{Y}_{n-k}=\dfrac{1}{n-k}\sum_{i=k+1}^{n} Y_{i}

\end{center}

$\overline{Y}_{k}$ and $\overline{Y}_{n-k}$ are mutually independent since function of independent random variables are independent.

\subsection{Part (a)}

Using the fact the sum of $l$ Independent Normal Variables\begin{center} $X-N(\mu_{i},\sigma_{i}^{2})$ is Normal variable $\sum_{i=1}^{l} X_{i}-N(\sum_{i=1}^{l} \mu_{i},\sum_{i=1}^{l} \sigma_{i}^{2})$\end{center} we have:\\

Distribution of \begin{center}$\overline{Y}_{k}$ and $\overline{Y}_{n-k}$ is $N(0,\frac{1}{k})$ and $N(0,\frac{1}{n-k})$ respectively.\end{center}

Again using the same we have distribution of

\begin{center}

$\overline{Y}_{k}+\overline{Y}_{n-k}$ is $N(0,\frac{1}{k}+\frac{1}{n-k})$

\end{center}

Hence the distribution of

\begin{center}

$\frac{1}{2}(\overline{Y}_{k}+\overline{Y}_{n-k})$ is $N(0,\frac{1}{4}(\frac{1}{k}+\frac{1}{n-k}))$

\end{center}

\subsection{Part (b)}

From previous part we have $\overline{Y}_{k}$ follows $N(0,\frac{1}{k})$ and $\overline{Y}_{n-k}$ follows $N(0,\frac{1}{n-k})$.\\

Hence $k^{\frac{1}{2}}\overline{Y}_{k}$ follows $N(0,1)$ and $(n-k)^{\frac{1}{2}}\overline{Y}_{n-k}$ follows $N(0,1)$.\\

Using Independence and the fact that Square of a Standard Normal Distribution is Chi-Squared and also that sum of independent Chi-Squared is Chi-Squared

\begin{center}

$k(\overline{Y}_{k}^{2})+(n-k)(\overline{Y}_{n-k}^{2})$ follows $\chi^{2}_{n}$

\end{center}

\newpage

Given $(X_{i},Y_{i})- N(\mu_{x},\mu_{y},\sigma^{2},\sigma^{2},\rho)$, $U_{i}=X_{i}+Y_{i}$, $V_{i}=X_{i}-Y_{i}$ and

\begin{center}

\overline{U}=\frac{1}{n}\sum_{1}^{n} U_{i}

\overline{V}=\frac{1}{n}\sum_{1}^{n} V_{i}

S_{u}^{2}=\frac{1}{n-1}\sum_{1}^{n} (U_{i}-\overline{U})^{2}

S_{v}^{2}=\frac{1}{n-1}\sum_{1}^{n} (V_{i}-\overline{V})^{2}

\end{center}

\subsection{Part (a)}

Using the fact that $(X_{i},Y_{i})$ are IID

\begin{center}

\begin{equation*}

E(U_{i})=E(X_{i}+Y_{i})=E(X_{i})+E(Y_{i})=\mu_{x}+\mu_{y}

\end{equation*}

\begin{equation*}

E(V_{i})=E(X_{i}-Y_{i})=E(X_{i})-E(Y_{i})=\mu_{x}-\mu_{y}

\end{equation*}

\begin{equation*}

Var(U_{i})=Var(X_{i}+Y_{i})=Var(X_{i})+Var(Y_{i})+2Cov(X_{i},Y_{i})=2\sigma^{2}+2\rho\sigma^{2}

\end{equation*}

\begin{equation*}

Var(V_{i})=Var(X_{i}-Y_{i})=Var(X_{i})+Var(Y_{i})-2Cov(X_{i},Y_{i})=2\sigma^{2}-2\rho\sigma^{2}

\end{equation*}

\end{center}

\subsection{Part (b)}

\begin{center}

\begin{equation*}

\begin{split}

Cov(U_{i},V_{i}) & =Cov(X_{i}+Y_{i},X_{i}-Y_{i})\\

& =Var(X_{i})-Var(Y_{i})+Cov(Y_{i},X_{i})-Cov(X_{i},Y_{i})=0

\end{split}

\end{equation*}

\end{center}

\subsection{Part(c)}

Using the fact that sum of two dependent Normal Random Variables $A-N(\mu_{a},\sigma_{a}^{2})$ and $B-N(\mu_{b},\sigma_{b}^{2})$ is $A+B-N(\mu_{a}+\mu_{b},\sigma_{a}^{2}+\sigma_{b}^{2}+2\rho\sigma_{a}\sigma_{b})$ where $\rho$ is the correlation between $A$ and $B$, we have:\\

\begin{center}

\begin{equation*}

U_{i}-N(\mu_{x}+\mu_{y},\sigma_{x}^{2}+\sigma_{y}^{2}+2\rho\sigma_{x}\sigma_{y})

\end{equation*}

\begin{equation*}

V_{i}-N(\mu_{x}-\mu_{y},\sigma_{x}^{2}+\sigma_{y}^{2}-2\rho\sigma_{x}\sigma_{y})

\end{equation*}

\end{center}

Using the fact that sample variance from Normal Distribution follows Chi-Squared distribution we have:\\

\begin{center}

\begin{equation*}

\frac{(n-1)}{\sigma_{x}^{2}+\sigma_{y}^{2}+2\rho\sigma_{x}\sigma_{y}}S_{u}^{2}-\chi_{n-1}^{2}

\end{equation*}

\begin{equation*}

\frac{(n-1)}{\sigma_{x}^{2}+\sigma_{y}^{2}-2\rho\sigma_{x}\sigma_{y}}S_{v}^{2}-\chi_{n-1}^{2}

\end{equation*}

\end{center}

and since here $\sigma_{x}^{2}=\sigma_{y}^{2}=\sigma^{2}$ then:

\begin{center}

\begin{equation*}

\frac{(n-1)}{2(1+\rho)\sigma^{2}}S_{u}^{2}-\chi_{n-1}^{2}

\end{equation*}

\begin{equation*}

\frac{(n-1)}{2(1-\rho)\sigma^{2}}S_{v}^{2}-\chi_{n-1}^{2}

\end{equation*}

\end{center}

Again from (b) we know that $U_{i}$ and $V_{i}$ are independent $\forall i=1,2,3,..,n$. Hence functions of $U_{i}$ and $V_{i}$ are also mutually independent. Hence:

\begin{center}

\begin{equation*}

\begin{split}

W & =\frac{(1-\rho)S_{u}^{2}}{(1+\rho)S_{v}^{2}}\\

& =\frac{\frac{1}{2(1+\rho)\sigma^{2}}S_{u}^{2}}{\frac{1}{2(1-\rho)\sigma^{2}}S_{v}^{2}}

\end{split}

\end{equation*}

\end{center}

Then $W$ follows $F_{n-1,n-1}$ since ratio of two Independent Chi-Squared RV divided by their corresponding degree of freedom is F.

\newpage

Given $p(N=n)=\theta(1-\theta)^{n-1}$ $\forall n=1,2,..$ and $0<\theta<1$ and $P(Y_{i}=1)=\pi$ and we can see $Y_{i}$ are IID Bernoulli($\pi$) RV.

\begin{center}

$T=Y_{1}+Y_{2}+Y_{3}+…+Y_{N}$

\end{center}

\subsection{Part (a)}

Using Conditional Expectation and Expectation of Bernoulli and Geometric and also using IID of $Y_{i}$ we have

\begin{center}

\begin{equation*}

\begin{split}

E(T)= E(E(T|N))

\end{split}

\end{equation*}

\begin{equation*}

\begin{split}

E(T|N=n) & =E(\sum_{1}^{n} Y_{i})=nE(Y_{1})=n\pi

\end{split}

\end{equation*}

\begin{equation*}

\begin{split}

E(T)= E(E(T|N))=E(N\pi)=\frac{\pi}{\theta}

\end{split}

\end{equation*}

\begin{equation*}

Var(T|N=n)=Var(\sum_{1}^{n} Y_{i})=nVar(Y_{1})=n\pi(1-\pi)

\end{equation*}

\begin{equation*}

\begin{split}

Var(T) & =Var(E(T|N))+E(Var(T|N))=Var(N\pi)+E(N\pi(1-\pi))\\

& =\pi\frac{1-\theta}{\theta^{2}}+\frac{\pi(1-\pi)}{\theta}=\frac{\pi(1-\pi\theta)}{\theta^{2}}

\end{split}

\end{equation*}

\end{center}

\subsection{Part (b)}

\begin{center}

\begin{equation*}

Corr(N,T)=\frac{Cov(N,T)}{\sqrt{Var(T)Var(N)}}

\end{equation*}

\begin{equation*}

Cov(N,T)=E(NT)-E(N)E(T)

\end{equation*}

\end{center}

Using Conditional Expectation as in part(a) we have

\begin{center}

\begin{equation*}

E(NT)=\pi E(N^{2})=\pi(\frac{1-\theta}{\theta^{2}}+\frac{1}{\theta^{2}})

\end{equation*}

\begin{equation*}

Cov(N,T)=\pi\frac{2-\theta}{\theta^{2}}-\frac{\pi}{\theta^{2}}=\frac{2\pi(1-\theta)}{\theta^{2}}

\end{equation*}

\end{center}

Now putting values of $Var(T)$ and $Var(N)$ we have

\begin{center}

\begin{equation*}

Corr(N,T)=\frac{2\theta\sqrt{\pi(1-\theta)}}{\sqrt{1-\pi\theta}}

\end{equation*}

\end{center}

\subsection{Part (c)}

\begin{center}

\begin{equation*}

\begin{split}

P(T=0)& =\sum_{1}^{\infty}P(T=0|N=n)P(N=n)\\

& = \sum_{1}^{\infty}P(Y_{i}=0, \forall i=1,2,3..n)\theta(1-\theta)^{n-1}\\

& = \sum_{1}^{\infty}P(Y_{1}=0)^{n}\theta(1-\theta)^{n-1}\\

& =\sum_{1}^{\infty} (1-\pi)^{n}\theta(1-\theta)^{n-1}\\

& =\theta(1-\pi)\sum_{1}^{\infty}((1-\theta)(1-\pi))^{n-1}\\

& =\frac{(1-\theta)\pi}{\pi+(1-\theta)\pi}

\end{split}

\end{equation*}

\end{center}

In the above derivation we have used the fact that if $T=0$ then $Y_{i}=0, \forall i=1,2,…N$ since one non-zero $Y_{i}$ would imply a non-zero $T$.\\

Also the fact that $Y_{i}$’s are IID is used above.

\newpage

Given $X_{1},X_{2},X_{3}…X_{n}$ are a random sample from a distribution with density function

\begin{center}

\begin{equation*}

f_{X}(x)=\frac{\alpha}{x^{\alpha+1}}, x>1, \alpha>0

\end{equation*}

\end{center}

\subsection{Part (a)}

\begin{center}

\begin{equation*}

\begin{split}

1-F_{Q_{n}}(y)& =1-P(Q_{n}<=y)=P(Q_{n}>y)\\

& = P(n\alpha(X_{(1)}-1)>y)=P(X-{(1)}>\frac{y}{n\alpha}+1)\\

& =P(X_{1}>\frac{y}{n\alpha}+1)^{n}\\

&=(1-P(X_{1}\leq\frac{y}{n\alpha}+1)^{n}

\end{split}

\end{equation*}

\begin{equation*}

\begin{split}

P(X_{1}\leq x)& =\int_{1}^{x} \frac{\alpha}{x^{\alpha+1}}\\

& = \alpha[\frac{x^{-\alpha}}{-\alpha}]_{1}^{x}\\

& = 1-\frac{1}{x^{\alpha}}

\end{split}

\end{equation*}

\begin{equation*}

\begin{split}

F_{Q_{n}}(y)& =1-(\frac{y}{n\alpha}+1)^{-n\alpha}

\end{split}

\end{equation*}

\end{center}

Hence the distribution of $Q_{n}$ at $y\in(1,\infty)$ is given by:

\begin{center}

$F_{Q_{n}}(y)=1-(\frac{y}{n\alpha}+1)^{-n\alpha}$

\end{center}

\subsection{Part (b)}

Now to compute $F_{Q_{n}}(y)$ as $n->\infty$

\begin{center}

\begin{equation*}

\begin{split}

\lim_{n\rightarrow\infty}F_{Q_{n}}(y)& = \lim_{n\rightarrow\infty}1-(\frac{y}{n\alpha}+1)^{-n\alpha}\\

& =1-\frac{1}{\lim_{n\rightarrow\infty}(\frac{y}{n\alpha}+1)^{n\alpha}}\\

& = 1-\exp(-y)

\end{split}

\end{equation*}

\end{center}

The standard limit identity involving $exp(x)$ has been used here.\\

We can identify the above distribution as Exponential(1). Hence $Q_{n}$ converges to $Exp(1)$ in distribution as $n\rightarrow\infty$

\end{document}

Given Y1, Y2, …,Ynare IID random variables following N(0,1) the StandardNormal Distribution.

1.1 Part (a)

Using the fact the sum of l Independent Normal Variables

Again using the same we have distribution of

Hence the distribution of

1.2 Part (b)

2.1 Part (a)

2.2 Part (b)

2.3 Part(c)

Using the fact that sample variance from Normal Distribution follows ChiSquared distribution we have:

3.1 Part (a)
Using Conditional Expectation and Expectation of Bernoulli and Geometric and
also using IID of Yi we have

3.2 Part (b)

3.3 Part (c)

In the above derivation we have used the fact that if T = 0 then Yi = 0; 8i =
1;2; :::N since one non-zero Yi would imply a non-zero T.
Also the fact that Yi’s are IID is used above.