Review of Probability, Random Variables, and Distributions

SCU

This note is a backup from Notion.

Contents

Lecture 8

Definition of Variance

  • Def: The variance of X, Var(X)X,\ Var(X) is:
    • If XX is discrete

      σ2=Var(X)=E((Xμx)2)=x(xμx)2f(x)\sigma^2=Var(X)=E((X-\mu_x)^2)=\sum_x(x-\mu_x)^2f(x)
    • If XX is continuous

      σ2=Var(X)=E((Xμx)2)=+(xμx)2f(x)dx\sigma^2=Var(X)=E((X-\mu_x)^2)=\int_{-\infty}^{+\infty}(x-\mu_x)^2f(x)dx
    • σ\sigma is called the standard deviation

    • σ2=E((Xμx)2)=E(X22μxX+μx2)=E(X2)2μxE(X)+μx2=E(X2)E2(X)\sigma^2=E((X-\mu_x)^2)=E(X^2-2\mu_xX+\mu_x^2)=E(X^2)-2\mu_xE(X)+\mu_x^2=E(X^2)-E^2(X)

  • Properties
    1. X, Var(X)0, Var(C)=0\forall X,\ Var(X)\ge0,\ Var(C)=0

      Var(X)=0P(X=C)=1Var(X)=0\Leftrightarrow P(X=C)=1

    2. Var(CX)=C2Var(X)Var(CX)=C^2Var(X)

    3. If XX and YY are independent, then

      E(XY)=E(X)E(Y)E(XY)=E(X)E(Y)

      Var(X±Y)=Var(X)+Var(Y)Var(X\pm Y)=Var(X)+Var(Y)

    4. If X1,X2...XnX_1,X_2...X_n are mutually independent, Var(i=1nCiXi+b)=i=1nCi2Var(Xi)Var(\sum_{i=1}^nC_iX_i+b)=\sum_{i=1}^{n}C_i^2Var(X_i)

Covariance

  • Def: the covariance of XX and YY is

    σXY=Cov(X,Y)=E[(Xμx)(Yμy)]\sigma_{XY}=Cov(X,Y)=E[(X-\mu_x)(Y-\mu_y)]
  • σXY=Cov(X,Y)=E(XY)E(X)E(Y)\sigma_{XY}=Cov(X,Y)=E(XY)-E(X)E(Y)

  • If XX and YY are independent, Cov(X,Y)=0Cov(X,Y)=0

  • The inverse direction may not be true!

    Var(X±Y)=Var(X)+Var(Y)±2Cov(X,Y)Var(X\pm Y)=Var(X)+Var(Y)\pm 2Cov(X,Y)

Correlation coefficient

  • Def: The correlation coefficient of XX and YY is

    ρXY=σXYσYσY\rho_{XY}=\frac{\sigma_{XY} }{\sigma_Y\sigma_Y}
  • If XX and YY are independent, ρXY=0\rho_{XY}=0

  • Properties:

    1. X,Y ρXY1\forall X,Y\ |\rho_{XY}\le1|

      pf: use Var(YtX)0Var(Y-tX)\ge0

    2. ρXY=1a0, bs.t. P(Y=aX+b)=1|\rho_{XY}|=1\Leftrightarrow \exists a\ne0,\ b\\ s.t.\ P(Y=aX+b)=1

    3. If ρXY=1|\rho_{XY}|=1, we call XX and YY are completely linear correlated

    4. If ρXY=0\rho_{XY}=0, XX and YY are called uncorrelated, means there is not any "linear correlation" between XX and YY.

    5. independent \longrightarrow uncorrelated

  • ρXY|\rho_{XY}| denote the strongness of linear correlation between XX and YY

  • ρXY>0\rho_{XY}\gt0 means there is positive linear correlation between XX and

    If XX becomes larger, then YY tends to become stronger

Lecture 9

Bernoulli Distribution

  • 0-1 distribution XB(1,p)X\sim B(1,p)

    F(x)={0,x<01p,0x<11,x1F(x)=\left\{ \begin{aligned} &0,&x\lt0\\&1-p,&0\le x\lt1\\&1,&x\ge1\end{aligned} \right.
  • E(X)=p, Var(X)=pp2=pqE(X)=p,\ Var(X)=p-p^2=pq

  • Indicator AS, IA(ω)={1,if ωA0,if ωAA\subset S,\ I_A(\omega)=\left \{ \begin{aligned}&1,&if\ \omega\in A\\&0,&if\ \omega\notin A\end{aligned}\right.

  • It can be used everywhere

Binomial Distribution

  • Def: the number of success XX in nn Bernoulli trails XB(n,p)X\sim B(n,p)
  • If n=1n=1, it becomes Bernoulli distribution
  • pmfpmf: f(x)=P(X=x)=b(x;n,p)=Cnxpxqnx,x=0,1,...,nf(x)=P(X=x)=b(x;n,p)=C_n^xp^xq^{n-x},x=0,1,...,n
  • Binomial: x=0nb(x;n,p)=x=0nCnxpxqnx=(p+q)n=1\sum_{x=0}^nb(x;n,p)=\sum^{n}_{x=0}C_n^xp^xq^{n-x}=(p+q)^n=1
  • E(X)=np,Var(X)=npqE(X)=np,Var(X)=npq
    • hint: Xi={1,the ith trail succeeds0,the ith trail failsX_i=\left\{\begin{aligned} &1,&the\ i-th\ trail \ succeeds\\&0,&the\ i-th\ trail\ fails \end{aligned} \right.
    • XiX_i are mutually independent XiB(1,p), X=i=1nXiX_i\sim B(1,p),\ X=\sum_{i=1}^nX_i

Multinomial Distribution

  • Def: Multinomial experiments repeatedly: independent, kk outcomes each time
  • DefL Multinomial distribution: the number of each outcomes in nn trails
  • Joint pmfpmf: f(x1,x2,...,xk;p1,p2,...,pk,n)=n!x1!x2!...xk!p1x1p2x2...pkxkf(x_1,x_2,...,x_k;p_1,p_2,...,p_k,n)=\frac{n!}{x_1!x_2!...x_k!}p_1^{x_1}p_2^{x_2}...p_k^{x_k}
  • Each marginal distribution is binomial

Lecture 10

Hypergeometric Distribution

  • Motivation: Sampling without replacement
  • Def: XX the number of success
    1. nn is selected from NN terms without replacement;
    2. of NN terms, kk are success and NkN-k are failures.
XH(N,n,k)X\sim H(N,n,k)
  • pmfpmf:
f(x;N,n,k)=CknCnxNkCnN, max(0,n(Nk))xmin(n,k)f(x;N,n,k)=\frac{C_{k}^{n}C_{n-x}^{N-k} }{C_{n}^{N} }, \ max(0,n-(N-k))\le x\le min(n,k)
  • Relationship to Binomial

    • Binomial is the limit case for hypergeometric when NN approaches infinity
    • When NN is larger enough(nN\frac nN is small): f(x;N,n,k)b(x;n,kN)f(x;N,n,k)\approx b(x;n,\frac kN)
  • XX is hypergeometric with N, n and kN,\ n\ and \ k, then

    E(X)=nkNE(X)=n\frac kN

    Var(X)=NnN1nkN(1kN)Var(X)=\frac{N-n}{N-1}n\frac kN(1-\frac kN)

Multivariate Hypergeometric

  • N terms be Lectureified into k kinds, select n randomly, number of each kind

    f(x1,x2,...,xk;a1,a2,...,ak,N,n)=Ca1x1Ca2x2...CakxkCNnf(x_1,x_2,...,x_k;a_1,a_2,...,a_k,N,n)=\frac{C_{a_1}^{x_1}C_{a_2}^{x_2}...C_{a_k}^{x_k} }{C_N^n}
  • Each marginal is hypergeometric!

Geometric Distribution

  • Def: Do Bernoulli experiments until succeed, XX the number of trails XG(p)X\sim G(p)

  • pmf: g(x;p)=qx1p,x=1,2,3...g(x;p)=q^{x-1}p,x=1,2,3...

  • Mean E(X)E(X) and variance Var(X)Var(X)

    E(X)=1p, Var(X)=qp2E(X)=\frac 1p,\ Var(X)=\frac{q}{p^2}

Negative Binomial Distribution

  • Def: Do Bernoulli experiments until the k-th succeed, XX the number of trails XNB(k,p)X\sim NB(k,p)

  • pmf:

    b(x;k,p)=Cx1k1qxkpk, x=k,k+1,k+2,...b^*(x;k,p)=C_{x-1}^{k-1}q^{x-k}p^k,\ x=k,k+1,k+2,...

  • Mean E(X)E(X) and variance Var(X)Var(X)

    E(X)=kp, Var(X)=kpp2E(X)=\frac kp,\ Var(X)=\frac{kp}{p^2}

Poisson Distribution

  • Def: number of occurring in a Poisson process

  • Derivation: Poisson theorem

    limnCnx(λn)x(1λn)nx=λxx!eλ\lim_{n\to\infty}C_n^x(\frac \lambda n)^x (1-\frac \lambda n)^{n-x}=\frac{\lambda^x}{x!}e^{-\lambda}

  • pmf:

    p(x;λ)=λxx!eλ, x=0,1,2...p(x;\lambda)=\frac{\lambda^x}{x!}e^{-\lambda},\ x=0,1,2...

  • Expectation:

    XP(λ), E(X)=λ, Var(X)=λX\sim P(\lambda),\ E(X)=\lambda,\ Var(X)=\lambda

  • Relationship to Binomial

    • Poisson distribution is the limit case of binomial when nn approaches infinity while npnp is fixed
    • If n(n50)n(n\ge50) is large while p(p0.1)p(p\le0.1) is small, XB(n,p)P(np)X\sim B(n,p)\approx P(np)

Lecture 11

Uniform Distribution

  • Def: XX is called uniform distribution on [a,b][a,b] if its density satisfy: XU(a,b)X\sim U(a,b)

    f(x)={1ba,x[a,b]0,elsewheref(x)=\left\{\begin{aligned} &\frac{1}{b-a},&x\in[a,b]\\&0,&elsewhere\end{aligned}\right.
  • cdf and probability

  • Expectations: E(X)=a+b2,Var(X)=(ba)212E(X)=\frac{a+b}{2},Var(X)=\frac{(b-a)^2}{12}

Exponential Distribution

  • Def: XX is called exponential distribution if

    f(x)={1βexβ,x>00,x0f(x)=\left\{\begin{aligned} &\frac{1}{\beta}e^{-\frac{x}{\beta} },&x\gt0\\ & 0,&x\le0\end{aligned} \right.
  • cdf: F(x)={0,x01exβ,x>0F(x)= \left\{\begin{aligned} &0,&x\le0\\ &1-e^{-\frac{x}{\beta} }, &x\gt0\end{aligned}\right.

Gamma Distribution

Gamma Function

  • Def: Gamma function

    Γ(α)=0+xα1exdx,α>0\Gamma(\alpha)=\int_{0}^{+\infty}x^{\alpha-1}e^{-x}dx,\alpha\gt0
  • Properties:

    Γ(1)=1,Γ(0.5)=πΓ(α+1)=αΓ(α),Γ(n)=(n1)!\Gamma(1)=1,\Gamma(0.5)=\sqrt \pi\\\Gamma(\alpha+1)=\alpha\Gamma(\alpha),\Gamma(n)=(n-1)!

  • Def: the Gamma density is as following: XΓ(α,β)X\sim \Gamma(\alpha,\beta)

    f(x)={1βαΓ(α)xα1exβ,x>00,x0f(x)=\left\{\begin{aligned} &\frac{1}{\beta^\alpha\Gamma(\alpha)}x^{\alpha-1}e^{-\frac x\beta}, &x\gt0\\&0,&x\le0\end{aligned} \right.

  • Exponential is special case of Gamma density Xe(β)=Γ(1,β)X\sim e(\beta)=\Gamma(1,\beta)

  • Expectations:

    E(X)=αβ,Var(X)=αβ2E(X)=\alpha\beta,Var(X)=\alpha\beta^2

    Xe(β),E(X)=β,Var(X)=β2X\sim e(\beta),E(X)=\beta,Var(X)=\beta^2

Normal Distribution

Standard Normal

  • Def: XX is called standard normal if density

    φ(x)=12πex22,x(,+)\varphi(x)=\frac{1}{\sqrt{2\pi} }e^{-\frac{x^2}{2} },x\in(-\infty,+\infty)
  • The cdf can be found from tables

    Φ(x)=xφ(t)dt=x12πet22dt\Phi(x)=\int_{-\infty}^{x}\varphi(t)dt=\int_{-\infty}^{x}\frac{1}{\sqrt{2\pi} }e^{-\frac{t^2}{2} }dt

    Φ(0)=0.5,Φ(x)=1Φ(x)\Phi(0)=0.5,\Phi(-x)=1-\Phi(x)

  • Expectations: if XX is standard normal

    E(X)=0,Var(X)=1E(X)=0,Var(X)=1

    XN(0,1)X\sim N(0,1)

  • Def: XX is normal with parameter μ,σ2\mu,\sigma^2

    XN(μ,σ2)XμσN(0,1)X\sim N(\mu,\sigma^2)\Leftrightarrow \frac{X-\mu}{\sigma}\sim N(0,1)
  • The density of N(μ,σ2)N(\mu,\sigma^2) is:

    F(x)=P(Xx)=P(Xμσxμσ)=Φ(xμσ)F(x)=P(X\le x)=P(\frac{X-\mu}{\sigma}\le\frac{x-\mu}{\sigma})=\Phi(\frac{x-\mu}{\sigma})

    f(x)=12πσe(xμ)22σ2,x(,+)f(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-\frac{(x-\mu)^2}{2\sigma^2} },x\in(-\infty,+\infty)

  • Expectations:

    E(X)=μ,Var(X)=σ2E(X)=\mu,Var(X)=\sigma^2

  • pth quantile

    • Def: for p in (0,1)(0,1), the pth quantile xpx_p of XX is P(Xxp)=pP(X\le x_p)=p
    • Def: for p in (0,1)(0,1), the critical value cpc_p of XX is P(Xxp)=pP(X\ge x_p)=p
    • xp=c1px_p=c_{1-p}

Lecture 12

Central Limit Theorem

  • Th (Lindeberge-Levy): if {Xi}\{X_i\} is a iid sequence with

    E(Xk)=μ,Var(Xk)=σ2Yn=k=1nXknμnσ=1nk=1nXkμσ/nE(X_k)=\mu,Var(X_k)=\sigma^2\\Y_n=\frac{\sum_{k=1}^{n}X_k-n\mu}{\sqrt n \sigma}=\frac{\frac 1n\sum_{k=1}^{n}X_k-\mu}{\sigma/\sqrt n}
  • Then

    limn+P(Ynx)=Φ(x)k=1nXkN(nμ,nσ2),1nk=1nXkN(μ,σ2n)\lim_{n\to+\infty}P(Y_n\le x)=\Phi(x)\\\sum_{k=1}^{n}X_k\sim N(n\mu,n\sigma^2),\frac 1n\sum_{k=1}^{n}X_k\sim N(\mu, \frac{\sigma^2}{n})

Lecture 13

Estimation Methods

  1. Moment estimate
    • Fundamental basis: {Xi}\{ X_i\}iid E(Xi)=μ,Var(Xi)=σ2E(X_i)=\mu, Var(X_i)=\sigma^2

      X=1ni=1nXiN(μ,σ2n)Xnμ\overline X=\frac 1n\sum_{i=1}^n X_i\sim N(\mu,\frac{\sigma^2}{n})\Rightarrow\overline X\xrightarrow{n\to\infty} \mu

    • Distribution parameter θ\theta is related to μ\mu

    • Estimation:

      E(x)=μ=g(θ)θ=h(μ)h(X)=θ^E(x)=\mu=g(\theta)\longrightarrow\theta=h(\mu)\approx h(\overline X)=\hat\theta
  2. The Method of Maximum Likelihood
  • Suppose the population Xf(x,θ)X\sim f(x,\theta)

    P(X1=x1,X2=x2,...,Xn=xn)=f(x1,θ)f(x2,θ)...f(xn,θ)L(θ)P(X_1=x_1,X_2=x_2,...,X_n=x_n)=f(x_1,\theta)f(x_2, \theta)...f(x_n,\theta)\equiv L(\theta)

  • L(θ)L(\theta) is called likelihood function

  • The estimation of mle is chosen as:

    L(θ^)=maxL(θ)L(\hat\theta)=max L(\theta)
  • Solution of mle for uniform distribution

    1. find the likelihood function for XU(a,b)X\sim U(a,b)

      L(a,b)=i=1nf(xi)=(1ba)nL(a,b)=\prod_{i=1}^{n}f(x_i)=(\frac{1}{b-a})^n

    2. find mle L(a,b)a>0,L(a,b)b<0\frac{\partial L(a,b)}{\partial a}\gt0,\frac{\partial L(a,b)}{\partial b}\lt0

      i,a<Xi<bamin{Xi},bmax{Xi}\forall i,a\lt X_i\lt b\Rightarrow a\le min\{X_i\},b\ge max\{X_i\}

    3. The likelihood function is strictly increasing with aa but strictly decreasing with bb, so the mle are:

      a^=min{Xi},b^=max{Xi}\hat a=min\{X_i\},\hat b=max\{Xi\}

Lecture 14

Unbiasedness

  • Def: if E(θ^)=θE(\hat\theta)=\theta, θ^\hat\theta is called unbiased
  • Def: b(θ^)=E(θ^)θb(\hat\theta)=E(\hat\theta)-\theta is called bias
  • Def: if b(θ^)0,limn+b(θ^)=0b(\hat\theta)\ne0,\lim_{n\to+\infty}b(\hat\theta)=0, θ^\hat\theta is asymptotically

Efficiency

  • Def: both θ^1\hat\theta_1 and θ^2\hat\theta_2 are biased, θ^1\hat\theta_1 is more efficient than θ^2\hat\theta_2 if Var(θ^1)<Var(θ^2)Var(\hat\theta_1)\lt Var(\hat\theta_2)

Mean Squared Error(MSE)

  • Def: the mean squared error is:

    M(θ^)=E[(θ^θ)2]M(\hat\theta)=E[(\hat\theta-\theta)^2]

  • The MSE can be computed as:

    M(θ^)=Var(θ^)+b2(θ^)M(\hat\theta)=Var(\hat\theta)+b^2(\hat\theta)

Lecture 15

Chi-Squared Distribution

XiN(0,1),X=i=1nXi2χ2(n)X_i\sim N(0,1),X=\sum_{i=1}^{n}X_i^2\sim \chi^2(n)
  • Derive of density:

    χ2(n)=Γ(n2,2)\chi^2(n)=\Gamma(\frac n2, 2)

    f(x;n)={12n/2Γ(n/2)xn/21ex/2,x>00,elsewheref(x;n)= \left\{\begin{aligned} &\frac{1}{2^{n/2}\Gamma(n/2)}x^{n/2-1}e^{-x/2},&x\gt0\\&0,&elsewhere \end{aligned} \right.

  • Expectations: Xχ2(n)E(x)=n,Var(X)=2nX\sim\chi^2(n)\Rightarrow E(x)=n,Var(X)=2n

  • Chi-Squared distributions are addictive:

    Xχ2(n),Yχ2(m),X,Y indepX+Yχ2(n+m)X\sim\chi^2(n),Y\sim\chi^2(m),X,Y\ indep\Rightarrow X+Y\sim\chi^2(n+m)

t-Distribution

XN(0,1),Yχ2(n)T=XY/nt(n)X\sim N(0,1),Y\sim\chi^2(n)\Rightarrow T=\frac{X}{\sqrt{Y/n} }\sim t(n)
  • Density:

    f(t)=Γ[(n+1)/2]Γ(n/2)nπ(1+t2n)(n+1)/2,<t<+f(t)=\frac{\Gamma[(n+1)/2]}{\Gamma(n/2)\sqrt{n\pi} }(1+\frac{t^2}{n})^{-(n+1)/2},-\infty\lt t\lt +\infty

  • Even function

  • Limit is standard normal: limnf(t)=φ(t)\lim_{n\to\infty} f(t)=\varphi(t)

F-Distribution

Xχ2(n1),Yχ2(n2)F=X/n1Y/n2F(n1,n2)X\sim \chi^2(n_1), Y\sim \chi^2(n_2)\Rightarrow F=\frac{X/n_1}{Y/n_2}\sim F(n_1, n_2)
  • Property: FF(n1,n2)1/FF(n2,n1)F\sim F(n_1,n_2)\Rightarrow1/F\sim F(n_2,n_1)
  • The limit case is Normal Distribution

Sampling Distribution Theorems

  • Suppose the population is Normal: XN(μ,σ2)X\sim N(\mu,\sigma^2)

  • Th1:

    XN(μ,σ2n)orXμσ/nN(0,1)\overline{X}\sim N(\mu,\frac{\sigma^2}{n})or\frac{\overline X-\mu}{\sigma/\sqrt{n} }\sim N(0,1)
  • Th2: X\overline X and S2S^2 are independent, and

    (n1)S2σ2=i=1n(XiX)2σ2χ2(n1)\frac{(n-1)S^2}{\sigma^2}=\sum_{i=1}^{n}\frac{(X_i-\overline X)^2}{\sigma^2}\sim\chi^2(n-1)
  • Th3:

    XμS/nt(n1)\frac{\overline X-\mu}{S/\sqrt{n} }\sim t(n-1)

Lecture 16

CI under Normal Distribution

  • find μ\mu
    • XN(μ,σ2)X\sim N(\mu, \sigma^2), and σ2\sigma^2 is given
      1. find Xμ\overline X\approx\mu
      2. construct Z=Xμσ/nN(0,1)Z=\frac{\overline X-\mu}{\sigma/\sqrt n}\sim N(0,1)
      3. find P(zα/2<Z<zα/2)=1αP(-z_{\alpha/2}<Z<z_{\alpha/2})=1-\alpha
      4. solve zα/2<Z<zα/2Xzα/2σn<μ<X+zα/2-z_{\alpha/2}<Z<z_{\alpha/2}\Leftrightarrow \overline X-z_{\alpha/2}\frac{\sigma}{\sqrt n}<\mu <X+z_{\alpha/2}
    • XN(μ,σ2)X\sim N(\mu, \sigma^2), and σ2\sigma^2 is unknown
      1. find Xμ\overline X\approx\mu
      2. construct T=XμS/nt(n1)T=\frac{\overline X-\mu}{S/\sqrt n}\sim t(n-1)
      3. find P(tα/2<T<tα/2)=1αP(-t_{\alpha/2}<T<t_{\alpha/2})=1-\alpha
      4. solve tα/2<T<tα/2Xtα/2Sn<μ<X+tα/2Sn-t_{\alpha/2}<T<t_{\alpha/2}\Leftrightarrow \overline X-t_{\alpha/2}\frac{S}{\sqrt n}<\mu<\overline X+t_{\alpha/2}\frac{S}{\sqrt n}
  • find σ\sigma
    • XN(μ,σ2)X\sim N(\mu,\sigma^2), and μ\mu is given
      1. construct W=i=1n(Xiμ)2σ2χ2(n)W=\sum_{i=1}^{n}\frac{(X_i-\mu)^2}{\sigma^2}\sim\chi^2(n)
      2. solve P(χ1α/22<W<χα/22)=1αP(\chi^2_{1-\alpha/2}<W<\chi^2_{\alpha/2})=1-\alpha
    • XN(μ,σ2)X\sim N(\mu,\sigma^2), and μ\mu is unknown
      1. construct W=n1σ2S2=i=1n(XiX)2σ2χ2(n1)W=\frac{n-1}{\sigma^2}S^2=\sum_{i=1}^{n}\frac{(X_i-\overline X)^2}{\sigma^2}\sim\chi^2(n-1)

Sampling Distribution under Two Populations

  • Suppose XN(μ1,σ12), YN(μ2,σ22)X\sim N(\mu_1, \sigma_1^2), \ Y\sim N(\mu_2,\sigma_2^2)

  • XX, YY independent, n1,n_1, n2n_2 samples from X, YX,\ Y

  • Th1: var known

    (XY)(μ1μ2)σ12/n+σ22/n2N(0,1)\frac{(\overline X-\overline Y)-(\mu_1-\mu_2)}{\sqrt{\sigma_1^2/n+\sigma_2^2/n_2} }\sim N(0,1)
  • Th2: var unknown but equal

    (XY)(μ1μ2)Sp1/n1+1/n2t(n1+n22)\frac{(\overline X-\overline Y)-(\mu_1-\mu_2)}{S_p\sqrt{1/n_1+1/n_2} }\sim t(n_1+n_2-2)
Sp2=(n11)S12+(n21)S22n1+n22S_p^2=\frac{(n_1-1)S_1^2+(n_2-1)S_2^2}{n_1+n_2-2}
  • Th3: Sampling theorem for Variance
S12/σ12S22/σ22F(n11,n21)\frac{S_1^2/\sigma_1^2}{S_2^2/\sigma_2^2}\sim F(n_1-1,n_2-1)

Sample variance

S2=(XiX)2n1S^2=\frac{\sum(X_i-\overline X)^2}{n-1}

X ?, E(X)=μ, Var(X)=σ2X=1nXiN(μ,σ2n)X\sim \ ?,\ E(X)=\mu,\ Var(X)=\sigma^2\\\overline X=\frac 1n\sum X_i\sim N(\mu,\frac{\sigma^2}{n}) Var(X)=E(X2)E2(X)Var(X)=E(X2)E2(X)Var(X)=E(X^2)-E^2(X)\\Var(\overline X )=E({\overline X}^2)-E^2(\overline X)\\ E(X2)=μ2+σ2E(X2)=μ2+σ2nE(X^2)=\mu^2+\sigma^2\\E(\overline X^2)=\mu^2+\frac{\sigma^2}{n} E((XiX)2)=E((Xi2+X22XiX))=E(Xi2+nX22XXi)=E(Xi2+nX22nX2)=E(Xi2)E(nX2)=nE(X2)nE(X2)=n(μ2+σ2)n(μ2+σ2n)=nσ2σ2=(n1)σ2\begin{aligned}E(\sum(X_i-\overline X)^2)=& E(\sum(X_i^2+\overline X^2-2X_i\overline X))\\=&E(\sum X_i^2+n\overline X^2-2\overline X\sum X_i)\\=&E(\sum X_i^2+n\overline X^2-2n\overline X^2)\\=&\sum E(X_i^2)-E(n\overline X^2)\\=&nE(X^2)-nE(\overline X^2)\\=&n(\mu^2+\sigma^2)-n(\mu^2+\frac{\sigma^2}{n})\\=&n\sigma^2-\sigma^2=(n-1)\sigma^2\end{aligned}\\ E((XiX)2)=(n1)σ2E((XiX)2)n1=σ2E((XiX)2n1)=σ2\begin{aligned}\Rightarrow E(\sum(X_i-\overline X)^2)&=(n-1)\sigma^2\\\frac{E(\sum(X_i-\overline X)^2)}{n-1}&=\sigma^2\\E(\frac{\sum(X_i-\overline X)^2}{n-1})&=\sigma^2\end{aligned} S2=(XiX)2n1E(S2)=σ2S^2=\frac{\sum(X_i-\overline X)^2}{n-1}\Rightarrow E(S^2)=\sigma^2