Regular Exponential Family of Distributions

Maximum likelihood estimation for Pareto observations

Published by Sumit Kumar On December 03, 2024

MLE for Pareto distribution The pdf of Pareto distribution is given by: `f(x; \alpha, \lambda) = \frac{\alpha \lambda^\alpha}{x^{\alpha+1}} \quad \text{for } x \geq \lambda.` `f(x_i, \alpha, \lambda) = \frac{\alpha \lambda^\alpha}{x_i^{\alpha+1}} \cdot I(x_i \geq \lambda), \quad x_i \in \mathbb{R} ` Writing the likelihood function `L(\alpha, \lambda) = \prod_{i=1}^n f(x_i , \alpha, \lambda)` `\Rightarrow L(\alpha, \lambda) = \prod_{i=1}^n \left( \frac{\alpha \lambda^\alpha}{x_i^{\alpha+1}} \cdot I(x_(i) \geq \lambda) \right)` `= L(\alpha, \lambda) = \alpha^n \lambda^{n\alpha} \prod_{i=1}^n \frac{1}{x_i^{\alpha+1}} \cdot \prod_{i=1}^n I(x_(i)\geq \lambda)` Finding the calculus for MLE in Pareto Distribution will be easy if we optimize it by taking log both sides because it is also a increasing function. `= \log L(\alpha, \lambda) = n \log \alpha + n \alpha \log \lambda - (\alpha+1) \sum_{i=1}^n \log x_i + \sum_{i=1}^n \log I(x_i \geq \lambda)` `= \log L(\alpha, \l...

Let $X_{1},X_{2},X_{3},...............X_{n}$ be an identically independent sample drawn from the exponential family.

`f(x ; \theta) = c(\theta)h(x)\exp{(p(\theta).T(x))} \tag{1}`

The joint density function of exponential family is

`f(x_{i} \, ; \, \theta) = c(\theta)^n \prod_{i=1}^{n} h(x_{i}) \exp\left(\sum_{i=1}^{n} p(\theta) \cdot T(x_{i})\right)`

here $h(\underline{x})=\prod_{i=1}^{n}h(x_{i})$ and

`g\left(\sum_{i=1}^{n} T(x_i), p(\theta)\right) = \left[c(\theta)\right]^n \exp\left(\sum_{i=1}^{n} T(x_i) \cdot p(\theta)\right)` The exponential families of distributions are also called regular families since they satisfy certain mild regularity conditions apart from the property that its support does not depend on the parameter $\boldsymbol{\theta}$. We have an important result \[ E_{\boldsymbol{\eta}} \left( \sum_{i=1}^{n} T_j(X_i) \right) = n \frac{\partial A(\boldsymbol{\eta})}{\partial \eta_j} \tag{1.2.12} \] and \[ \mathrm{cov} \left( \sum_{i=1}^{n} T_j(x_i), \sum_{i=1}^{n} T_k(x_i) \right) = n \frac{\partial^2 A(\boldsymbol{\eta})}{\partial \eta_j \partial \eta_k} \tag{1.2.13} \] The statistic $\mathbf{T}$ in Eq.~(1.2.10) contains all the information about $\boldsymbol{\eta}$ or $\boldsymbol{\theta}$ which is contained in the data. It is due to this reason, we are interested in the family of distributions of $\mathbf{T} = (T_1, \dots, T_k)$. \textbf{Theorem 1.2.1} \quad If the r.v. $X$ is distributed according to an exponential family with density in Eq.~(1.2.10), then $\mathbf{T} = (T_1, \dots, T_k)$ is distributed according to an exponential family with density \[ f(\mathbf{T}; \boldsymbol{\eta}) = \exp \left\{ \sum_{i=1}^{k} \eta_i T_i - A(\boldsymbol{\eta}) \right\} k(\mathbf{T}) \tag{1.2.14} \]

Sufficient Statistic

from the $(1)$ the vector $T(X)$ for many observations is called sufficient statistic because it follow the criteria of sufficiency namely for a density function which can be expressed as

`f_{X}(x)=h(x).g(\theta,T(x))`

it is product of

a factor that depends on parameter
a factor that in not depend on parameter and sample observations

Therefore `T(\underline{X})=\sum_{i=1}^{n} T(X_{i})` is sufficient for $\theta$ and also complete in the joint density of exponential family.

Parameter Space

The vector $c(\theta)$ is the natural parameter space.

Base Measure

the vector $h(x)$ is called base measure.

Example. Let $X \sim Bin(n,p)$ with $n>1$ and $0<p<1$. We represent this Binomial pmf in exponential form.

The pmf is given by

`P(X=x)=\binom{n}{x} p^{x} (1-p)^{n-x} I_{(x \in 0,1,2,....,n)} \tag{2}`

If we fit it in exponential family we get

$\Rightarrow f(x \, | \, p) = \binom{n}{x} \left(\frac{p}{1-p}\right)^x I_{(x \in \{0, 1, 2, \ldots, n\})}$

$\Rightarrow f(x \, | \, p) = \binom{n}{x} \exp\left(x \log\left(\frac{p}{1-p}\right) + n \log(1-p)\right)$

$\Rightarrow \binom{n}{x} \exp\left[{x \log \frac{p}{1-p} + n \log (1 - p)}\right] \mathbb{I}_{\{x \in \{0, 1, \ldots, n\}\}}$

Writing $ c(\theta) = \log \frac{p}{1-p} $, $ T(x) = x $, $ \psi(p) = -n \log (1 - p) $, and $ h(x) = \binom{n}{x} \mathbb{I}_{\{x \in \{0, 1, \ldots, n\}\}} $, we have represented the pmf $ f(x \mid p) $ in the one-parameter Exponential family form, as long as $ p \in (0, 1) $.

For $ p = 0 $ or $ p = 1 $, the distribution becomes a one-point distribution. Consequently, the family of distributions $ \{f(x \mid p), 0 < p < 1\} $ forms a one-parameter Exponential family, but if either of the boundary values $ p = 0 $ or $ p = 1 $ is included, the family is not in the Exponential family.

Example (Gamma Distribution). Suppose $X$ has the Gamma density `f(x; a, \frac{1}{\lambda}) = \frac{\lambda^a x^{a-1} e^{-\lambda x}}{\Gamma(a)}, \quad x > 0, \, a > 0, \, \lambda > 0. \tag{3}`

As such, it has two parameters $λ, a$. If we assume that $a$ is known, then we may write the density in the one parameter Exponential family form:

`f\left(x; a, \frac{1}{\lambda}\right) = \frac{\lambda^a x^{a-1} e^{-\lambda x}}{\Gamma(a)}, \quad x > 0, \, a > 0, \, \lambda > 0.`

so the exponential family form is

`f\left(x; a, \frac{1}{\lambda}\right) = x^{a-1} \exp\left( -\lambda x + a \log(\lambda) - \log(\Gamma(a)) \right), \quad x > 0, \, a > 0, \, \lambda > 0.`

where $h(x) = x^{a-1}.$, $c(\theta) = -\lambda.$, $T(x) = x.$, $p(\theta) = -a \log(\lambda) + \log(\Gamma(a)).$

As a consequence of this the $irregular distributions$ whose dependency on parameter can not be fitted into regular exponential exponential examples would be like $X \sim U(0, \theta)$, $X \sim U(-\theta, \theta)$ even also $shifted \, exponential family \, distributions \, f(x \, | \, \theta)=e^{(x-\theta)} \mathbb{I_{(x>\theta)}}$

Example. Normal Distribution.

`f(x ; \mu, \sigma^{2})=\frac{1}{\sigma\sqrt{2π}}\exp{\left(-\frac{1}{2}\left(\frac{x-\mu}{\sigma}\right)^{2}\right)}`

The familiar form of the univariate Gaussian is:

`p(x \, | \, \mu, \sigma^2) = \frac{1}{\sqrt{2\pi\sigma^2}} \exp\left\{ -\frac{(x - \mu)^2}{2\sigma^2} \right\} \tag{4}`

We put it in exponential family form by expanding the square:

`p(x \, | \, \mu, \sigma^2) = \frac{1}{\sqrt{2\pi}} \exp\left\{ \frac{\mu}{\sigma^2} x - \frac{1}{2\sigma^2} x^2 - \frac{1}{2\sigma^2} \mu^2 - \log \sigma \right\} \tag{5}`

We see that:

$\eta = \left( \frac{\mu}{\sigma^2}, -\frac{1}{2\sigma^2} \right) \tag{6}$

$t(x) = \left( x, x^2 \right) \tag{7}$

$a(\eta) = \frac{\mu^2}{2\sigma^2} + \log \sigma \tag{8}$

$A(\eta) = -\frac{\eta_1^2}{4\eta_2} - \frac{1}{2} \log(-2\eta_2) \tag{9}$

$h(x) = \frac{1}{\sqrt{2\pi}} \tag{10}$

Exponential Family for two or more exponential family

Many common distributions with multiple parameters also belong to the general class of multiparameter exponential families. For instance, the normal distribution on $\mathbb{R}$ with both its mean and variance unknown is part of this family. Another example is the multivariate normal distribution.

The methods and properties of multiparameter exponential families closely resemble those of single-parameter exponential families.

Let $X = (X_1, \ldots, X_d)$ have a distribution $P_{\theta}, \theta \in \Theta \subseteq \mathbb{R}^k$. The family of distributions $\{P_{\theta}, \theta \in \Theta\}$ is said to belong to the $k$-parameter Exponential family if its density (pmf) may be represented in the form

`f(x | \theta) = \exp\left( \sum_{i=1}^{k} p_i(\theta) T_i(x) - \psi(\theta) \right) h(x). \tag{11}`

Let $X_{1},X_{2},X_{3},...............X_{n}$ be an identically independent $f(x ; \theta_1 , \theta_2 )$ which is a pdf in two parameter then exponential family defined by

`f(x \, ; \, \theta_1, \theta_2) = c(\theta_1, \theta_2) h(x) \exp\left( p_1(\theta_1, \theta_2) T_1(x) + p_2(\theta_1, \theta_2) T_2(x) \right)`

the joint density of sample observation is

`f(x \, ; \, \theta_1, \theta_2) = \left[c(\theta_1, \theta_2)\right]^n \prod_{i=1}^{n} h(x_i) \exp\left( p_1(\theta_1, \theta_2) \sum_{i=1}^{n} T_1(x_i) + p_2(\theta_1, \theta_2) \sum_{i=1}^{n} T_2(x_i) \right) `

by the Fisher Neymann Factorization Theorem $(\sum_{i=1}^{n} T_1(x_{i}),\sum_{i=1}^{n} T_2(x_{i}))$ is jointly sufficient for $( \theta_1 , \theta_2)$.

Example Trinomial Distribution and Sufficient Statistics

Consider the trinomial distribution with parameters $\theta_1$ and $\theta_2$. The probability mass function (pmf) is given by:

`f(x, y; \theta_1, \theta_2) = \dfrac{n!}{x! \, y! \, (n - x - y)!} \left( \dfrac{\theta_1}{1 - \theta_1 - \theta_2} \right)^x \left( \dfrac{\theta_2}{1 - \theta_1 - \theta_2} \right)^y (1 - \theta_1 - \theta_2)^{n}`

Rewriting the pmf in exponential family form:

`= (1 - \theta_1 - \theta_2)^n \cdot \dfrac{n!}{x! \, y! \, (n - x - y)!} \exp \left\{ x \ln \left( \dfrac{\theta_1}{1 - \theta_1 - \theta_2} \right) + y \ln \left( \dfrac{\theta_2}{1 - \theta_1 - \theta_2} \right) \right\}`

Identifying the Components

Carrier function: `c(\theta_1, \theta_2) = (1 - \theta_1 - \theta_2)^n`
Base measure: `h(x, y) = \dfrac{n!}{x! \, y! \, (n - x - y)!}`
Natural parameters:
- `p_1(\theta_1, \theta_2) = \log \left( \dfrac{\theta_1}{1 - \theta_1 - \theta_2} \right)`
- `p_2(\theta_1, \theta_2) = \log \left( \dfrac{\theta_2}{1 - \theta_1 - \theta_2} \right)`
Sufficient statistics:
- `T_1(x) = x`
- `T_2(y) = y`

Therefore, the statistic `\left( \sum X_i, \sum Y_i \right)` is jointly sufficient for `( \theta_1, \theta_2 )` by the Neyman–Fisher factorization theorem.

Example Sufficient Statistic in Discrete Uniform Distribution

Let $X_{1}, X_{2}, X_{3}.........X_{n}$ be iid random variables from the pmf

`P_{N}(x)=\frac{1}{N}, x=1,2,3,.....,N`

The cdf of $T = X_{(n)}$ is

`P(X_{(n)} \leq t)=P(X_{1} \leq t , X_{2} \leq t, X_{3} \leq t.........X_{n} \leq t) = \left(\frac{t}{N}\right)^{n}`

The pmf of $T$ is

`P(X_{(n)} =t)=P(X_{(n)} \leq t) - P(X_{(n) \leq t-1) = \left(\frac{t}{N}\right)^{n} - \left(\frac{t-1}{N}\right)^{n}`

The conditional distribution of $X|T=t$ is

`P(X=x|T=t) = \frac{P(X_{1}=x_{1},....,X_{n}=x_{n} \cap X_{(n)}=t)}{P(X_{(n)}=t)}`

Report Abuse

Labels

Maximum Likelihood Estimator for Log Normal Distribution

Methods of Estimation in Statistical Inference

Maximum likelihood estimation for Pareto observations

Data Deep Dive

Regular Exponential Family of Distributions

Sufficient Statistic

Parameter Space

Exponential Family for two or more exponential family

Example Trinomial Distribution and Sufficient Statistics

Identifying the Components

Example Sufficient Statistic in Discrete Uniform Distribution

Post a Comment

Literature in Testing of Hypotheses: Concepts, Theory, and Applications

Cramer Rao Inequality - statsclick

Maximum Likelihood Estimator in exponential distribution

Exploring the world of artificial intelligence robot

Data Deep Dive