Literature in Testing of Hypotheses: Concepts, Theory, and Applications

Maximum likelihood estimation for Pareto observations

Published by Sumit Kumar On December 03, 2024

MLE for Pareto distribution The pdf of Pareto distribution is given by: `f(x; \alpha, \lambda) = \frac{\alpha \lambda^\alpha}{x^{\alpha+1}} \quad \text{for } x \geq \lambda.` `f(x_i, \alpha, \lambda) = \frac{\alpha \lambda^\alpha}{x_i^{\alpha+1}} \cdot I(x_i \geq \lambda), \quad x_i \in \mathbb{R} ` Writing the likelihood function `L(\alpha, \lambda) = \prod_{i=1}^n f(x_i , \alpha, \lambda)` `\Rightarrow L(\alpha, \lambda) = \prod_{i=1}^n \left( \frac{\alpha \lambda^\alpha}{x_i^{\alpha+1}} \cdot I(x_(i) \geq \lambda) \right)` `= L(\alpha, \lambda) = \alpha^n \lambda^{n\alpha} \prod_{i=1}^n \frac{1}{x_i^{\alpha+1}} \cdot \prod_{i=1}^n I(x_(i)\geq \lambda)` Finding the calculus for MLE in Pareto Distribution will be easy if we optimize it by taking log both sides because it is also a increasing function. `= \log L(\alpha, \lambda) = n \log \alpha + n \alpha \log \lambda - (\alpha+1) \sum_{i=1}^n \log x_i + \sum_{i=1}^n \log I(x_i \geq \lambda)` `= \log L(\alpha, \l...

Testing of Hypothesis

In statistical hypothesis testing, our goal is to use sample data to make decisions about population characteristics. We set up two competing claims: the null hypothesis ( $H_0$ ), which typically represents the status quo or no effect, and the alternative hypothesis ( $H_1$ ), which represents the effect we are looking for. However, not all statistical tests are created equal. The challenge is to find the "best" test—one that correctly rejects a false null hypothesis as often as possible. This is where the concepts of Most Powerful (MP) and Uniformly Most Powerful (UMP) tests become critical. A most powerful test that is one of size $\alpha$ that have highest power among all powerful test that exist.

Consider, for instance, a clinical trial designed to evaluate whether a new drug is more effective than the standard treatment. A poorly chosen test may fail to detect a genuine improvement, leading to the rejection of a potentially life-saving therapy.

Similarly, in quality control and manufacturing, detecting even a slight deviation from the target mean can prevent large-scale production losses. In finance, identifying abnormal market returns may help investors detect inefficiencies or fraudulent activities. In each of these examples, an optimal testing procedure—one that maximizes the power for detecting true effects—is crucial for making reliable decisions.

The concept of Most Powerful Tests provides a solution to this challenge by focusing on tests that offer the highest probability of correctly rejecting the null hypothesis for a given significance level. Extending this idea, Uniformly Most Powerful (UMP) Tests seek procedures that maintain maximal power across all possible parameter values under the alternative hypothesis.

A parametric hypothesis is a statement about the unknown parameter $\theta$. it is usually referred to as null hypothesis $H_{o} \in \Theta$ and $H_{1} \in \Theta$ is usually referred to as alternative hypothesis.

Simple and Composite Hypothesis

In parametric space, when the null hypothesis is true we denote its parameter set by $ \Theta_0 $, and when the alternative is true by $ \Theta_1 $. If a hypothesis contains a single parameter value (i.e., it completely specifies the distribution), it is called a simple hypothesis.

We write the testing problem as $ H_0: \theta \in \Theta_0 $ versus $ H_1: \theta \in \Theta_1 $. When either $ \Theta_0 $ or $ \Theta_1 $ contains multiple parameter values the hypothesis is called composite. This distinction affects both the form of the most powerful tests and the methods used to construct them.

Test function: The test function $ \phi(x) $ is the probability of rejecting $H_0$ when $X=x$ is observed.

` \phi(x) = \begin{cases} 1, & \text{if } x \in \omega \\ 0, & \text{if } x \in \omega^c \end{cases} `

Error probabilities are:

` P(\text{Type I error}) = P(x \in \omega \mid H_0) = \int_\omega f_{\theta_0}(x)\,dx `

` P(\text{Type II error}) = P(x \in \omega^c \mid H_1) = \int_{\omega^c} f_{\theta_1}(x)\,dx `

Power of a Best Critical Region

The power of a test is the probability of correctly rejecting a false null hypothesis; the size (significance level) $ \alpha $ is the probability of incorrectly rejecting a true null.

` 1 - \beta = P(x \in \omega \mid H_1) = \int_\omega f_{\theta_1}(x)\,dx `

A Best Critical Region (BCR) of size $ \alpha $ maximizes power among all regions with Type I error $ \le \alpha $. For simple versus simple problems the Neyman–Pearson lemma gives a complete characterization of the BCR.

Simple vs. Simple — Neyman–Pearson Lemma

The NP lemma states that the most powerful test of size $ \alpha $ for testing $ H_0 : \theta = \theta_0 $ against $ H_1 : \theta = \theta_1 $ rejects $H_0$ when the likelihood ratio exceeds a threshold.

` \omega = \{ x : \dfrac{L(\theta_1; x)}{L(\theta_0; x)} \ge k \}, \quad k > 0 `

The lemma identifies most powerful (MP) tests among non-randomized tests.
It does not require identical distributions or even the same parametric family under $H_0$ and $H_1$.
An MP test of size $ \alpha $ is necessarily unbiased in the simple vs simple case.

Example 1 — NP test for mean (known variance)

Let $X_1,\dots,X_n$ be a sample from $N(\mu,\sigma^2)$ with known $\sigma^2$. Test $H_0: \mu = \mu_0$ vs $H_1: \mu = \mu_1$ with $\mu_1 > \mu_0$.

The NP most powerful region is of the form ` \bar{x} \ge k_1 `. After standardizing under $H_0$ we obtain:

` \phi(x) = \begin{cases} 1, & \text{if } \bar{x} \ge \mu_0 + z_{\alpha}\dfrac{\sigma}{\sqrt{n}} \\[6pt] 0, & \text{otherwise} \end{cases} `

For the one-sided opposite alternative ($\mu_1 < \mu_0$) the direction reverses:

` \phi(x) = \begin{cases} 1, & \text{if } \bar{x} \le k_1 \\ 0, & \text{otherwise} \end{cases} `

Example 2 — NP test for variance (known mean)

Let $X_1,\dots,X_n$ be a sample from $N(\mu,\sigma^2)$ with known $\mu$. Test $H_0: \sigma = \sigma_0$ vs $H_1: \sigma = \sigma_1$ with $\sigma_1 > \sigma_0$.

The likelihood ratio leads to a test based on the sum of squared deviations:

` \sum_{i=1}^n (x_i - \mu)^2 \ge k_1 `

Under $H_0$ the statistic `\sum_{i=1}^n\left(\frac{X_i-\mu}{\sigma_0}\right)^2` has a $ \chi^2_n $ distribution. Choosing $k_1$ to satisfy the size condition gives:

` \phi(x) = \begin{cases} 1, & \text{if } \sum_{i=1}^{n}\left(\frac{x_i - \mu}{\sigma_0}\right)^2 \ge \chi^2_{n,\alpha} \\[6pt] 0, & \text{otherwise} \end{cases} `

Here `\chi^2_{n,\alpha}` denotes the upper $\alpha$-quantile of $ \chi^2_n $. In some texts you will see the critical constant written as $ \sigma_0^2 \chi^2_{n,\alpha} $ depending on algebraic arrangement.

Table 1 — Likelihood ratio tests concerning mean (t-test)

Likelihood ratio tests concerning mean of a normal population (t-test)
H₀	H₁	(σ² known)	(σ² unknown)
μ = μ₀	μ ≠ μ₀	`\left\|\dfrac{\sqrt{n}(\bar{x}-\mu_0)}{\sigma}\right\| > z_{\alpha/2}`	`\left\|\dfrac{\sqrt{n}(\bar{x}-\mu_0)}{s}\right\| > t_{\alpha/2,n-1}`
μ ≤ μ₀	μ > μ₀	`\dfrac{\sqrt{n}(\bar{x}-\mu_0)}{\sigma} > z_{\alpha}`	`\dfrac{\sqrt{n}(\bar{x}-\mu_0)}{s} > t_{\alpha,n-1}`
μ ≥ μ₀	μ < μ₀	`\dfrac{\sqrt{n}(\bar{x}-\mu_0)}{\sigma} < z_{1-\alpha}`	`\dfrac{\sqrt{n}(\bar{x}-\mu_0)}{s} < t_{1-\alpha,n-1}`

Table 2 — Likelihood ratio tests for variance (χ²-tests)

Likelihood ratio tests for variance of a normal population (`χ^2`-tests)
H₀	H₁	μ known	μ unknown
`\sigma^2 = \sigma_0^2`	`\sigma^2 \ne \sigma_0^2`	`\dfrac{\sum_{i=1}^{n}(x_i-\mu)^2}{\sigma_0^2} \le \chi^2_{n,1-\alpha/2}` or `\dfrac{\sum_{i=1}^{n}(x_i-\mu)^2}{\sigma_0^2} \ge \chi^2_{n,\alpha/2}`	`\dfrac{(n-1)s^2}{\sigma_0^2} \le \chi^2_{n-1,1-\alpha/2}` or `\dfrac{(n-1)s^2}{\sigma_0^2} \ge \chi^2_{n-1,\alpha/2}`
`\sigma^2 \le \sigma_0^2`	`\sigma^2 > \sigma_0^2`	`\dfrac{\sum_{i=1}^{n}(x_i-\mu)^2}{\sigma_0^2} \ge \chi^2_{n,\alpha}`	`\dfrac{(n-1)s^2}{\sigma_0^2} \ge \chi^2_{n-1,\alpha}`
`\sigma^2 \ge \sigma_0^2`	`\sigma^2 < \sigma_0^2`	`\dfrac{\sum_{i=1}^{n}(x_i-\mu)^2}{\sigma_0^2} \le \chi^2_{n,1-\alpha}`	`\dfrac{(n-1)s^2}{\sigma_0^2} \le \chi^2_{n-1,1-\alpha}`

Table 3 — Two-sample mean tests (LR forms)

Likelihood ratio tests for difference of two means
H₀	H₁	σ₁², σ₂² known	σ₁² = σ₂² unknown
`\mu_1 = \mu_2`	`\mu_1 \ne \mu_2`	`\dfrac{\|\bar{x}-\bar{y}\|}{\sqrt{\dfrac{\sigma_1^2}{m}+\dfrac{\sigma_2^2}{n}}} > z_{\alpha/2}`	`\dfrac{\|\bar{x}-\bar{y}\|}{s\sqrt{\dfrac{1}{m}+\dfrac{1}{n}}} > t_{\alpha/2,m+n-2}`
`\mu_1 \le \mu_2`	`\mu_1 > \mu_2`	`\dfrac{\bar{x}-\bar{y}}{\sqrt{\dfrac{\sigma_1^2}{m}+\dfrac{\sigma_2^2}{n}}} > z_{\alpha}`	`\dfrac{\bar{x}-\bar{y}}{s\sqrt{\dfrac{1}{m}+\dfrac{1}{n}}} > t_{\alpha,m+n-2}`
`\mu_1 \ge \mu_2`	`\mu_1 < \mu_2`	`\dfrac{\bar{x}-\bar{y}}{\sqrt{\dfrac{\sigma_1^2}{m}+\dfrac{\sigma_2^2}{n}}} < -z_{\alpha}`	`\dfrac{\bar{x}-\bar{y}}{s\sqrt{\dfrac{1}{m}+\dfrac{1}{n}}} < -t_{\alpha,m+n-2}`

Report Abuse

Labels

Maximum Likelihood Estimator for Log Normal Distribution

Methods of Estimation in Statistical Inference

Maximum likelihood estimation for Pareto observations

Data Deep Dive

Literature in Testing of Hypotheses: Concepts, Theory, and Applications

Testing of Hypothesis

Simple and Composite Hypothesis

Power of a Best Critical Region

Simple vs. Simple — Neyman–Pearson Lemma

Example 1 — NP test for mean (known variance)

Example 2 — NP test for variance (known mean)

Table 1 — Likelihood ratio tests concerning mean (t-test)

Table 2 — Likelihood ratio tests for variance (χ²-tests)

Table 3 — Two-sample mean tests (LR forms)

Post a Comment

Cramer Rao Inequality - statsclick

Maximum Likelihood Estimator in exponential distribution

Exploring the world of artificial intelligence robot

Regular Exponential Family of Distributions

Data Deep Dive