Analysis of Variance - Statsclick

Introduction

Suppose that we wish to be more objective in our analysis of the data. Specifically, suppose that we wish to test for differences between the mean etch rates at all a 4 level of RF power. Thus, we are interested in testing the equality of all four means. It might seem that this problem could be solved by performing a t-test for all six possible pairs of means. However, this is not the best solution to this problem. First of all, performing all six pairwise t-tests is inefficient. It takes a lot of effort. Second, conducting all these pairwise comparisons inflates the type I error. Suppose that all four means are equal, so if we select 0.05, the probability of reaching the correct decision on any single comparison is 0.95. However, the probability of reaching the correct conclusion on all six comparisons is considerably less than 0.95, so the type I error is inflated. The appropriate procedure for testing the equality of several means is the analysis of variance. However, the analysis of variance has a much wider application than the problem above. It is probably the most useful technique in the field of statistical inference.

ANOVA mind map

Introduction to ANOVA

Definitions

ANOVA is a statistical technique to compare means of three or more groups to determine if there are statistically difference between them.

Purpose

It test null hypothesis that all group means are equal against the alternative hypothesis that at atleast one group is different

Origin

Introduced by Prof. R.A. Fisher in 1920s primary for agricultural experiment.

Applications

Widely used in various fields such as agriculture industry, Psychology, education and business to analyse experimental data.

Assumptions

Independence

Observations must be independent of each other.

Normality

The data in each group should be normally distributed.

Homogeneity Variance

The variance of each group should be equal (homoscadesticity).

Additivity

The effect of different factors are additive (no effects in one way ANOVA)

Introduction to ANOVA (One - Way)

Treatment (Level)	Observations	Totals	Averages
Treatment (Level)	$ y_{i1} $	Totals	Averages	$ y_{i2} $	$\cdots$	$ y_{in} $
1	$ y_{11} $	$ y_{12} $	$\cdots$	$ y_{1n} $	$ y_{1\cdot} $	$ \overline{y}_{1\cdot} $
2	$ y_{21} $	$ y_{22} $	$\cdots$	$ y_{2n} $	$ y_{2\cdot} $	$ \overline{y}_{2\cdot} $
$\vdots$	$\vdots$	$\vdots$	$\cdots$	$\vdots$	$\vdots$	$\vdots$
$ a $	$ y_{a1} $	$ y_{a2} $	$\cdots$	$ y_{an} $	$ y_{a\cdot} $	$ \overline{y}_{a\cdot} $
Totals:					$ y_{\cdot \cdot} $	$ \overline{y}_{\cdot \cdot} $

Fixed Effect Model

The name ANOVA stems from a partitioning of the total variability in the response variable into components that are consistent with a model for the experiment.

The basic single-factor ANOVA model is:

` y_{ij} = \mu + \tau_i + \varepsilon_{ij}`

where:

$y_{ij}$: General observation
$\mu$: Grand mean
$\tau_i$: Fixed effect factor
$\varepsilon_{ij} \sim N(0, \sigma^2)$: Random error

Models for the Data

Means Model:

` y_{ij} = \mu_i + \varepsilon_{ij}`

Effect Model:

` \mu_i = \mu + \tau_i = \overline{y}_{i\cdot} \quad \text{for } i=1,2,\dots,a`
` y_{ij} = \mu + \tau_i + \varepsilon_{ij} \quad \text{for } j=1,2,\dots,n`

ANOVA Model

1) Fixed Effect Model

In this model, the $ i $-th treatment mean $ \mu_i $ is broken into two components: $ \mu_i = \mu + \tau_i $. We consider $ \mu $ as the overall mean so that:

`\frac{1}{a} \sum_{i=1}^{a} \mu_i = \mu`

It implies that:

`\sum_{i=1}^{a} \tau_i = 0`

2) Random Effect Model

Here, $ \tau_i $ are random variables. Knowing about the specific treatment means is less informative; instead, the hypothesis is tested on the variability of $ \tau_i $.

Decomposition of Sum of Squares

Consider the model:

`y_{ij} = \mu + \tau_i + \varepsilon_{ij} \sim N(0,\sigma^2)`

With conditions:

$ \sum_{i=1}^{a} \tau_i = 0 $
$ \sum_{j=1}^{n} \varepsilon_{ij} = 0 $

The decomposition can be written as:

`y_{ij} = \overline{y}_{\cdot\cdot} + (\overline{y}_{i\cdot} - \overline{y}_{\cdot\cdot}) + (y_{ij} - \overline{y}_{i\cdot})`

Again:

`y_{ij} - \overline{y}_{\cdot\cdot} = (\overline{y}_{i\cdot} - \overline{y}_{\cdot\cdot}) + (y_{ij} - \overline{y}_{i\cdot})`

The sum of squares can be decomposed as:

`\sum_{i=1}^{a} \sum_{j=1}^{n} (y_{ij} - \overline{y}_{\cdot\cdot})^2 = \sum_{i=1}^{a} \sum_{j=1}^{n} \left[(\overline{y}_{i\cdot} - \overline{y}_{\cdot\cdot}) + (y_{ij} - \overline{y}_{i\cdot})\right]^2`

Total Variability (TSS)

Total variability is measured by the Total Sum of Squares (TSS):

`SS_T = \sum_{i=1}^{a} \sum_{j=1}^{n} (y_{ij} - \overline{y}_{\cdot\cdot})^2`

The partitioning of TSS in ANOVA is:

`SS_T = SS_{\text{Treatment}} + SS_E`

Under the null hypothesis $H_0: \mu_1 = \mu_2 = \cdots = \mu_a$, the test statistics follow:

$\dfrac{SS_T}{\sigma^2} \sim \chi^2_{a(n-1)}$
$\dfrac{SS_{\text{Treatment}}}{\sigma^2} \sim \chi^2_{a-1}$
$\dfrac{SS_E}{\sigma^2} \sim \chi^2_{a(n-1)}$

Hypotheses

$H_0: \mu_1 = \mu_2 = \cdots = \mu_a$
$H_1: \text{At least one } \mu_i \neq \mu$

A large value of $SS_{\text{Treatment}}$ reflects significant differences among treatment means, whereas a small value suggests no substantial difference.

Example

Consider the observed etch rate data:

RF Power (W)	1	2	3	4	5	$y_{i\cdot}$	$\overline{y}_{i\cdot}$
160	575	542	530	539	570	2756	551.2
180	605	593	590	579	610	2937	587.4
200	665	613	610	637	629	3127	625.4
220	725	700	715	685	710	3355	671.0

Grand total: $y_{\cdot\cdot} = 13,355$ | Grand mean: $\overline{y}_{\cdot\cdot} = 617.75$

Sum of Squares Calculations

Error sum of squares:

`SS_E = \sum_{i=1}^{a} \sum_{j=1}^{n} (y_{ij} - \overline{y}_{i\cdot})^2 = 5339.20`

Treatment sum of squares:

`SS_{\text{Treatment}} = n \sum_{i=1}^{a} (\overline{y}_{i\cdot} - \overline{y}_{\cdot\cdot})^2 = 66870.55`

Total sum of squares:

`SS_T = SS_E + SS_{\text{Treatment}} = 5339.20 + 66870.55 = 72209.75`

Degrees of Freedom (df)

Degrees of freedom relations:

`df_{\text{Total}} = df_{\text{Treatment}} + df_{\text{Error}}` `df_{\text{Total}} = an - 1 = a - 1 + a(n - 1)`

Mean Squares

Mean squares are calculated as:

`MS_{\text{Treatment}} = \dfrac{SS_{\text{Treatment}}}{a-1}` `MS_E = \dfrac{SS_E}{a(n-1)} \quad \text{(an unbiased estimator of } \sigma^2 \text{)} `

ANOVA (Analysis of Variance with one observation per cell) Two Way

Statistical Model:

`Y_{ij} = \mu + T_i + B_j + \epsilon_{ij}, \quad i=1,2,\dots,a, \quad j=1,2,\dots,b`

where:

$ Y_{ij} $ = Observed response for the $ i $-th treatment in the $ j $-th block.
$ \mu $ = Overall mean.
$ T_i $ = Effect of the $ i $-th treatment.
$ B_j $ = Effect of the $ j $-th block.
$ \epsilon_{ij} $ = Random error component, assumed to be normally distributed with mean 0 and variance $ \sigma^2 $.

Hypothesis:

`H_0: T_1 = T_2 = \dots = T_a = 0`

`H_1: T_i \neq 0 \quad \text{for at least one } i`

Sums of Squares Computation:

`SS_T = \sum_{i=1}^{a} \sum_{j=1}^{b} Y_{ij}^2 - \frac{Y_{..}^2}{ab}`

`SS_B = \frac{1}{a} \sum_{j=1}^{b} Y_{\cdot j}^2 - \frac{Y_{..}^2}{ab}`

`SS_A = \frac{1}{b} \sum_{i=1}^{a} Y_{i \cdot}^2 - \frac{Y_{..}^2}{ab}`

`SS_E = SS_T - SS_A - SS_B`

Total sum of squares can be written as:

`SS_T = \sum_{i=1}^{a} \sum_{j=1}^{b} (Y_{ij} - \bar{Y}_{..})^2`

By expanding, we get:

`\sum_{i=1}^{a} \sum_{j=1}^{b} (Y_{ij} - \bar{Y}_{..})^2 = b \sum_{i=1}^{a} (\bar{Y}_{i.} - \bar{Y}_{..})^2 + a \sum_{j=1}^{b} (\bar{Y}_{.j} - \bar{Y}_{..})^2 + \sum_{i=1}^{a} \sum_{j=1}^{b} (Y_{ij} - \bar{Y}_{i.} - \bar{Y}_{.j} + \bar{Y}_{..})^2`

Partitioning the Sum of Squares:

`SS_T = SS_{\text{Treatment}} + SS_{\text{Block}} + SS_E`

`ab - 1 = (a - 1) + (b - 1) + (a-1)(b-1)`

Sum of Squares Expressions:

`SS_T = \sum_{i=1}^{a} \sum_{j=1}^{b} Y_{ij}^2 - \frac{Y_{..}^2}{N}`

`SS_{\text{Treatment}} = b \sum_{i=1}^{a} \bar{Y}_{i.}^2 - \frac{Y_{..}^2}{N}`

`SS_{\text{Block}} = a \sum_{j=1}^{b} \bar{Y}_{.j}^2 - \frac{Y_{..}^2}{N}`

`SS_E = SS_T - SS_{\text{Treatment}} - SS_{\text{Block}}`

Mean Squares Computation:

`MS_{\text{Treatment}} = \frac{SS_{\text{Treatment}}}{a-1}`

`MS_{\text{Block}} = \frac{SS_{\text{Block}}}{b-1}`

`MS_E = \frac{SS_E}{(a-1)(b-1)}`

F-Test for Treatment Effect:

`F_0 = \frac{MS_{\text{Treatment}}}{MS_E}`

Reject the null hypothesis if:

`F_0 > F_{\alpha, (a-1), (a-1)(b-1)}`

Analysis of Variance Table:

Source of Variation	Sum of Squares (SS)	DF	Mean Square (MS)	F-value
Treatment (A)	$ SS_A $	$ a-1 $	$ MS_A = \frac{SS_A}{a-1} $	$ F_A = \frac{MS_A}{MS_E} $
Blocks (B)	$ SS_B $	$ b-1 $	$ MS_B = \frac{SS_B}{b-1} $	$ F_B = \frac{MS_B}{MS_E} $
Error (E)	$ SS_E $	$ (a-1)(b-1) $	$ MS_E = \frac{SS_E}{(a-1)(b-1)} $
Total	$ SS_T $	$ ab-1 $

Understanding ANOVA (Analysis of Variance with $n$ observations per cell) One Way Table

If the treatment means are equal, the treatment and error mean squares of the model will be (theoretically) equal.

If the treatment means differ, the treatment mean square will be larger than the error mean square of the model.

ANOVA Table

Source of Variation	Sum of Squares	df	Mean Square (M.Sq)	$F_0$
Between Treatments	`SS_{\text{Treatment}} = n \sum_{i=1}^{a} (\overline{y}_{i\cdot} - \overline{y}_{\cdot\cdot})^2`	$a - 1$	`MS_{\text{Treatment}} = \dfrac{SS_{\text{Treatment}}}{a - 1}`	`F_0 = \dfrac{MS_{\text{Treatment}}}{MSE}`
Error (Within)	`SS_E = SS_T - SS_{\text{Treatment}}`	$N - a$	`MSE = \dfrac{SS_E}{N - a}`
Total	`SS_T = \sum_{i=1}^{a} \sum_{j=1}^{n} (y_{ij} - \overline{y}_{\cdot\cdot})^2`	$N - 1$

Test Statistic and Decision Rule

The reference distribution for $F_0$ is the $F_{a-1, (N-a)}$ distribution.

Reject the null hypothesis (equal treatment means) if: `F_0 > F_{\alpha, a-1, (N-a)}`

Point and Confidence Interval Estimates

The point estimate of $\mu_i$ is given by: `\hat{\mu}_i = \overline{y}_{\cdot\cdot} + \tau_i = \overline{y}_{i\cdot}`

A $100(1 - \alpha)\%$ confidence interval for the $i$th treatment mean is:

`\overline{y}_{i\cdot} - t_{\frac{\alpha}{2}, (N-a)} \sqrt{\dfrac{MSE}{n}} < \mu_i < \overline{y}_{i\cdot} + t_{\frac{\alpha}{2}, (N-a)} \sqrt{\dfrac{MSE}{n}}`

For the difference between two treatment means $\mu_i - \mu_j$, the $100(1 - \alpha)\%$ confidence interval is:

`\overline{y}_{i\cdot} - \overline{y}_{j\cdot} - t_{\frac{\alpha}{2}, (N-a)} \sqrt{\dfrac{2MSE}{n}} < \mu_i - \mu_j < \overline{y}_{i\cdot} - \overline{y}_{j\cdot} + t_{\frac{\alpha}{2}, (N-a)} \sqrt{\dfrac{2MSE}{n}}`

Important Notes

If the null hypothesis is false, the expected value of $MS_{\text{Treatment}}$ is greater than $\sigma^2$.
Use the p-value of the $F$-test for treatment effect decision-making.
Sum of square between treatments $SS_{B}=Treatment_{AVG} - Grand_{AVG}$.
Sum of squares within or $SS_{B}= SS_{T}-SS{B}$.

Two-Factor ANOVA

The basic two-factor ANOVA model is given by:

`y_{ijk} = \mu + \tau_i + \beta_j + (\tau\beta)_{ij} + \epsilon_{ijk}`

where: - $\mu$ = overall mean - $\tau_i$ = effect of the $i$th treatment (row factor) - $\beta_j$ = effect of the $j$th column factor - $(\tau\beta)_{ij}$ = interaction effect between $\tau_i$ and $\beta_j$ - $\epsilon_{ijk}$ = experimental error, assumed to be independently and normally distributed with mean 0 and variance $\sigma^2$

Index Ranges

$i = 1, 2, \dots, a$ (levels of Factor A)
$j = 1, 2, \dots, b$ (levels of Factor B)
$k = 1, 2, \dots, n$ (number of replications)

Data Layout

Factor A	Factor B
Factor A	L1	L2	...	Lb
1	`y_{111}, y_{112}, \dots, y_{11n}`	`y_{121}, y_{122}, \dots, y_{12n}`	...	`y_{1b1}, y_{1b2}, \dots, y_{1bn}`
2	`y_{211}, y_{212}, \dots, y_{21n}`	`y_{221}, y_{222}, \dots, y_{22n}`	...	`y_{2b1}, y_{2b2}, \dots, y_{2bn}`
...	...	...	...	...
a	`y_{a11}, y_{a12}, \dots, y_{a1n}`	`y_{a21}, y_{a22}, \dots, y_{a2n}`	...	`y_{ab1}, y_{ab2}, \dots, y_{abn}`

Mean Calculations

The model equation can be simplified to:

`y_{ijk} = \mu + M_{ij} + \epsilon_{ijk}`

Mean definitions:

- Row mean: `\overline{y}_{i..} = \dfrac{1}{bn} \sum_{j=1}^{b} \sum_{k=1}^{n} y_{ijk}` for $i = 1, 2, \dots, a$ - Column mean: `\overline{y}_{.j.} = \dfrac{1}{an} \sum_{i=1}^{a} \sum_{k=1}^{n} y_{ijk}` for $j = 1, 2, \dots, b$ - Cell mean: `\overline{y}_{ij.} = \dfrac{1}{n} \sum_{k=1}^{n} y_{ijk}` for $i=1,\dots,a$ and $j=1,\dots,b$ - Grand mean: `\overline{y}_{...} = \dfrac{1}{abn} \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n} y_{ijk}`

Experimental Setup

In general: - There are two factors (row and column). - Factor A has $a$ levels (row treatments). - Factor B has $b$ levels (column treatments). - Each treatment combination has $n$ replications. - Total number of runs: `abn`.

Note: This model assumes a fixed effect case. Random effect models require different analysis approaches.

Decomposition of Total Sum of Squares

The objective is to test hypotheses concerning the row (A treatments), column (B treatments), and interaction effects.

The decomposition of the total sum of squares is given by:

`y_{ijk} = \overline{y}_{...} + (\overline{y}_{i..} - \overline{y}_{...}) + (\overline{y}_{.j.} - \overline{y}_{...}) + (\overline{y}_{ij.} - \overline{y}_{i..} - \overline{y}_{.j.} + \overline{y}_{...}) + (y_{ijk} - \overline{y}_{ij.})`

$\overline{y}_{...}$ = grand mean
$(\overline{y}_{i..} - \overline{y}_{...})$ = row (Factor A) effect
$(\overline{y}_{.j.} - \overline{y}_{...})$ = column (Factor B) effect
$(\overline{y}_{ij.} - \overline{y}_{i..} - \overline{y}_{.j.} + \overline{y}_{...})$ = interaction effect
$(y_{ijk} - \overline{y}_{ij.})$ = error

Total variability is measured by the Total Sum of Squares (TSS):

`SS_T = \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n} (y_{ijk} - \overline{y}_{...})^2`

The total sum of squares is decomposed as:

`SS_T = SS_A + SS_B + SS_{AB} + SS_E`

Where:

`SS_A` = Sum of squares due to Factor A (row treatments)
`SS_B` = Sum of squares due to Factor B (column treatments)
`SS_{AB}` = Sum of squares due to interaction between A and B
`SS_E` = Sum of squares due to error

Degrees of Freedom

Total: `N - 1 = abn - 1`
Factor A: `a - 1`
Factor B: `b - 1`
Interaction: `(a - 1)(b - 1)`
Error: `ab(n - 1)`

Two-Way ANOVA Table

Source of Variation	Sum of Squares (SS)	Degrees of Freedom (df)	Mean Square (MS)	F-Statistic
A treatments	`SS_A`	`a - 1`	`MS_A = \dfrac{SS_A}{a - 1}`	`F_0 = \dfrac{MS_A}{MS_E}`
B treatments	`SS_B`	`b - 1`	`MS_B = \dfrac{SS_B}{b - 1}`	`F_0 = \dfrac{MS_B}{MS_E}`
Interaction (A × B)	`SS_{AB}`	`(a - 1)(b - 1)`	`MS_{AB} = \dfrac{SS_{AB}}{(a - 1)(b - 1)}`	`F_0 = \dfrac{MS_{AB}}{MS_E}`
Error	`SS_E`	`ab(n - 1)`	`MS_E = \dfrac{SS_E}{ab(n - 1)}`	-
Total	`SS_T`	`abn - 1`	-	-

Interpretation

Two-way ANOVA compares mean differences between groups split on two independent variables (factors).
The primary purpose is to understand if there is an interaction effect between the two factors on the dependent variable.

Hypothesis Testing in Two-Way ANOVA

In Two-Way ANOVA, we test three hypotheses:

Hypothesis for the equality of row (Factor A) treatment effects:
Null Hypothesis: `H_0: \tau_1 = \tau_2 = \cdots = \tau_a`
Alternative Hypothesis: `H_1:` At least one `\tau_i \neq 0`
Hypothesis for the equality of column (Factor B) treatment effects:
Null Hypothesis: `H_0: \beta_1 = \beta_2 = \cdots = \beta_b`
Alternative Hypothesis: `H_1:` At least one `\beta_j \neq 0`
Hypothesis for the equality of interaction effects between Factor A and Factor B:
Null Hypothesis: `H_0: (\tau\beta)_{ij} = 0`
Alternative Hypothesis: `H_1:` At least one `(\tau\beta)_{ij} \neq 0`

Formulas for Sum of Squares

Total Sum of Squares (TSS): `SS_T = \sum_{i=1}^{a} \sum_{j=1}^{b} \sum_{k=1}^{n} y_{ijk}^2 - \dfrac{y_{...}^2}{abn}`
Sum of Squares for Factor A (Rows): `SS_A = \dfrac{1}{bn}\sum_{i=1}^{a} y_{i..}^2 - \dfrac{y_{...}^2}{abn}`
Sum of Squares for Factor B (Columns): `SS_B = \dfrac{1}{an}\sum_{j=1}^{b} y_{.j.}^2 - \dfrac{y_{...}^2}{abn}`
Sum of Squares for Interaction: `SS_{AB} = SS_{\text{subtotal}} - SS_A - SS_B`
Error Sum of Squares: `SS_E = SS_T - SS_A - SS_B - SS_{AB}`

Worked Example

An engineer is studying methods to improve target detection on a radar scope. The factors are:

Ground Clutter (A): Low, Medium, High
Filter Type (B): Type-1, Type-2

Data Table

Ground Clutter	Filter Type
Ground Clutter	Type-1	Type-2
Low	90, 96, 108, 98	66, 84, 92, 94
Medium	102, 105, 106, 109	92, 98, 91, 95
High	114, 112, 108, 109	93, 91, 95, 83

Step-by-Step Calculations

Total Sum of Squares:

`SS_T = \sum y_{ijk}^2 - \dfrac{y_{...}^2}{abn} = 1985.33`

Sum of Squares for Ground Clutter (A):

`SS_A = \dfrac{1}{bn}\sum y_{i..}^2 - \dfrac{y_{...}^2}{abn} = 353.083`

Sum of Squares for Filter Type (B):

`SS_B = \dfrac{1}{an}\sum y_{.j.}^2 - \dfrac{y_{...}^2}{abn} = 937.5`

Sum of Squares for Interaction:

`SS_{AB} = SS_{\text{subtotal}} - SS_A - SS_B = 81.25`

Error Sum of Squares:

`SS_E = SS_T - SS_A - SS_B - SS_{AB} = 613.5`

ANOVA Summary Table

Source of Variation	SS	df	MS	F₀	Decision
Ground Clutter	`353.083`	`2`	`176.54`	`5.18`	Significant
Filter Type	`937.5`	`1`	`937.5`	`27.5`	Significant
Interaction	`81.25`	`2`	`40.62`	`1.19`	Not Significant
Error	`613.5`	`18`	`34.08`	-	-
Total	`1985.33`	`23`	-	-	-

Interpretation of Results

Ground Clutter (p < 0.05): Significant effect on detection ability.
Filter Type (p < 0.05): Significant effect on detection ability.
Interaction (p > 0.05): No significant interaction between Ground Clutter and Filter Type.

Treatment (Level)	Observations	Totals	Averages
Treatment (Level)	\( y_{i1} \)	Totals	Averages	\( y_{i2} \)	\(\cdots\)	\( y_{in} \)
1	\( y_{11} \)	\( y_{12} \)	\(\cdots\)	\( y_{1n} \)	\( y_{1\cdot} \)	\( \overline{y}_{1\cdot} \)
2	\( y_{21} \)	\( y_{22} \)	\(\cdots\)	\( y_{2n} \)	\( y_{2\cdot} \)	\( \overline{y}_{2\cdot} \)
\(\vdots\)	\(\vdots\)	\(\vdots\)	\(\cdots\)	\(\vdots\)	\(\vdots\)	\(\vdots\)
\( a \)	\( y_{a1} \)	\( y_{a2} \)	\(\cdots\)	\( y_{an} \)	\( y_{a\cdot} \)	\( \overline{y}_{a\cdot} \)
Totals:					\( y_{\cdot \cdot} \)	\( \overline{y}_{\cdot \cdot} \)

Labels