- ASSUMPTIONS
- Samples are independent and random
- Groups are normally distributed using Shapiro test
- For Shapiro test,
- H0: Data is normal
- Ha: Data is not normal
- Favourable: P > 0.05 because we do not want to reject NULL hypothesis
- Homogenity of variances using LEVENE’S Test
- For LEVENE’s Test,
- H0: Sigma1 = Sigma2 = Sigma3…,
- Ha = Atleast one is different
- Favourable: P > 0.05
- NOTE: ANOVA model is robust to the violation of normality assumption provided Homogenity of variances condition holds good
- TukeyHSD Test: H0: The mean of levels in a factorial variable are equal
- One-way ANOVA is when only one categorical variable is considered
- Sum of Square of Total Variation (SST) = Sum of Square of
Among Groups (SSA) + Sum of Square of Within Groups (SSW)
- EXAMPLE - Consider the productive hours plotted for employees served with different drinks
- n = 15, c = 3
- Sample mean Xbar = (10 + 12 + 15 + 8 + 5 + 15 + 20 + 21 + 20 + 19 + 18 + 16 + 15 + 14 + 17)/15 = 225/15 = 15
- Juice Xbar = (10 + 12 + 15 + 8 + 5)/5 = 50/5 = 10
- Coffee Xbar = (15 + 20 + 21 + 20 + 19)/5 = 95/5 = 19
- Tea Xbar = (18 + 16 + 15 + 14 + 17)/5 = 80/5 = 16
- SST = 300
- (10 - 15)pow2 + (12 - 15)pow2 + (15 - 15)pow2 + (8 - 15)pow2 + (5 - 15)pow2 + (15 - 15)pow2 + (20 - 15)pow2 + (21 - 15)pow2 + (20 - 15)pow2 + (19 - 15)pow2 + (18 - 15)pow2 + (16 - 15)pow2 + (15 - 15)pow2 + (14 - 15)pow2 + (17 - 15)pow2
- SSA = 210 = (10 - 15)pow2 * 5 + (19 - 15)pow2 * 5 + (16 - 15)pow2 * 5
- SSW = 90
- (10 - 10)pow2 + (12 - 10)pow2 + (15 - 10)pow2 + (8 - 10)pow2 + (5 - 10)pow2 + (15 - 19)pow2 + (20 - 19)pow2 + (21 - 19)pow2 + (20 - 19)pow2 + (19 - 19)pow2 + (18 - 16)pow2 + (16 - 16)pow2 + (15 - 16)pow2 + (14 - 16)pow2 + (17 - 16)pow2
- MST = Mean Sum of Squares
- MSA = SSA/DegreesOfFreedom = 210/(3-1) = 105
- MSW = SSW/DegreesOfFreedom = 90/(n - c) = 90/12 = 7.5
- F-Statistics = MSA/MSW = 105/7.5 = 14
- NOTE: F-Statistics by itself will not help. One should calculate the F-Critical with same degrees of freedom to understand if NULL hypothesis should be accepted or rejected
- NOTE: All the formulas discussed above work with Balanced Dataset i.e. equal observations under each level
- F-Statistics > F-Critical => Reject H0
- F-Statistics < F-Critical => Do not reject H0
- NULL Hypothesis is rejected i.e. Means calculated for each level are not equal
Comments
Post a Comment