One-way ANOVA

  • ASSUMPTIONS
    • Samples are independent and random
    • Groups are normally distributed using Shapiro test
      • For Shapiro test, 
        • H0: Data is normal
        • Ha: Data is not normal
        • Favourable: P > 0.05 because we do not want to reject NULL hypothesis
    • Homogenity of variances using LEVENE’S Test
      • For LEVENE’s Test, 
        • H0: Sigma1 = Sigma2 = Sigma3…, 
        • Ha = Atleast one is different
        • Favourable: P > 0.05
    • NOTE: ANOVA model is robust to the violation of normality assumption provided Homogenity of variances condition holds good
    • TukeyHSD Test: H0: The mean of levels in a factorial variable are equal
  • One-way ANOVA is when only one categorical variable is considered
  • Sum of Square of Total Variation (SST) = Sum of Square of Among Groups (SSA) + Sum of Square of Within Groups (SSW)
Where c is number of levels while n is number of observations. For example, if Promotions is categorical variable with 3 levels – High, Low, Medium, then c = 3
  • EXAMPLE - Consider the productive hours plotted for employees served with different drinks
    • n = 15, c = 3
    • Sample mean Xbar = (10 + 12 + 15 + 8 + 5 + 15 + 20 + 21 + 20 + 19 + 18 + 16 + 15 + 14 + 17)/15 = 225/15 = 15
    • Juice Xbar = (10 + 12 + 15 + 8 + 5)/5 = 50/5 = 10
    • Coffee Xbar = (15 + 20 + 21 + 20 + 19)/5 = 95/5 = 19
    • Tea Xbar = (18 + 16 + 15 + 14 + 17)/5 = 80/5 = 16
    • SST 300
      • (10 - 15)pow2 + (12 - 15)pow2 + (15 - 15)pow2 + (8 - 15)pow2 + (5 - 15)pow2 + (15 - 15)pow2 + (20 - 15)pow2 + (21 - 15)pow2 + (20 - 15)pow2 + (19 - 15)pow2 + (18 - 15)pow2 + (16 - 15)pow2 + (15 - 15)pow2 + (14 - 15)pow2 + (17 - 15)pow2 
    • SSA  = 210 = (10 - 15)pow2 * 5 + (19 - 15)pow2 * 5 + (16 - 15)pow2 * 5
    • SSW = 90
      • (10 - 10)pow2 + (12 - 10)pow2 + (15 - 10)pow2 + (8 - 10)pow2 + (5 - 10)pow2 + (15 - 19)pow2 + (20 - 19)pow2 + (21 - 19)pow2 + (20 - 19)pow2 + (19 - 19)pow2 + (18 - 16)pow2 + (16 - 16)pow2 + (15 - 16)pow2 + (14 - 16)pow2 + (17 - 16)pow2
    • MST = Mean Sum of Squares
    • MSA = SSA/DegreesOfFreedom = 210/(3-1) = 105
    • MSW = SSW/DegreesOfFreedom = 90/(n - c) = 90/12 = 7.5
    • F-Statistics = MSA/MSW = 105/7.5 = 14
    • NOTE: F-Statistics by itself will not help. One should calculate the F-Critical with same degrees of freedom to understand if NULL hypothesis should be accepted or rejected
    • NOTE: All the formulas discussed above work with Balanced Dataset i.e. equal observations under each level
    • F-Statistics > F-Critical => Reject H0
    • F-Statistics F-Critical => Do not reject H0 
CONCLUSION:
  • NULL Hypothesis is rejected i.e. Means calculated for each level are not equal

Comments