- CHI-SQUARE test is used to test the independence of two categorical variables
- NULL Hypothesis: The two categorical variables are independent
- Alternate Hypothesis: The two categorical variables are not independent
- The Chi-Square Test Statistic is calculated as below:
- The degrees of freedom = (r - 1)(c - 1), where r = number of rows, c = number of columns of the contingency table of categorical variables
- If calculated Critical Value for degrees of freedom < Test Statistic, then reject NULL hypothesis
- EXERCISE: Given below is a survey output of Gender vs Educational Level with 395 observations
- Are gender and education level dependent at 5% significance level?
- SOLUTION:
- H0: Gender is independent of Education Level
- Ha: Gender is not independent of Education Level
- Calculating Expected frequency for level combination: Female vs High School
- (Row Total * Column Total)/(Sample Size) = 100 * 201/395 = 50.886
- Calculating Expected frequency for level combination: Female vs Bachelors
- (Row Total * Column Total)/(Sample Size) = 98 * 201/395 = 49.868
- So on for other level combinations gets us the following output:
- Using Chi-Square Test Statistic formula: ((60 - 50.886) pow 2)/50.886 + ((54- 49.868) pow 2)/49.868 + ((46 - 50.377) pow 2)/50.377 + ((41 - 49.868) pow 2)/49.868 + ((40 - 49.114) pow 2)/49.114 + ((44 - 48.132) pow 2)/48.132 + ((53- 48.623) pow 2)/48.623 + ((57 - 48.132) pow 2)/48.132 = 8.006
- Chi-Square Critical Value for degrees of freedom = (4 - 1)(2 - 1) = 3 is 7.815
- Clearly Chi-Square Test Statistic = 8.006 > Chi-Square Critical Value = 7.815 and hence we reject the NULL hypothesis
REFERENCES
Comments
Post a Comment