Salary is hypothesized to depend on educational qualification and occupation. To understand the dependency, the salaries of 40 individuals [SalaryData.csv] are collected and each person’s educational qualification and occupation are noted. Educational qualification is at three levels, High school graduate, Bachelor, and Doctorate. Occupation is at four levels, Administrative and clerical, Sales, Professional or specialty, and Executive or managerial. A different number of observations are in each level of education – occupation combination.
[Assume that the data follows a normal distribution. In reality, the normality assumption may not always hold if the sample size is small.]
State the null and the alternate hypothesis for conducting one-way ANOVA for both Education and Occupation individually.
Perform a one-way ANOVA on Salary with respect to Education. State whether the null hypothesis is accepted or rejected based on the ANOVA results.
Perform a one-way ANOVA on Salary with respect to Occupation. State whether the null hypothesis is accepted or rejected based on the ANOVA results.
If the null hypothesis is rejected in either (2) or in (3), find out which class means are significantly different. Interpret the result.
Problem 1B:
What is the interaction between two treatments? Analyze the effects of one variable on the other (Education and Occupation) with the help of an interaction plot. [hint: use the ‘pointplot’ function from the ‘seaborn’ function]
Perform a two-way ANOVA based on Salary with respect to both Education and Occupation (along with their interaction Education*Occupation). State the null and alternative hypotheses and state your results. How will you interpret this result?
Explain the business implications of performing ANOVA for this particular case study.