L1022 Project A1 2020-21
Maximum project length: 3,000 words
The relationship between workers’ wages, age, gender, education and firm size
Worker-level wages are potentially affected by a number of factors including the age of the worker, his/her gender, his /her education and the size of the employing firm.
The objective of this project is to establish whether and to what extent workers’ age, education, gender and size of the employing firm are determinants of worker-level wages in a cross-sectional dataset of workers.
The data below contain information, covering a sample of 60 workers, on workers’ wages (Wage), age (Age), education (Edu), a commonly used measure of firm size and in particular the number of workers in the employing firm (Size) and a dummy variable indicating whether the worker is male (Gender=0) or female (Gender=1). Wages are measured in pounds per hour worked while Age and Edu are measured in years.
- Describe the data, using summary statistics and graphs, as appropriate.
- Calculate the pair-wise correlation coefficients between Wage and each of the other variables. Test the statistical significance of each correlation coefficient.
- Consider the two variables Gender and and Size. Compute the pairwise correlation of the two variables and test the significance of the correlation coefficient. Now consider the two values of Gender while grouping the Size variable into 3 intervals (3 intervals of 20 observations each) and construct a contingency Table. Using the contingency Table perform a test for the presence of association between the two variables. Compare and discuss results from the contingency Table analysis with results from the correlation analysis.
- Consider the two variables Age and Edu and test the null hypothesis that the two variables have equal variance.
- Estimate a regression model of the form:
Wagei =α + β1Agei + β2Edui + β3Genderi + β4Sizei +ui
where the i subscript corresponds to worker i. Interpret the coefficients that you obtain, and comment on their economic and statistical significance.
- Interpret the R2 statistic from the regression and test whether it is statistically significant.
- Re-estimate the model adding the variable Age to the power two (Age^2) and comment on any changes to the results and goodness of fit:
Wagei =α + β1Agei + β2Agei2 + β3Edui + β4Genderi + β5Sizei +ui
- Estimate a (partial) log-version of the regression model of the form:
Log(Wagei)= α + β1Agei + β2Agei2 + β3Edui + β4Genderi + β5Log(Sizei)+ui
Note that this version of the model is the one typically used in applied Labor economics analyses. Interpret the coefficients that you obtain, and comment on their economic and statistical significance. Compare this model with the one estimated in point 7.
- What conclusions do you draw from your analysis?
Copy and paste the data into Excel and conduct all the analysis in Excel.
| Worker_id | Wage | Age | Edu | Gender | Size |
| 1 | 23 | 53 | 11 | 0 | 5 |
| 2 | 18 | 49 | 7 | 0 | 30 |
| 3 | 30 | 49 | 14 | 0 | 7 |
| 4 | 23 | 40 | 11 | 0 | 46 |
| 5 | 19 | 32 | 10 | 0 | 8 |
| 6 | 34 | 47 | 14 | 0 | 24 |
| 7 | 33 | 61 | 19 | 1 | 4 |
| 8 | 21 | 35 | 15 | 0 | 23 |
| 9 | 25 | 51 | 11 | 0 | 3 |
| 10 | 29 | 50 | 14 | 0 | 31 |
| 11 | 21 | 37 | 13 | 0 | 38 |
| 12 | 34 | 57 | 14 | 0 | 42 |
| 13 | 11 | 23 | 11 | 0 | 11 |
| 14 | 16 | 50 | 10 | 0 | 36 |
| 15 | 19 | 44 | 10 | 1 | 13 |
| 16 | 18 | 40 | 12 | 0 | 6 |
| 17 | 16 | 34 | 12 | 0 | 9 |
| 18 | 25 | 49 | 11 | 0 | 9 |
| 19 | 26 | 54 | 13 | 1 | 43 |
| 20 | 25 | 56 | 12 | 0 | 5 |
| 21 | 21 | 45 | 10 | 1 | 13 |
| 22 | 29 | 53 | 13 | 0 | 88 |
| 23 | 20 | 44 | 10 | 1 | 24 |
| 24 | 23 | 41 | 10 | 0 | 10 |
| 25 | 19 | 33 | 9 | 0 | 134 |
| 26 | 32 | 45 | 14 | 0 | 9 |
| 27 | 29 | 44 | 12 | 1 | 27 |
| 28 | 11 | 34 | 11 | 1 | 10 |
| 29 | 28 | 41 | 13 | 0 | 17 |
| 30 | 29 | 45 | 14 | 0 | 7 |
| 31 | 33 | 48 | 16 | 0 | 15 |
| 32 | 25 | 55 | 13 | 0 | 10 |
| 33 | 26 | 47 | 8 | 0 | 16 |
| 34 | 23 | 39 | 8 | 1 | 37 |
| 35 | 28 | 42 | 14 | 0 | 21 |
| 36 | 27 | 46 | 10 | 0 | 28 |
| 37 | 26 | 54 | 11 | 0 | 23 |
| 38 | 19 | 40 | 10 | 0 | 22 |
| 39 | 17 | 42 | 12 | 1 | 43 |
| 40 | 27 | 45 | 15 | 0 | 14 |
| 41 | 31 | 47 | 14 | 0 | 45 |
| 42 | 34 | 45 | 15 | 0 | 127 |
| 43 | 25 | 45 | 13 | 1 | 16 |
| 44 | 22 | 45 | 11 | 0 | 56 |
| 45 | 34 | 53 | 15 | 1 | 115 |
| 46 | 30 | 42 | 14 | 1 | 91 |
| 47 | 28 | 57 | 15 | 0 | 17 |
| 48 | 33 | 51 | 12 | 0 | 73 |
| 49 | 13 | 23 | 11 | 0 | 5 |
| 50 | 29 | 43 | 14 | 0 | 80 |
| 51 | 36 | 57 | 10 | 0 | 6 |
| 52 | 26 | 46 | 12 | 0 | 57 |
| 53 | 31 | 63 | 15 | 0 | 13 |
| 54 | 38 | 52 | 11 | 0 | 91 |
| 55 | 27 | 44 | 17 | 0 | 40 |
| 56 | 29 | 44 | 13 | 0 | 14 |
| 57 | 35 | 62 | 15 | 0 | 76 |
| 58 | 28 | 32 | 16 | 0 | 19 |
| 59 | 28 | 44 | 13 | 0 | 27 |
| 60 | 33 | 58 | 16 | 1 | 22 |
where:
Worker_id = Worker identifier
Wage = Wage of the worker in pounds per hour worked
Age = Age of the worker
Edu = Number of years of education of the worker
Gender = Dummy variable indicating whether the worker is male (Gender=0) of female (Gender=1)
Size = Number of workers in the firm employing the worker