Module: ADVANCED QUANTITATIVE METHODS FOR MANAGERS AND DECISION MAKING
3 rd Written Assignment (WA3)
Assignment guidelines
The assignment should be well structured in a managerial style and easy to read.
Explain shortly what you do in each subject.
Avoid repeating theory, and/or list basic formulas.
Define the quantities in your calculation.
Interpret results not in “dry statistical language” nut what they mean for the specific
problem (in context)
Try to provide answers not just statistical calculations.
The assignment is submitted as a business report in a Word document. You should also
submit an Excel (or other statistical package) file with your calculations.
PART I (Subjects 1 to 3)
For this part you will use the same data set (250 observations) as in your first and second written
assignments. Variables are indicated in italics.
Subject 1 (15%)
Investigate whether Job-type is a factor that affects Credit card debt.
i. | Select the appropriate test of hypothesis and state the null and alternative hypothesis (5%) Test the null hypothesis at 95% confidence level and state your conclusions. (10%). |
ii. |
Subject 2 (20%)
A two-way ANOVA in SPSS (Excel output would include the same information), regarding the effect of
factors Age and Marital status on Household Income produced the following results:
Descriptive Statistics
Dependent Variable: Household income in thousands
Age category Marital status |
Mean | Std. Deviation | N |
18-24 Unmarried Married Total |
21.8667 22.1429 22.0000 |
5.48852 6.57334 5.92814 |
15 14 29 |
25-34 Unmarried Married Total |
37.5000 33.5217 35.0270 |
12.38330 18.49784 16.38001 |
14 23 37 |
35-49 Unmarried Married Total |
54.1875 71.4167 63.3088 |
27.92437 46.89708 39.80897 |
32 36 68 |
50-64 Unmarried Married Total |
82.9310 110.5000 96.4737 |
57.43501 91.84548 76.87585 |
29 28 57 |
>65 Unmarried Married Total |
57.5926 33.1563 44.3390 |
59.67106 24.61328 45.50506 |
27 32 59 |
Tests of Between-Subjects Effects
Dependent Variable: Household income in thousands
Source | Type III Sum of Squares |
df | Mean Square | F | Sig. |
Model | 1003214.089a | 10 | 100321.409 | 44.330 | 0.000 |
agecat marital agecat * marital |
151441.306 616.066 23338.370 |
4 1 4 |
37860.326 616.066 5834.593 |
16.730 0.272 2.578 |
0.000 0.602 0.038 |
Error Tota |
543137.911 1546352.000 |
240 250 |
2263.075 |
i. | Interpret the result of the analysis of variance and state your conclusions in context (10%) Explain the interaction effect by plotting the relevant data, and comment on the significance of the interaction effect, providing an explanation in business terms. (10%). |
ii. |
Subject 3 (10%)
Consider the six variables income, debtinc, creddebt, othdebt, age, and ed, which are all numerical
variables.
i. | Compute (using the proper tools) the pairwise correlation coefficients between those variables and indicate which ones are significant at 5% level. You are asked to choose a dependent variable and build a regression model selecting explanatory variables from the list of the variables above. Make sure that the explanatory variables you finally choose in the regression are significant. Justify the causality between explanatory and explanatory variable and explain how your |
ii. | |
iii. |
model quantifies this association.
PART B
The data set for subjects 4 and 5 is given in the file “WA#3 CarSales.xlsx”. The file contains data
regarding sales of different car models along with technical characteristics of the specific cars. The
description of the variables is given in the sheet Data.
Subject 4 (20%)
i. Create two Scatter Plot graphs that show how “Resale Value” is associated to “Price”
and to “Fuel Efficiency (mpg)” and comment on the shape of the association in each case.
ii. | Develop a simple linear regression model between Resale Value as the dependent variable and Price as the explanatory variable. Use the least squares method to estimate the regression coefficients (Do not use mathematic formulas. Use available tools in Excel, SPPS or another statistical package). State the regression equation, check the significance of the coefficients at the 5% level and give the interpretation of the regression coefficients b0 and b1 in context. Based on the regression model, what is the expected average Resale value, for a car priced at 30 thousand and a car priced at 15 thousand? Provide a range for the resale value of the two prices in (iv), with 95% confidence. Interpret the value of the R2 and the value of the Standard Error of the regression Produce and examine the residual plot and the normal probability plot of residuals and indicate whether assumptions of regression analysis hold true in this case. Add mileage (mpg) as a second explanatory variable in the regression model and run the regression. Compare your results with those of the model derived in (ii). If you should |
iii. | |
iv. | |
v. vi. vii. |
|
viii. |
choose one of the two models for prediction purposes, which one would you choose?
Justify your choice both statistically as well as in business terms.
Subject 5 (25%)
i. Find the correlation between Resale Value and all other numerical variables in the data
set. Comment on the rationality of the correlation coefficients.
ii. | According to your results, which of the above correlation coefficients are not significant at 5% level of significance? Develop a regression model with Resale Value as dependent variable, using all variables that had a significant correlation coefficient in (ii), as explanatory variables. Do you observe any cases of explanatory variables that had a statistically significant correlation coefficient with Resale Value in (ii), but their regression coefficient in (ii) is not significant. How can you explain that? Rerun the regression in (iii) using only the explanatory variables that were statistically |
iii. | |
iv. | |
v. |
significant. Interpret the values of the regression coefficients, R square, and standard
error of the regression.
vi. | Give a numerical example on how you can use this model to predict a resale price of a car. Compare the regression model against the one in 4.viii. Which one would you choose for prediction purposes? Justify your choice both statistically as well as in business terms. |
vii. |
Subject 6 (10%)
A linear regression of variable Y against two explanatory variables X1 and X2 produced the
following estimation model:
Y = 160.976 – 1.732X1 – 2.526X2 + e
(40.298) (0.427) (1.382)
The number in parentheses are the standard errors of each coefficient
i. Fill in the cells in the following regression output table
Coefficients | Standard Error |
t Stat | P-value | Lower 95% |
Upper 95% |
Intercept | |||||
X1 | |||||
X2 |
ii. Which independent variables are statistically significant at 5% and 10% level?