The following analysis was obtained using data in MEAP93.RAW, which contains school-level pass rates (as a percent) on a 10th-grade math test.
(i) The variable expend is expenditures per student, in dollars, and math10 is the pass rate on the exam. The following simple regression relates math10 to lexpend 5 log(expend):
Interpret the coefficient on lexpend. In particular, if expend increases by 10%, what is the estimated percentage point change in math10? What do you make of the large negative intercept estimate? (The minimum value of lexpend is 8.11 and its average value is 8.37.)
(ii) Does the small R-squared in part (i) imply that spending is correlated with other factors affecting math10? Explain. Would you expect the R-squared to be much higher if expenditures were randomly assigned to schools—that is, independent of other school and student characteristics-rather than having the school districts determine spending?
(iii) When log of enrollment and the percent of students eligible for the federal free lunch program are included, the estimated equation becomes
Comment on what happens to the coefficient on lexpend. Is the spending coefficient still statistically different from zero?
(iv) What do you make of the R-squared in part (iii)? What are some other factors that could be used to explain math10 (at the school level)?
The obtained regression results are as follows:
i) The coefficient of lexpend is 11.16. This implies that a 1% increase in expenditure will increase the pass rate in exams by 11.16/100 = 0.1116 units.
Also, this can be interpreted as a 10% increase in expenditure will increase the pass rate in exams by 1.116 units.
The large negative intercept (69.34) shows that without any expenditure per student, the pass rate in the exams is negative by 69.34 units.
ii) R-squared in a regression model is a measure of fit of the regression variables, implying how close the data is to the regression line.
High values of R-squared are preferred over low values in order to make the data as best fit.
In the given case only one variable has been used to predict the pass rates, ignoring other values which could have been taken into consideration, which could have led to the low value of R-squared.
By assigning expenditure randomly to schools independent of other schools, the value of R-squared would not get too much affected.
iii) The new regression results give the value of the coefficient of lexpend equal to 7.75. This implies that a 1% increase in expenditure will increase the pass rate in exams by 7.75/100 = 0.0775 units.
iv) The new R-squared of the regression model is 0.1893. This means that around 18.93% of the variation in the model is explained by the given variables. By including more variables like lenroll and lnchprg the value of R-squared has shown a positive improvement.
Some of the other factors which could be used to explain math10 can be attendance percentage marks obtained in-class tests and the number of hours devoted by students for studying at home.