**Percentage Variance**

**Instructions:**** **

The Stata output below is from a dataset that includes transfer students in the state of Ohio. Researchers were interested in the factors that predict transfer students’ GPA (transfer GPA is on a scale from 0 to 4.0). Researchers hypothesized that four variables predict or explain the variance in GPA: gender, age, number of credits earned prior to transfer, and participation in remedial coursework prior to transfer. To begin the analysis, researchers entered three independent variables into a regression analysis and the Stata output below shows the results of these three variables on the outcome variable transfer GPA: gender (“Gender”), age (“CalcAge”), and number of transfer credits earned prior to transfer (“NtcEarn”). Answer the following set of questions using the Stata output below.

- What percentage of the variance in the outcome (i.e., variance in transfer GPA) is explained by the overall model?

- Identify which independent variables are statistically significant in the model and interpret the coefficients. Assume that “2.Gender” means females and the coefficient is interpreted relative to males. Also assume that “CalcAge” variable is a continuous variable and each unit is one year, and “NtcEarn” is a continuous variable.

Next researchers added a fourth variable to the regression model: remediation participation prior to transfer (“RemPreTran”); the Stata output below displays the results of this regression model with the four independent variables. Assume that “1.RemPreTran” means participation in remediation prior to transfer and the coefficient is interpreted relative to students who did not participate in remediation. Use this output to answer the questions below.

- What percentage of the variance in the outcome (i.e., variance in transfer GPA) is explained by the overall model?

- What percentage of the variance in the outcome (i.e., variance in transfer GPA) is explained by remediation participation prior to transfer?

- Identify which independent variables are statistically significant in the model and interpret the coefficients.

**Solution**

- What percentage of the variance in the outcome (i.e., variance in transfer GPA) is explained by the overall model?

Variance in the outcome of the overall model is indicated by the R-squared metric of the model. In the given model, R-squared equals 0.0121 i.e. 1.21% of the variance in the outcome in explained by the overall model.

- Identify which independent variables are statistically significant in the model and interpret the coefficients. Assume that “2.Gender” means females and the coefficient is interpreted relative to males. Also assume that “CalcAge” variable is a continuous variable and each unit is one year, and “NtcEarn” is a continuous variable.

If we take our significance level as 0.05, at that level “Gender” and “CalcAge” are the only statistically significant variables. To interpret them, Gender has a positive coefficient which would imply that females are more likely to have a higher transfer GPA. Age has a small negative coefficient which implies that younger folks are likely to have a higher GPA than older students. So, for a unit increase in age of the transfer student, there is a -0.007 decrease in the student’s GPA.

Next researchers added a fourth variable to the regression model: remediation participation prior to transfer (“RemPreTran”); the Stata output below displays the results of this regression model with the four independent variables. Assume that “1.RemPreTran” means participation in remediation prior to transfer and the coefficient is interpreted relative to students who did not participate in remediation. Use this output to answer the questions below.

Using the R-squared of the model, 8.42% of the variance in the outcome is explained by the overall model.

- What percentage of the variance in the outcome (i.e., variance in transfer GPA) is explained by remediation participation prior to transfer?

If we assume that there is no correlation between “Gender”, “CalcAge” and “RemPreTran”, then the increase in R-squared of the second model over the previous model can be attributed to “RemPreTran”. Therefore, 7.21% of the variance in the outcome is explained by the remedial participation prior to transfer.

- Identify which independent variables are statistically significant in the model and interpret the coefficients.

Using a 0.05 significance level, “Gender”,”CalcAge” and “RemPreTran” are the statistically significant variables in the model. Continuing the answer from Question2 , “RemPreTran” has a negative coefficient, which shows that participation in remedial coursework has a negative impact on the GPA of the transfer students.