
Homepage

Multiple Linear Regression Equations Homework Help
Multiple Regression Equations
In this homework you will examine regression assumptions in
multiple regression equations present given the dataset posted to Moodle for this week. Use the notes and other resources posted in Moodle for this week to support your investigation. Report all relevant tables and charts to justify your work and be sure they are formatted to class expectations.
Response variable: The response variable is also known as the dependent variable which all other independent or explanatory variables depends on. Hence, for the purpose of this study the selected response variable is total expenditure because all other variables depend on this. The number of admission will increase the total expenditure or reduce it. Similarly, number of births and personnel were all depending on the total expenditure.
Descriptive statistics Estimates
Table 1: Descriptive statistics

Admission 
Births 
Payroll Exp 
Personnel 
Tot Exp 
Mean 
6831.84 
874.05 
30500.89 
861.5 
67139.81 
Median 
4777 
480 
20739.5 
589.5 
43364.5 
Variance 
44171146 
1131385 
1.07E+09 
675021.6 
4.95E+09 
Std deviation 
6646.138 
1063.67 
32715.84 
821.597 
70386.44 
Skewness 
1.6109 
1.5898 
2.2307 
1.7909 
2.0073 
Kurtosis 
3.0956 
2.5195 
6.0754 
3.1891 
4.5156 
Minimum 
111 
0 
1053 
50 
2082 
Maximum 
37375 
5691 
188865 
4087 
367706 
A normal bell shaped distribution has exactly 3. A distribution with kurtosis less than 3 is called platykurtic compared to a normal distribution, its tail are shorter and thinner and often its central peak is lower and broader. From the descriptive table above payroll expense has a kurtosis value closer to 3 as we can see above; hence its distribution will likely be bellshaped.
Correlation matrix
Table 2: correlation with response variable

Admissions 
Births 
Payroll Exp. 
Personnel 
Tot. Exp. 
Admissions 
1 




Births 
0.855624 
1 



Payroll Exp. 
0.848209 
0.659576079 
1 


Personnel 
0.879457 
0.697463374 
0.95187 
1 

Tot. Exp. 
0.90249 
0.713219085 
0.982541 
0.964709 
1 
From the correlation table above the variable payroll expense has the highest correlation with the response variable with the correlation coefficient between both variable (r=0.9825). This means there is a strong positive correlation between both variable total expenditure and payroll expense.
Scatter plot between Total expenditure and payroll expense.
Evidence of Multi collinearity
Table 3: Correlation matrix of independent variable

Admissions 
Births 
Payroll Exp. 
Personnel 
Admissions 
1 



Births 
0.855624455 
1 


Payroll Exp. 
0.848209291 
0.659576 
1 

Personnel 
0.879456785 
0.697463 
0.951870085 
1 
Multicollinearity is a phenomenon in which one predictor variable in a multiple regression model can be linearly predicted from the others with a substantial degree of accuracy. From the table above, there is multicollinearity between payroll expense and personnel. Similarly, there is multicollinearity between Admission and Births, Admission and Payroll expense, Admission and personnel with the correlation coefficients between these variables greater than 0.80.
Multiple Regression Assumptions
No
Multicollinearity: The greatest assumptions of multiple regression analysis are that the independent variables were not highly correlated. From the table 3 above which shows the correlation matrix of all independent variable in this study, we observe a high correlation between pairs of independent variable in this study which means this assumption is violated.
Linearity: This assumption states that there must be linear relationship between the response variable and the independent variable this assumption is not violated.
Homoscedasticity: The variance of error terms is identical around the values of the independent variables, according to this assumption. The distribution of points across all values of the independent variables can be determined by plotting uniform residuals versus expected values. This assumption is also not violated.
Multiple regression equation
Table 4: Multiple regression model
Coefficient 
B(Std. Err.) 
tvalue 
pvalue 
Intercept 
2607.73(934.288) 
2.791 
0.006 
Admission Births Payroll Exp Personnel 
2.493(0.285)1.781(1.179)1.409(0.063)13.101(2.793) 
8.748 1.510 22.261 4.691 
1.02E15 0.133 1.92E55 5.11E06 
RSquareAdjusted RSquareMultiple R 
0.9844
0.9841
0.9922

FValue 
3076.655 
Pr(F>0) 
6.5E175 
The table 3 above shows the multiple regression models between the dependent variable and the independent variables. The regression model is significant with (F4,195=3076.655, pvalue = 6.5E175) with the pvalue of the model lesser than 0.05 level of significance we establish the fact that the model is significant. The coefficient of determination Rsquare is the amount of variability in the regression model that the independent variables caused by the independent variable in the model. The Rsquare was computed to be 0.984 which means 98.4% of the variation in the model can be accounted for by the independent variables. Furthermore the test of significance of the independent variables indicate that all variables were significant expect births which has pvalue greater than 0.05 level of significance. Lastly the multiple regression equation for this model can be written as Total Exp = 2607.73 +〖2.493〗_Admissions–〖1.781〗_Births–〖1.409〗_(payroll expense)+〖13.101〗_personnel