## Comparing Two Regression Tests

Sum of Squares | df | Mean Square | F | P-value | |
---|---|---|---|---|---|

Regression | 41.0379 | 1 | 41.0379 | 72.2735 | .000 |

Residual | 9.6528 | 17 | 0.5678 | ||

Total | 50.6907 | 18 |

The ANOVA table above shows that there is a strong evidence to reject the null hypothesis as the p value is less than 0.05. Therefore, we conclude that there is a linear relationship between the female mass and egg number for species 1.

coefficient | Std. Error | t | P-value | ||

(Constant) | 4.1292 | 1.4409 | 2.8656 | 0.0107 | |

Female Mass | 0.2011 | 0.0237 | 8.5014 | .0000 |

The regression equation is given as . The test shows that the variable (female mass) is significant at 5% level of significance.

Null hypothesis: there is relationship between female mass and egg number in species2

Alternative hypothesis: there is no relationship between female mass and egg number in species2

Linear regression test

Sum of Squares | df | Mean Square | F | P-value | |
---|---|---|---|---|---|

Regression | 105.1745 | 1 | 105.1745 | 154.1147 | .000 |

Residual | 30.7099 | 45 | 0.6824 | ||

Total | 135.8844 | 46 |

The ANOVA table above shows that there is a strong evidence to reject the null hypothesis as the p value is less than 0.05. Therefore, we conclude that there is a linear relationship between the female mass and egg number for species 2

Coefficient | Std. Error | t | P-value | ||

(Constant) | 4.9654 | 0.9759 | 5.0880 | 0.000 | |

Female Mass | 0.1875 | 0.0151 | 12.4143 | 0.000 |

The regression equation is given as . The test shows that the variable (female mass) is significant at 5% level of significance

## Independent T-test

This is used to test the predicted values and species (grouping variable i.e. 1 and 2). The result is displayed below:

Species | N | Mean | Std. Deviation | Std. Error Mean |

1 | 19 | 61.0065954 | 6.94321843 | 1.59288355 |

2 | 47 | 63.8909508 | 7.11113481 | 1.03726562 |

The descriptive statistics show that Specie 1 has (N=19, mean = 61.01, SD = 6.94) while specie 2 has (N = 47, mean = 63.89, SD = 7.11).

F | t | df | Sig. (2-tailed) | Mean Difference |

.280 | -1.502 | 64 | .138 | -2.88435533 |

The test shows that there is no sufficient evidence to reject the null hypothesis as the p-values is greater than 0.05, therefore, we conclude that the true difference between the group mean is zero i.e. the mathematical relationship between female mass and egg number is the same in both species.

## Nested ANOVA Test

Null hypothesis: Polycarp General Hospital (Smyrna) requires few number of days post op before release than Saint Genesius General Hospital (Rome)

Alternative hypothesis: Saint Genesius General Hospital (Rome) requires few number of days post op before release than Polycarp General Hospital (Smyrna)

The appropriate test for this question is the Nested ANOVA test, because we need to compare the means of both hospital to determine the better choice based on getting patient home sooner.

Symra G.H | Rome G.H | |
---|---|---|

Mean | 30.97143 | 28.37143 |

Variance | 4.057143 | 4.584679 |

Observations | 70 | 70 |

df | 69 | 69 |

F | 0.884935 | |

P(F<=f) one-tail | 0.306529 | |

F Critical one-tail | 0.671141 |

The table above shows the result for the nested ANOVA test. The mean and variance of Smyra General Hospital is 30.97, 4.057 and 28.37, 4.584 for Rome General Hospital respectively. The test statistic is 7.399, critical value is 1.977, p-value = 0.3065 This implies that we do not have strong evidence to reject the null hypothesis and conclude that Saint Polycarp General Hospital (Smyrna) requires few number of days post op before release than Genesius General Hospital (Rome). The best choice for patients who want to get home sooner is Saint Polycarp General Hospital (Smyrna).

**Test for Normality for Smyra General Hospital**

Smyra G.H | Kolmogorov-Smirnov | Shapiro-Wilk | ||||

Statistic | df | Sig. | Statistic | df | Sig. | |

.158 | 70 | .000 | .964 | 70 | .042 |

The Kolmogorov-Smirnov test shows that the test is significant, therefore the data for Symra General Hospital is statistically significance, i.e. normality assumption test is valid.

##### Normality test for Rome General Hospital

## Linear Regression Test

Sum of Squares | df | Mean Square | F | P-value | |
---|---|---|---|---|---|

Regression | 1112490.230 | 1 | 1112490.23 | 14.766 | .002 |

Residual | 1130146.277 | 15 | 75343.085 | ||

Total | 2242636.507 | 16 |

The ANOVA table above shows that there is a strong evidence to reject the null hypothesis as the p value is less than 0.05. Therefore, we conclude that there is a linear relationship between the light and depth and also a good fit between the two data (light and depth)

coefficient | Std. Error | t | P-value | ||

(Constant) | 660.70 | 139.247 | 4.745 | .000 | |

Depth | -52.218 | 13.589 | -3.843 | .002 |

## Three-way ANOVA Test

Null hypothesis: there is no relationship between the independent variable (elevation feet, season and peaks) interactions

Alternative hypothesis: there is relationship between the independent variable (elevation feet, season and peaks) interactions

The most appropriate test for this question is the three way Anova test. This is because there are three different variables of interest and we want to know the interaction effect between the three variable. To determine the whether we have a statistically significant three-way interaction, we need to consult the “elevation”, “season”, “peaks” row in the Test of Between-subjects effect table as shown below.

Source | Type III Sum of Squares | df | Mean Square | F | Sig. | Partial Eta Squared |
---|---|---|---|---|---|---|

Corrected Model | 117142.996^{a} |
35 | 3346.943 | 1367.016 | .000 | .997 |

Intercept | 195261.735 | 1 | 195261.735 | 79752.179 | .000 | .998 |

Elevationfeet | 31883.951 | 2 | 15941.976 | 6511.298 | .000 | .989 |

Season | 3917.200 | 1 | 3917.200 | 1599.931 | .000 | .917 |

Peaks | 34467.387 | 5 | 6893.477 | 2815.553 | .000 | .990 |

Elevationfeet * Season | 644.600 | 2 | 322.300 | 131.639 | .000 | .646 |

Elevationfeet * Peaks | 5645.555 | 10 | 564.556 | 230.586 | .000 | .941 |

Season * Peaks | 34447.424 | 5 | 6889.485 | 2813.923 | .000 | .990 |

Elevationfeet * Season * Peaks | 6136.878 | 10 | 613.688 | 250.653 | .000 | .946 |

Error | 352.563 | 144 | 2.448 | |||

Total | 312757.294 | 180 | ||||

Corrected Total | 117495.560 | 179 |

(I) Elevation(feet) | (J) Elevation(feet) | Mean Difference (I-J) | Std. Error | Sig. | 95% Confidence Interval | |
---|---|---|---|---|---|---|

Lower Bound | Upper Bound | |||||

4000 | 6000 | -16.1303^{*} |
.28568 | .000 | -16.6950 | -15.5657 |

8000 | -32.6000^{*} |
.28568 | .000 | -33.1647 | -32.0353 | |

6000 | 4000 | 16.1303^{*} |
.28568 | .000 | 15.5657 | 16.6950 |

8000 | -16.4697^{*} |
.28568 | .000 | -17.0343 | -15.9050 | |

8000 | 4000 | 32.6000^{*} |
.28568 | .000 | 32.0353 | 33.1647 |

6000 | 16.4697^{*} |
.28568 | .000 | 15.9050 | 17.0343 | |

Dependent Variable: Weight | ||||||

LSD |

The post hoc test is used to confirm where the difference occurred between groups (elevation, season, peak). The table above shows that there is a statistically significant difference between all variable (elevation, season and peak) because the p-value of all the interaction are less at 0.05 at 5% level of significance.

Null hypothesis: there is no linear relationship between AIC proteins and Age

Sum of Squares | df | Mean Square | F | P-value | |
---|---|---|---|---|---|

Regression | 0.874 | 1 | 0.874 | 77.149 | .000 |

Residual | 0.589 | 52 | 0.011 | ||

Total | 1.464 | 53 |

coefficient | Std. Error | t | P-value | ||

(Constant) | 4.465 | 0.094 | 47.709 | .000 | |

Age | 0.015 | 0.002 | 8.783 | .000 |

The best regression equation is given as . The test shows that the variable (Age) is significant at 5% level of significance. The variable Age is significant because the p-value is less than 0.05, therefore we conclude that is it significant. The R square which is used to show the variability in AIC concentration is 0.597, which implies that the variable was able to explain about 60% of the AIC concentration.

Residual Plot

Minimum | Maximum | Mean | Std. Deviation | N | |

Predicted Value | 5.0910 | 5.4635 | 5.2772 | .12844 | 54 |

Residual | -.20898 | .20454 | .00000 | .10545 | 54 |

Std. Predicted Value | -1.450 | 1.450 | .000 | 1.000 | 54 |

Std. Residual | -1.963 | 1.921 | .000 | .991 | 54 |

Dependent Variable: A1C |

The table above shows the residual analysis of the dependent variable with mean and standard deviation of the residual and predicted value.

## Step-wise Linear Regression

Null hypothesis: there is no association between the dependent variable (Exact protein) and the independent variable (Protein 1, Protein 2, Protein 3, and Protein 4)

Alternative hypothesis: there is association between the dependent variable (Exact protein) and the independent variable (Protein 1, Protein 2, Protein 3, and Protein 4)

The appropriate test for this question is the stepwise linear regression (forward). This was chosen because we are interested in the final model among all model from the regression analysis.

Model | R | R Square | Adjusted R Square | Std. Error of the Estimate | R Square Change | Sig. F Change |
---|---|---|---|---|---|---|

1 | .897^{a} |
.805 | .796 | 8.768 | .805 | .000 |

2 | .966^{b} |
.933 | .927 | 5.251 | .128 | .000 |

3 | .981^{c} |
.962 | .956 | 4.072 | .029 | .001 |

The table above shows the model summary, model 1 has an Adjusted R square of 0.796, model 2 has 0.927, while model 3 has an Adjusted R square of 0.962. It also shows that there is significant change in the Adjusted R square as the three model are statistically significant.

Model | Sum of Squares | df | Mean Square | F | Sig. | |
---|---|---|---|---|---|---|

1 | Regression | 7285.977 | 1 | 7285.977 | 94.782 | .000^{b} |

Residual | 1768.023 | 23 | 76.871 | |||

Total | 9054.000 | 24 | ||||

2 | Regression | 8447.343 | 2 | 4223.671 | 153.168 | .000^{c} |

Residual | 606.657 | 22 | 27.575 | |||

Total | 9054.000 | 24 | ||||

3 | Regression | 8705.803 | 3 | 2901.934 | 175.018 | .000^{d} |

Residual | 348.197 | 21 | 16.581 | |||

Total | 9054.000 | 24 | ||||

a. Dependent Variable: Exact[Protein] | ||||||

b. Predictors: (Constant), Method 3[Protein] | ||||||

c. Predictors: (Constant), Method 3[Protein], Method 1[Protein] | ||||||

d. Predictors: (Constant), Method 3[Protein], Method 1[Protein], Method 4[Protein] |

The ANOVA table above shows the model that are significant. This implies that model 1 (Method 3 protein) is statistically significant, model 2 (method 3 protein and method 1 protein) is statistically significant and model 3 (method 3 protein, method 1 protein and method 4 protein) is statistically significant.

Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||

B | Std. Error | Beta | ||||

1 | (Constant) | -106.133 | 20.447 | -5.191 | .000 | |

Method 3[Protein] | 1.968 | .202 | .897 | 9.736 | .000 | |

2 | (Constant) | -127.596 | 12.685 | -10.059 | .000 | |

Method 3[Protein] | 1.823 | .123 | .831 | 14.814 | .000 | |

Method 1[Protein] | .348 | .054 | .364 | 6.490 | .000 | |

3 | (Constant) | -124.200 | 9.874 | -12.578 | .000 | |

Method 3[Protein] | 1.357 | .152 | .619 | 8.937 | .000 | |

Method 1[Protein] | .296 | .044 | .310 | 6.784 | .000 | |

Method 4[Protein] | .517 | .131 | .284 | 3.948 | .001 | |

a. Dependent Variable: Exact[Protein] |

The regression equation of the final model is given as

Residual plot for the predict value