# A Report to Do Data Analysis in SPSS and Conduct Relevant Hypotheses Homework Solution

## Instructions

1. The primary objective of the Study on the Efficacy of Nosocomial Infection Control (SENIC Project) was to determine whether infection surveillance and control programs have reduced the rates of nosocomial (hospital-acquired) infection in United States hospitals. This data set consists of a random sample of 113 hospitals selected from the original 338 hospitals surveyed. Each line of the data set has an identification number and provides information on 11 other variables for a single hospital. The data presented here are for the 1975-76 study period. Please download hospital.xlsx from Blackboard. The dataset contains 12 variables shown below.

 1 Identification number: 1-113 2 Infection risk: Average estimated probability of acquiring infection in hospital (in percent) 3 Length of stay: Average length of stay of all patients in hospital (in days) 4 Age: Average age of patients (in years) 5 Routine culturing ratio: Ratio of number of cultures performed to number of patients without signs or symptoms of hospital-acquired infection, times 100 6 Routine chest X-ray ratio: Ratio of number of X-rays performed to number of patients without signs or symptoms of pneumonia, times 100 7 Number of beds: Average number of beds in hospital during study period 8 Medical school affiliation: 1=Yes, 2=No 9 Region: Geographic region, where: 1=Northeast, 2=Northcentral, 3=South, 4=West 10 Average daily census: Average number of patients in hospital per day during study period 11 Number of nurses: Average number of full-time equivalent registered and licensed practical nurses during study period (number full time plus one half the number part time) 12 Available facilities and services: Percent of 35 potential facilities and services that are provided by the hospital

1.1 Import the dataset into SPSS. Make sure you correctly specify the Measure for each variable. Please add Values for two categorical variables – medical school affiliation and region. Report descriptive measures and create graphical displays for the following variables – length, age, infection risk, available facilities and services, and number of beds. Provide a summary of your findings (no more than 200 words) based on the descriptive statistics and displays.

1.2 Confirmatory approach: Consider a regression model with infectious risk against age, routine culturing ratio, average daily census, available facilities and service, and Medical school affiliation. Provide a write-up of your findings (no more than 300 words) in APA format to address the following three issues.

• Assumption of homoscedasticity, assumption of normality, independence of error terms (i.e. autocorrelation), and collinearity between predictors.

• Provide the prediction equation. Interpret the value of unstandardized coefficients within the context.

• Identify any outlier or influential point based on Cook’s D and standardized DfBeta.

1.3 exploratory approach: Use either forward or stepwise selection method to find the best set of predictors for explaining length of stay. Consider all other variables as candidate predictors excluding medical school affiliation and region. Describe the variable selection procedure and report the final model (no more than 200 words). Hint: Make sure you adjust the tolerance level to be 0.10.

1.4 Consider a regression model with infectious risk against average daily census and Medical school affiliation. Assume there is no interaction between Medical school affiliation and daily census (i.e., equal slopes). Provide separate prediction equations for those affiliated with a medical school (yes) and those are not affiliated (no). Report your findings.

2. A researcher would like to investigate factors that are related to the years of graduate school for a student and number of students graduating. The data responses are stored in Graduate.sav. Four variables are measured.

 year: years of graduate school (values range from 1 year to 14 years) university: 1 – UC, Berkeley; 2 – Columbia University; 3 – Princeton University residence: 1 – permanent residents; 2 – temporary residents events: number of students graduating in each category For example, the first line of data would read: there are 31 students, who are permanent residents and spent one year in graduate school, are graduating in UC, Berkeley.

2.1 Are years of graduate school differ among students in different universities and with different residence status? Use appropriate GLM method to examine the main effect of university, main effect of residence status, and their interaction effect on years of graduate school. Create an APA style summary (no more than 200 words) and include the test result on homogeneity assumption, ANOVA summary table, interpretation of overall model usefulness as well as main and interaction effects based on F test and effect size measures.

2.2 This researcher is also interested in the relationship between number of students graduating and their years of graduate school. Create an appropriate graphical display to show this relationship, and summarize your observations (no more than 100 words). Then use an appropriate statistical measure to report the linear association between these two variables, and interpret the value within the context (no more than 100 words).

2.3At last, this researcher wants to examine whether number of students graduating differs by university and residence status. Use appropriate GLM method to examine the main effect of university, main effect of residence status, and their interaction effect on number of students graduating. If the interaction effect is significant, conduct simple effect analysis. Provide a summary to describe your analysis and all of the findings (no more than 300 words).

3. A group of researchers is asked to examine the effect of a new brand of Margarine (called as Clora Margarine) on the cholesterol measures. Eighteen participants were recruited through a random sampling process, and used Clora Margarine for 8 weeks. Their cholesterol was measured before the special diet, after 4 weeks, and after 8 weeks. The data responses are stored in Cholesterol.sav.

3.1 Report descriptive measures and create graphical displays

(a) Create a table to display descriptive measures for cholesterol levels at three different time points (i.e., before, after 4 weeks, and after 8 weeks).

(b) Two graphs are shown below. The first graph displays changes in cholesterol measures across three-time points for each participant. The second graph only indicates the average cholesterol measure at each time point. Note that the scale of the y-axis is different in these two plots. Please comment on the mean difference of cholesterol across three-time points as well as individual differences in the changes of cholesterol measures. 3.2Use appropriate GLM method to test whether change in mean cholesterol is significant across three time points. Provide an APA format write-up to summarize all the procedures in your analysis and general findings. Please at least cover the following information in your report.

• Assumption of sphericity.
• The F test and effect size measures for the main analysis. (Please make sure you use the most appropriate F statistic and corresponding degrees of freedom.)
• Conduct post hoc comparisons if applicable on cholesterol measures between each pair of time points using Sidak method. (Hint: Time is a within-subject factor. In SPSS, the Post Hoc option conducts post hoc comparisons for between-subject factors only.)

## Assignment solution

Question 1.1

Table 1: Descriptive Statistics

 N Minimum Maximum Mean Std. Deviation Infectious Risk 113 1.3 17.9 5.102 2.4735 Length 113 1.60 42.00 10.1073 4.18667 Age 113 38.8 65.9 53.232 4.4616 Available facilities and services 113 6 835 Number of beds 113 29 835 43.16 15.201 Valid N (listwise) 113 252.17 192.843 Fig 1: Histogram for infectious risk Fig 2: Histogram for Age Fig 3: Histogram for Length Fig 4: Histogram for Number of beds Fig 5: Histogram of available facilities and services

Table 1 above shows the descriptive statistics of the interest variables including the mean, standard deviation, and minimum and maximum values. The mean age of respondents is 53.23 while its standard deviation is 4.462. The average length of stay in the hospital is 10.11 while its standard deviation is 4.187. Infectious risk has a mean value of 5.10 and a standard deviation value of 2.474. The average number of beds in the hospital is 252.17 and its standard deviation is 192.843. Lastly, the available facilities and services have a mean of 43.16 and a standard deviation value of 15.201. To visualize all these variables the histogram bar chart was used with a super imposed normal curve which tells which the direction of their distribution. Only Age and available facilities and services were normally skewed to the left while other variables were skewed to the right.

Question 1.2

ANOVAa

 Model Sum of Squares df Mean Square F Sig. Regression 104.885 5 20.977 3.867 .003b 1 Residual 104.885 107 20.977 Total 685.251 112

a. Dependent Variable: Infectious Risk
b. Predictors: (Constant), Medical school affiliation, Age, Routine culturing ratio, Available facilities and services, Average daily census
Model Summaryb
 Model R R Square Adjusted R Square Std. Error of the Estimate 1 .391a .153 .113 2.3289

a. Predictors: (Constant), Medical school affiliation, Age, Routine culturing ratio, Available facilities and services, Average daily census
b. Dependent Variable: Infectious Risk

Coefficients'a
Model Unstandardized Coefficients Standardized Coefficients t Sig Correlations Collinearity Statistics
B Std. Error Beta Zero-order Partial Part Tolerance VIF
1 (Constant) .267 3.231   .082 .934
Age .114 .051 .206 2.231 .028 .179 .211 .198 .925 1.081
Routine culturing ratio -.002 .023 -.007 -.077 .939 -.031 -.007 -.007 .874 1.144
Average daily census 006 .002 .402 2.587 .011 .313 .243 .230 .329 3.044
Available facilities and services -.028 .024 -.175 1.207 .230 .178 -.116 -.107 .376 2.657
Medical school affiliation -.669 .796 -.097 -841 .402 -.221 -.081 -.075 .593 1.686
Residuals Statistics
Minimum Maximum Mean Std. Deviation N
Predicted Value 2.886 9.336 5.102 9677 113
Std. Predicted Value -2,290 4.375 .000 1.000 113
Standard Error of Predicted Value 266 1.102 505 .182 113
Adjusted Predicted Value 2.809 7.816 5.078 .9494 113
Residual -4.5034 8.6040 0000 2.2764 113
Std. Residual -1.934 3.694 .000 .977 113
Stud. Residual -2.014 4.194 .005 1.026 113
Deleted Residual -4.8855 11.0873 .0241 2.5141 113
Stud. Deleted Residual -2.044 4.566 .012 1.052 113
Mahal. Distance .473 24.094 4.956 4.646 113
Cook's Distance .000 .846 .019 .083 113
Centered Leverage Value .004 .215 .044 .041 113
a. Dependent Variable: Infectious Risk
Collinearity Diagnostics
Model Dimension Eigenvalue Condition Index Variance Proportions
(Constant) Age Routine culturing ratio Average daily census Available facilities and services Medical school affiliation
1 1 5.273 1.000 .00 .00 .01 .00 .00 .00
2 .390 3.679 .00 .00 .02 .22 .01 .01
3 .289 4.274 .00 .00 .83 .00 ,00 .01
4 .033 12.633 .00 .00 .05 .69 .92 .04
5 .013 20.256 .04 .16 .00 .07 .06 .87
6 .003 42.457 .95 .83 .09 .01 .01 .08
a. Dependent Variable: Infectious Risk  Interpretations

Multiple linear regressions were used to estimate the relationship between the response variable infectious risk and the five independent variables. The assumptions of homoscedasticity of variance were not violated since the scatter plot between the studentized residuals against the unstandardized predicted values shows a linear trend. However, assumption of normality is violated since the Q-Q plot above shows the data points straying away from the linear trend line in an obvious form. Assumption of independence of error term is not violated also with no relationship found between the residuals and the response variable and lastly cook distance below mean shows no presence of outlier in this model.
The regression model is significant with (F5,107=3.867, p-value = 0.003) with the p-value of the model lesser than 0.05 level of significance we establish the fact that the model is significant and the independent variables could predict the response variable. The coefficient of determination R-square is the amount of variability in the regression model that the independent variables caused by the independent variable in the model. The R-square was computed to be 0.153 which means 15.3% of the variation in the response variable (Infectious risk) can be accounted for by the independent variables. Furthermore the test of significance of the independent variables indicates that both variables Average daily census and Age were the only significant independent variables with their p-value 0.000<0.05 level of significance. The other variables in the model were insignificant with their p-value 0.165>0.05 level of significance. Furthermore, VIF values below 10 in the model above represent no presence of multicollinearity among the pairs of the independent variable. Lastly, the regression prediction equation for this model can be written as
Infectious risk = 0.26 + 〖0.114〗_Age–〖0.002〗_(Routine culturing ratio) + 〖0.006〗_(Average daily census) - 〖0.028〗_(Available facilities and services) - 〖0.669〗_(Medical school affiliation).

Question 1.3

Model Summary
Mod el R R Square Adjusted R Square Std. Error of the Estimate
1 .473a .224 .217 2.1885
2 .517b .267 .254 2.1366
3 .546c .298 .278 2.1014
4 .572d .327 .302 2.0665
a. Predictors: (Constant), Length
b. Predictors: (Constant), Length, Average daily census
c. Predictors: (Constant), Length, Average daily census, Age
d. Predictors: (Constant), Length, Average daily census, Age, Routine chest X-ray ratio
e. Dependent Variable: Infectious Risk

ANOVAa
Model Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta
(Constant) 1 2.275 540   4.213 .000
Length 280 .049 .473 5.663 .000
(Constant) 1.917 .546   3.514 .001
2 Length 250 .050 .423 5.041 .000
Average daily census .003 .001 213 2.542 .012
(Constant) -3.222 2.426   -1.328 187
Length 3 244 .049 413 4,999 .000
Average daily census .004 .001 .225 2.723 .008
Age .097 .045 175 2.172 .032
(Constant) -4.967 2.518   -1.973 .051
Length 224 .049 .379 4.580 .000
4 Average daily census .004 .001 223 2.735 .007
Age .099 .044 .179 2.265 .025
Routine chest X-ray ratio .022 .010 175 2.170 .032
a. Dependent Variable: Infectious Risk

Interpretations

In this model, the forward stepwise regression model was used. Forward selection is a form of stepwise regression that starts with a blank model and gradually adds variables. In each time you we take a step forward, we add one variable that improves the model the most. However, the coefficient of determination R-square will be used to determine the best model out of the four fitted model above and the model with the highest R-square will be considered as the best model. The last model is considered as the best model which has a R-square value of 0.327 the highest among the four above. This model has the four predictor variables which are Length, Average daily census, Age and Routine check X-ray rotation. This model is found to be significant with [F(4,108) = 13.117, p<0.05] and lastly all independent variables in this model is significant with their p-value lesser than 0.05 level of significance.

Question 1.4

Model   Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .315a 0.099 0.083 2.3689
a. Predictors: (Constant),   Medical school affiliation (Yes), Average daily census
b. Dependent Variable:   Infectious Risk
ANOVAa
Model Sum of Squares df Mean Square F Sig.
1 Regression 67.988 2 33.994 6.058 .003b
Residual 617.263 110 5.611
Total 685.251 112
a. Dependent Variable: Infectious Risk
b. Predictors: (Constant), Medical school affiliation (Yes), Average daily census
Coefficientsa
Model Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta
1 (Constant) 4.804 1.716 2.799 0.006
Average daily census 0.005 0.002 0.285 2.484 0.015
Medical school affiliation (Yes) -0.313 0.79 -0.045 -0.397 0.692
a. Dependent Variable: Infectious Risk
Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .313a 0.098 0.09 2.3598
a. Predictors: (Constant), Medical school affiliation (No), Average daily census
b. Dependent Variable: Infectious Risk
Model Summary
Model R R Square Adjusted R Square Std. Error of the Estimate
1 .313a 0.098 0.09 2.3598
a. Predictors: (Constant), Medical school affiliation (No), Average daily census
b. Dependent Variable: Infectious Risk
ANOVAa
Model Sum of Squares df Mean Square F Sig.
Regression 67.105 1 67.105 12.05 .001b
Residual 618.146 111 5.569
Total 685.251 112
a. Dependent Variable: Infectious Risk

b. Predictors: (Constant), Medical school affiliation (No), Average daily census

Coefficientsa
Model Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta
1 (Constant) 4.139 0.355 11.645 0
Average daily census
Medical school affiliation (No)
.005
-0.289
.001
0.655
.313
-0.021
3.471
-0.155
.001
0.601
a. Dependent Variable: Infectious Risk
Interpretations

The two multiple linear regression predicting infectious risk is significant at 5% level of significance. While R-square of the model with Yes medical school affiliation is greater than that of No. The predicted equation for each is given respectively;

Medical school affiliation (Yes)

Infectious risk = 4.804 + 〖0.005〗_(Average daily census)–〖0.313〗_(Medical school affilation)

Medical school affiliation (No)

Infectious risk = 4.139 + 〖0.005〗_(Average daily census)–〖0.289〗_(Medical school affilation)

Question 2.1
Levene's Test of Equality of Error Variancesa
Dependent Variable: years of graduate school
F df1 df2 Sig.
0.857 5 67 0.515
Tests the null hypothesis that the error variance of the dependent variable is equal across groups.
a. Design: Intercept + university + residence + university * residence

Tests of Between-Subjects Effects
Dependent Variable: years of graduate school
Source Type III Sum of Squares df Mean Square F Sig. Partial Eta Squared
Corrected Model 81.507a 5 16.301 1.036 0.404 0.072
Intercept 3066.16 1 3066.16 194.908 0 0.744
university 26.319 2 13.159 0.837 0.438 0.024
residence 47.909 1 47.909 3.045 0.086 0.043
university * residence 26.319 2 13.159 0.837 0.438 0.024
Error 1054 67 15.731
Total 4629 73
Corrected Total 1135.507 72
a. R Squared = .072 (Adjusted R Squared = .003)
Interpretations

The univariate Generalized linear model was used to test if years of graduate school differ among students in different universities and with different residence statuses. Two main effects identified in this model as university and residence and their interaction term are also included in the model. At a 5% level of significance, the homogeneity assumption is not violated in the model from the table above with (F=0.857, p=0.515).
The overall model is not significant with [F(5,67)=1.036, p>0.05]. This shows there is no difference among the years of graduate students in different universities and with different residence statuses. The main effect of University is not significant with [F(2,67)=0.837, p>0.05]. Similarly, the main effect of residence is not significant with [F(1,67)=3.045, p>0.05]. While, the interaction effect is also not significant with [F(2,67)=0.837, p>0.05].
Lastly, Eta-Square is the effect size measure of this model which were given in the model above, with the least measure given as 0.24 we can conclude the effect size is large enough in the model above.

Question 2.2 Scatter plot between Year of graduate school and number of students graduating
Interpretations: A scatterplot is a data visualization that depicts the relationship between two numerical variables. Each member of the dataset is represented as a point whose x-y coordinates correspond to the two variables' values. Hence, a scatterplot is used above to determine the relationship between years of graduate school and the number of graduate students, and the graph above shows some points of linearity which means there exists a linear relationship between both variables.

Correlations
number of students graduating Pearson Correlation 1 -.340**
Sig. (2-tailed) 0.003
N 73 73
years of graduate school Pearson Correlation -.340** 1
Sig. (2-tailed) 0.003
N 73 73
**. Correlation is significant at the 0.01 level (2-tailed).
Interpretations: Correlation is a statistical technique used in determining the degree of ass if association or relationship between two variables. Correlation can either be negative or positive; also it ranges from -1 to +1. Hence, from the table above the correlation coefficient between both variables is (r=-0.340, p<0.03) which means there is a negative significant association between both variable's number of students graduating and years of graduate school.

Question 2.3

Levene's Test of Equality of Error Variancesa
Dependent Variable: number of students graduating
 F df1 df2 Sig. 14.300 5 67 .000
Tests the null hypothesis that the error variance of the dependent variable is equal across groups.
a. Design: Intercept + university + residence + university * residence

Tests of Between-Subjects Effects
Dependent Variable: number of students graduating
Source Type III Sum of Squares df Mean Square F Sig. Partial Eta Squared
Corrected Model 314364.322a 5 62872.864 7.298 0 0.353
Intercept 348799.192 1 348799.192 40.484 0 0.377
university 152597.273 2 76298.636 8.856 0 0.209
residence 87246.036 1 87246.036 10.126 0.002 0.131
university * residence 60588.929 2 30294.465 3.516 0.035 0.095
Error 577248.335 67 8615.647
Total 1336525 73
Corrected Total 891612.658 72

a. R Squared = .353 (Adjusted R Squared = .304)

Interpretations

The univariate Generalized linear model was used to test if the number of student graduating differ among students in different universities and with different residence status. With two main effects identified in this model as university and residence and their interaction term is also included in the model. At a 5% level of significance, the homogeneity assumption is violated in the model from the table above with (F=14.300, p<0.05). Hence, heteroscedasticity is present.
The overall model is significant with [F(5,67)=7.298, p<0.05]. This shows there is a significant difference among the number of students graduating from different universities and with different residence statuses. The main effect of the University is significant with [F(2,67)=8.856, p<0.05]. Similarly, the main effect of residence is also significant with [F(1,67)=10.126, p<0.05]. While, the interaction effect is significant too with [F(2,67)=3.516, p<0.05].
Lastly, Eta-Square the effect size measure of this model shows there is a medium effect size across all predictor variables in the model with the least effect size given as 0.095.

Question 3.1

A.

Descriptive Statistics
N Minimum Maximum Mean Std. Deviation
Before the special diet 18 3.91 8.43 6.4078 1.19109
After 4 weeks 18 3.7 7.71 5.8417 1.12335
After 8 weeks 18 3.66 7.67 5.7789 1.10191
Valid N (listwise) 18 Fig 1: Histogram distribution of before the special diet Fig 2: Histogram distribution of diets after 4 weeks Fig 3: Histogram distribution of diets after 8 weeks

Interpretations

The descriptive statistics table above includes the variable's mean, standard deviation, and minimum and maximum values. The first cholesterol levels “before the diet” have a mean of 6.41 and a standard deviation of 1.191. Diets after 4 weeks have a mean value of 5.84 and a standard deviation 1.123, diets after 8 weeks have a mean value of 5.78 and a standard deviation value of 1.102. The histogram bar chart was used to visualize the three distributions above with a superimposed normal curve and the three graphs show the variables were normally distributed and skewed to the right.
B Interpretations: The graph above for each participant's changes in cholesterol level indicates the 5th participants have the highest change in cholesterol level, while the 10th participants have the lowest changes in cholesterol level among the participants. Interpretations: The graph above indicates the average cholesterol measure at each time point. The mean difference in cholesterol levels “before the diet” is the highest among the three, followed by diets after 4 weeks, while diets after 8 weeks have the lowest mean difference.

Question 3.2
Mauchly's Test of Sphericitya
Measure: Cholestrol_levels
Within Subjects Effect Mauchly's W Approx. Chi-Square df Sig. Epsilonb
Greenhouse-Geisser Huynh-Feldt Lower-bound
Time 0.381 15.44 2 0 0.618 0.642 0.5
Tests the null hypothesis that the error covariance matrix of the orthonormalized transformed dependent variables is proportional to an identity matrix.
a. Design: Intercept
Within Subjects Design: Time
b. May be used to adjust the degrees of freedom for the averaged tests of significance. Corrected tests are displayed in the Tests of Within-Subjects Effects table.
Tests of Within-Subjects Effects
Measure: Cholestrol_levels
Source Type III Sum of Squares df Mean Square F Sig. Partial Eta Squared
Time Sphericity Assumed 4.32 2 2.16 212.321 0 0.926
Greenhouse-Geisser 4.32 1.235 3.497 212.321 0 0.926
Huynh-Feldt 4.32 1.284 3.365 212.321 0 0.926
Lower-bound 4.32 1 4.32 212.321 0 0.926
Error(Time) Sphericity Assumed 0.346 34 0.01
Greenhouse-Geisser 0.346 21.001 0.016
Huynh-Feldt 0.346 21.822 0.016
Lower-bound 0.346 17 0.02
Tests of Between-Subjects Effects
Measure: Cholestrol_levels
Transformed Variable: Average
Source Type III Sum of Squares df Mean Square F Sig. Partial Eta Squared
Intercept 1950.125 1 1950.125 503.326 0 0.967
Error 65.866 17 3.874
Estimates
Measure: Cholestrol_levels
Time Mean Std. Error 95% Confidence Interval
Lower Bound Upper Bound
1 6.408 0.281 5.815 7
2 5.842 0.265 5.283 6.4
3 5.779 0.26 5.231 6.327
Interpretations From the table above, we can conclude sphericity assumptions is violated with (χ^2(2) =15.440, p<0.05). Violation of sphericity assumptions means the variances of the differences between all combinations of related groups are not equal. As our data violated the assumption of sphericity, we look at the values in the "Greenhouse-Geisser" row as given in the within-subject-effects above. The ANOVA with repeated measures with a Greenhouse-Geisser correction is reported, the mean scores for the cholesterols levels were statistically significantly different (F(1.235, 21.001) = 212.321, p < 0.001). The Sidak pairwise comparison table presents the results for post hoc test. This table gives us the significance level for differences between the individual time points. We can see that there was a significant difference in cholesterol levels between diets after 4 weeks and diets after 8 weeks (p = 0.000), and between before the diets and diets after 8 weeks (p = 0.000), also there is significant differences between before the diets and diets after 4 weeks (p = 0.004).
Pairwise Comparisons
Measure: Cholestrol_levels
(I) Time (J) Time Mean Difference (I-J) Std. Error Sig.b 95% Confidence Interval for Differenceb
Lower Bound Upper Bound
1 2 .566* 0.037 0 0.469 0.663
3 .629* 0.042 0 0.518 0.74
2 1 -.566* 0.037 0 -0.663 -0.469
3 .063* 0.017 0.004 0.019 0.107
3 1 -.629* 0.042 0 -0.74 -0.518
2 -.063* 0.017 0.004 -0.107 -0.019
Based on estimated marginal means
*. The mean difference is significant at the .05 level.
b. Adjustment for multiple comparisons: Sidak.
Interpretations

From the table above, we can conclude sphericity assumptions are violated with (χ^2(2) =15.440, p<0.05). Violation of sphericity assumptions means the variances of the differences between all combinations of related groups are not equal. As our data violated the assumption of sphericity, we look at the values in the "Greenhouse-Geisser" row as given in the within-subject effects above. The ANOVA with repeated measures with a Greenhouse-Geisser correction reported the mean scores for the cholesterols levels were statistically significantly different (F(1.235, 21.001) = 212.321, p < 0.001).

The Sidak pairwise comparison table presents the results for the post hoc test. This table gives us the significance level for differences between the individual time points. We can see that there was a significant difference in cholesterol levels between diets after 4 weeks and diets after 8 weeks (p = 0.000), and between before the diets and diets after 8 weeks (p = 0.000), also there is significant differences between before the diets and diets after 4 weeks (p = 0.004).