Univariate Anova Analysis
MANOVA
- Using GSSWEBMAIL.sav, given in the resources (with 25 variables and 2,041 cases created from a subset of the 2010 GSS), consider the DVs WWWHR and EMAILHR, and IVs GENERATION and SEX.
- State a research question, and the omnibus null and alternative hypotheses for a MANOVA.
- Transform the DV WWWHR with the Ln transformation and include its before and after figure (Histogram) in the report.
- Using the attached MANOVA checklist for conducting MANOVA testing, screen and statistically test for the assumptions.
- Conduct a MANOVA with post hoc tests, using SPSS GLM.
- Provide a complete narrative of results in a maximum of six double-spaced pages in current APA format, in the order listed below:
- The main effects for each IV on the combined DV (test statistic, F ratio, p value, and effect size).
- The main effect for factor interaction (test statistic, F ratio, p value, and effect size).
- The univariate ANOVA results including main effects for each IV and DV (F ratio, p-value, and effect size).
- The post hoc results and your conclusions.
- Include the SPSS output file with your analysis.
Solution
Research questions and hypotheses:
We are interested in evaluating the effect of generation and sex on the combined DV of WWW hours per week and Email hours per week.
The appropriate research questions are:
- Whether there is a significant effect of generation on the combined DV of WWW hours per week and Email hours per week? (FOR MAIN EFFECT)
- Whether there is a significant effect of sex on the combined DV of WWW hours per week and Email hours per week? (FOR MAIN EFFECT)
- Whether there is a significant effect of an interaction between generation and sex on the combined DV of WWW hours per week and Email hours per week? (FOR INTERACTION EFFECT)
Hence, the null and the alternate hypotheses defined for these research questions are:
H0: There is no effect of generation on the combined DV of WWW hours per week and Email hours per week vs H1: There is a significant effect of generation on the combined DV of WWW hours per week and Email hours per week.
H0: There is no effect of sex on the combined DV of WWW hours per week and Email hours per week vs H1: There is a significant effect of sex on the combined DV of WWW hours per week and Email hours per week.
H0: There is no effect of an interaction between generation and sex on the combined DV of WWW hours per week and Email hours per week vs H1: There is a significant effect of an interaction between generation and sex on the combined DV of WWW hours per week and Email hours per week.
Missing Data:
Statistics | |||||
WWW HOURS PER WEEK | EMAIL HOURS PER WEEK | Generation by age range | RESPONDENTS SEX | ||
N | Valid | 1045 | 1100 | 2038 | 2041 |
Missing | 996 | 941 | 3 | 0 |
There are many cases for which the www hours per week and email hours per week are missing. We will exclude those cases for our analysis.
Transformation of the Dependent Variables:
We observe from the histograms that the dependent variables – email hours per week (emailhr) and www hours per week (wwwhr) violate the normality assumptions and if we do not transform these variables, then the data will have many outliers, as is evident from their boxplots.
So, we need to do an appropriate log transformation of these DVs.
Since these variables have many zero values (there are zeroes for email hours per week in the data and zeroes for www hours per week) in the data, if we simply assume a log(emailhr) and log(wwwhr) transformation, these zero valued cases will be lost in our transformed data.
Hence, we transform the dependent variables as log (emailhr +1) and log (wwwhr + 1)
Outliers in the dependent variables:
From the boxplots of the transformed dependent variables – log (wwhr +1) and log (emailhr +1), we observe they do not have outliers.
Linearity of Dependent variables:
The Pearson correlation between these 2 variables is 0.444 and it is statistically significant with a very low p-value.
Hence, the linearity of the dependent variables can be assumed.
Homogeneity of Variance – Covariance:
The F ratio of the Box’s Test of equality of Covariance Matrices is 1.008 and the p-value of the Box test of equality of covariance matrices is 0.455 which is not significant.
Hence, we can assume that the homogeneity of variance-covariance matrices of the dependent variables are equal across groups and we can use the Wilk’s Lambda for the test statistic.
MANOVA Results:
Multivariate Tests^{a} | ||||||
Effect | Value | F | Hypothesis df | Sig. | Partial Eta Squared | |
Intercept | Wilks’ Lambda | .333 | 960.394 | 2.000 | < .001 | .667 |
Generation | Wilks’ Lambda | .939 | 6.078 | 10.000 | < .001 | .031 |
sex | Wilks’ Lambda | .994 | 3.091 | 2.000 | .046 | .006 |
Generation * sex | Wilks’ Lambda | .992 | .747 | 10.000 | .680 | .004 |
a. Design: Intercept + Generation + sex + Generation * sex |
Main effects foreach IV on the combined DV:
The value of the test statistic for the main effect of generation is 0.939, with F-ratio = 6.078, p-value < 0.001 and effect size = 0.031
The value of the test statistic for the main effect of sex is 0.994, with F-ratio = 3.091, p-value = 0.046 and effect size = 0.006
Hence, the main effects for both the IVs – generation and sex on the combined DVs are statistically significant.
Main effect for factor interaction on the combined DV:
The value of the test statistic for the main effect of factor interaction (generation*sex) is 0.992, with F-ratio = 0.747, p-value = 0.680 and effect size = 0.004
The factor interaction is not significant.
Univariate ANOVA results:
UNIVARIATE ANOVA Results | |||||||
Source | Dependent Variable | Type III Sum of Squares | df | Mean Square | F | Sig. | Partial Eta Squared |
Generation | LOG (emailhr + 1) | 10.476 | 5 | 2.095 | 1.962 | .082 | .010 |
LOG (wwwhr + 1) | 50.747 | 5 | 10.149 | 9.802 | < .001 | .049 | |
sex | LOG (emailhr + 1) | 1.946 | 1 | 1.946 | 1.822 | .177 | .002 |
LOG (wwwhr + 1) | 1.650 | 1 | 1.650 | 1.593 | .207 | .002 | |
Generation * sex | LOG (emailhr + 1) | 4.977 | 5 | .995 | .932 | .459 | .005 |
LOG (wwwhr + 1) | 2.706 | 5 | .541 | .523 | .759 | .003 |
For the dependent variable – LOG (email hours per week + 1):
The F-ratio for IV generation is 1.962 with a p-value = 0.082 and effect size = 0.010
The F-ratio for IV sex is 1.822 with a p-value = 0.177 and effect size = 0.002
The F-ratio for the interaction effect generation*sex is 0.932 with a p-value = 0.459 and effect size = 0.005
Hence, the IVs generation and sex as well as the interaction effect generation*sex are not significant for the DV LOG (emailhr + 1).
For the dependent variable – LOG (www hours per week + 1):
The F-ratio for IV generation is 9.802 with a p-value < 0.001 and effect size = 0.049
The F-ratio for IV sex is 1.593 with a p-value = 0.207 and effect size = 0.002
The F-ratio for the interaction effect generation*sex is 0.523 with a p-value = 0.759 and effect size = 0.003
Hence, the IV generation is statistically significant for the DV LOG (wwwhr + 1) and the IV sex as well as the interaction effect generation*sex are not significant for the DV LOG (wwwhr + 1).
Post – Hoc Analysis:
The mean difference between the DV – LOG (emailhr +1) for any two groups of generation is not statistically significant as the p-value for the scheffe’s multiple comparison test between any two groups of generation for the DV – LOG (emailhr +1) are higher than 0.05
Multiple Comparisons – Significant Groups |
||||||
Scheffe | ||||||
Dependent Variable | (I) Generation by age range | (J) Generation by age range | Mean Difference (I-J) | Sig. | 95% Confidence Interval | |
Lower Bound | Upper Bound | |||||
LOG (WWW HOURS PER WEEK + 1) | Gen Y (Millenials) | Silent Generation | .4586 | .019 | .0430 | .8742 |
G.I. Generation | 1.0208 | < .001 | .4206 | 1.6209 | ||
Gen X | G.I. Generation | .9813 | < .001 | .3739 | 1.5888 | |
Younger Boomers | G.I. Generation | .7691 | .004 | .1502 | 1.3879 | |
Older Boomers | G.I. Generation | .7045 | .017 | .0736 | 1.3354 |
The mean difference between the DV – LOG (wwwhr + 1) is statistically significant for the groups – Gen Y and Silent generation, Gen Y and G.I. Generation, Gen X and G.I. Generation, Younger Boomers and G.I. Generation, Older Boomers and G.I. Generation as the p-values for the scheffe’s multiple comparison test between these groups of generation is less than 0.05 whereas the p-values for the scheffe’s multiple comparison test between other groups of generation are higher than 0.05
Conclusions:
The factor interaction between generation and sex is not significant on the combined DVs and the main effects of generation (p-value <0.001) and sex (p-value = 0.046) are statistically significant on the combined DVs.
From univariate analysis, none of the IVs generation or sex or their interaction effect is statistically significant for the DV log (emailhr +1). IV sex and factor interaction generation*sex are not significant for the DV log (wwwhr +1). IV generation is statistically significant for the DV log (wwwhr +1) with p-value < 0.001
The post-hoc multiple comparison shows that the mean difference between the DV – LOG (wwwhr +1) is statistically significant for the groups – Gen Y and Silent generation, Gen Y and G.I. Generation, Gen X and G.I. Generation, Younger Boomers and G.I. Generation, Older Boomers and G.I. Generation.