# Understanding Univariate ANOVA analysis

Univariate ANOVA analysis is an analysis of variance where there is only one independent variable. It is used in the comparison of mean differences between two or more groups. You can calculate univariate ANOVA analysis through analyzing and comparison of means or through the general linear model.

## Statistics corrections

The appropriate research design is a 2-way, 3×2 factorial, and Between Subject design. It is two way because we have two independent variables. It is 3×2 because one independent variable has 3 levels and the other has 2 levels. It is between subjects because both groups have independent samples.
The appropriate research design is 4-way, 2×2×3×2 factorial, and mixed design. It is 4-way because we have four independent variables, 2×2×3×2 because three of the independent variables have 2 levels and one has three levels. It is a mixed design because one of them is repeated samples and the other three are independent samples. ## T-test

a. Independent samples t-test because we have independent groups in the independent variable and it is only two levels. Moreover, the research question is a different question and the dependent variable is continuous.
b. Ordinal logistic regression because the question is an association question and the dependent variable is measured on the ordinal scale while the independent is normal/continuous scale.
c. Kendall’s tau-b or Spearman rho because both variables are continuous but there is a significant violation of normality assumption.

## Pearson inferential statistic

The Pearson inferential statistic was selected to determine if there is a statistically significant relationship between the number of hours spent wearing the Fitbit predicts older adults’ activity level (as measured in steps). Number of hours spent wearing the Fitbit (N= 50, M= 12.94, median= 14.00, mode= 15, SD= 5.83, skewed= -.258, and range= 22.00). Older adults’ activity level (as measured in steps) (N=50, M= 7299.20, median= 8100.00, mode=10000.00, SD= 3673.99, skewed= -.399 and range= 12900). No errors could be found in the information after checking the raw data against the data in the Data View, minimum and maximum values were within the range of the codebook, the M and SD appeared reasonable, and there was no missing data or outliers identified. R(48)= .88, p= .00. The direction of the correlation was positive, indicating the number of hours spent wearing the Fitbit predicts older adults’ activity level (as measured in steps). Using Cohen’s (1988) guidelines, the effect size is much larger than typical for studies in this area. The r² indicated that approximately 77% of the variance in older adults’ activity level (as measured in steps) can be predicted from hours spent wearing the Fitbit.

A Kruskal-Wallis test was used for ordinal data of patient satisfaction to compare among different care approaches on patient satisfaction in an outpatient oncology clinic (standard of care, GA only, GA+ integration intervention). The three levels are Standard of care (N= 20, M= 3.25, median= 3, mode= 3, SD= 0.85, skewed= .04, and range= 3.00), GA only (N= 20, M= 2.5, median= 2.5, mode= 2, SD= 0.82, skewed= 0, and range= 3.00) and GA+ integration intervention (N= 20, M= 4.35, median= 4, mode= 4, SD= 0.67, skewed= -.54, and range= 2.00). No errors could be found in the information after checking the raw data against the data in the Data View, minimum and maximum values were within range of the codebook. We found a significant difference between patient satisfaction for different care approaches (H(2) =29.796,p<0.001) with a mean rank of 28.40 (N=20) for standard of care, 17.10 (N=20) for GA only and 46 (N=20) for GA+ integration intervention.

A post hoc test shows that there is a significant difference in mean rank of GA only and GA+ integration intervention (p<0.001) and between the standard of care and GA+ integration intervention (p=0.003) but there is no significant difference between the mean rank of GA only and standard of care (p=0.103).

 Table 1: comparison of patient satisfaction between care approaches Standard of care GA only GA+integration intervention n mean rank n mean rank n mean rank p-value 20 28.4 20 17.1 20 46 <0.0001

The median value for age is 72.5 with the interquartile range been 13.25. For gender, 46% are female while 54% are male. For the functional score, the median is 3 while the interquartile range is 1.25, and finally, for self-rated quality of life, the median is 0 and the interquartile range is 1.

A simple linear regression model was used to predict patients’ self-rated quality of life by age. Age (N= 50, M= 74, median= 72.5, mode= 67, SD= 8.08, skewed= .553, and range= 30.00), patients’ self-rated quality of life(N= 50, M= 0.68, median= 0, mode=0, SD= 1.05, skewed= 1.763, and range= 4.00). An insignificant regression model was found (F(1,48)=0.121, p=0.730, R2=0.003). No errors could be found in the information after checking the raw data against the data in the Data View, minimum and maximum values were within the range of the codebook. The predicted regression model is 0.195+0.007age. A year increase in age leads to a 0.007 increase in patients’ self-rated quality of life(49)=0.347, p=0.730. Age was found to be an insignificant predictor of patients’ self-rated quality of life. A very small effect size of 0.003 was found.

Kendal’s tau b was used to determine if gender influences functional scores. Functional Score(N= 50, M= 2.76, median= 3, mode=3, SD= 1.17, skewed= 0.33, and range= 4.00), Gender (N= 50, male=27(54%), Female =23(46%)). No errors could be found in the information after checking the raw data against the data in the Data View, minimum and maximum values were within the range of the codebook. The result was not significant (tau=0.444, p=0.1). This means that gender does not influence the patient’s functional score.

## Logistic Regression

##### Odds ratio calculation The confidence 95% interval is calculated as

Upper 95% CI=  exposure * foodp Crosstabulation Count foodp Total food poisoning no food poisoning exposure ate fish 66 34 100 do not eat fish 15 85 100 Total 81 119 200

 Variables in the Equation B S.E. Wald df Sig. Exp(B) 95% C.I.for EXP(B) Lower Upper Step 1a exposure 2.398 .351 46.749 1 .000 11.000 5.532 21.873 Constant -3.061 .507 36.507 1 .000 .047 a. Variable(s) entered on step 1: exposure.
A logistic regression model was estimated to determine the chance of developing food poisoning for participants who ate and did not eat fish at restaurant X. The result shows that participants who ate fish significantly develop food poisoning more (66%) than participants who did not eat fish (15%). participants who ate fish have 11 times the odds of participants who did not eat fish of developing food poisoning (OR=11, p<0.001, 95%CI= [5.532, 21.873]). Thus, we expect that for the whole population, participants who ate fish have between 5.532- and 21.873-times odds of participants who did not eat fish of developing food poisoning.  The p-value is less than 0.001 and the hypothesized null value of 1 does not fall within the confidence interval which means that the odds ratio is statistically significant.

### Correlation

 Journal Issue #: 61 First author:      Bridie McCarthy                                            number of authors: 8 Interdisciplinary team  Yes Biostatistician or statistician listed as coauthor No Study design: quasi-experimental, one-group pre-post-test. criteria Yes/No Remarks 1.       Study contains a power analysis No Power analysis was not mentioned nor carried out. Adequacy of sample size and how they came about the sample size was not indicated. 2.       Analytic approach is appropriate to the study design Uncertain The authors used an independent t-test and its non-parametric equivalent (Mann-Whitney U test), ANOVA, and its non-parametric equivalent (Kruskal Wallis test) as well as a linear regression model. But since no research question was stated, we would not know if they are appropriate. However, something suggested from the topic is a comparison of pre and post-intervention coping strategy but this is missing in the whole analysis. 3.       Analysis addresses each research question or hypothesis No No research question nor hypothesis was formulated by the authors and this makes it difficult to evaluate the methods since we do not know what they set out to achieve. The only thing they stated is the aim which is to investigate the impact of a psycho-educational intervention “Coping with Stressful Events” with first-year undergraduate nursing and midwifery students but this is enough there ought to be specific research questions that will point to the aim. 4.       Normality assumption of the data addressed Yes The authors addressed the non-normality of the data by presenting a non-parametric test result that is robust to non-normality. However, the non-parametric test used (Mann-Whitney U test) is not appropriate for paired sample 5.       Level of data: categorical, ordinal, continuous is identified Yes Though this is not explicitly stated it is implied in the discussion of their research instrument. 6.       Statistical/analytical approach is described and appropriate to the level of data No The authors only made mention of their statistical approach but the reason it was adopted was not stated. Why an independent t-test? ANOVA? And linear regressions? These were not answered and the assumptions underlying them were not addressed. 7.       Both descriptive data and inferential statistics are reported Partly The author reported all descriptive statistics as well as a result of the inferential analysis. However, the raw data were not presented. 8.       Description of analytic approach sufficient to replicate the analysis Yes The authors indicated which test they were running and which variables are being used to run the test. I believe this is enough to replicate the work 9.       Differentiates between clinical and statistical significance No The authors only describe the statistical significance of their studies and statistical implication but the significance of clinical activity was not discussed.

The paper did not meet two-third of the criteria proposed by Cohen et.al. (2009). The first observation to make is that research questions/hypothesis was not spelled out and without research questions/hypothesis, there is no way to evaluate if what is being done is appropriate or not. if I were to do this, the first thing I will do is to spell out the research questions/hypothesis.

From the topic, it is implied that the author wants to compare pre-intervention and post-intervention coping with the stressful events but the authors went on to compare gender, age group, and living arrangements separately for pre and post-intervention. This somehow derails from what is implied by the aim of the study. I would have done a paired t-test or its non-parametric equivalent (Wilcoxon rank test) or better still if I want to study the effect of these demographic variables along, I would have done a mixed design to cater for the dependent and independent groups. Although, the author carried out a regression with the interaction of time and demographic variables, however, this is not enough when the interaction is significant, with a mixed model, we would have been able to determine the simple effect of each independent variables which is not possible in the regression model.

Related Topics