Problem Description:
In this SPSS homework, we explore the relationships between High-Density Lipoprotein (HDL) cholesterol and various biomarkers. We collected data from 80 subjects, including BMI, AGE, GENDER, PULSE rate, SYSTOLIC blood pressure, DIASTOLIC blood pressure, High-Density Lipoprotein (HDL) cholesterol, and Low-Density Lipoprotein (LDL) cholesterol. The objective is to analyze the data and draw meaningful conclusions.
Solution
Suppose you conduct a study where you want to study the relationship between High-Density Lipoprotein (HDL) and some biomarkers.
You collected the following measurements from 80 subjects (Download the “BODY1.sav” data); BMI (kg/m2), AGE in years, GENDER (0=female and 1= male), PULSE is pulse rate (beats per minutes), SYSTOLIC is systolic blood pressure (mm Hg), DIASTOLIC is diastolic blood pressure (mm Hg), High-Density Lipoprotein (HDL) is cholesterol (mg / dL), Low-Density Lipoprotein (LDL) is cholesterol mg / DL).
Specifically, you should:
- Calculate the correlation between all continuous variables. Interpret your results.
- Group the age into three different AGE brackets “18-25”, “26-45” and “46 and above”. Test the claim that subjects in those AGE brackets have the same mean LDL.
- Test whether DIASTOLIC blood pressure and PULSE rate varied by GENDER. What are the null and alternative hypotheses?
- Using GENDER, AGE, BMI, DIASTOLIC blood pressure, SYSTOLIC blood pressure, and PULSE rate to predict LDL. Interpret the result and present the regression equation.
Answers:
i) The correlation table
Correlations
AGE | PULSE | SYS | DIAS | HDL | LDL | BMI | ||
---|---|---|---|---|---|---|---|---|
AGE | Pearson Correlation | 1 | -.179 | .426** | .220* | -.170 | .386** | .204 |
Sig. (2-tailed) | .112 | .000 | .050 | .131 | .000 | .069 | ||
N | 80 | 80 | 80 | 80 | 80 | 80 | 80 | |
PULSE | Pearson Correlation | -.179 | 1 | -.240* | -.141 | .255* | -.091 | .078 |
Sig. (2-tailed) | .112 | .032 | .211 | .022 | .420 | .489 | ||
N | 80 | 80 | 80 | 80 | 80 | 80 | 80 | |
SYS | Pearson Correlation | .426** | -.240* | 1 | .191 | -.150 | .246* | .116 |
Sig. (2-tailed) | .000 | .032 | .090 | .183 | .028 | .306 | ||
N | 80 | 80 | 80 | 80 | 80 | 80 | 80 | |
DIAS | Pearson Correlation | .220* | -.141 | .191 | 1 | -.273* | .258* | .179 |
Sig. (2-tailed) | .050 | .211 | .090 | .014 | .021 | .113 | ||
N | 80 | 80 | 80 | 80 | 80 | 80 | 80 | |
HDL | Pearson Correlation | -.170 | .255* | -.150 | -.273* | 1 | -.245* | -.142 |
Sig. (2-tailed) | .131 | .022 | .183 | .014 | .029 | .209 | ||
N | 80 | 80 | 80 | 80 | 80 | 80 | 80 | |
LDL | Pearson Correlation | .386** | -.091 | .246* | .258* | -.245* | 1 | .106 |
Sig. (2-tailed) | .000 | .420 | .028 | .021 | .029 | .348 | ||
N | 80 | 80 | 80 | 80 | 80 | 80 | 80 | |
BMI | Pearson Correlation | .204 | .078 | .116 | .179 | -.142 | .106 | 1 |
Sig. (2-tailed) | .069 | .489 | .306 | .113 | .209 | .348 | ||
N | 80 | 80 | 80 | 80 | 80 | 80 | 80 |
**. Correlation is significant at the 0.01 level (2-tailed).
*. Correlation is significant at the 0.05 level (2-tailed).
The significant correlated pairs:
(AGE, SYS), (AGE, DIAS), (AGE, LDL), (PULSE, HDL), (SYS, LDL), (DIAS, HDL), (DIAS, LDL),AND (HDL, LDL).
ii) ANOVA result:
ANOVA
LDL
|
Sum of Squares | df | Mean Square | F | Sig. |
---|---|---|---|---|---|
Between Groups | 18008.017 | 2 | 9004.008 | 7.478 | .001 |
Within Groups | 92711.933 | 77 | 1204.051 | ||
Total | 110719.950 | 79 |
The mean LDL among the three age groups are not the same.
iii) Null Hypothesis: Mean diastolic pressure is same among males and females.
Alternative hypothesis: There is a difference in mean diastolic pressure among males and females.
Here, we are going to use the two-sample t-test. Here is the SPSS output:
Group Statistics
|
Gender | N | Mean | Std. Deviation | Std. Error Mean |
---|---|---|---|---|---|
DIAS | Female | 40 | 64.95 | 15.332 | 2.424 |
Male | 40 | 71.25 | 10.886 | 1.721 |
Independent Samples Test
|
Levene's Test for Equality of Variances | t-test for Equality of Means | ||||||||
---|---|---|---|---|---|---|---|---|---|---|
F | Sig. | t | df | Sig. (2-tailed) | Mean Difference | Std. Error Difference | 95% Confidence Interval of the Difference | |||
Lower | Upper | |||||||||
DIAS | Equal variances assumed | .783 | .379 | -2.119 | 78 | .037 | -6.300 | 2.973 | -12.219 | -.381 |
Equal variances not assumed | -2.119 | 70.352 | .038 | -6.300 | 2.973 | -12.229 | -.371 |
Here we will be using Equal variances, as the test for equality of variances is not significant. The test statistic for two-sample t-test is significant. Thus we can say that there is sufficient evidence
iv) Here we have LDL as dependent variable while other six variables are independent variables. Here’s the model coefficients
Coefficients
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | ||
---|---|---|---|---|---|---|
B | Std. Error | Beta | ||||
1 | (Constant) | 19.018 | 46.942 | .405 | .687 | |
AGE | .786 | .282 | .336 | 2.785 | .007 | |
PULSE | .200 | .376 | .064 | .533 | .596 | |
BMI | .096 | .636 | .017 | .151 | .880 | |
Gender | 9.932 | 9.045 | .133 | 1.098 | .276 | |
SYS | .166 | .239 | .082 | .695 | .489 | |
DIAS | .396 | .312 | .144 | 1.269 | .208 |
a. Dependent Variable: LDL
The model ANOVA:
ANOVA
Model | Sum of Squares | df | Mean Square | F | Sig. | |
---|---|---|---|---|---|---|
1 | Regression | 22046.797 | 6 | 3674.466 | 3.025 | .011b |
Residual | 88673.153 | 73 | 1214.701 | |||
Total | 110719.950 | 79 |
a. Dependent Variable: LDL
b. Predictors: (Constant), DIAS, PULSE, BMI, SYS, AGE, Gender
Using the ANOVA we can say that the linear regression model is significant as the p-value = 0.011.
The regression equation is:
LDL = 19.018 + 0.486*AGE +0.2*PULSE +0.096*BMI +9.932*Gender + 0.166*SYS +0.396*DIAS
Among all the above variables, only AGE is a significant predictor.