# Exploring the Relationships Between HDL Cholesterol and Biomarkers Using SPSS

In this analysis, we delve into the intricate web of health biomarkers to understand the relationships between High-Density Lipoprotein (HDL) cholesterol and various essential indicators. Our SPSS research is based on data collected from 80 subjects, encompassing factors like BMI, age, gender, pulse rate, systolic blood pressure, diastolic blood pressure, High-Density Lipoprotein (HDL) cholesterol, and Low-Density Lipoprotein (LDL) cholesterol. Through meticulous examination, we aim to unravel meaningful insights from this diverse dataset.

## Problem Description:

In this SPSS homework, we explore the relationships between High-Density Lipoprotein (HDL) cholesterol and various biomarkers. We collected data from 80 subjects, including BMI, AGE, GENDER, PULSE rate, SYSTOLIC blood pressure, DIASTOLIC blood pressure, High-Density Lipoprotein (HDL) cholesterol, and Low-Density Lipoprotein (LDL) cholesterol. The objective is to analyze the data and draw meaningful conclusions.

Solution

Suppose you conduct a study where you want to study the relationship between High-Density Lipoprotein (HDL) and some biomarkers.

You collected the following measurements from 80 subjects (Download the “BODY1.sav” data); BMI (kg/m2), AGE in years, GENDER (0=female and 1= male), PULSE is pulse rate (beats per minutes), SYSTOLIC is systolic blood pressure (mm Hg), DIASTOLIC is diastolic blood pressure (mm Hg), High-Density Lipoprotein (HDL) is cholesterol (mg / dL), Low-Density Lipoprotein (LDL) is cholesterol mg / DL).

Specifically, you should:

1. Calculate the correlation between all continuous variables. Interpret your results.
2. Group the age into three different AGE brackets “18-25”, “26-45” and “46 and above”. Test the claim that subjects in those AGE brackets have the same mean LDL.
3.  Test whether DIASTOLIC blood pressure and PULSE rate varied by GENDER. What are the null and alternative hypotheses?
4. Using GENDER, AGE, BMI, DIASTOLIC blood pressure, SYSTOLIC blood pressure, and PULSE rate to predict LDL. Interpret the result and present the regression equation.

i) The correlation table

Correlations

AGE PULSE SYS DIAS HDL LDL BMI
AGE Pearson Correlation 1 -.179 .426** .220* -.170 .386** .204
Sig. (2-tailed) .112 .000 .050 .131 .000 .069
N 80 80 80 80 80 80 80
PULSE Pearson Correlation -.179 1 -.240* -.141 .255* -.091 .078
Sig. (2-tailed) .112 .032 .211 .022 .420 .489
N 80 80 80 80 80 80 80
SYS Pearson Correlation .426** -.240* 1 .191 -.150 .246* .116
Sig. (2-tailed) .000 .032 .090 .183 .028 .306
N 80 80 80 80 80 80 80
DIAS Pearson Correlation .220* -.141 .191 1 -.273* .258* .179
Sig. (2-tailed) .050 .211 .090 .014 .021 .113
N 80 80 80 80 80 80 80
HDL Pearson Correlation -.170 .255* -.150 -.273* 1 -.245* -.142
Sig. (2-tailed) .131 .022 .183 .014 .029 .209
N 80 80 80 80 80 80 80
LDL Pearson Correlation .386** -.091 .246* .258* -.245* 1 .106
Sig. (2-tailed) .000 .420 .028 .021 .029 .348
N 80 80 80 80 80 80 80
BMI Pearson Correlation .204 .078 .116 .179 -.142 .106 1
Sig. (2-tailed) .069 .489 .306 .113 .209 .348
N 80 80 80 80 80 80 80

**. Correlation is significant at the 0.01 level (2-tailed).

*. Correlation is significant at the 0.05 level (2-tailed).

The significant correlated pairs:

(AGE, SYS), (AGE, DIAS), (AGE, LDL), (PULSE, HDL), (SYS, LDL), (DIAS, HDL), (DIAS, LDL),AND (HDL, LDL).

ii) ANOVA result:

ANOVA

LDL

Sum of Squares df Mean Square F Sig.
Between Groups 18008.017 2 9004.008 7.478 .001
Within Groups 92711.933 77 1204.051
Total 110719.950 79

The mean LDL among the three age groups are not the same.

iii) Null Hypothesis: Mean diastolic pressure is same among males and females.

Alternative hypothesis: There is a difference in mean diastolic pressure among males and females.

Here, we are going to use the two-sample t-test. Here is the SPSS output:

Group Statistics

Gender N Mean Std. Deviation Std. Error Mean
DIAS Female 40 64.95 15.332 2.424
Male 40 71.25 10.886 1.721

Independent Samples Test

Levene's Test for Equality of Variances t-test for Equality of Means
F Sig. t df Sig. (2-tailed) Mean Difference Std. Error Difference 95% Confidence Interval of the Difference
Lower Upper
DIAS Equal variances assumed .783 .379 -2.119 78 .037 -6.300 2.973 -12.219 -.381
Equal variances not assumed -2.119 70.352 .038 -6.300 2.973 -12.229 -.371

Here we will be using Equal variances, as the test for equality of variances is not significant. The test statistic for two-sample t-test is significant. Thus we can say that there is sufficient evidence

iv) Here we have LDL as dependent variable while other six variables are independent variables. Here’s the model coefficients

Coefficients

Model Unstandardized Coefficients Standardized Coefficients t Sig.
B Std. Error Beta
1 (Constant) 19.018 46.942 .405 .687
AGE .786 .282 .336 2.785 .007
PULSE .200 .376 .064 .533 .596
BMI .096 .636 .017 .151 .880
Gender 9.932 9.045 .133 1.098 .276
SYS .166 .239 .082 .695 .489
DIAS .396 .312 .144 1.269 .208

a. Dependent Variable: LDL

The model ANOVA:

ANOVA

Model Sum of Squares df Mean Square F Sig.
1 Regression 22046.797 6 3674.466 3.025 .011b
Residual 88673.153 73 1214.701
Total 110719.950 79

a. Dependent Variable: LDL

b. Predictors: (Constant), DIAS, PULSE, BMI, SYS, AGE, Gender

Using the ANOVA we can say that the linear regression model is significant as the p-value = 0.011.

The regression equation is:

LDL = 19.018 + 0.486*AGE +0.2*PULSE +0.096*BMI +9.932*Gender + 0.166*SYS +0.396*DIAS

Among all the above variables, only AGE is a significant predictor.