Problem Description:
In this SPSS homework, we explore the relationships between HighDensity Lipoprotein (HDL) cholesterol and various biomarkers. We collected data from 80 subjects, including BMI, AGE, GENDER, PULSE rate, SYSTOLIC blood pressure, DIASTOLIC blood pressure, HighDensity Lipoprotein (HDL) cholesterol, and LowDensity Lipoprotein (LDL) cholesterol. The objective is to analyze the data and draw meaningful conclusions.
Solution
Suppose you conduct a study where you want to study the relationship between HighDensity Lipoprotein (HDL) and some biomarkers.
You collected the following measurements from 80 subjects (Download the “BODY1.sav” data); BMI (kg/m2), AGE in years, GENDER (0=female and 1= male), PULSE is pulse rate (beats per minutes), SYSTOLIC is systolic blood pressure (mm Hg), DIASTOLIC is diastolic blood pressure (mm Hg), HighDensity Lipoprotein (HDL) is cholesterol (mg / dL), LowDensity Lipoprotein (LDL) is cholesterol mg / DL).
Specifically, you should:
 Calculate the correlation between all continuous variables. Interpret your results.
 Group the age into three different AGE brackets “1825”, “2645” and “46 and above”. Test the claim that subjects in those AGE brackets have the same mean LDL.
 Test whether DIASTOLIC blood pressure and PULSE rate varied by GENDER. What are the null and alternative hypotheses?
 Using GENDER, AGE, BMI, DIASTOLIC blood pressure, SYSTOLIC blood pressure, and PULSE rate to predict LDL. Interpret the result and present the regression equation.
Answers:
i) The correlation table
Correlations
AGE  PULSE  SYS  DIAS  HDL  LDL  BMI  

AGE  Pearson Correlation  1  .179  .426**  .220*  .170  .386**  .204 
Sig. (2tailed)  .112  .000  .050  .131  .000  .069  
N  80  80  80  80  80  80  80  
PULSE  Pearson Correlation  .179  1  .240*  .141  .255*  .091  .078 
Sig. (2tailed)  .112  .032  .211  .022  .420  .489  
N  80  80  80  80  80  80  80  
SYS  Pearson Correlation  .426**  .240*  1  .191  .150  .246*  .116 
Sig. (2tailed)  .000  .032  .090  .183  .028  .306  
N  80  80  80  80  80  80  80  
DIAS  Pearson Correlation  .220*  .141  .191  1  .273*  .258*  .179 
Sig. (2tailed)  .050  .211  .090  .014  .021  .113  
N  80  80  80  80  80  80  80  
HDL  Pearson Correlation  .170  .255*  .150  .273*  1  .245*  .142 
Sig. (2tailed)  .131  .022  .183  .014  .029  .209  
N  80  80  80  80  80  80  80  
LDL  Pearson Correlation  .386**  .091  .246*  .258*  .245*  1  .106 
Sig. (2tailed)  .000  .420  .028  .021  .029  .348  
N  80  80  80  80  80  80  80  
BMI  Pearson Correlation  .204  .078  .116  .179  .142  .106  1 
Sig. (2tailed)  .069  .489  .306  .113  .209  .348  
N  80  80  80  80  80  80  80 
**. Correlation is significant at the 0.01 level (2tailed).
*. Correlation is significant at the 0.05 level (2tailed).
The significant correlated pairs:
(AGE, SYS), (AGE, DIAS), (AGE, LDL), (PULSE, HDL), (SYS, LDL), (DIAS, HDL), (DIAS, LDL),AND (HDL, LDL).
ii) ANOVA result:
ANOVA
LDL

Sum of Squares  df  Mean Square  F  Sig. 

Between Groups  18008.017  2  9004.008  7.478  .001 
Within Groups  92711.933  77  1204.051  
Total  110719.950  79 
The mean LDL among the three age groups are not the same.
iii) Null Hypothesis: Mean diastolic pressure is same among males and females.
Alternative hypothesis: There is a difference in mean diastolic pressure among males and females.
Here, we are going to use the twosample ttest. Here is the SPSS output:
Group Statistics

Gender  N  Mean  Std. Deviation  Std. Error Mean 

DIAS  Female  40  64.95  15.332  2.424 
Male  40  71.25  10.886  1.721 
Independent Samples Test

Levene's Test for Equality of Variances  ttest for Equality of Means  

F  Sig.  t  df  Sig. (2tailed)  Mean Difference  Std. Error Difference  95% Confidence Interval of the Difference  
Lower  Upper  
DIAS  Equal variances assumed  .783  .379  2.119  78  .037  6.300  2.973  12.219  .381 
Equal variances not assumed  2.119  70.352  .038  6.300  2.973  12.229  .371 
Here we will be using Equal variances, as the test for equality of variances is not significant. The test statistic for twosample ttest is significant. Thus we can say that there is sufficient evidence
iv) Here we have LDL as dependent variable while other six variables are independent variables. Here’s the model coefficients
Coefficients
Model  Unstandardized Coefficients  Standardized Coefficients  t  Sig.  

B  Std. Error  Beta  
1  (Constant)  19.018  46.942  .405  .687  
AGE  .786  .282  .336  2.785  .007  
PULSE  .200  .376  .064  .533  .596  
BMI  .096  .636  .017  .151  .880  
Gender  9.932  9.045  .133  1.098  .276  
SYS  .166  .239  .082  .695  .489  
DIAS  .396  .312  .144  1.269  .208 
a. Dependent Variable: LDL
The model ANOVA:
ANOVA
Model  Sum of Squares  df  Mean Square  F  Sig.  

1  Regression  22046.797  6  3674.466  3.025  .011b 
Residual  88673.153  73  1214.701  
Total  110719.950  79 
a. Dependent Variable: LDL
b. Predictors: (Constant), DIAS, PULSE, BMI, SYS, AGE, Gender
Using the ANOVA we can say that the linear regression model is significant as the pvalue = 0.011.
The regression equation is:
LDL = 19.018 + 0.486*AGE +0.2*PULSE +0.096*BMI +9.932*Gender + 0.166*SYS +0.396*DIAS
Among all the above variables, only AGE is a significant predictor.