Parsimonious modelThe dataset consists of 11,627 observations. The average age of the samples is 54.79 and the median age is 54 years with a variation around the mean of 9.56 years. The youngest is 32 years while the oldest is 81 years. The average systolic BP is 136.32 with a standard deviation of 22.80. The median systolic BP is 132 with the least being 83.65 and the highest is 295. The average time to death is 21.52 years with a standard deviation of 4.90 years. The median time to death is 24.02. The smallest time to death is 0.07 years while the largest is 24.02 years.
The result below shows that 92.76% of the observations have no previous coronary heart disease while 7.24% have previous coronary heart disease. 41.39% have education level “0=-11 years”, 30.09% have High school diploma, 16.63% have some college and 11.89 have more that college
|High school diploma||3,410||30.09||71.48|
We check for the log linearity of the continuous covariate (age and systolic BP) in the model. The plot of the result below shows that age satisfies the assumption of log-linearity thus there is no need to transform.
Figure 1: log-linearity test for age
Similarly, the plot for systolic BP below is also linear which suggests that it also satisfies the log linearity assumption and does not need any transformation.
There is significant interaction of previous coronary heart disease (prevchd) with both continuous variables “age” and “sysbp”
The result of the proportional hazard assumption test is shown below. The result shows that the assumption is not met as chi2(5)=103.42, p<0.001. This connotes rejection of the null hypothesis of proportional hazard. age, systolic BP, and prevchd. Only the interactions satisfy the assumption.
To solve this we include the time-varying variable for age, systolic BP, and prevchd. Testing for proportional hazard from the newly estimated model, we see that the global chi2(7)=3.89, p=0.7924 which means that we cannot reject the null hypothesis of proportional hazard. Moreover, all the variables have insignificant p-values which means all variable satisfies the proportional hazard assumption. The result is shown below.
Using the cox snell residuals to test the goodness of fit of the model, we see that both lines only intersects at time 0 which calls into question the fit of the model. i.e. the model does not fit well
From the result presented in g, we see that age is associated with an increased risk of death from heart disease. A year increase in age increases the risk by 2.8% and this effect is significant (p<0.001). Systolic blood pressure is associated with a slight reduction in death from CHD. An increase in systolic blood pressure reduces the risk significantly by 0.03% (p<0.001). History of previous CHD is associated with increased risk. A person with previous CHD has the risk of death higher than those without previous CHD by 9.2% (p<0.001). The interaction of age with previous CHD and systolic BP with previous CHD is significant.
The survival plot of age at 40 years and 60 years are presented in h. the plot shows that the risk is the same up till time 5 before we see that age60 plot moves quickly towards the origin signifying increasing risk of death at the later time period. This supports the estimates that age is an increased risk of death from CHD.