OneSample Ttest
Homework Assignment SPSS Exercises
The questionnaire used to collect the data for the survey is in our textbook – Avery Fitness Center. You will need it to define the labels in SPSS.
Questions:
 Get freq, mean, std and sem for each variable. Tell for which variables the mean, std and sem are “meaningful” and give the interpretation of each.
Mean

Std 
SEM 
Can the mean of this type of variable be interpreted? (Y/N)  
Weight  
Classes  
Circuit  
Station  
Pool  
Visits  
Daypart  
Doctor  
Enjoy  
Age  
Gender 
 Using only the standard deviation for each of the Importance variables (in survey, how important …), which variable had the greatest amount of agreement? List these four variables in table below in order of most to least agreement.
Importance Variables  Standard Deviation 
Create and present a Frequency table to present your answers to the following questions:
 What percentage of the respondents who answered the Gender question are male?
 What percentage of everyone who took the survey are female?
Create and present a single Frequency table using the variable Income and answer the following questions:
 What percentage of the respondents who answered the question make over $120,000 per year?
 What percentage of the respondents who answered the question reported making over $60,000 per year?
 What percentage of the respondents who answered the question reported making $30,000 or less per year?
 What percentage of the survey respondents reported making between $45,001 and $60,000 per year?
Create and present a Histogram with a normal curve (can use the SPSS graph) using the variable Age and answer the following questions:
 What are the mean, standard deviation, and count for age?
 What are the upper and lower boundaries (i.e., ages) of the normal distribution? How did you calculate these numbers?
 Identify (by specific ages) any outliers (if any)?
 If there are outliers, what do you recommend be done with them and why?
Create and present a Frequency table using the variable Gender:
 Based on the percentages of males, calculate the sampling error for the proportion using the formula from our book. Be sure to show and explain the numbers you used. Also be sure to show the resultant confidence interval.
Create and present a single table that lists the following:
 Percentages and counts for each category of the four continuous variables: General Health/Fitness, Social Aspects, Physical Enjoyment, and Specific Medical Concerns;
 The top two boxes for each of these variables;
Create a single table to compare the means between:
 The pairs of all of these four continuous Importance variables (General Fitness; Social Aspects; Physical Enjoyment; Specific Medical Concerns). Explain if there are/are not significant differences between each pair of variables.
 List all the variables in the table in the order of most important to least important (be sure to show why/how you determined the level of importance). Be sure to show the ranking numbers (e.g., 1,2,3,4).
Run a Onesample Ttest and present a table to determine:
 If the average number of monthly visits (i.e., the variable Visits) is significantly different from the national average of eight. Interpret and explain your relevant results. Be sure to report the mean difference, tvalue, degrees of freedom, and significance level.
Create and present a Crosstabulation table of the variables Pool and Doctor.
 What percentage of the total sample utilized the therapy pool?
 What percentage of those who used the therapy pool were recommended by a doctor?
 What percentage of those recommended by a doctor utilized the therapy pool?
 Are the results significant?
 How strongly, if at all, are the variables associated with each other?
Show in a table:
 The comparison of the means between the number of Visits and whether people had Utilized the exercise circuit. Explain, and show, if the means are significantly different from each other.
Run and interpret a correlation analysis and create a single table that:
 Uses the four Importance variables (General Fitness; Social Aspects; Physical Enjoyment; Specific Medical Concerns) showing the correlations and which are significant.
 Replace the diagonal values with the respective means in the table.
 Interpret the table.
Recommendations
 Based on your analysis of ALL the data in this assignment, write an Executive Summary of your findings with clear managerial recommendations. 200 300 words
Solution
Answers:
Mean

Std  SEM  Can the mean of this type of variable be interpreted? (Y/N)  
weight  0.32  0.465  0.022  No 
classes  0.26  0.440  0.021  No 
circuit  0.22  0.415  0.020  No 
station  0.12  0.325  0.015  No 
pool  0.45  0.498  0.023  No 
visits  14.20  7.733  0.387  Yes 
daypart  1.30  0.549  0.027  No 
doctor  0.26  0.439  0.021  No 
enjoy  3.91  1.090  0.055  Yes 
age  62.56  19.630  0.937  Yes 
gender  1.79  0.105  0.019  No 
The mean can be interpreted for quantitive variables only. Here in our case, we can interpret the mean of visits and age
 For the 400 persons who answered the question “Number of visits to AFC in previous 30 days”, the mean number of visits in the previous 30 days is of 14.20 visits with a standard deviation of 7.733.
 The mean score for the physical enjoyment is of 3.91 meaning that the physical enjoyment is an important reason for participating in AFC
 For the 439 persons who answered the question related to age, the mean age is of 62.56 years with a standard deviation of 19.630
 Using only the standard deviation for each of the Importance variables, The table below shows in order of most to least agreement
Importance Variables  Standard Deviation 
Fitness  0.745 
Enjoy  1.090 
Medical  1.206 
Social  1.272 
Let us create and present, using SPSS, a Frequency table for the variable gender
Frequency  Percent  Valid Percent  Cumulative Percent  
Valid  male  89  19.8  20.6  20.6 
female  344  76.4  79.4  100.0  
Total  433  96.2  100.0  
Missing  System  17  3.8  
Total  450  100.0 
 The percentage of male respondents who answered the Gender question is of 20.6% (89 male among 433 respondents)
 What percentage of everyone who took the survey are female 76.4 (344 female among 450 person)
Using SPSS, let us create and present a single Frequency table using the variable Income
Frequency  Percent  Valid Percent  Cumulative Percent  
Valid  0 – 15.000  14  3.1  4.0  4.0 
15.001 – 30.000  43  9.6  12.3  16.3  
30.001 – 45.000  49  10.9  14.0  30.3  
45.001 – 60.000  83  18.4  23.7  54.0  
60.001 – 75.000  60  13.3  17.1  71.1  
75.001 – 90.000  35  7.8  10.0  81.1  
90.001 – 105.000  25  5.6  7.1  88.3  
105.001 – 120.000  23  5.1  6.6  94.9  
more than 120.000  18  4.0  5.1  100.0  
Total  350  77.8  100.0  
Missing  System  100  22.2  
Total  450  100.0 
 0% (18 person among 350) of the respondents who answered the question make over $120,000 per year
 46% (161 person among 350) of the respondents who answered the question reported making over $60,000 per year
 3% (57 person among 350) is the percentage of the respondents who answered the question reported making $30,000 or less per year
 7% (83 person among 350) of the survey respondents reported making between $45,001 and $60,000 per year.
The following figure represents a Histogram with a normal using the variable Age
 The mean age of the 439 respondents is of 62.56 years with a standard deviation of 19.63
 the upper and lower boundaries can be calculated using the following formula
Upper boundary is of 121 years
Lower boundary 3.67 years
 there are no outliers in the data
 Outlier can be replaced with the maximum/minimum value dependent if the outiler is greater/smaller than the upper/lower boundaries. Or it can simply be replaced with the median
The following table representa Frequency table using the variable Gender
Frequency  Percent  Valid Percent  Cumulative Percent  
Valid  male  89  19.8  20.6  20.6 
female  344  76.4  79.4  100.0  
Total  433  96.2  100.0  
Missing  System  17  3.8  
Total  450  100.0 
Based on the percentages of males, the sampling error for the proportion is:
 The following table present the percentages and counts for each category of the four continuous variables: General Health/Fitness, Social Aspects, Physical Enjoyment, and Specific Medical Concerns
Variable  Category  Count  Percent  Valid Percent  Cumulative Percent  
Fitness  Valid  1  10  2.2  2.2  2.2  
2  4  .9  .9  3.1  
3  8  1.8  1.8  4.9  
4  50  11.1  11.2  16.1  
5  374  83.1  83.9  100.0  
Total  446  99.1  100.0  
Missing  System  4  .9  
Total  450  100.0  
Social  Valid  1  53  11.8  13.5  13.5  
2  66  14.7  16.8  30.2  
3  113  25.1  28.7  58.9  
4  94  20.9  23.9  82.7  
5  68  15.1  17.3  100.0  
Total  394  87.6  100.0  
Missing  System  56  12.4  
Total  450  100.0  
Enjoy  Valid  1  18  4.0  4.6  4.6  
2  20  4.4  5.1  9.6  
3  84  18.7  21.3  31.0  
4  128  28.4  32.5  63.5  
5  144  32.0  36.5  100.0  
Total  394  87.6  100.0  
Missing  System  56  12.4  
Total  450  100.0  
Medical  Valid  1  33  7.3  8.1  8.1  
2  14  3.1  3.4  11.6  
3  43  9.6  10.6  22.2  
4  122  27.1  30.0  52.2  
5  194  43.1  47.8  100.0  
Total  406  90.2  100.0  
Missing  System  44  9.8  
Total  450  100.0  
 The following table shows the top two boxes for each of these variables
Variable  Category  Count  Percent  Valid Percent  
Fitness  1  10  2.2  2.2  
2  4  .9  .9  
Social  1  53  11.8  13.5  
2  66  14.7  16.8  
Enjoy  1  18  4.0  4.6  
2  20  4.4  5.1  
Medical  1  33  7.3  8.1  
2  14  3.1  3.4  
 There are six possible pairs, the following table shows comparisons of means between the pairs of all of four continuous Importance variables (General Fitness; Social Aspects; Physical Enjoyment; Specific Medical Concerns)
PairedDifferences  t  df  Sig. (2tailed)  
Mean  Std. Deviation  Std. ErrorMean  95% Confidence Interval of the Difference  
Lower  Upper  
Pair 1  fitness – social  1.629  1.321  .067  1.499  1.760  24.482  393  .000 
Pair 2  fitness – enjoy  .853  1.069  .054  .747  .959  15.829  393  .000 
Pair 3  fitness – medical  .700  1.210  .060  .581  .818  11.645  405  .000 
Pair 4  social – enjoy  .788  1.110  .057  .899  .676  13.937  385  .000 
Pair 5  social – medical  .925  1.537  .080  1.081  .768  11.608  371  .000 
Pair 6  enjoy – medical  .128  1.488  .077  .279  .023  1.664  375  .097 
From the table above, we can notice that all the differences between the pairs are statistically significant (pvalues of the 2 tailed test are smaller than the 5% significance level). Except the difference between enjoy and medical (Pair 6) which is not significant (t=1.664, df=375, pvalue=0.097>0.05) and hence the means of these two variables is not statistically significant.
 Using the mean, the following table shows the importance in the order of most important to least important
Importance Variables  Standard Deviation  
1  Fitness  4.74 
2  Medical  4.06 
3  Enjoy  3.91 
4  Social  3.15 
 In this question, we want to examine whether the average number of monthly visits is significantly different from the national average of eight. To do so, we proceed to a Onesample Ttest. The following table shows the results obtained via SPSS.
variable  t  df  Sig. (2tailed)  MeanDifference  95% Confidence Interval of the Difference  
Lower  Upper  
visits  16.022  399  .000  6.195  5.43  6.96 
From the table above, we can securely confirm, at the 5% significance level, that the average number of monthly visits is statistically different from the national average of eight (t=10.022, df=399 and pvalue<0.05)
The following table represents a crosstabulation table of the variables Pool and Doctor
pool  
No  Yes  
Count  Count  
doctor  No  203  130 
Yes  44  73  
Chisquare  19.071  
df  1  
Sig  0.000 
 The percentage of the total sample utilized the therapy pool
 The percentage of those who used the therapy pool were recommended by a doctor
 The percentage of those recommended by a doctor utilized the therapypool
 From the table 8 above, the chisquare statistic is of 19.071 with 1 degree of freedom and a null pvalue meaning that, at the 5% significance level, there is a significant association between utilized the therapy pool and a doctor recommendation.
 The coefficient of correlation between the pool and the doctor recommendation is of 0.206 meaning that there is a moderate association between these two variables.
 The following table represent the comparison of the means between the number of Visits and whether people had utilized the exercise circuit.
ttest for Equality of Means  
t  df  Sig. (2tailed)  MeanDifference  Std. ErrorDifference  95% Confidence Interval of the Difference  
Lower  Upper  
visits  Equal variances assumed  2.522  398  .012  2.302  .913  4.096  .508 
Equal variances not assumed  2.849  184.863  .005  2.302  .808  3.896  .708 
From the table above, we can confirm that there is a significant difference between the mean number of visits and whether people had utilized the exercise circuit. In fact, the ttests are significant. Assuming equal variance (t=2.522, df=398,pvalue=0.012<0.05) and not assuming equal variances (t=2.849, df=184.863,pvalue=0.005<0.05)
 The following table shows the correlation of the importance variables
fitness  social  enjoy  medical  
fitness  Pearson Correlation  4.74  .188^{**}  .340^{**}  .271^{**} 
Sig. (2tailed)  .000  .000  .000  
N  446  394  394  406  
social  Pearson Correlation  .188^{**}  3.15  .565^{**}  .238^{**} 
Sig. (2tailed)  .000  .000  .000  
N  394  394  386  372  
enjoy  Pearson Correlation  .340^{**}  .565^{**}  3.91  .188^{**} 
Sig. (2tailed)  .000  .000  .000  
N  394  386  394  376  
medical  Pearson Correlation  .271^{**}  .238^{**}  .188^{**}  4.06 
Sig. (2tailed)  .000  .000  .000  
N  406  372  376  406  
**. Correlation is significant at the 0.01 level (2tailed). 
 See table above
 The table above shows a significant and positive correlation between the four importance variables. Meaning that that if one variable increases in value, the second variable also increase in value. Similarly, as one variable decreases in value, the second variable also decreases in value.
 In this question, we will resume the results obtained from the statistical analysis of this Avery Fitness Center survey. Let us start with the personal characteristics of the respondents, the mean age of the 439 respondents is of 62.56 years with a standard deviation of 19.63. 4% of the respondents are female. The high percentage of people who joined the program are reported making between $45.001 and $60.000 per year. Also, we have analyzed the personal reason for participating in AFC programs and the results showed that people participated respectively for fitness, medical and enjoy reasons. Besides, there is a significant difference between the mean average number of monthly and the national average of eight. In fact, that there is a significant difference between the mean number of visits and whether people had utilized the exercise circuit. Furthermore, there is a significant association between utilized the therapy pool and a doctor recommendation.
This information would help the center to focalize their marketing strategy. They should target female population aged between 30 and 70 years and making between $45.001 and $60.000 per year. The center should also work on the fitness program by good monitoring etc. and work on the medical programs and enjoyment materials. The center should work with doctors for recommendations.