N Median Mode Range
SPSS 1 Assignment Instructions
This assignment is designed to help teach you to describe a single variable – its central location, its dispersion, to create an appropriate graphic to illustrate the variable, and to discuss the way in which you variables distribute.
Use the following format:
 A title page with “SPSS 1: Describing a single variable” as the title and your name, section, TA’s name, professor’s name, date, G# in the upper, right hand corner.
 Start a new page for each variable.
 The variable name in bold and underlined at the top of the page.
 Your answers should have the following sections:
 a) A properly formatted frequency table for the variable you are describing.
 b) A table of appropriate summary statistics.
 c) An appropriate graphic.
 d) A paragraph describing your variable.
For each of the variables:
 Identify the level of measurement for each variable.
 Build a table that shows the cumulative % and frequencies.
 The table must be in APA format
 Some variables will have to be recoded to effectively display in a table. RULE OF THUMB – no more than 10 categories should appear in ANY table.
 Report the summary statistics that describes the variable in terms of all the appropriate measures of central location and dispersion.
 Create 1 appropriate graphic to display the distribution.
 Write a paragraph that describes the distribution in terms of central location, dispersion, outliers, and skew (if any).
These are your variables:
 From the WORLD 2012dataset use the variable named “polity” with the label “Higher scores more democratic (Polity)”.
 From the NES 2012 dataset use the variable named “dem_marital” with the label “Marital Status”.
 From the NES 2012 dataset use the variable named “relig_attend” with the label “Attendance: Religious Services”.
 From the GSS 2012 dataset use the variable named “wordsum” with the label “Number of words correct in vocabulary test”.
 From the GSS 2012 dataset use the variable named “educ_4” with the label “Education in 4 Categories”.
***Sample Problem*** Variable: age5
 Level of measurement for this variable is ordinal.
 Cumulative % frequency table:
Age in 5 Categoriesa  
XF  % Cum %  
1830 437  21.7 21.7  
3140 384  19.1 40.8  
4150 403  20.0 60.8  
5160 369  18.3 79.1  
61+ 421  20.9 100.0  
Total 2015  100.0  
a. General Social Survey 2008 
III. Table of summary statistics:
Summary Statistics 
N Median Mode Range Minimum Maximum Q1 Q3 IQR Vratio2015 4150 1830(61+) – (1830) 1830 61+ 3140 5160 (5160) – (3140) 0.783 
Bar chart:
 Descriptive paragraph requirements:
 What does your variable measure?
 Describe this variable in terms of all appropriate measurements of central
 Describe this variable in terms of the appropriate measurements of dispersion.
 If appropriate, is this distribution skewed negative or positive?
 Discuss any other interesting and relevant details about this distribution.
SPSS1 Frequently Asked Questions and Point Breakdown
5 questions:
 Level of measurement:
 You must have the correct level of measurement
 You should not put interval/ratio. You must clearly identify if the data is interval or ratio for full
 You need to report the level of measurement on the original data, not the recoded data.
 Cumulative Frequency Table:
 You must have a title, a source and appropriate columns.
 Your table must be formatted according to APA guidelines. Week 6 under course content on our Blackboard Course website has both the template and an instructional video if you would like to refresh your memory from lab.
 If you have more than 10 rows you should recode the data. Make sure that the valid total of your recoded cumulative frequency table matches the valid total of the cumulative frequency table on the original data.
 Descriptive Statistics:
 Do not directly copy and paste from SPSS. You must format the tables according to the APA Guidelines.
 Measures of central tendency must be correct for the level of measurement of the data.
 Measures of dispersion must be appropriate for the level of measurement of the data.
 Always report the category labels for categorical data.
 You will need to perform some calculations. Remember, SPSS does not give you the IQR or the V ratio. You will need to calculate these out correctly for full points.
 Remember, if you have ordinal data the numbers associated with the data are just the coding Therefore, you need to simply the IQR and the Range as much as possible. (For example, if you were using a Likert Scale, you could end up with HateLove for the range and Somewhat dislikeSomewhat love for the IQR). Remember for categorical data you must work with the category labels.
 You must run statistics on the original data, not the recoded data. As such, your descriptive statistics should be appropriate to the level of measurement of your original data.
 Graphics:
 These can be copied and pasted from SPSS. You need to make sure that your graphs have titles and sources. You may use Excel but we strongly encourage you to use SPSS
 Your graph must be appropriate to the level of measurement of your data.
 (nominal = pie chart; ordinal = bar graph; interval/ratio = histogram)
 If using recoded variable, should use the level of measurement AFTER the recoding (interval recoded to ordinal should use bar graph)
 However, if you use a histogram with the interval/ratio original data, you will not be penalized
 Have value labels. Remember to use the crosshairs in the Graph Editor on SPSS. d. Bar chart should use % of cases (“Percent”) not count.
 Paragraph:
 You need to describe both central tendency and dispersion. Do not just laundry list all of the statistics. Focus on the most salient univariate statistics for your level of measurements (i.e. if you have ratio data, you should discuss the mean for central tendency and the standard deviation and the variance for measures of dispersion). You need to interpret theses as well. What do they tell you about the variable being measured? (For example, if you have a variable on age and the average age was 25, what could you infer about the population?). Report interesting trends from the statistics.
 Discuss the shape of the distribution curve (positive or negative skew) if it is appropriate for the level of measurement.
 Make sure that you report outliers if it is appropriate to the level of measurement for your data.
Solution
SPSS1: Describing a Single variable
Variable named “polity” with the label “Higher scores more democratic (Polity)” from WORLD 2012 dataset
 Level of measurement
Level of measurement for this variable is ratio as the scores are representation of being democratic, so absolute zero may not be defined
 Cumulative Frequency Table
Higher scores more democratic (Polity)  
score  % Cum  
10  1.4  
9  3.5  
8  4.9  
7  11.8  
6  12.5  
5  13.2  
4  16.0  
3  18.1  
2  23.6  
1  25.0  
0  25.7  
1  27.1  
2  29.2  
3  31.9  
4  34.7  
5  38.9  
6  47.2  
7  54.9  
8  66.7  
9  77.8  
10  100  
 Descriptive Statistics
Summary Statistics 
N Median Mode Range Minimum Maximum Q1 Q3 IQR Vratio167 7 10 20 10 10 0.75 9 (MinQ1 9.25, Q1Q2 7.75, Q2Q3 2, Q3 max11) , (132/167 = 0.8083) 
 Graphics
5.Paragraph
Since the data is ratio data, hence the mean of the of the data is 4.36 while its median is 7.00 which means median is greater than mean which means the data are “skewed to the left”, with a long tail of low scores pulling the mean down more than the median. This is further bolstered by the fact that skewness of the data is .972. Skewness is a measure of the asymmetry. If there is an existence of negative skew which means the left tail is longer; the mass of the distribution is concentrated on the right of the figure. The distribution is said to be leftskewed, lefttailed, or skewed to the left. Further the standard deviation of the data is 6.104. Ideally approximately 99% of the data is + or – 2 standard deviations from the mean, therefore 99% of the data would be concentrated between 4.36 +2×6.104 and 4.36 – 2×6.104 i.e. between 16.568 and 7.848. However , in this case the results are different as distribution is left skewed.
Variable named “dem_marital” with the label “Marital Status” from the NES 2012 dataset
 Level of measurement
Level of measurement for this variable is nominal
 Cumulative Frequency Table
PRE: Marital status  
Category  % Cum  
Married: spouse present  51.5  
Married: spouse absent  57.0  
Widowed  67.6  
Divorced  73.5  
Separated  91.1  
Never married  100.0  
 Descriptive Statistics
Summary Statistics 
N Median Mode Range Minimum Maximum Q1 Q3 IQR Vratio5905 1 1 5 1 6 1 5 (MinQ1 0, Q1Q2 0, Q2Q3 4, Q3 max 1) , (13043/5905 = 0.48467) 
 Graphics
5.Paragraph
The data is nominal in nature, therefore, pie chart has been plotted. The mean of the data is 2.59 while its median is 1 which means median is less than mean. The mode of the data is 1. Since data is nominal in nature, therefore, there is no scope for outliers. Further, mean also has a limited role in this case.
Variable named “relig_attend” with the label “Attendance: Religious Services” from the NES 2012 dataset
 Level of measurement
Level of measurement for this variable is ordinal
 Cumulative Frequency Table
PRE: Marital status  
Category  % Cum  
Never  42.9  
Few/Yr  57.9  
12/Mnth  67.5  
Alm/Evwk  78.6  
Ev Week  100.0  
 Descriptive Statistics
Summary Statistics 
N Median Mode Range Minimum Maximum Q1 Q3 IQR Vratio5884 1 0 4 0 4 0 3 (MinQ1 0, Q1Q2 1, Q2Q3 2, Q3 max 1) , (12526/5884 = 0.5707) 
 Graphics
5. Paragraph
The data is ordinal in nature, therefore, bar chart has been plotted. The mean of the data is 1.53 while its median is 1 which means median is less than mean. The mode of the data is 0. Since data is ordinal in nature, therefore, there is no scope for outliers. Further, mean also has a limited role in this case.
Variable named “wordsum” with the label “Number of words correct in vocabulary test” from the GSS 2012 dataset
 Level of measurement
Level of measurement for this variable is ratio scale
 Cumulative Frequency Table
Number Words Correct In Vocabulary Test  
score  % Cum  
0  .7  
1  2.1  
2  5.5  
3  10.8  
4  21.6  
5  39.3  
6  63.5  
7  78.7  
8  90.5  
9  96.4  
10  100.0  
 Descriptive Statistics
Summary Statistics 
N Median Mode Range Minimum Maximum Q1 Q3 IQR Vratio1975 6 6 10 0 10 5 7 (MinQ1 5, Q1Q2 1, Q2Q3 1, Q3 max 3) , (1 310/1975 = 0.8430) 
 Graphics
5. Paragraph
Since the data is ratio data, hence the mean of the of the data is 5.91 while its median is 6.00 which means median is greater than mean which means the data are “skewed to the left”, with a long tail of low scores pulling the mean down more than the median. This is further bolstered by the fact that skewness of the data is .234. Skewness is a measure of the asymmetry. If there is an existence of negative skew which means the left tail is longer; the mass of the distribution is concentrated on the right of the figure. The distribution is said to be leftskewed, lefttailed, or skewed to the left. Further the standard deviation of the data is 1.988. Ideally approximately 99% of the data is + or – 2 standard deviations from the mean, therefore 99% of the data would be concentrated between 5.91 +2×1.988 and 5.91 – 2×1.988 i.e. between 9.886 and 1.934. However , in this case the results are different as distribution is left skewed.
Variable named “educ_4” with the label “Education in 4 Categories” from the GSS 2012 dataset
 Level of measurement
Level of measurement for this variable is nominal
 Cumulative Frequency Table
Education: 4 Cats  
Category  % Cum  
<HS  16.2  
HS  42.9  
Some Coll  69.9  
Coll+  100.0  
 Descriptive Statistics
Summary Statistics 
N Median Mode Range Minimum Maximum Q1 Q3 IQR Vratio1975 3 4 3 1 4 2 4 (MinQ1 1, Q1Q2 1, Q2Q3 1, Q3 max 1) , (1 593/1975 = 0.6997) 
 Graphics
5.Paragraph
The data is nominal in nature, therefore, pie chart has been plotted. The mean of the data is 2.71 while its median is 3 which means median is more than mean. The mode of the data is 4. Since data is nominal in nature, therefore, there is no scope for outliers. Further, mean also has a limited role in this case.
GET
FILE=’C:\Users\Akki\Desktop\fwdfiles\GSS2012.sav’.
DATASET NAME DataSet0 WINDOW=FRONT.
FREQUENCIES VARIABLES=wordsum
/NTILES=4
/STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SKEWNESS SESKEW KURTOSIS SEKURT
/HISTOGRAM
/ORDER=ANALYSIS.
Frequencies
Notes  
Output Created  25Oct2017 01:21:40  
Comments  
Input  Data  C:\Users\Akki\Desktop\fwdfiles\GSS2012.sav 
Active Dataset  DataSet1  
Filter  <none>  
Weight  Weight Variable  
Split File  <none>  
N of Rows in Working Data File  1974  
Missing Value Handling  Definition of Missing  Userdefined missing values are treated as missing. 
Cases Used  Statistics are based on all cases with valid data.  
Syntax  FREQUENCIES VARIABLES=wordsum
/NTILES=4 /STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SKEWNESS SESKEW KURTOSIS SEKURT /HISTOGRAM /ORDER=ANALYSIS.


Resources  Processor Time  00:00:00.280 
Elapsed Time  00:00:00.310 
[DataSet1] C:\Users\Akki\Desktop\fwdfiles\GSS2012.sav
Statistics  
Number Words Correct In Vocabulary Test  
N  Valid  1283 
Missing  692  
Mean  5.91  
Std. Error of Mean  .056  
Median  6.00  
Mode  6  
Std. Deviation  1.988  
Variance  3.954  
Skewness  .234  
Std. Error of Skewness  .068  
Kurtosis  .067  
Std. Error of Kurtosis  .137  
Range  10  
Minimum  0  
Maximum  10  
Percentiles  25  5.00 
50  6.00  
75  7.00 
Number Words Correct In Vocabulary Test  
Frequency  Percent  Valid Percent  Cumulative Percent  
Valid  0  9  .5  .7  .7 
1  17  .9  1.4  2.1  
2  43  2.2  3.4  5.5  
3  69  3.5  5.4  10.8  
4  138  7.0  10.8  21.6  
5  227  11.5  17.7  39.3  
6  310  15.7  24.2  63.5  
7  195  9.9  15.2  78.7  
8  151  7.7  11.8  90.5  
9  76  3.9  5.9  96.4  
10  46  2.3  3.6  100.0  
Total  1283  64.9  100.0  
Missing  IAP  662  33.5  
DID NOT TRY  30  1.5  
Total  692  35.1  
Total  1975  100.0 
FREQUENCIES VARIABLES=dem_marital
/NTILES=4
/STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SKEWNESS SESKEW KURTOSIS SEKURT
/PIECHART FREQ
/ORDER=ANALYSIS.
Frequencies
Notes  
Output Created  25Oct2017 00:14:58  
Comments  
Input  Data  C:\Users\Akki\Desktop\fwdfiles\NES2012.sav 
Active Dataset  DataSet1  
Filter  <none>  
Weight  Weight variable  
Split File  <none>  
N of Rows in Working Data File  5916  
Missing Value Handling  Definition of Missing  Userdefined missing values are treated as missing. 
Cases Used  Statistics are based on all cases with valid data.  
Syntax  FREQUENCIES VARIABLES=dem_marital
/NTILES=4 /STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SKEWNESS SESKEW KURTOSIS SEKURT /PIECHART FREQ /ORDER=ANALYSIS.


Resources  Processor Time  00:00:00.358 
Elapsed Time  00:00:00.621 
[DataSet1] C:\Users\Akki\Desktop\fwdfiles\NES2012.sav
Statistics  
PRE: Marital status  
N  Valid  5905 
Missing  11  
Mean  2.59  
Std. Error of Mean  .024  
Median  1.00  
Mode  1  
Std. Deviation  1.874  
Variance  3.514  
Skewness  .614  
Std. Error of Skewness  .032  
Kurtosis  1.263  
Std. Error of Kurtosis  .064  
Range  5  
Minimum  1  
Maximum  6  
Percentiles  25  1.00 
50  1.00  
75  5.00 
PRE: Marital status  
Frequency  Percent  Valid Percent  Cumulative Percent  
Valid  1. Married: spouse present  3043  51.4  51.5  51.5 
2. Married: spouse absent {VOL}  320  5.4  5.4  57.0  
3. Widowed  629  10.6  10.6  67.6  
4. Divorced  347  5.9  5.9  73.5  
5. Separated  1042  17.6  17.7  91.1  
6. Never married  524  8.9  8.9  100.0  
Total  5905  99.8  100.0  
Missing  System  11  .2  
Total  5916  100.0 
FREQUENCIES VARIABLES=relig_attend
/NTILES=4
/STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SKEWNESS SESKEW KURTOSIS SEKURT
/BARCHART FREQ
/ORDER=ANALYSIS.
Frequencies
Notes  
Output Created  25Oct2017 00:54:18  
Comments  
Input  Data  C:\Users\Akki\Desktop\fwdfiles\NES2012.sav 
Active Dataset  DataSet1  
Filter  <none>  
Weight  Weight variable  
Split File  <none>  
N of Rows in Working Data File  5916  
Missing Value Handling  Definition of Missing  Userdefined missing values are treated as missing. 
Cases Used  Statistics are based on all cases with valid data.  
Syntax  FREQUENCIES VARIABLES=relig_attend
/NTILES=4 /STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SKEWNESS SESKEW KURTOSIS SEKURT /BARCHART FREQ /ORDER=ANALYSIS.


Resources  Processor Time  00:00:00.296 
Elapsed Time  00:00:00.270 
[DataSet1] C:\Users\Akki\Desktop\fwdfiles\NES2012.sav
Statistics  
Attendance: Religious Services  
N  Valid  5884 
Missing  32  
Mean  1.53  
Std. Error of Mean  .021  
Median  1.00  
Mode  0  
Std. Deviation  1.616  
Variance  2.613  
Skewness  .478  
Std. Error of Skewness  .032  
Kurtosis  1.413  
Std. Error of Kurtosis  .064  
Range  4  
Minimum  0  
Maximum  4  
Percentiles  25  .00 
50  1.00  
75  3.00 
Attendance: Religious Services  
Frequency  Percent  Valid Percent  Cumulative Percent  
Valid  Never  2526  42.7  42.9  42.9 
Few/Yr  879  14.9  14.9  57.9  
12/Mnth  566  9.6  9.6  67.5  
Alm/Evwk  657  11.1  11.2  78.6  
EvWeek  1256  21.2  21.4  100.0  
Total  5884  99.5  100.0  
Missing  System  32  .5  
Total  5916  100.0 
GET
FILE=’C:\Users\Akki\Desktop\fwdfiles\GSS2012.sav’.
DATASET NAME DataSet0 WINDOW=FRONT.
FREQUENCIES VARIABLES=educ_4
/NTILES=4
/STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SUM SKEWNESS SESKEW KURTOSIS SEKURT
/PIECHART FREQ
/ORDER=ANALYSIS.
Frequencies
Notes  
Output Created  25Oct2017 01:29:31  
Comments  
Input  Data  C:\Users\Akki\Desktop\fwdfiles\GSS2012.sav 
Active Dataset  DataSet1  
Filter  <none>  
Weight  Weight Variable  
Split File  <none>  
N of Rows in Working Data File  1974  
Missing Value Handling  Definition of Missing  Userdefined missing values are treated as missing. 
Cases Used  Statistics are based on all cases with valid data.  
Syntax  FREQUENCIES VARIABLES=educ_4
/NTILES=4 /STATISTICS=STDDEV VARIANCE RANGE MINIMUM MAXIMUM SEMEAN MEAN MEDIAN MODE SUM SKEWNESS SESKEW KURTOSIS SEKURT /PIECHART FREQ /ORDER=ANALYSIS.


Resources  Processor Time  00:00:00.421 
Elapsed Time  00:00:00.630 
[DataSet1] C:\Users\Akki\Desktop\fwdfiles\GSS2012.sav
Statistics  
Education: 4 Cats  
N  Valid  1974 
Missing  1  
Mean  2.71  
Std. Error of Mean  .024  
Median  3.00  
Mode  4  
Std. Deviation  1.064  
Variance  1.132  
Skewness  .209  
Std. Error of Skewness  .055  
Kurtosis  1.214  
Std. Error of Kurtosis  .110  
Range  3  
Minimum  1  
Maximum  4  
Sum  5347  
Percentiles  25  2.00 
50  3.00  
75  4.00 
Education: 4 Cats  
Frequency  Percent  Valid Percent  Cumulative Percent  
Valid  <HS  320  16.2  16.2  16.2 
HS  528  26.7  26.8  42.9  
Some Coll  533  27.0  27.0  69.9  
Coll+  593  30.0  30.1  100.0  
Total  1974  99.9  100.0  
Missing  System  1  .1  
Total  1975  100.0 