Quantitative Research Methods
Instructions:
The purpose of this assignment is to continue to develop your skills using the essential commands in Stata, statistical tests, and interpreting Stata output. Create a DoFile to track the commands you use.
Dataset:
 Create a new categorical variable called locale. Using the values in the variable urbanization, create three categories for the new locale variable: urban, suburban, and rural (combine towns and rural together).
 Create a table with the frequency distribution of locale and paste the output below. Interpret the descriptive results of the frequency distribution.
 Assume that community colleges in California have different resources allocation policies. Let’s assume that there are two primary categories of resource allocation policies. In one category are colleges that engage a wide range of students, faculty, and staff in decisions about resource allocation (e.g., how much is spent on instruction, student services, etc.); let’s call these participatory resource allocation policies. In the other category are colleges whereby resource allocation decisions are made only among the executive leadership team; let’s call these authoritarian resource allocation policies. The variable res_allocation represents this policy, and colleges with a value of 1 have a participatory policy and colleges with a value of 0 have an authoritarian policy.
 Find the mean, standard deviation, minimum, and maximum values of instructional expenditures per FTE (inst_exp_fte) for community colleges that have a participatory policy. Paste the output table below.
 Find the mean, standard deviation, minimum, and maximum values of instructional expenditures per FTE (inst_exp_fte) for community colleges that have an authoritarian policy. Paste the output table below.
 Interpret the descriptive results from 3a and 3b.
 You hypothesize that there might be a relationship between resource allocation policy and instructional expenditures. Use a statistical test to examine if there is a statistically significant difference in the average instructional expenditures per FTE between community colleges that have a participatory policy compared to those that have an authoritarian policy. Paste the output below and interpret the results.
 You want to examine if there is a relationship between the average instructional expenditures per FTE and the new variable locale, because you suspect that locale might influence instructional expenditures. Find the mean, standard deviation, minimum, and maximum values of Instructional Expenditures per FTE (inst_exp_fte) based on the community college locale. Paste the output of the table below and interpret the descriptive results.
 Next, use a statistical test to examine if there is a statistically significant difference in the average instructional expenditures per FTE between community colleges in the three different locales. Paste the output below and interpret the results.
 You are interested in the factors that influence a community college’s resource allocation policy. You hypothesize that there might be a relationship between the resource allocation policy and locale. Create a crosstabulation of the frequencies of the resource allocation policy and locale using the locale and res_allocation Include row and column percentages in the table. Paste the table below and interpret the descriptive results.
 Use a statistical test to examine if there is a statistically significant relationship between locale and the resource allocation policy. Paste the table below and interpret the results.
 Print your entire output to a PDF by typing: translate @Results assignment2.pdf
 Upload the PDF, the Assignment in a Word document, and the DoFile to Canvas.
Solution
 Create a new categorical variable called locale. Using the values in the variable urbanization, create three categories for the new locale variable: urban, suburban, and rural (combine towns and rural together).
The Stata commands to create the new variable and label the values of the variable follow:
generate locale = 1 if urbanization == 12  urbanization == 13
replace locale = 2 if urbanization == 21  urbanization == 22  urbanization == 23
replace locale = 3 if urbanization > 23 & urbanization < .
label define LOCALE 1 “City” 2 “Suburb” 3 “Rural”
label values locale LOCALE
 Create a table with the frequency distribution of locale and paste the output below. Interpret the descriptive results of the frequency distribution.
The Stata command to create the frequency table and its results follow:
. tab1 locale
>tabulation of locale
locale  Freq. Percent Cum.
————+———————————–
City  24 28.57 28.57
Suburb  42 50.00 78.57
Rural  18 21.43 100.00
————+———————————–
Total  84 100.00
Of the 84 community colleges, half (42/84 are in suburbs, 28.57% (24/84) are in cities, and 21.43% (18/84) are in rural environments.
 Assume that community colleges in California have different resources allocation policies. Let’s assume that there are two primary categories of resource allocation policies. In one category are colleges that engage a wide range of students, faculty, and staff in decisions about resource allocation (e.g., how much is spent on instruction, student services, etc.); let’s call these participatory resource allocation policies. In the other category are colleges whereby resource allocation decisions are made only among the executive leadership team; let’s call these authoritarian resource allocation policies. The variable res_allocation represents this policy, and colleges with a value of 1 have a participatory policy and colleges with a value of 0 have an authoritarian policy.
 Find the mean, standard deviation, minimum, and maximum values of instructional expenditures per FTE (inst_exp_fte) for community colleges that have a participatory policy. Paste the output table below.
The Stata commands to label the res_allocation variable and find the statistics along with its results follow:
label define RES 0 “Authoritarian” 1 “Partipatory”
label values res_allocation RES
. display “Participatory schools”
Participatory schools
. tabstatinst_exp_fte if res_allocation==1, statistics( count mean sd min max ) columns(statistics)
variable  N mean sd min max
————+————————————————–
inst_exp_fte  36 6104.556 1289.864 5009 11396
—————————————————————
 Find the mean, standard deviation, minimum, and maximum values of instructional expenditures per FTE (inst_exp_fte) for community colleges that have an authoritarian policy. Paste the output table below.
The Stata commands to find the statistics along with its results follow:
. display “Authoritarian schools”
Authoritarian schools
. tabstatinst_exp_fte if res_allocation==0, statistics( count mean sd min max ) columns(statistics)
variable  N mean sd min max
————+————————————————–
inst_exp_fte  68 4007.235 569.8288 2508 4927
—————————————————————
 Interpret the descriptive results from 3a and 3b.
Both the mean and variance of instructional expenditures per full time student are higher for participatory schools than for authoritarian schools. In fact, the minimum expenditure by full time student for a participatory school is larger than the maximum expenditure per full time student by an authoritarian school.
 You hypothesize that there might be a relationship between resource allocation policy and instructional expenditures. Use a statistical test to examine if there is a statistically significant difference in the average instructional expenditures per FTE between community colleges that have a participatory policy compared to those that have an authoritarian policy. Paste the output below and interpret the results.
The Stata command for a two group t test with unequal variances and its results follow:
. ttestinst_exp_fte, by(res_allocation) unequal welch
Twosample t test with unequal variances
——————————————————————————
Group  Obs Mean Std. Err. Std. Dev. [95% Conf. Interval]
———+——————————————————————–
Authorit  68 4007.235 69.1019 569.8288 3869.307 4145.163
Partipat  36 6104.556 214.9773 1289.864 5668.128 6540.983
———+——————————————————————–
combined  104 4733.231 130.8922 1334.844 4473.637 4992.825
———+——————————————————————–
diff  2097.32 225.8104 2552.777 1641.864
——————————————————————————
diff = mean(Authorit) – mean(Partipat) t = 9.2880
Ho: diff = 0 Welch’s degrees of freedom = 42.7845
Ha: diff < 0 Ha: diff != 0 Ha: diff > 0
Pr(T < t) = 0.0000 Pr(T > t) = 0.0000 Pr(T > t) = 1.0000
The null hypothesis is that the population means expenditure per full time student is the same for authoritarian and participatory schools. The sample mean expenditure per full time student is 4,007 for authoritarian schools and 6,104, with a difference between the means of 2,097. The t statistic is 9.288, which has 42.78 degrees of freedom and is associated with a p value less than 0.0001 for the twotailed alternative. Therefore we conclude that the sample evidence overwhelmingly rejects the null hypothesis in favor of the alternative that participatory schools spend more per full time student on average than authoritarian schools.
You want to examine if there is a relationship between the average instructional expenditures per FTE and the new variable locale, because you suspect that locale might influence instructional expenditures. Find the mean, standard deviation, minimum, and maximum values of Instructional Expenditures per FTE (inst_exp_fte) based on the community college locale. Paste the output of the table below and interpret the descriptive results.
The Stata command to produce the statistics and its results follows:
. tabstatinst_exp_fte, statistics( count mean sd min max ) by(locale) columns(statistics)
Summary for variables: inst_exp_fte
by categories of: locale
locale  N mean sd min max
——+————————————————–
City  24 4347.583 932.7743 3077 6323
Suburb  42 4711.524 1117.377 3106 8367
Rural  18 4856.222 1237.523 3004 7326
——+————————————————–
Total  84 4638.548 1099.532 3004 8367
———————————————————
For our sample, the 18 rural schools average the largest spending per full time student, 4,856, followed by the suburban schools, 4,711, and the city schools, 4,347. All schools have a similar minimum, the suburban schools have the largest maximum and the city schools have the smallest maximum. It is clear that within each category there are schools that spend more than some schools in the other category and schools that spend less than some schools in the other category.
 Next, use a statistical test to examine if there is a statistically significant difference in the average instructional expenditures per FTE between community colleges in the three different locales. Paste the output below and interpret the results.
The Stata command for a one factor ANOVA with locale as the factor and its results follow:
. anovainst_exp_fte locale
Number of obs = 84 Rsquared = 0.0310
Root MSE = 1095.65 Adj Rsquared = 0.0071
Source  Partial SS df MS F Prob> F
———–+—————————————————
Model  3108397.39 2 1554198.69 1.29 0.2796

locale  3108397.39 2 1554198.69 1.29 0.2796

Residual  97236197.4 81 1200446.88
———–+—————————————————
Total  100344595 83 1208971.02
The null hypothesis is that the population mean spending per full time student is the same across the three locales and the alternative is that at least one population mean is different. The F statistic with 2 and 81 degrees of freedom is 1.29, associated with a p value of 0.2796. Therefore our sample does not provide evidence against the null hypothesis and we conclude that there is no evidence that average spending per full time student differs across the school locations.
 You are interested in the factors that influence a community college’s resource allocation policy. You hypothesize that there might be a relationship between the resource allocation policy and locale. Create a crosstabulation of the frequencies of the resource allocation policy and locale using the locale and res_allocation Include row and column percentages in the table. Paste the table below and interpret the descriptive results.
The Stata command and its results follow:
. tabulate locale res_allocation, column row
+——————+
 Key 
——————
 frequency 
 row percentage 
 column percentage 
+——————+
 res_allocation
locale  AuthoritaPartipato  Total
———–+———————+———
City  18 6  24
 75.00 25.00  100.00
 32.73 20.69  28.57
———–+———————+———
Suburb  26 16  42
 61.90 38.10  100.00
 47.27 55.17  50.00
———–+———————+———
Rural  11 7  18
 61.11 38.89  100.00
 20.00 24.14  21.43
———–+———————+———
Total  55 29  84
 65.48 34.52  100.00
 100.00 100.00  100.00
Cell contents indicate the count of schools that are in the category defined by the row and column. Thus 24 schools are in the city, 18 (75%) are authoritarian and 6 (25%) participatory. The 18 authoritarian city schools are 32.73% (18/55) of the total of 55 authoritarian schools and the 6 participatory schools are 20.69% (6/29) participatory schools. There are 42 schools in the suburbs, 36 (61.9%) are authoritarian and 38.1% (16/42) are participatory. The suburban schools are 47.27% of the authoritarian schools and 55.17% of the participatory schools. There are 18 schools in the rural environment, 61% (11/18) are authoritarian and 38% are participatory. The rural schools represent 21% (18/84) of the total schools, 20% (11/55) of the authoritarian schools and 24.14% (7/29) of the participatory schools. From comparing row percentages across locales, it is clear that city schools are slightly less likely to be participatory (25%) than suburban or rural schools (each with 38%).
 Use a statistical test to examine if there is a statistically significant relationship between locale and the resource allocation policy. Paste the table below and interpret the results.
The Stata command to perform a chi square test of the null hypothesis that the proportions of schools that are authoritarian and participatory does not differ across the three locales of city, suburban and rural follows, with listing of the table suppressed as the table is the same as for question 7:
. tabulate locale res_allocation, chi2 nofreq
Pearson chi2(2) = 1.3517 Pr = 0.509
The null hypothesis is that the population proportion of schools that are authoritarian and participatory is independent of the locale of the school. The chi square statistic for the test has 2 degrees of freedom and a value of 1.3517 associated with a p value of 0.509. Therefore there is no evidence in the sample to reject the null hypothesis and we conclude that there is no evidence that the population proportion of schools that are authoritarian and participatory differs by locale.
 Print your entire output to a PDF by typing: translate @Results assignment2.pdf
The Stata command follows:
translate @Results assignment2.pdf
 Upload the PDF, the Assignment in a Word document, and the DoFile to Canvas.
assignment.do
set more off
cd “C:\STATA”
/* Import Excel data file. */
import excel “C:\STATA\Assignment Data.xls”, sheet(“Sheet1”) firstrow
/* 1. Create a new categorical variable called locale. Using */
/* the values in the variable urbanization, create three */
/* categories for the new locale variable: urban, suburban, */
/* and rural (combine towns and rural together). */
generate locale = 1 if urbanization == 12  urbanization == 13
replace locale = 2 if urbanization == 21  urbanization == 22  urbanization == 23
replace locale = 3 if urbanization > 23 & urbanization < .
label define LOCALE 1 “City” 2 “Suburb” 3 “Rural”
label values locale LOCALE
/* 2. Create a table with the frequency distribution of */
/* locale and paste the output below. Interpret the */
/* descriptive results of the frequency distribution. */
tab1 locale
/* 3. Assume that community colleges in California have */
/* different resources allocation policies. Let’s assume */
/* that there are two primary categories of resource */
/* allocation policies. In one category are colleges that */
/* engage a wide range of students, faculty, and staff in */
/* decisions about resource allocation (e.g., how much is */
/* spent on instruction, student services, etc.); let’s call */
/* these participatory resource allocation policies. In the */
/* other category are colleges whereby resource allocation */
/* decisions are made only among the executive leadership */
/* team; let’s call these authoritarian resource allocation */
/* policies. The variable res_allocation represents this */
/* policy, and colleges with a value of 1 have a */
/* participatory policy and colleges with a value of 0 have */
/* an authoritarian policy. */
label define RES 0 “Authoritarian” 1 “Partipatory”
label values res_allocation RES
/* a. Find the mean, standard deviation, minimum, and */
/* maximum values of instructional expenditures per FTE */
/* (inst_exp_fte) for community colleges that have a */
/* participatory policy. Paste the output table below. */
display “Participatory schools”
tabstatinst_exp_fte if res_allocation==1, statistics( count mean sd min max ) columns(statistics)
/* b. Find the mean, standard deviation, minimum, and */
/* maximum values of instructional expenditures per FTE */
/* (inst_exp_fte) for community colleges that have an */
/* authoritarian policy. Paste the output table below. */
display “Authoritarian schools”
tabstatinst_exp_fte if res_allocation==0, statistics( count mean sd min max ) columns(statistics)
/* 4. You hypothesize that there might be a relationship */
/* between resource allocation policy and instructional */
/* expenditures. Use a statistical test to examine if there */
/* is a statistically significant difference in the average */
/* instructional expenditures per FTE between community */
/* colleges that have a participatory policy compared to */
/* those that have an authoritarian policy. Paste the output */
/* below and interpret the results. */
ttestinst_exp_fte, by(res_allocation) unequal welch
/* 5. You want to examine if there is a relationship between */
/* the average instructional expenditures per FTE and the */
/* new variable locale, because you suspect that locale */
/* might influence instructional expenditures. Find the mean,*/
/* standard deviation, minimum, and maximum values of */
/* Instructional Expenditures per FTE (inst_exp_fte) based */
/* on the community college locale. Paste the output of the */
/* table below and interpret the descriptive results. */
tabstatinst_exp_fte, statistics( count mean sd min max ) by(locale) columns(statistics)
/* 6. Next, use a statistical test to examine if there is a */
/* statistically significant difference in the average */
/* instructional expenditures per FTE between community */
/* colleges in the three different locales. Paste the output */
/* below and interpret the results. */
anovainst_exp_fte locale
/* 7. You are interested in the factors that influence a */
/* community college’s resource allocation policy. You */
/* hypothesize that there might be a relationship between */
/* the resource allocation policy and locale. Create a */
/* crosstabulation of the frequencies of the resource */
/* allocation policy and locale using the locale and */
/* res_allocation variables. Include row and column */
/* percentages in the table. Paste the table below and */
/* interpret the descriptive results. */
tabulate locale res_allocation, column row
/* 8. Use a statistical test to examine if there is a */
/* statistically significant relationship between locale and */
/* the resource allocation policy. Paste the table below and */
/* interpret the results. */
tabulate locale res_allocation, chi2 nofreq
/* 9. Print your entire output to a PDF by typing: */
/* translate @Results assignment2.pdf */
translate @Results assignment2.pdf