Quantitative Research Methods

Quantitative Research Methods 

Instructions:

The purpose of this assignment is to continue to develop your skills using the essential commands in Stata, statistical tests, and interpreting Stata output. Create a Do-File to track the commands you use.

Dataset:

  1. Create a new categorical variable called locale. Using the values in the variable urbanization, create three categories for the new locale variable: urban, suburban, and rural (combine towns and rural together).
  2. Create a table with the frequency distribution of locale and paste the output below. Interpret the descriptive results of the frequency distribution.
  3. Assume that community colleges in California have different resources allocation policies. Let’s assume that there are two primary categories of resource allocation policies. In one category are colleges that engage a wide range of students, faculty, and staff in decisions about resource allocation (e.g., how much is spent on instruction, student services, etc.); let’s call these participatory resource allocation policies. In the other category are colleges whereby resource allocation decisions are made only among the executive leadership team; let’s call these authoritarian resource allocation policies. The variable res_allocation represents this policy, and colleges with a value of 1 have a participatory policy and colleges with a value of 0 have an authoritarian policy.
    1. Find the mean, standard deviation, minimum, and maximum values of instructional expenditures per FTE (inst_exp_fte) for community colleges that have a participatory policy. Paste the output table below.
    2. Find the mean, standard deviation, minimum, and maximum values of instructional expenditures per FTE (inst_exp_fte) for community colleges that have an authoritarian policy. Paste the output table below.
    3. Interpret the descriptive results from 3a and 3b.
  4. You hypothesize that there might be a relationship between resource allocation policy and instructional expenditures. Use a statistical test to examine if there is a statistically significant difference in the average instructional expenditures per FTE between community colleges that have a participatory policy compared to those that have an authoritarian policy. Paste the output below and interpret the results.
  5. You want to examine if there is a relationship between the average instructional expenditures per FTE and the new variable locale, because you suspect that locale might influence instructional expenditures. Find the mean, standard deviation, minimum, and maximum values of Instructional Expenditures per FTE (inst_exp_fte) based on the community college locale. Paste the output of the table below and interpret the descriptive results.
  6. Next, use a statistical test to examine if there is a statistically significant difference in the average instructional expenditures per FTE between community colleges in the three different locales. Paste the output below and interpret the results.
  7. You are interested in the factors that influence a community college’s resource allocation policy. You hypothesize that there might be a relationship between the resource allocation policy and locale. Create a cross-tabulation of the frequencies of the resource allocation policy and locale using the locale and res_allocation Include row and column percentages in the table. Paste the table below and interpret the descriptive results.
  8. Use a statistical test to examine if there is a statistically significant relationship between locale and the resource allocation policy. Paste the table below and interpret the results.
  9. Print your entire output to a PDF by typing: translate @Results assignment2.pdf
  10. Upload the PDF, the Assignment in a Word document, and the Do-File to Canvas. 

Solution 

  1. Create a new categorical variable called locale. Using the values in the variable urbanization, create three categories for the new locale variable: urban, suburban, and rural (combine towns and rural together).

The Stata commands to create the new variable and label the values of the variable follow:

generate locale = 1 if urbanization == 12 | urbanization == 13

replace locale = 2 if urbanization == 21 | urbanization == 22 | urbanization == 23

replace locale = 3 if urbanization > 23 & urbanization < .

label define LOCALE 1 “City” 2 “Suburb” 3 “Rural”

label values locale LOCALE

  1. Create a table with the frequency distribution of locale and paste the output below. Interpret the descriptive results of the frequency distribution.

The Stata command to create the frequency table and its results follow:

. tab1 locale

->tabulation of locale

locale |      Freq.     Percent        Cum.

————+———————————–

City |         24       28.57       28.57

Suburb |         42       50.00       78.57

Rural |         18       21.43      100.00

————+———————————–

Total |         84      100.00

Of the 84 community colleges, half (42/84 are in suburbs, 28.57% (24/84) are in cities, and 21.43% (18/84) are in rural environments.

  1. Assume that community colleges in California have different resources allocation policies. Let’s assume that there are two primary categories of resource allocation policies. In one category are colleges that engage a wide range of students, faculty, and staff in decisions about resource allocation (e.g., how much is spent on instruction, student services, etc.); let’s call these participatory resource allocation policies. In the other category are colleges whereby resource allocation decisions are made only among the executive leadership team; let’s call these authoritarian resource allocation policies. The variable res_allocation represents this policy, and colleges with a value of 1 have a participatory policy and colleges with a value of 0 have an authoritarian policy.
    1. Find the mean, standard deviation, minimum, and maximum values of instructional expenditures per FTE (inst_exp_fte) for community colleges that have a participatory policy. Paste the output table below.

The Stata commands to label the res_allocation variable and find the statistics along with its results follow:

label define RES 0 “Authoritarian” 1 “Partipatory”

label values res_allocation RES

. display “Participatory schools”

Participatory schools

. tabstatinst_exp_fte if res_allocation==1, statistics( count mean sd min max ) columns(statistics)

variable |         N      mean        sd       min       max

————-+————————————————–

inst_exp_fte |        36  6104.556  1289.864      5009     11396

—————————————————————-

  1. Find the mean, standard deviation, minimum, and maximum values of instructional expenditures per FTE (inst_exp_fte) for community colleges that have an authoritarian policy. Paste the output table below.

The Stata commands to find the statistics along with its results follow:

. display “Authoritarian schools”

Authoritarian schools

. tabstatinst_exp_fte if res_allocation==0, statistics( count mean sd min max ) columns(statistics)

variable |         N      mean        sd       min       max

————-+————————————————–

inst_exp_fte |        68  4007.235  569.8288      2508      4927

—————————————————————-

  1. Interpret the descriptive results from 3a and 3b.

Both the mean and variance of instructional expenditures per full time student are higher for participatory schools than for authoritarian schools. In fact, the minimum expenditure by full time student for a participatory school is larger than the maximum expenditure per full time student by an authoritarian school.

  1. You hypothesize that there might be a relationship between resource allocation policy and instructional expenditures. Use a statistical test to examine if there is a statistically significant difference in the average instructional expenditures per FTE between community colleges that have a participatory policy compared to those that have an authoritarian policy. Paste the output below and interpret the results.

The Stata command for a two group t test with unequal variances and its results follow:

. ttestinst_exp_fte, by(res_allocation) unequal welch

Two-sample t test with unequal variances

——————————————————————————

Group |     Obs        Mean    Std. Err.   Std. Dev.   [95% Conf. Interval]

———+——————————————————————–

Authorit |      68    4007.235     69.1019    569.8288    3869.307    4145.163

Partipat |      36    6104.556    214.9773    1289.864    5668.128    6540.983

———+——————————————————————–

combined |     104    4733.231    130.8922    1334.844    4473.637    4992.825

———+——————————————————————–

diff |            -2097.32    225.8104               -2552.777   -1641.864

——————————————————————————

diff = mean(Authorit) – mean(Partipat)                        t =  -9.2880

Ho: diff = 0                             Welch’s degrees of freedom =  42.7845

Ha: diff < 0                 Ha: diff != 0                 Ha: diff > 0

Pr(T < t) = 0.0000         Pr(|T| > |t|) = 0.0000          Pr(T > t) = 1.0000

The null hypothesis is that the population means expenditure per full time student is the same for authoritarian and participatory schools. The sample mean expenditure per full time student is 4,007 for authoritarian schools and 6,104, with a difference between the means of -2,097. The t statistic is -9.288, which has 42.78 degrees of freedom and is associated with a p value less than 0.0001 for the two-tailed alternative. Therefore we conclude that the sample evidence overwhelmingly rejects the null hypothesis in favor of the alternative that participatory schools spend more per full time student on average than authoritarian schools.

You want to examine if there is a relationship between the average instructional expenditures per FTE and the new variable locale, because you suspect that locale might influence instructional expenditures. Find the mean, standard deviation, minimum, and maximum values of Instructional Expenditures per FTE (inst_exp_fte) based on the community college locale. Paste the output of the table below and interpret the descriptive results.

The Stata command to produce the statistics and its results follows:

. tabstatinst_exp_fte, statistics( count mean sd min max ) by(locale) columns(statistics)

Summary for variables: inst_exp_fte

by categories of: locale

locale |         N      mean        sd       min       max

——-+————————————————–

City |        24  4347.583  932.7743      3077      6323

Suburb |        42  4711.524  1117.377      3106      8367

Rural |        18  4856.222  1237.523      3004      7326

——-+————————————————–

Total |        84  4638.548  1099.532      3004      8367

———————————————————-

For our sample, the 18 rural schools average the largest spending per full time student, 4,856, followed by the suburban schools, 4,711, and the city schools, 4,347. All schools have a similar minimum, the suburban schools have the largest maximum and the city schools have the smallest maximum. It is clear that within each category there are schools that spend more than some schools in the other category and schools that spend less than some schools in the other category.

  1. Next, use a statistical test to examine if there is a statistically significant difference in the average instructional expenditures per FTE between community colleges in the three different locales. Paste the output below and interpret the results.

The Stata command for a one factor ANOVA with locale as the factor and its results follow:

. anovainst_exp_fte locale

Number of obs =      84     R-squared     =  0.0310

Root MSE      = 1095.65     Adj R-squared =  0.0071

Source |  Partial SS    df       MS           F     Prob> F

———–+—————————————————-

Model |  3108397.39     2  1554198.69       1.29     0.2796

|

locale |  3108397.39     2  1554198.69       1.29     0.2796

|

Residual |  97236197.4    81  1200446.88

———–+—————————————————-

Total |   100344595    83  1208971.02

The null hypothesis is that the population mean spending per full time student is the same across the three locales and the alternative is that at least one population mean is different. The F statistic with 2 and 81 degrees of freedom is 1.29, associated with a p value of 0.2796. Therefore our sample does not provide evidence against the null hypothesis and we conclude that there is no evidence that average spending per full time student differs across the school locations.

  1. You are interested in the factors that influence a community college’s resource allocation policy. You hypothesize that there might be a relationship between the resource allocation policy and locale. Create a cross-tabulation of the frequencies of the resource allocation policy and locale using the locale and res_allocation Include row and column percentages in the table. Paste the table below and interpret the descriptive results.

The Stata command and its results follow:

. tabulate locale res_allocation, column row

+——————-+

| Key               |

|——————-|

|     frequency     |

|  row percentage   |

| column percentage |

+——————-+

|    res_allocation

locale | AuthoritaPartipato |     Total

———–+———————-+———-

City |        18          6 |        24

|     75.00      25.00 |    100.00

|     32.73      20.69 |     28.57

———–+———————-+———-

Suburb |        26         16 |        42

|     61.90      38.10 |    100.00

|     47.27      55.17 |     50.00

———–+———————-+———-

Rural |        11          7 |        18

|     61.11      38.89 |    100.00

|     20.00      24.14 |     21.43

———–+———————-+———-

Total |        55         29 |        84

|     65.48      34.52 |    100.00

|    100.00     100.00 |    100.00

Cell contents indicate the count of schools that are in the category defined by the row and column. Thus 24 schools are in the city, 18 (75%) are authoritarian and 6 (25%) participatory. The 18 authoritarian city schools are 32.73% (18/55) of the total of 55 authoritarian schools and the 6 participatory schools are 20.69% (6/29) participatory schools. There are 42 schools in the suburbs, 36 (61.9%) are authoritarian and 38.1% (16/42) are participatory. The suburban schools are 47.27% of the authoritarian schools and 55.17% of the participatory schools. There are 18 schools in the rural environment, 61% (11/18) are authoritarian and 38% are participatory. The rural schools represent 21% (18/84) of the total schools, 20% (11/55) of the authoritarian schools and 24.14% (7/29) of the participatory schools. From comparing row percentages across locales, it is clear that city schools are slightly less likely to be participatory (25%) than suburban or rural schools (each with 38%).

  1. Use a statistical test to examine if there is a statistically significant relationship between locale and the resource allocation policy. Paste the table below and interpret the results.

The Stata command to perform a chi square test of the null hypothesis that the proportions of schools that are authoritarian and participatory does not differ across the three locales of city, suburban and rural follows, with listing of the table suppressed as the table is the same as for question 7:

. tabulate locale res_allocation, chi2 nofreq

Pearson chi2(2) =   1.3517   Pr = 0.509

The null hypothesis is that the population proportion of schools that are authoritarian and participatory is independent of the locale of the school. The chi square statistic for the test has 2 degrees of freedom and a value of 1.3517 associated with a p value of 0.509. Therefore there is no evidence in the sample to reject the null hypothesis and we conclude that there is no evidence that the population proportion of schools that are authoritarian and participatory differs by locale.

  1. Print your entire output to a PDF by typing: translate @Results assignment2.pdf

The Stata command follows:

translate @Results assignment2.pdf

  1. Upload the PDF, the Assignment in a Word document, and the Do-File to Canvas. 

assignment.do 

set more off

cd “C:\STATA”

/* Import Excel data file.                                   */

import excel “C:\STATA\Assignment Data.xls”, sheet(“Sheet1”) firstrow

/* 1. Create a new categorical variable called locale. Using */

/* the values in the variable urbanization, create three     */

/* categories for the new locale variable: urban, suburban,  */

/* and rural (combine towns and rural together).             */

generate locale = 1 if urbanization == 12 | urbanization == 13

replace locale = 2 if urbanization == 21 | urbanization == 22 | urbanization == 23

replace locale = 3 if urbanization > 23 & urbanization < .

label define LOCALE 1 “City” 2 “Suburb” 3 “Rural”

label values locale LOCALE

/* 2. Create a table with the frequency distribution of      */

/* locale and paste the output below. Interpret the          */

/* descriptive results of the frequency distribution.        */

tab1 locale

/* 3. Assume that community colleges in California have      */

/* different resources allocation policies. Let’s assume     */

/* that there are two primary categories of resource         */

/* allocation policies. In one category are colleges that    */

/* engage a wide range of students, faculty, and staff in    */

/* decisions about resource allocation (e.g., how much is    */

/* spent on instruction, student services, etc.); let’s call */

/* these participatory resource allocation policies. In the  */

/* other category are colleges whereby resource allocation   */

/* decisions are made only among the executive leadership    */

/* team; let’s call these authoritarian resource allocation  */

/* policies. The variable res_allocation represents this     */

/* policy, and colleges with a value of 1 have a             */

/* participatory policy and colleges with a value of 0 have  */

/* an authoritarian policy.                                  */

label define RES 0 “Authoritarian” 1 “Partipatory”

label values res_allocation RES

/* a. Find the mean, standard deviation, minimum, and        */

/* maximum values of instructional expenditures per FTE      */

/* (inst_exp_fte) for community colleges that have a         */

/* participatory policy. Paste the output table below.       */

display “Participatory schools”

tabstatinst_exp_fte if res_allocation==1, statistics( count mean sd min max ) columns(statistics)

/* b. Find the mean, standard deviation, minimum, and        */

/* maximum values of instructional expenditures per FTE      */

/* (inst_exp_fte) for community colleges that have an        */

/* authoritarian policy. Paste the output table below.       */

display “Authoritarian schools”

tabstatinst_exp_fte if res_allocation==0, statistics( count mean sd min max ) columns(statistics)

/* 4. You hypothesize that there might be a relationship     */

/* between resource allocation policy and instructional      */

/* expenditures. Use a statistical test to examine if there  */

/* is a statistically significant difference in the average  */

/* instructional expenditures per FTE between community      */

/* colleges that have a participatory policy compared to     */

/* those that have an authoritarian policy. Paste the output */

/* below and interpret the results.                          */

ttestinst_exp_fte, by(res_allocation) unequal welch

/* 5. You want to examine if there is a relationship between */

/* the average instructional expenditures per FTE and the    */

/* new variable locale, because you suspect that locale      */

/* might influence instructional expenditures. Find the mean,*/

/*  standard deviation, minimum, and maximum values of       */

/* Instructional Expenditures per FTE (inst_exp_fte) based   */

/* on the community college locale. Paste the output of the  */

/* table below and interpret the descriptive results.        */

tabstatinst_exp_fte, statistics( count mean sd min max ) by(locale) columns(statistics)

/* 6. Next, use a statistical test to examine if there is a  */

/* statistically significant difference in the average       */

/* instructional expenditures per FTE between community      */

/* colleges in the three different locales. Paste the output */

/* below and interpret the results.                          */

anovainst_exp_fte locale

/* 7. You are interested in the factors that influence a     */

/* community college’s resource allocation policy. You       */

/* hypothesize that there might be a relationship between    */

/* the resource allocation policy and locale. Create a       */

/* cross-tabulation of the frequencies of the resource       */

/* allocation policy and locale using the locale and         */

/* res_allocation variables. Include row and column          */

/* percentages in the table. Paste the table below and       */

/* interpret the descriptive results.                        */

tabulate locale res_allocation, column row

/* 8. Use a statistical test to examine if there is a        */

/* statistically significant relationship between locale and */

/* the resource allocation policy. Paste the table below and */

/* interpret the results.                                    */

tabulate locale res_allocation, chi2 nofreq

/* 9. Print your entire output to a PDF by typing:           */

/* translate @Results assignment2.pdf                        */

translate @Results assignment2.pdf