# Interpretation Coefficient

Documentation for College Distance Data

These data are taken from the HighSchool and Beyond survey conducted by the

Department of Education in 1980, with a follow-up in 1986. The survey included

students from approximately 1100 high schools.

The data in CollegeDistanceexclude students in the western states. The data in

CollegeDistanceWestincludes only those students in the western states.

Series in Data Set

Name Desrciption

ed Years of Education Completed

female 1 = Female/0 = Male

black 1 = Black/0 = Not-Black

Hispanic 1 = Hispanic/0 = Not-Hispanic

bytest Base Year Composite Test Score. (These are achievement tests given to high

school seniors in the sample)

momcoll 1 = Mother is a College Graduate/ 0 = Mother is not a College Graduate

incomehi 1 = Family Income > \$25,000 per year/ 0 = Income ≤ \$25,000 per year.

ownhome 1= Family Owns Home / 0 = Family Does not Own Home

urban 1 = School in Urban Area / = School not in Urban Area

cue80 County Unempolyment rate in 1980

stwmfg80 State Hourly Wage in Manufacturing in 1980

dist Distance from 4yr College in 10’s of miles

tuition Avg. State 4yr College Tuition in \$1000’s

Years of Education: Rouse computed years of education by assigning 12 years to all

members of the senior class. Each additional year of secondary education counted as a

one year. Student’s with vocational degrees were assigned 13 years, AA degrees were

assigned 14 years, BA degrees were assigned 16 years, those with some graduate

education were assigned 17 years, and those with a graduate degree were assigned 18

years.

1. Use STATA and the data fromCollegeDistance.dta
1. a) Run a regression of ED on Dist, Female, Bytest, Tuition, Black, Hispanic, Incomehi, Ownhome, DadColl, MomColl, Cue80, and Stwmfg80. If Dist increases from 2 to 3 (that is, from 20 to 30 miles), how are years of education expected to change? What is the predicted change if Dist increases from 6 to 7?
2. b) Run a regression of ED on Dist, Dist2 Female, Bytest, Tuition, Black, Hispanic, Incomehi, Ownhome, DadColl, MomColl, Cue80, and Stwmfg80. If Dist increases from 2 to 3 (that is, from 20 to 30 miles), how are years of education expected to change? What is the predicted change if Dist increases from 6 to 7?
3. c) Do you prefer regression in (a) or (b)? Explain.
4. d) Consider a Hispanic female with T uition = \$950, Bytest = 58, Incomehi = 0, Ownhome = 0, DadColl = 1, MomColl = 1, Cue80 = 7.1, and Stwmfg80 = \$10.06.
5. Plot in the same graph the estimated regression relation between Dist and ED from (a) and (b) (i.e. look only at the part of regression equation that depends on Dist) for Dist in the range of 0 to 10 (from 0 to 100 miles). Describe the similarities and differences between 1 the estimated regression functions. Would your answer change if you plotted the regression functions for a white male with the same characteristics?
6. How does the regression function in (b) behave for Dist> 10? How many observations are there with Dist> 10?
7. e) Add the interaction term DadColl × MomColl to the regression in (b). Interpret its estimated coefficient.
8. f) Is there any evidence that the effect of Dist on ED depends on the family’s income?
1. You study how unemployment benefits affect duration of unemployment using a pooled data from the Labour Force Surveys in 25 EU countries. You obtain the regression results given in Table 1

) What is the interpretation of the coefficient on uben?

1. b) What is the predicted effect of obtaining 3 more years of education on unemployment duration?
2. c) Construct a 99% confidence interval for the predicted effect of obtaining 3 more years of education
3. d) Is there a (statistically) significant difference between unemployment duration of married and non-married individuals? State and test the appropriate hypothesis at the 5% significance level.
4. e) Test the hypothesis that all the slope coefficients in the regression are zero at the 1% significance level.
5. f) Your friend argues that the relationship between log duration and age is cubic, i.e. age2 and age3 should be included in the regression. How would you verify her claim? State the appropriate hypothesis and discuss all the necessary steps required to obtain the result.

Solution

Question No. 1:

1. Results are shown in image shown below

We can observe a negative and significant coefficient for dist (-0.0366) which shows that if distance is 10 miles then years of college education completed will decrease by 0.03666 years, so if distance increases from 20 to 30 miles then college education will decrease by further

3-2(0.03666) = -0.03666 years.

Change in distance from 6 to 7 will yield similar results as 7-6(-0.0366) = 3-2 (-0.03666).

1. Results are shown in image below

Now we can see that dist has negative and dist2 has positive coefficient so we need to interpret it carefully.

For increase in distance by 20 miles we have (2*-0.0811732 + 2^2*0.00464) = -0.143, which says that if distance will increase by 20 miles then change in years of education completed will be decreased by 0.143 years.

And change in distance by 30 miles (3*-0.0811732 + 3^2*0.00464) = -0.201 , which says that if distance will increase by 20 miles then change in years of education completed will be decreased by 0.201 years.

So the change will be (-0.143)-(-0.201) = 0.058

And change in distance by 60 miles (6*-0.0811732 + 6^2*0.00464) = -0.312 , which says that if distance will increase by 20 miles then change in years of education completed will be decreased by 0.312 years.

And change in distance by 70 miles (7*-0.0811732 + 7^2*0.00464) = -0.340 , which says that if distance will increase by 20 miles then change in years of education completed will be decreased by 0.340 years.

So here change will be 0.312-0.340 = 0.028 which is different from 0.058.

We will prefer (b) as adding square of variable has increased R-squared and the new variable is also significant. (b) also explains the models in better detailed way as compared to (a).

1. (i)

Solving for option (a)

Hispanic =1 female=1  with T uition = \$950, Bytest = 58, Incomehi = 0, Ownhome = 0, DadColl = 1, MomColl = 1, Cue80 = 7.1, and Stwmfg80 = \$10.06.

Ed = dist (-0.037) + 1 (0.143) + 58 (0.093) + 950 (-0.191) +  1(0.362) + 0(0.372) + 1(0.571) + 1(0.378) + 7.1 (0.029) + 10.06 (-0.043) +8.921

Ed = dist (-0.037) – 165.9

So for distance = 20 miles

Ed = 2 (-0.037) – 165.9 = -165.97

So for distance = 30 miles

Ed = 3 (-0.037) – 165.9 = -166.011

So for distance = 60 miles

Ed = 6 (-0.037) – 165.9 = -166.122

So for distance = 70 miles

Ed = 7 (-0.037) – 165.9 = -166.159

Solving for option (b)

Hispanic =1 female=1  with T uition = \$950, Bytest = 58, Incomehi = 0, Ownhome = 0, DadColl = 1, MomColl = 1, Cue80 = 7.1, and Stwmfg80 = \$10.06.

Ed = dist(-.0811732) + dist2 (.0046413 ) + 1 (0.143) + 58 (0.0926) + 950 (-0.1928) +  1(0.333) + 0(0.369) + 1(0.561) + 1(0.377) + 7.1 (0.0259) + 10.06 (-0.04255) +9.0121

Ed = dist(-.0811732) + dist2 (.0046413 ) + -167.60

For distance = 20

Ed = 2(-.0811732) + 2^2 (.0046413 ) + -167.60 = -167.74

For distance = 30

Ed = 3(-.0811732) + 3^2 (.0046413 ) + -167.60 =  -167.80

For distance = 60

Ed = 6(-.0811732) + 6^2 (.0046413 ) + -167.60 = -167.91

For distance = 70

Ed = 7(-.0811732) + 7^2 (.0046413 ) + -167.60 = -167.94

Charts a and b shows decrees in education level for (a) and (b) w.r.t increase in distance

1. (ii)

In b regression function behaves differently as square of variable is added and now in b the intensity of increase in distance to decrease in education years has decreased.

Dodcoll * momcoll has negative and significant coefficient, which says that if both mom and dad are college graduates then it will effect negatively in years of education of child.

1. No there is no such evidence that effect of Dist on ED depends on the family’s income, however both dist and income effect ed significantly,