R-squared Value Regression Model

R-squared Value Regression

Documentation for CollegeDistance Data

These data are taken from the HighSchool and Beyondsurvey conducted by the
Department of Education in 1980, with a follow-up in 1986. The survey included
students from approximately 1100 high schools.
The data used here were supplied by Professor Cecilia Rouse of Princeton University and
were used in her paper “Democratization or Diversion? The Effect of Community
Colleges on Educational Attainment,” Journal of Business and Economic Statistics, April
1995, Vol. 12, No. 2, pp 217-224.
The data in CollegeDistanceexclude students in the western states. The data in
CollegeDistanceWestincludes only those students in the western states.Series in Data Set

Name Desrciption
ed Years of Education Completed (See below)
female 1 = Female/0 = Male
black 1 = Black/0 = Not-Black
Hispanic 1 = Hispanic/0 = Not-Hispanic
bytest Base Year Composite Test Score. (These are achievement tests given to high
school seniors in the sample)
dadcoll 1 = Father is a College Graduate/ 0 = Father is not a College Graduate
momcoll 1 = Mother is a College Graduate/ 0 = Mother is not a College Graduate
incomehi 1 = Family Income > $25,000 per year/ 0 = Income ≤ $25,000 per year.
ownhome 1= Family Owns Home / 0 = Family Does not Own Home
urban 1 = School in Urban Area / = School not in Urban Area
cue80 County Unempolyment rate in 1980
stwmfg80 State Hourly Wage in Manufacturing in 1980
dist Distance from 4yr College in 10’s of miles
tuition Avg. State 4yr College Tuition in $1000’s

Years of Education: Rouse computed years of education by assigning 12 years to all
members of the senior class. Each additional year of secondary education counted as a
one year. Student’s with vocational degrees were assigned 13 years, AA degrees were
assigned 14 years, BA degrees were assigned 16 years, those with some graduate
education were assigned 17 years, and those with a graduate degree were assigned 18
years.

Solution 

Q1.

Yes, the last statement is true that comparing the average sales in the markets with increased marketing budget and average sales in the remaining markets will give an unbiased estimated of the true causal effect of increased marketing spending on sales because the selection of half of the markets to increase the marketing budget in those markets was done randomly and the number of regional markets is large.

Q2.

(a)

(b)

The estimated intercept is 13.95586

The estimated slope is -0.07337

The average value of years of completed schooling decreases by 0.07337 years if the colleges are built 1 unit (10 miles) closer to where the students go to high school.

(c)

Bob’s high school was 20 miles from the nearest college. Using the estimated regression, Bob’s years of completed education is 13.95586 – 0.07337 * 2 = 13.80912 years

If Bob lived 10 miles from the nearest college, then the prediction would increase by 0.07337 which means that the predicted years of completed education would be 13.88429 years

(d)

The R-squared value for the regression model is 0.00745

Hence, the distance to college does not explain a large fraction of the variation in education attainment across individuals.

(e)

The standard error of the regression is 1.807 years.

Q3.

 

 

 

(

 

(

22.69 49.08 -35.2208 -64.6842 2278.231 4184.046 1240.507
12.28 61.86 -45.6308 -51.9042 2368.432 2694.046 2082.173
97.59 167.19 39.67917 53.4258 2119.891 2854.316 1574.437
86.15 161.09 28.23917 47.3258 1336.441 2239.731 797.4507
110.21 111.82 52.29917 -1.9442 -101.68 3.779914 2735.203
80.72 190.05 22.80917 76.2858 1740.016 5819.523 520.2582
95.96 156.04 38.04917 42.2758 1608.559 1787.243 1447.739
17.48 29.51 -40.4308 -84.2542 3406.467 7098.77 1634.652
18 88.01 -39.9108 -25.7542 1027.871 663.2788 1592.874
8.72 12.15 -49.1908 -101.614 4998.487 10325.45 2419.738
67.5 153.29 9.58917 39.5258 379.0196 1562.289 91.95218
77.63 185.08 19.71917 71.3158 1406.288 5085.943 388.8457