R-squared Value Regression
Documentation for CollegeDistance Data
These data are taken from the HighSchool and Beyondsurvey conducted by the
Department of Education in 1980, with a follow-up in 1986. The survey included
students from approximately 1100 high schools.
The data used here were supplied by Professor Cecilia Rouse of Princeton University and
were used in her paper “Democratization or Diversion? The Effect of Community
Colleges on Educational Attainment,” Journal of Business and Economic Statistics, April
1995, Vol. 12, No. 2, pp 217-224.
The data in CollegeDistanceexclude students in the western states. The data in
CollegeDistanceWestincludes only those students in the western states.Series in Data Set
|ed||Years of Education Completed (See below)|
|female||1 = Female/0 = Male|
|black||1 = Black/0 = Not-Black|
|Hispanic||1 = Hispanic/0 = Not-Hispanic|
|bytest||Base Year Composite Test Score. (These are achievement tests given to high
school seniors in the sample)
|dadcoll||1 = Father is a College Graduate/ 0 = Father is not a College Graduate|
|momcoll||1 = Mother is a College Graduate/ 0 = Mother is not a College Graduate|
|incomehi||1 = Family Income > $25,000 per year/ 0 = Income ≤ $25,000 per year.|
|ownhome||1= Family Owns Home / 0 = Family Does not Own Home|
|urban||1 = School in Urban Area / = School not in Urban Area|
|cue80||County Unempolyment rate in 1980|
|stwmfg80||State Hourly Wage in Manufacturing in 1980|
|dist||Distance from 4yr College in 10’s of miles|
|tuition||Avg. State 4yr College Tuition in $1000’s|
Years of Education: Rouse computed years of education by assigning 12 years to all
members of the senior class. Each additional year of secondary education counted as a
one year. Student’s with vocational degrees were assigned 13 years, AA degrees were
assigned 14 years, BA degrees were assigned 16 years, those with some graduate
education were assigned 17 years, and those with a graduate degree were assigned 18
Yes, the last statement is true that comparing the average sales in the markets with increased marketing budget and average sales in the remaining markets will give an unbiased estimated of the true causal effect of increased marketing spending on sales because the selection of half of the markets to increase the marketing budget in those markets was done randomly and the number of regional markets is large.
The estimated intercept is 13.95586
The estimated slope is -0.07337
The average value of years of completed schooling decreases by 0.07337 years if the colleges are built 1 unit (10 miles) closer to where the students go to high school.
Bob’s high school was 20 miles from the nearest college. Using the estimated regression, Bob’s years of completed education is 13.95586 – 0.07337 * 2 = 13.80912 years
If Bob lived 10 miles from the nearest college, then the prediction would increase by 0.07337 which means that the predicted years of completed education would be 13.88429 years
The R-squared value for the regression model is 0.00745
Hence, the distance to college does not explain a large fraction of the variation in education attainment across individuals.
The standard error of the regression is 1.807 years.