+1 (315) 557-6473 

Top-notch SAS Code Homework Help

If your SAS code homework is giving you sleepless nights, there is no need to panic. Keep calm because our experienced and highly qualified SAS code homework helpers have got your back. Send us the details of your homework and we will deliver excellent solutions on time. Our SAS code homework help service is highly rated by students because we never compromise on quality. Besides our rates are also affordable. Our SAS code homework tutors give you the best chance of scoring a decent grade.

Wisconsin COVID-19 Data By Census Tract Boundary

Data is updated at 2:00PM CDT daily. All data are laboratory-confirmed cases of COVID-19 that we freeze once a day to verify and ensure that we are reporting accurate information. The number of people with positive/negative test results includes only Wisconsin residents who had their results reported electronically to DHS. Here are descriptions of the variables in the data.

Variable Name Variable Description
GEOID Geographic ID
State  State
CENSUS_TRACT Census Tract Number
COUNTY County Name
DATE Last Date of Report
POSITIVE Number of Positive on COVID19 Test
NEGATIVE Number of Negative on COVID19 Test
DEATHS Number of Deaths by COVID19
HOSP_YES Number of Hospitalized by COVID19
HOSP_NO Number of Not Hospitalized by COVID19
HOSP_UNKNOWN Unknown Number of Hospitalized  by COVID19
AREA_LAND Land Area Size
AREA_WATER Water Area Size
POPULATION  Total Population
POP_LT18  Percent of Population that is Less Than 18 Years
POP_65P  Percent of Population that is 65 Years and Over
HOUS_NO_VEH  Percent of households with no vehicle available
ADULT_LIMITED_ENGLISH  Percent of adults 18 years and over who have limited English ability
ADULT_SPANISH_LENG  Percent of adults 18 years and over who speak Spanish and have limited English ability
POP_BELOWPOV  Percent of Population whose income in the past 12 months is below poverty level
POP_DISABILITY  Percent of Population with a Disability
POP_MEDICAD  Percent of Population with Medicaid/Means-Tested Public Coverage
POP_MEDICARE  Percent of Population with Medicare Coverage
POP_HEALTHINS  Percent of Population with No Health Insurance Coverage
HOUS_NOSMARTPHN  Percent of Households that Have No Smartphone
HOUS_NOINTERNET  Percent of Households with No Internet Access

SAS Code

Here is the SAS code to load the data into your SAS program.

filenamewebdaturl "http://bigblue.depaul.edu/jlee141/econdata/eco520/COVID19_WI.csv" ;

proc import datafile=webdat out = COVID19 DBMS = csv replace ; run ;

run ;

/*Select 500 randomly selected census tracts in WI using YourDePaulID */

proc surveyselect data= COVID19 method=srs seed= YourID

     N=500 out= MYCOVID19 ;

run;

proc contents data=MYCOVID19 ;

run ;

Use SAS code to answer the following questions using MYCOVID19 data

Descriptive Analytics Questions.

Use descriptive statistics and plots to answer the following questions. 

(Confirmed cases means the test results show positive from the COVID19 test.)

1) Find the average number of confirmed cases, hospitalized, and deaths per 1000 persons by Census tract.

2) Find the average number of confirmed cases, hospitalized, and deaths per 1000 persons by County.

3) Find the five highest Census tracts and the five highest counties in terms of the number of confirmed cases.

4) The five Census tracts that have the highest probability to be hospitalized among the confirmed cases.

5) Find the correlation coefficients relate to the number of confirmed cases with demographic, social status, and health care service variables. Find any nonlinear relationship using scatter plots.

6) Suppose you are working as a consultant for the state of Wisconsin and make some recommendation. According to the descriptive statistics, where are the most venerable areas that the gov’t agent need to spend their resources? Explain clearly.

 Hierarchical and Non-hierarchical Clustering Analysis 

1) Hierarchical Clustering Analysis for the Demographic clusters: POP_65P, POP_BELOWPOV, POP_DISABILITY

2) Non-Hierarchical Clustering Analysis for the Demographic clusters: POP_65P, POP_BELOWPOV, POP_DISABILITY

3) Compare the results 1) and 2), and define or name the clusters using descriptive statistics and plots.

4) Test if the clusters are significant to the number of confirmed cases per 1000 persons using the ANOVA test

5) Test if the clusters are significant to the number of hospitalized per 1000 persons using the ANOVA test.

Predictive Analytics using Regression Model 

Use only TRAIN data (80%) to estimate the models and use the TEST data (20%) to perform the out-of-sample prediction

Suppose you are trying to predict the number of COVID19 confirmed cases using the demographic variables from Census data. (none of the variables related to COVID19 can be used to predict the COVID19 related dependent variables).

1) Find the regression models to explain the variation of the number of confirmed cases per 1000 persons. You can use any variables such as nonlinear variables and cluster variables to make the best models.

Model 1: Your own choices of variables 1

Model 2: Your own choices of variables 2

Model 3: Stepwise

Model 4: adjusted R square

2) Perform the out of sample prediction using the observation that were not used in the estimation. Find the following statistics and compare the results. Which model is the best performing model in terms of the following statistics?

a. MSE (mean square error)

b. RMSE (root mean square error)

c. MPE (mean percentage error)

d. MAE (mean absolute error)

3) Do the regression analysis for the number of hospitalized per 1000 persons as the dependent variable following the step 1) and 2). Explain what you find.

Solution

Use SAS code to answer the following questions using MYCOVID19 data

4. Descriptive Analytics Questions.

Use descriptive statistics and plots to answer the following questions. ( 9 points).

(Confirmed cases means the test results show positive from the COVID19 test.)

7) Find the average number of confirmed cases, hospitalized, and deaths per 1000 persons by Census tract.

procmeansdata=mycovid19;

classcensus_tract;

var positive hosp_yes deaths;

outputout = _1mean(where=(_stat_='MEAN'));

run;

procprintdata=_1mean;

varcensus_tract positive hosp_yes deaths;

run;

8) Find the average number of confirmed cases, hospitalized, and deaths per 1000 persons by County.

procmeansdata=mycovid19 nway;

class county;

var positive HOSP_YES deaths;

outputout=_2mean(where=(_stat_='MEAN'));

run;

procprintdata=_2mean;

var county positive hosp_yes deaths;

run;

SAS Code

SAS Code 1

SAS Code 2

9) Find the five highest Census tracts and the five highest counties in terms of the number of confirmed cases.

procmeansdata=mycovid19 nway;

classcensus_tract;

var positive;

outputout = _3max(where=(_stat_='MAX'));

run;

procsortdata=_3max(obs=5);

bydescending positive;

run;

SAS Code 3

procmeansdata=mycovid19 nway;

class county;

var positive;

outputout = _3max1(where=(_stat_='MAX'));

run;

procsortdata=_3max1(obs=5);

bydescending positive;

SAS Code 4

10) The five Census tracts that have the highest probability to be hospitalized among the confirmed cases.

data mycovid19_1;

set mycovid19;

prob=hosp_yes/positive;

run;

procmeansdata=mycovid19_1 nway;

class county;

varprob;

outputout = _4max(where=(_stat_='MAX'));

run;

procsortdata=_4max(obs=5);

bydescendingprob;

run;

SAS Code 5

11) Find the correlation coefficients relate to the number of confirmed cases with demographic, social status, and health care service variables. Find any nonlinear relationship using scatter plots.

proccorrdata=mycovid19;

var positive population HOUS_NO_VEH ADULT_LIMITED_ENGLISH POP_BELOWPOV POP_MEDICAD POP_HEALTHINS;

run;

SAS Code 6

The correlation between confirmed cases and:

i. Population is 0.8604.

ii. Percent of households with no vehicle available is -0.187.

iii. Percent of Population whose income in the past 12 months is below poverty level is -0.2111.

iv. Percent of Population with Medicaid/Means-Tested Public Coverage is -0.1946

Scatter plots

i. There is a linear relationship between confirmed cases and Population

procsgplotdata=mycovid19;

scattery = positive x = population;

run;

SAS Code 7

ii. There is a linear relationship between confirmed cases and Percent of households with no vehicle available.

procsgplotdata=mycovid19;

scattery = positive x = HOUS_NO_VEH;

run;

SAS Code 8

iii. There is a linear relationship between confirmed cases and Percent of Population whose income in the past 12 months is below poverty level.

procsgplotdata=mycovid19;

scattery = positive x = POP_BELOWPOV;

run;

SAS Code 9

iv. There is a linear relationship between confirmed cases and Percent of Population with Medicaid/Means-Tested Public Coverage.

procsgplotdata=mycovid19;

scattery = positive x = POP_MEDICAD;

run;

SAS Code 10

12) Suppose you are working as a consultant for the state of Wisconsin and make some recommendation. According to the descriptive statistics, where are the most venerable areas that the gov’t agent need to spend their resources? Explain clearly.

Based on the COVID-19 data collected on the state of Wisconsin through the DHS, there seems to be a relationship between population size and number of confirmed cases. My suggestion is therefore, to closely monitor the areas with a large population, to make sure the occupants are adhering to the covid-19 protocols set by WHO. The census tracts that have the highest confirmed cases are 1, 2, 2.02, 2.02 and 2.05; and the counties with the highest confirmed cases are: Brown, Barron, Bayfiel, Adams and Ashland. These sites and areas should be closely monitored, and strict guidelines put in place to ensure no further spread.

5. Hierarchical and Non-hierarchical Clustering Analysis on ( 7 points)

6) Hierarchical Clustering Analysis for the Demographic clusters: POP_65P, POP_BELOWPOV, POP_DISABILITY

procclusterdata=mycovid19 method=ward;

var POP_65P POP_BELOWPOV POP_DISABILITY;

run;

SAS Code 11

7) Non-Hierarchical Clustering Analysis for the Demographic clusters: POP_65P, POP_BELOWPOV, POP_DISABILITY

procfastclusdata=mycovid19 maxclusters=10maxiter=100converge=0

mean=mean out=prelim;

var POP_65P POP_BELOWPOV POP_DISABILITY;

run;

SAS Code 12

8) Compare the results 1) and 2), and define or name the clusters using descriptive statistics and plots. 

SAS Code 13

9) Test if the clusters are significant to the number of confirmed cases per 1000 persons using the ANOVA test

10) Test if the clusters are significant to the number of hospitalized per 1000 persons using the ANOVA test.

6. Predictive Analytics using Regression Model (9 points)

Use only TRAIN data (80%) to estimate the models and use the TEST data (20%) to perform the out-of-sample prediction

Suppose you are trying to predict the number of COVID19 confirmed cases using the demographic variables from Census data. (none of the variables related to COVID19 can be used to predict the COVID19 related dependent variables).