# Regression using STATA for Development Impact Evaluation

Are you struggling with performing regression analysis in STATA? If your answer is yes, this blog is for you. Regression analysis is a statistical method used to predict the value of an independent variable. STATA is a regress command used to perform regression analysis. The specified output determines the output of this command. In this blog, we have used regression to perform a development impact evaluation in STATA. For this sample STATA homework, we have used the dataset from research conducted by Tarozzi et al. (2014). The study aimed to analyze the impact bednets treated with insecticide had in Orissa, India. We have used STATA to create a .do file for data imports and keep only baseline-present households. The solution is only meant to give you an idea of what our professionals can do. Do not hesitate to contact us if you need assistance with statistics homework based on this area. We will do everything possible to deliver your work on time.

**1. Baseline characteristics
**

a.

Experimental group | Prevalence |

Control | 0.018 |

Micro loans | 0.033 |

Free | 0.019 |

The result showed that at baseline, the prevalence rate for control group is 0.018 while for micro loans group is 0.033 and for free group is 0.019. The result indicates that only 1.8% of control group used insecticide-treated bednets at baseline while 3.3% of micro loans group and 1.9% of free group used insecticide-treated bednets at baseline.

b.

Control vs Micro credit

Is there a statistically significant difference in the prevalence of ITN use between micro credit group and the control group

The hypothesis is stated as

H_O:μ_mc=μ_control

H_1:μ_mc≠μ_control

The result of the independent t-test is presented in the table below. The result showed that average prevalence of ITN use for credit group (M=0.033, sd=0.003) is significantly different from that of control (M=0.018, sd=0.002); t(5759)=-3.76, p=.0002

Two-sample t test with equal variance

Group | Obs | Mean | Std. Err. | Std. Dev. | [95% Conf | Interval] |

control micro cr | 2,897 2,864 | .0176044 .0331704 | .0024437 .0033469 | .1315313 .1 791126 | .0128128 .0266079 | .0223961 .0397329 |

combined | 5,761 | 0253428 | 0020708 | 1571778 | 0212832 | 0294024 |

dift | -.015566 | 004137 | -.023676 | -.0074559 |

diff = mean (control) - mean (micro cr) Ho: diff = 0 t = -3.7626

Ho: diff = 0 degrees of freedom=5759

Ha: diff < 0 Pr(T It|) = 0.0002 Ha: diff > 0 Pr(T > t) = 0.9999

**Control vs Free
**

Is there a statistically significant difference in the prevalence of ITN use between free group and the control group

The hypothesis is stated as

H_O:μ_free=μ_control

H_1:μ_free≠μ_control

The result of the independent t-test is presented in the table below. The result showed that average prevalence of ITN use for free group (M=0.019, sd=0.002) is not significantly different from that of control (M=0.018, sd=0.002); t(6080)=-0.54, p=.591

The result showed that at baseline, the prevalence rate of malaria for control group is 0.112 while for micro loans group is 0.123 and for free group is 0.114. The result indicates that 11.2% of control group tested positive for malaria RDT at baseline while 12.3% of micro loans group and 11.4% of free group tested positive for malaria RDT at baseline

D

Is there a statistically significant difference in the prevalence of malaria across group at baseline?

The hypothesis is stated as

H_O:μ_free=〖μ_mc=μ〗_control

H_1:μ_free≠μ_mc≠μ_control

The result showed that we cannot reject the null hypothesis that the prevalence rate of malaria is not significantly different across arms.

**2. Endline ITN use
**

A

The result showed that for the endline, the prevalence rate for control group is 0.022 while for micro loans group is 0.162 and for free group is 0.468. The result indicates that 2.2% of control group used insecticide-treated bednets at endline while 16.2% of micro loans group and 46.6%% of free group used insecticide-treated bednets at baseline.

The result of the t-test from testing the following hypothesis is summarized in the table above

H_O 1:μ_mc=μ_control

H_1 1:μ_mc≠μ_control

H_O 2:μ_free=μ_control

H_1 2:μ_free≠μ_control

H_O 3:μ_free=μ_mc

H_1 3:μ_free≠μ_mc

The result showed that we reject the mull hypothesis for all the three hypotheses which means there is significant difference in prevalence of ITN use between control and micro loans arm, between control and free arms and between free and micro loans arm.

C

The result showed that prevalence of ITN use is significantly higher for micro credit arm than control arm and also significantly higher for free arm than control arm.

D

The result above showed that take-up of ITN is significantly for female-headed household is significantly different from that of male-headed household (p=.01).

**3. Endline malaria prevalence**

**
**

The result showed that at endline, the prevalence rate of malaria for control group is 0.183 while for micro loans group is 0.227 and for free group is 0.220. The result indicates that 18.3% of control group tested positive for malaria RDT at endline while 22.7% of micro loans group and 22% of free group tested positive for malaria RDT at endline.

B

B

The result of the t-test from testing the following hypothesis is summarized in the table above

H_O 1:μ_mc=μ_control

H_1 1:μ_mc≠μ_control

H_O 2:μ_free=μ_control

H_1 2:μ_free≠μ_control

H_O 3:μ_free=μ_mc

H_1 3:μ_free≠μ_mc

The result showed that we reject the mull hypothesis for the first two hypotheses which means there is significant difference in malaria between control and micro loans arm and between control and free arms while we could not reject the null hypothesis for the third hypothesis which means there is no significant difference in malaria between free and micro loans arm at endline.

C

The result in C confirms the result in a and b as the estimated coefficients are significant which suggests significant difference in malaria prevalence between micro loan and control group as well as between free and control group.

Clustering the standard error at village level led to opposite result compared to b. The result showed no significant difference in malaria prevalence between control and free group and between control and micro loan group.

E

Adding female and age as control variables results in a slightly lower estimated coefficients. The estimate for micro credit reduces from 0.0439 to 0.04296 while the estimated coefficient of free group reduces from 0.0367 to 0.0355.

F

The regression result showed that there is significant difference in malaria infection at endline between male and female (P<.001). Malaria infection prevalence is higher for female than male by 3.73%. The result showed there is evidence of heterogeneous effect of malaria infection by the gender of the individual.

G

The regression result showed that there is significant difference in malaria infection at endline between female-headed household and male-headed household (p<.001). Malaria infection prevalence is higher for female-headed household than male-headed household by 10.3%.

4. Local average treatment effects

A

using the micro-loan arm could lead to non-compliance. The commitment to pay at a future date will prevent some of those assigned to the group not to take the treatment and the probability of this non-compliance would be higher than if it were given for free.

B

From the result, the difference in the prevalence of malaria in the two groups is 0.0351 (3.51%).

C

Compliance rate refers to the proportion of those who observe the assignment they were given. i.e. those who were in control that did not take the treatment and those that are in treatment that took the treatment. The cross-tabulation that revealed this is shown below.

The cross-tab showed that 2,251 subjects were assigned to the control groups did not take-up ITN, these are compliers from the control side. On the other hand, 1,219 subjects from the free group took-up ITN, they are the compliers from the treatment side. Thus,

CR=(2251+1219)/4829=0.7186

The compliance rate is 71.86%

D

The wald estimator is given as

β┴^_wald=(y┴¯_1-y┴¯_0)/(x┴¯_1-x┴¯_0 )

β┴^_wald=(0.2200315-0.1849106)/(0.4806782-0.0183166)

β┴^_wald=0.076

E

The LATE from the instrumental regression model is 0.076 and is significant.