# Quiz Statistics for Social Research

*************************************
* Statistics for Social Research *
* Example Questions for Quiz 5 *
*************************************

1. A researcher wants to examine whether playing STEM-related video games has a positive
causal effect on high school students’ interest in science. However, she is also concerned about
the spurious association between these two variables, because students with better math skills
are more likely to play this type of video game and these students also tend to have a greater
interest in science.
(a) There are three variables involved in this case. Please list the three variables.
(b) Draw a diagram to illustrate the relationship between these three variables.
(c) TRUE or FALSE: If this is a spurious association, we expect to see no association between
playing STEM-related video games and interest in science among students with the same level
of math skill. ____________
(d) Please propose an alternative “third variable” (i.e. a variable that plays a similar role as
math skill in this scenario) that may also lead to a spurious association between playing video
games and interest in science. Explain why. Use no more than 2 sentences.

2. A researcher is interested in whether a country’s gender equality policy affects women’s labor
force participation rate in the country. The two variables are defined below:
Y = Women’s labor force participation rate (i.e. the percentage of women in the labor force)
X1= Gender equality policy score (higher score indicates greater support for gender equality)
(a) Write down the bivariate regression function using Y as the response variable and X1 as the
explanatory variable.
(b) The research suspects that a third variable, namely the country’s GDP (in trillions), may also
affect women’s labor force participation rate. Let X2=GDP. Write down the multiple regression
function with Y, X1, and X2. Then draw a diagram to illustrate the relationship between these
variables.
(c) In the multiple regression function specified in (b), the coefficient on X1 is estimated to be
4.2, and the coefficient on X2 is estimated to be 1.8. Interpret the coefficient on X1.
(d) In the bivariate regression function specified in (a), the coefficient on X1 is estimated to be
5.6. Why do you think we get a smaller coefficient on X1 in the multiple regression than in the
bivariate regression? Use no more than 2 sentences.

Solution

Q1.
SPURIOUS RELATIONSHIPS:
Spurious relationship: two variables are associated, yet there is absolutely no causal relationship
between the two
Example: shoe size and mortality rate
Suppose initially we hypothesize (foolishly) But this association may be a spurious relationship as follows: The original association between shoe size and the mortality rate is spurious.
(no causality at all !!!)

Q2.
MULTIPLE LINEAR REGRESSION:   EXAMPLE:   Every one-unit increase on the social support scale leads to a decrease in 0.392 on the depression
scale when we do not consider family dysfunction at all.
Every one-unit increase on the family dysfunction scale leads to an increase in 0.358 on the
depression scale when we do not consider social support at all.

A one-point increase in social support leads to a 0.07 points decrease in depression, when
controlling for family dysfunction.
A 1 point increase in family dysfunction leads to a 0.338 points increase in depression, when
controlling for social support.
The effect of social support becomes much weaker when controlling for family functioning. Q1.

(a)

The three variables involved in this study are:

1. Playing STEM-related video games
2. Interest in science
3. Math skills

(b) (c)

If there is a spurious relationship between playing STEM-related video games and student’s interest in science, then we expect to see no association between STEM-related video games and interest in science among students with the same level of math skill. This is because if there is a spurious association between these two variables, then it is actually the variable math skills which positively affects both the variables as students with better math skill are more likely to play STEM-related video games and these students also tend to have a greater interest in science and so, with the same level of math skill, we expect to see no association between playing STEM-related video games and interest in science. Thus, the statement is TRUE.

(d)

“Mental ability or IQ level of a student” may also lead to a spurious association between playing video games and interest in science. This is because, students with higher mental ability (or IQ level) are more likely to play STEM-related video games and such students are also more likely to have a greater interest in science and thus, the STEM-related video games may not have a causal effect on the interest in science, i.e., they may have a spurious association.

Q2.

A researcher is interested in whether a country’s gender equality policy affects women’ labor

force participation rate in the country. The two variables are defined below:

Y = Women’s labor force participation rate (i.e. the percentage of women in the labor force)

X1= Gender equality policy score (higher score indicates greater support for gender equality)

(a)

The bivariate regression function using Y as the response variable and X1 as the explanatory variable is:

Y = a + b * X1, where a is the percentage of women in labor force when the gender equality policy score is 0 and b is the change in the percentage of women in the labor force when the gender equality policy score increases by 1 unit.

(b)

The research suspects that a third variable, namely the country’s GDP (in trillions), may also

affect women’s labor force participation rate.

Let X2 = GDP (in trillions).

The multiple regression function with Y, X1, and X2 are:

Y = c + d * X1 + e * X2, where c is the percentage of women in labor force when the gender equality policy score is 0 and the country’s GDP is 0, d is the change in the percentage of women in the labor force when the gender equality policy score increases by 1 unit and the country’s GDP remains unchanged and e is the change in the percentage of women in labor force when the country’s GDP is increased by 1 trillion and the gender equality policy score remains unchanged.

A diagram to illustrate the relationship between these variables is: (c)

In the multiple regression function specified in (b), the coefficient on X1 is estimated to be 4.2 which means that the percentage of women in the labor force increases by 4.2 % when the gender equality policy score increases by 1 unit and the country’s GDP remains unchanged (fixed).

(d)

In the bivariate regression function specified in (a), the coefficient on X1 is estimated to be 5.6 whereas the coefficient of X1 in the multiple regression function specified in (b) is estimated to be 4.2 The smaller coefficient on X1 in the multiple regression than in the bivariate regression may be due to a positive correlation between gender equality policy score (X1) and country’s GDP (X2) and may be due to a positive effect of country’s GDP (X2) on the percentage of women in the labor force (Y).