# Multiple Regression Basics

In this exercise, we will use data from the file JobChars.csv. This is a .csv file with variable names in the header that you can easily pull into R using the read.csv function. If done correctly, you should have 12 variables and 1142 cases. The research context is a survey taken by employees at a large organization on one particular day. Here are the variables and their meaning in the order they appear in the text file:

Commute        Minutes spent traveling to work on the day in question

TaskVar           How much variety did you have in the types of tasks today?

LMX                 How well did you get along with your supervisor today?

JobSat              How satisfied were you with your job today?

Stress               How much stress did you experience from your job today?

sOCB               Supervisor ratings of focal employee helpfulness on that day

sJPERF             Supervisor ratings of focal employee job performance on that day

sCWB               Supervisor ratings of focal employee counterproductive work behaviors today

Part I: Multiple Regression Analysis

Once you’ve loaded the data into R, proceed with the following regression analyses. Below is some generic code for the lm function in R and the lm.beta package and function as well. In the code, DV stands for the Dependent Variable (aka outcome variable, criterion variable, or Y) and IV stands for Independent Variable (aka, predictor variable, or X), numbered here from 1 to 3 different predictors.

# This code runs multiple regression for 3 predictors

output1 <- lm(Y ~ IV1 + IV2 +IV3, mydata)

summary(output1)

# This code installs and loads the lm.beta package and obtains

# new output that includes standardized regression coefficients

# (betas)

install.packages(“lm.beta”)

library(lm.beta)

output2 <- lm.beta(output1)

summary(output2)

In this part of the exercise, supervisory-rated Organizational Citizenship Behavior (sOCB) is the dependent variable. Autonomy, TaskVar, TaskSig, TaskID, and Feedbk represent predictor variables that we are interested in evaluating (see research on the Job Characteristics Model by Hackman and Oldham).

1. Begin by running an analysis that regressessOCB on Autonomy, TaskVar, TaskSig, TaskID, and Feedbk. From the output for the lm and lm.beta analyses, identify the standardized and unstandarized regression coefficients for each predictor and the overall level of prediction of this model.
2. Compare the magnitudes of the standardized regression coefficients with the magnitudes of the unstandardized regression coefficients. Do they differ? If so, to what extent and why?
3. Take a look at the correlations between all of the variables (i.e., using the cor(mydata,use=”pairwise.complete.obs”)function). Compare the magnitudes of the standardized regression coefficients for each variable in the model with the zero-order correlations for the same variable with sOCB (i.e., the validity coefficients). Do they differ? If so, is there a pattern to the differences that you see? Why do they differ?
4. R2 and adjusted R2 are both reported in the output. How does the value of adj. R2 in this analysis differ from the value for R2 and what might you conclude from a comparison of the two values?
5. The square root of R2 is R (Multiple R) which represents the multiple correlation of all of the variables in the predictor model with the DV. If you summed the validity coefficients (from 1b), how does this value compare to Multiple R? Why are these values different?
6. Now try running the same set of predictors for a different DV. This time, use supervisory-rated counterproductive work behavior (sCWB) as the DV with the same set of predictors.
7. Does this model do a better or worse job of predicting the DV than the model in 1? How can you tell?
8. Compare the b-weights and beta-weights between the two models. Do the same IVs predict both outcomes? What appear to be the best predictors of the two criteria?
9. In many instances, we use multiple regression to run hierarchical analyses. In these analyses, we are primarily interested in the extent to which the addition of variables into a regression model increases the value of R2. The traditional order in which variables are added to model is as follows: First step—control variables, Second step, previously identified predictors, Last step—new study variables. Run a hierarchical analysis for the sJPERF outcome variable. In this model, Commute and Stress are control variables, the set of 5 job characteristics (Autonomy, TaskVar, TaskSig, TaskID, and Feedbk) are previously identified predictors, and LMX is the new study variable. To complete the steps, run three separate regression analyses, adding the new variables at each step. Don’t forget to run the lm.beta function to obtain both b-weights and beta-weights. Keep track of the results of each step (e.g., by using different object names for the different models, such as step1 <-, step2 <-, and step3 <-).
10. By how much does the value of R 2 change at each step? Run the anova(stepX,stepX+1) function to determine (and report here) if the steps are significantly different.
11. In the last model, interpret the beta-weight for LMX.
12. Finally, run the same set of variables in a hierarchical analysis, but reverse the order of the steps. So, LMX will be entered first, then the set of job characteristics, and finally, the control variables.
1. Look over the standardized and unstandardized output for these models as well as the R-squared information. Did you prefer to see how the new variable (LMX) changed after adding prior variables and control variables or was it more informative the way you did it in 3? Why or why not?

Solution

JobChars

library(dplyr)

library(ggplot2)
library(corrplot)
library(lm.beta)
library(statisticalModeling)

# Data preprocessing

str(df)

## ‘data.frame’:    1142 obs. of  12 variables:
##  \$ Commute : int  0 0 0 0 0 0 0 0 0 0 …
##  \$ Autonomy: num  1 2 2 2 2.22 2.22 2.44 2.44 2.56 2.89 …
##  \$ TaskVar : num  4 2 4 5 2 2 1 5 4 2 …
##  \$ TaskSig : num  4 3.67 4 4.33 2.67 2.67 1 5 5 3.67 …
##  \$ TaskID  : num  3.33 4 3 1.67 2 3.67 3 2 3.33 1.67 …
##  \$ Feedbk  : num  3.67 4 3 3 1 4 1.67 2 2 3 …
##  \$ LMX     : num  4 3.25 3.75 3.5 1 2 1 3.5 5 3.5 …
##  \$ JobSat  : num  5 3.33 4 2 3.67 3 2.33 4 4 2 …
##  \$ Stress  : num  2 2 2.33 4 3.67 2.67 3 2 2.67 3.33 …
##  \$ sOCB    : num  3 NA 5 5 NA NANA 5.67 3 3 …
##  \$ sJPERF  : num  5 NA 5 4.25 NA NANA 5 5 5 …
##  \$ sCWB    : num  1.33 NA 1 1.67 NA NANA 1 1 1 …

summary(df)

##  Min.   :  0.00   Min.   :1.000   Min.   :1.000   Min.   :1.000
##  1st Qu.:  2.00   1st Qu.:3.780   1st Qu.:3.000   1st Qu.:2.330
##  Median : 20.00   Median :4.000   Median :4.000   Median :3.000
##  Mean   : 27.06   Mean   :4.025   Mean   :3.662   Mean   :3.015
##  3rd Qu.: 39.25   3rd Qu.:4.560   3rd Qu.:4.000   3rd Qu.:4.000
##  Max.   :560.00   Max.   :5.000   Max.   :5.000   Max.   :5.000
##
##  Min.   :1.000   Min.   :1.000   Min.   :1.000   Min.   :1.00
##  1st Qu.:2.330   1st Qu.:2.000   1st Qu.:3.000   1st Qu.:3.33
##  Median :3.330   Median :3.000   Median :3.500   Median :4.00
##  Mean   :3.151   Mean   :2.707   Mean   :3.376   Mean   :3.74
##  3rd Qu.:4.000   3rd Qu.:3.330   3rd Qu.:4.000   3rd Qu.:4.00
##  Max.   :5.000   Max.   :5.000   Max.   :5.000   Max.   :5.00
##
##      Stress           sOCBsJPERFsCWB
##  Min.   :1.000   Min.   :1.000   Min.   :2.250   Min.   :1.000
##  1st Qu.:2.000   1st Qu.:3.330   1st Qu.:4.000   1st Qu.:1.000
##  Median :2.670   Median :4.000   Median :4.500   Median :1.000
##  Mean   :2.759   Mean   :4.071   Mean   :4.384   Mean   :1.104
##  3rd Qu.:3.330   3rd Qu.:5.000   3rd Qu.:5.000   3rd Qu.:1.000
##  Max.   :5.000   Max.   :6.000   Max.   :5.000   Max.   :4.000
##                  NA’s   :412     NA’s   :407     NA’s   :406

sum(is.na(df))

##  1225

df<-apply(df, 2, function(x) {
if(is.numeric(x)) ifelse(is.na(x), median(x, na.rm = T), x) else x
}) %>%as.data.frame()
corrplot(cor(df)) # Part1

Let’s construct the regression model.

output1 <-lm.beta(model1)
summary(model1)

##
## Call:
##     data = df)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -3.3198 -0.3094 -0.0048  0.3712  2.3255
##
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)  4.62822    0.18228  25.391  < 2e-16 ***
## Autonomy    -0.13041    0.03738  -3.489 0.000504 ***
## TaskVar      0.07683    0.03248   2.365 0.018183 *
## TaskSig     -0.23706    0.03215  -7.374 3.18e-13 ***
## TaskID       0.02472    0.02753   0.898 0.369302
## Feedbk       0.10999    0.03000   3.667 0.000257 ***
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
##
## Residual standard error: 0.8794 on 1136 degrees of freedom
## Multiple R-squared:  0.05498,    Adjusted R-squared:  0.05082
## F-statistic: 13.22 on 5 and 1136 DF,  p-value: 1.549e-12

summary(output1)

##
## Call:
##     data = df)
##
## Residuals:
##     Min      1Q  Median      3Q     Max
## -3.3198 -0.3094 -0.0048  0.3712  2.3255
##
## Coefficients:
##             Estimate Standardized Std. Error t value Pr(>|t|)
## (Intercept)  4.62822      0.00000    0.18228  25.391  < 2e-16 ***
## Autonomy    -0.13041     -0.10480    0.03738  -3.489 0.000504 ***
## TaskVar      0.07683      0.07688    0.03248   2.365 0.018183 *
## TaskSig     -0.23706     -0.25399    0.03215  -7.374 3.18e-13 ***
## TaskID       0.02472      0.02770    0.02753   0.898 0.369302
## Feedbk       0.10999      0.12111    0.03000   3.667 0.000257 ***
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
##
## Residual standard error: 0.8794 on 1136 degrees of freedom
## Multiple R-squared:  0.05498,    Adjusted R-squared:  0.05082
## F-statistic: 13.22 on 5 and 1136 DF,  p-value: 1.549e-12

coef.lm.beta(output1)

##  0.00000000 -0.10479903  0.07688488 -0.25398935  0.02770263  0.12111179

fmodel(model1) use =”pairwise.complete.obs”)
print(cor1)

## sOCB      1.00000000 -0.09275147 -0.01291696 -0.17054121 -0.01000259
## Autonomy -0.09275147  1.00000000  0.16032774  0.05792463  0.21571056
## TaskVar  -0.01291696  0.16032774  1.00000000  0.40412507 -0.04357096
## TaskSig  -0.17054121  0.05792463  0.40412507  1.00000000  0.15846109
## TaskID   -0.01000259  0.21571056 -0.04357096  0.15846109  1.00000000
## Feedbk    0.02653020  0.06983023  0.25473028  0.44634538  0.23530669
##              Feedbk
## sOCB     0.02653020
## Autonomy 0.06983023
## Feedbk   1.00000000

corrplot(cor1) The regression coefficients (“betas”) are effect sizes. In a simple linear regression, this is the slope of the regression line, in a multiple linear regression, this is the slope of the (hyper-)plane in the direction of the predictor. This means: the value of beta tells you how much the predicted value changes when the corresponding predictor is increased by 1 unit, holding all other predictors constant. Interpretation always depends on the particular purpose, question, experimental design and the whole context. There is never any reliable interpretation without specifying a context. Secondly, the coeff.of the betas. So it makes sense.

In this model, the standardized and unstandardized coefficients differ. The betas will be estimated so that the explanation of the total effect is balanced between the two predictors. But if both predictors is explain the same things, any other distribution of the values for the betas that give the same effect in sum will be equally valid, so there is a large uncertainty in these estimates.

The model has R squared 0.05082 and p-value 1.549e-12. The most significant variables in the model: Autonomy where p-value 0.000504, TaskSig where p-value 3.18e-13, Feedbk where p-value 0.000257. Both R2 and the adjusted R2 give you an idea of how many data points fall within the line of the regression equation. However, there is one main difference between R2 and the adjusted R2: R2 assumes that every single variable explains the variation in the dependent variable. The adjusted R2 tells you the percentage of variation explained by only the independent variables that actually affect the dependent variable.

R-Square, also known as the Coefficient of determination is a commonly used statistic to evaluate model fit. R-square is 1 minus the ratio of residual variability. When the variability of the residual values around the regression line relative to the overall variability is small, the predictions from the regression equation are good. For example, if there is no relationship between the X and Y variables, then the ratio of the residual variability of the Y variable to the original variance is equal to 1.0. Then R-square would be 0. If X and Y are perfectly related then there is no residual variance and the ratio of variance would be 0.0, making R-square = 1. In most cases, the ratio and R-square will fall somewhere between these extremes, that is, between 0.0 and 1.0. This ratio value is immediately interpretable in the following manner. If we have an R-square of 0.05082 then we know that the variability of the Y values around the regression line is 1-0.4 times the original variance; in other words we have explained 95% of the original variability, and are left with 60% residual variability. Ideally, we would like to explain most if not all of the original variability. The R-square value is an indicator of how well the model fits the data (e.g., an R-square close to 1.0 indicates that we have accounted for almost all of the variability with the variables specified in the model).

Standardized partial coefficients have the same interpretation as unstandardized partial coefficients except that the units are now standard deviations rather than arbitrary units. Therefore, standardized partial coefficients can be interpreted as the number of standard deviations the outcome increases for every standard deviation increase in the predictor, holding all other predictors constant. Most researchers have an understanding of standard deviations as a unit of measurement, theoretically making standardized partial coefficients easier to interpret than their unstandardized counterparts. In our example, the standardized coefficients differ from the correlation of a zero-order variable. Researchers are familiar with correlations: they range from -1 to 1, have standard deviation units, and researchers know what values are considered weak versus strong in their scientific field. Indeed, the standardized coefficient from a simple regression is the (zero-order) correlation between the predictor and outcome. However, when moving to multiple regression, standardized partial coefficients are not on the correlation metric. The added phrase “while holding all other predictors constant” changes the interpretation. Standardized partial coefficients range from -∞ to +∞ rather than -1 to +1 and have fractions of standard deviation units rather than standard deviation units. This results in an unclear understanding of what a weak versus strong standardized partial coefficient is.

Part2

Let’s construct the regression model.

output2 <-lm.beta(model2)
summary(model2)

##
## Call:
##     data = df)
##
## Residuals:
##      Min       1Q   Median       3Q      Max
## -0.21435 -0.09156 -0.06395 -0.02532  2.95056
##
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  1.005e+00  5.982e-02  16.794  < 2e-16 ***
## Autonomy    -8.521e-03  1.227e-02  -0.695 0.487450
## TaskVar      3.343e-02  1.066e-02   3.136 0.001755 **
## TaskSig     -4.052e-02  1.055e-02  -3.841 0.000129 ***
## TaskID       3.059e-02  9.033e-03   3.387 0.000732 ***
## Feedbk      -9.880e-06  9.844e-03  -0.001 0.999199
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
##
## Residual standard error: 0.2886 on 1136 degrees of freedom
## Multiple R-squared:  0.02248,    Adjusted R-squared:  0.01818
## F-statistic: 5.225 on 5 and 1136 DF,  p-value: 9.487e-05

summary(output2)

##
## Call:
##     data = df)
##
## Residuals:
##      Min       1Q   Median       3Q      Max
## -0.21435 -0.09156 -0.06395 -0.02532  2.95056
##
## Coefficients:
##               Estimate Standardized Std. Error t value Pr(>|t|)
## (Intercept)  1.005e+00    0.000e+00  5.982e-02  16.794  < 2e-16 ***
## Autonomy    -8.521e-03   -2.122e-02  1.227e-02  -0.695 0.487450
## TaskVar      3.343e-02    1.037e-01  1.066e-02   3.136 0.001755 **
## TaskSig     -4.052e-02   -1.346e-01  1.055e-02  -3.841 0.000129 ***
## TaskID       3.059e-02    1.062e-01  9.033e-03   3.387 0.000732 ***
## Feedbk      -9.880e-06   -3.372e-05  9.844e-03  -0.001 0.999199
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
##
## Residual standard error: 0.2886 on 1136 degrees of freedom
## Multiple R-squared:  0.02248,    Adjusted R-squared:  0.01818
## F-statistic: 5.225 on 5 and 1136 DF,  p-value: 9.487e-05

fmodel(model2) use =”pairwise.complete.obs”)
print(cor2)

## sCWB      1.00000000 0.01052185  0.04126681 -0.07706961  0.07580980
## Autonomy  0.01052185 1.00000000  0.16032774  0.05792463  0.21571056
## TaskVar   0.04126681 0.16032774  1.00000000  0.40412507 -0.04357096
## TaskSig  -0.07706961 0.05792463  0.40412507  1.00000000  0.15846109
## TaskID    0.07580980 0.21571056 -0.04357096  0.15846109  1.00000000
## Feedbk   -0.01016658 0.06983023  0.25473028  0.44634538  0.23530669
##               Feedbk
## sCWB     -0.01016658
## Autonomy  0.06983023
## Feedbk    1.00000000

corrplot(cor2) coef.lm.beta(output2)

##  0.000000e+00 -2.122136e-02  1.036864e-01 -1.345619e-01  1.062359e-01
##        Feedbk
## -3.371585e-05

The model has R squared 0.01818 and p-value 9.487e-05. The most significant variables in the model: TaskSig where p-value 0.000129, TaskID where p-value 0.000732. This model is worse than the previous one. R squared less than previous model. Beta coefficients differ. In the second model, the beta coefficients are very small, indicating a slight slope in the hyperplane relative to the predicted value. Adjusted R-squared is 0.01818 but Multiple R-squared is 0.02248. Correlation coefficients also indicate weak dependencies between variables. We can not use this model to explain the variables. Beta weights are very small, which suggests that variables poorly explain the predicted value.

Part3

anova(modelp3.0) # Sum sq 316.95

## Analysis of Variance Table
##
## Response: sJPERF
##             Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 1141 316.95 0.27778

summary(modelp3.1)

##
## Call:
##     Feedbk + LMX, data = df)
##
## Residuals:
##      Min       1Q   Median       3Q      Max
## -2.07563 -0.31562  0.08942  0.39341  0.90979
##
## Coefficients:
##              Estimate Std. Error t value Pr(>|t|)
## (Intercept)  3.891296   0.111118  35.019  < 2e-16 ***
## Autonomy     0.045185   0.022026   2.051   0.0405 *
## TaskVar     -0.040926   0.018938  -2.161   0.0309 *
## TaskSig      0.092087   0.018736   4.915 1.02e-06 ***
## TaskID      -0.034106   0.016041  -2.126   0.0337 *
## Feedbk       0.008746   0.018421   0.475   0.6350
## LMX          0.091278   0.018221   5.010 6.32e-07 ***
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
##
## Residual standard error: 0.5123 on 1135 degrees of freedom
## Multiple R-squared:  0.06031,    Adjusted R-squared:  0.05535
## F-statistic: 12.14 on 6 and 1135 DF,  p-value: 2.895e-13

anova(modelp3.1) #Residuals 297.832

## Analysis of Variance Table
##
## Response: sJPERF
##             Df  Sum Sq Mean Sq F value    Pr(>F)
## Autonomy     1   1.748  1.7479  6.6612  0.009978 **
## TaskVar      1   0.217  0.2170  0.8270  0.363343
## TaskSig      1   8.395  8.3946 31.9908 1.959e-08 ***
## TaskID       1   0.934  0.9336  3.5580  0.059515 .
## Feedbk       1   1.238  1.2377  4.7166  0.030079 *
## LMX          1   6.585  6.5853 25.0959 6.322e-07 ***
## Residuals 1135 297.832  0.2624
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

lm.beta(modelp3.1)

##
## Call:
##     Feedbk + LMX, data = df)
##
## Standardized Coefficients::
##  0.00000000  0.06218675 -0.07014089  0.16897361 -0.06545282  0.01649208
##         LMX
##  0.15614207

fmodel(modelp3.1) effect_size(modelp3.1, ~Autonomy)

## 1 0.04518537        4    4.725357       4       3   3.33      3 3.5

summary(modelp3.2)

##
## Call:
##     Feedbk + LMX + Commute, data = df)
##
## Residuals:
##      Min       1Q   Median       3Q      Max
## -2.02893 -0.31059  0.09081  0.39319  0.91726
##
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  3.9569673  0.1132964  34.926  < 2e-16 ***
## Autonomy     0.0434907  0.0219700   1.980  0.04800 *
## TaskVar     -0.0423576  0.0188897  -2.242  0.02513 *
## TaskSig      0.0875350  0.0187529   4.668 3.41e-06 ***
## TaskID      -0.0350189  0.0159973  -2.189  0.02880 *
## Feedbk       0.0090268  0.0183668   0.491  0.62319
## LMX          0.0895906  0.0181774   4.929 9.51e-07 ***
## Commute     -0.0011849  0.0004274  -2.772  0.00566 **
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
##
## Residual standard error: 0.5108 on 1134 degrees of freedom
## Multiple R-squared:  0.06664,    Adjusted R-squared:  0.06088
## F-statistic: 11.57 on 7 and 1134 DF,  p-value: 2.951e-14

anova(modelp3.2) #Residuals 295.827

## Analysis of Variance Table
##
## Response: sJPERF
##             Df  Sum Sq Mean Sq F value    Pr(>F)
## Autonomy     1   1.748  1.7479  6.7004  0.009762 **
## TaskVar      1   0.217  0.2170  0.8318  0.361934
## TaskSig      1   8.395  8.3946 32.1793 1.783e-08 ***
## TaskID       1   0.934  0.9336  3.5789  0.058772 .
## Feedbk       1   1.238  1.2377  4.7444  0.029599 *
## LMX          1   6.585  6.5853 25.2437 5.866e-07 ***
## Commute      1   2.005  2.0051  7.6863  0.005655 **
## Residuals 1134 295.827  0.2609
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

lm.beta(modelp3.2)

##
## Call:
##     Feedbk + LMX + Commute, data = df)
##
## Standardized Coefficients::
##  0.00000000  0.05985439 -0.07259477  0.16062076 -0.06720536  0.01702208
##         LMX     Commute
##  0.15325597 -0.08032231

fmodel(modelp3.2) summary(modelp3.3)

##
## Call:
##     Feedbk + LMX + Commute + Stress, data = df)
##
## Residuals:
##      Min       1Q   Median       3Q      Max
## -2.02023 -0.30888  0.08742  0.38835  0.91664
##
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)
## (Intercept)  4.1744204  0.1261053  33.103  < 2e-16 ***
## Autonomy     0.0321884  0.0220368   1.461 0.144384
## TaskVar     -0.0300414  0.0190497  -1.577 0.115075
## TaskSig      0.0905227  0.0186570   4.852 1.39e-06 ***
## TaskID      -0.0472796  0.0162203  -2.915 0.003629 **
## Feedbk       0.0110204  0.0182643   0.603 0.546375
## LMX          0.0855076  0.0181001   4.724 2.60e-06 ***
## Commute     -0.0010335  0.0004267  -2.422 0.015579 *
## Stress      -0.0663895  0.0173233  -3.832 0.000134 ***
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1
##
## Residual standard error: 0.5077 on 1133 degrees of freedom
## Multiple R-squared:  0.07858,    Adjusted R-squared:  0.07208
## F-statistic: 12.08 on 8 and 1133 DF,  p-value: < 2.2e-16

anova(modelp3.3) #Residuals 292.042

## Analysis of Variance Table
##
## Response: sJPERF
##             Df  Sum Sq Mean Sq F value    Pr(>F)
## Autonomy     1   1.748  1.7479  6.7813 0.0093324 **
## TaskVar      1   0.217  0.2170  0.8419 0.3590537
## TaskSig      1   8.395  8.3946 32.5677 1.469e-08 ***
## TaskID       1   0.934  0.9336  3.6221 0.0572693 .
## Feedbk       1   1.238  1.2377  4.8017 0.0286353 *
## LMX          1   6.585  6.5853 25.5484 5.027e-07 ***
## Commute      1   2.005  2.0051  7.7790 0.0053741 **
## Stress       1   3.786  3.7857 14.6871 0.0001339 ***
## Residuals 1133 292.042  0.2578
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

lm.beta(modelp3.3)

##
## Call:
##     Feedbk + LMX + Commute + Stress, data = df)
##
## Standardized Coefficients::
##  0.00000000  0.04429960 -0.05148655  0.16610295 -0.09073515  0.02078143
##         LMX     Commute      Stress
##  0.14627158 -0.07005963 -0.11625054

fmodel(modelp3.3) anova(modelp3.1, modelp3.2, modelp3.3)

## Analysis of Variance Table
##
## Model 2: sJPERF ~ Autonomy + TaskVar + TaskSig + TaskID + Feedbk + LMX +
##     Commute
## Model 3: sJPERF ~ Autonomy + TaskVar + TaskSig + TaskID + Feedbk + LMX +
##     Commute + Stress
##   Res.Df    RSS Df Sum of Sq      F    Pr(>F)
## 1   1135 297.83
## 2   1134 295.83  1    2.0051  7.779 0.0053741 **
## 3   1133 292.04  1    3.7857 14.687 0.0001339 ***
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

Here, we can see that each successive model is significant above and beyond the previous one. This suggests that each predictor added along the way is making an important contribution to the overall model.

When we use anova() with a single model, it shows analysis of variance for each variable. However, when we use anova() with multiple models, it does model comparisons.

Model 0: SSTotalSSTotal = 316,95 (no predictors)

Model 1: SSResidualSSResidual = 297.83 (after adding basic variables)

Model 2: SSResidualSSResidual = 295.83,SSDifferenceSSDifference = 2.0051, FF(1,96) = 7.779, pp = 0.0053741 (after adding Commute)

Model 3: SSResidualSSResidual = 292.04, SSDifferenceSSDifference = 3.7857, FF(1,95) = 14.687, pp = 0.0001339 (after adding Stress)

By adding Commute, the model accounts for additional SSSS 2 and it was a statistically significant change according to the corresponding F-statistic and p-value. The R2R2 increased by +0.6% (2.0051 / 316,95 = 0.1025399) in Model 2. By adding Stress, the model accounts for additional SSSS 3.7857 and it was statistically significant again. The R2R2 increased by +1.1% (3.7857 / 316,95 = 0.06579513) in Model 3.

Aside from the coefficients of variables, let’s take a look at R2 of Model 1, 2, and 3, which are 0.05535, 0.06088, and 0,07208 respectively. The R2 changes computed using anova() results correspond to differences in R2 in lm()results for each model: 0.06088 – 0.05535 = 0,00553 for Model 2 and 0,07208– 0.06088= 0,0112 for Model 3 (with rounding errors). Although we can compute R2 differences between models using lm() results, lm() results don’t provide corresponding F-statistics and p-values to an increased R2. And it’s important to remember that adding variables always increases R2, whether or not it actually explains additional variation in the DV. That’s why it’s crucial to perform F-tests and not just rely on the difference in R2 between models.

Beta weight for variable Commute is -0.07, for variable Stress is -0.11. ###let’s visualize the angle of inclination

fmodel(modelp3.2, ~Commute) fmodel(modelp3.3, ~Stress) fmodel(modelp3.3, ~Commute +Stress) Part4

modelp4.0<-lm(LMX ~1, data =df)

anova(modelp4.0)

## Analysis of Variance Table
##
## Response: LMX
##             Df Sum Sq Mean Sq F value Pr(>F)
## Residuals 1141 927.47 0.81286

anova(modelp4.1) #Residuals 790.40

## Analysis of Variance Table
##
## Response: LMX
##             Df Sum Sq Mean Sq  F value    Pr(>F)
## Autonomy     1  26.72  26.716  38.3971 8.059e-10 ***
## TaskVar      1  13.10  13.099  18.8269 1.559e-05 ***
## TaskSig      1   8.23   8.225  11.8217 0.0006066 ***
## TaskID       1   1.05   1.053   1.5133 0.2188910
## Feedbk       1  87.97  87.971 126.4354 < 2.2e-16 ***
## Residuals 1136 790.40   0.696
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

lm.beta(modelp4.1)

##
## Call:
##     data = df)
##
## Standardized Coefficients::
##  0.00000000  0.14638296  0.04433831 -0.03565781 -0.02936169  0.35268875

fmodel(modelp4.1) anova(modelp4.2) #Residuals 789.52

## Analysis of Variance Table
##
## Response: LMX
##             Df Sum Sq Mean Sq  F value    Pr(>F)
## Autonomy     1  26.72  26.716  38.4064 8.024e-10 ***
## TaskVar      1  13.10  13.099  18.8315 1.555e-05 ***
## TaskSig      1   8.23   8.225  11.8246 0.0006057 ***
## TaskID       1   1.05   1.053   1.5137 0.2188358
## Feedbk       1  87.97  87.971 126.4659 < 2.2e-16 ***
## Commute      1   0.89   0.886   1.2735 0.2593413
## Residuals 1135 789.52   0.696
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

lm.beta(modelp4.2)

##
## Call:
##     Commute, data = df)
##
## Standardized Coefficients::
##  0.00000000  0.14531312  0.04333566 -0.03886166 -0.03000938  0.35249928
##     Commute
## -0.03119300

fmodel(modelp4.2) anova(modelp4.3) #Residuals 786.78

## Analysis of Variance Table
##
## Response: LMX
##             Df Sum Sq Mean Sq  F value    Pr(>F)
## Autonomy     1  26.72  26.716  38.5060 7.640e-10 ***
## TaskVar      1  13.10  13.099  18.8803 1.517e-05 ***
## TaskSig      1   8.23   8.225  11.8552  0.000596 ***
## TaskID       1   1.05   1.053   1.5176  0.218240
## Feedbk       1  87.97  87.971 126.7938 < 2.2e-16 ***
## Commute      1   0.89   0.886   1.2768  0.258725
## Stress       1   2.74   2.735   3.9425  0.047321 *
## Residuals 1134 786.78   0.694
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

lm.beta(modelp4.3)

##
## Call:
##     Commute + Stress, data = df)
##
## Standardized Coefficients::
##  0.00000000  0.13709376  0.05365619 -0.03600759 -0.04157730  0.35314283
##     Commute      Stress
## -0.02599415 -0.05766579

fmodel(modelp4.1) anova(modelp4.1, modelp4.2, modelp4.3)

## Analysis of Variance Table
##
## Model 3: LMX ~ Autonomy + TaskVar + TaskSig + TaskID + Feedbk + Commute +
##     Stress
##   Res.Df    RSS Df Sum of SqF  Pr(>F)
## 1   1136 790.40
## 2   1135 789.52  1   0.88589 1.2768 0.25873
## 3   1134 786.78  1   2.73536 3.9425 0.04732 *
## —
## Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ‘ 1

Model 0: SSTotalSSTotal = 927.47 (no predictors)

Model 1: SSResidualSSResidual = 790.40 (after adding basic variables)

Model 2: SSResidualSSResidual = 789.52,SSDifferenceSSDifference = 0.88589, FF(1,96) = 1.2768, pp = 2.2e-16 (after adding Commute)

Model 3: SSResidualSSResidual = 786.78, SSDifferenceSSDifference = 2.73536, FF(1,95) = 3.9425, pp = 2.2e-16 (after adding Stress)

By adding Commute, the model accounts for additional SSSS 0.88589 and it was a statistically significant change according to the corresponding F-statistic and p-value. The R2R2 increased by +0,001120825 (0.88589 / 790.40 = 0,001120825) in Model 2. By adding Stress, the model accounts for additional SSSS 2.73536 and it was statistically significant again. The R2R2 increased by +0,003453947 (2.73536 / 790.40 = 0,003453947) in Model 3.

Aside from the coefficients of variables, let’s take a look at R2 of Model 1, 2, and 3, which are 0.144, 0.1442, and 0.1465 respectively. The R2 changes computed using anova() results correspond to differences in R2 in lm()results for each model: 0.144 – 0.1442 = 0,0002 for Model 2 and 0.1465 – 0.144 = 0,0025 for Model 3 (with rounding errors). As we see, the contribution is not significant in the accuracy of the model when adding new variables.

### let’s visualize the angle of inclination

fmodel(modelp4.2, ~Commute) fmodel(modelp4.3, ~Stress) fmodel(modelp4.3, ~Commute +Stress) JobChars.R

library(dplyr)

library(data.table)

library(ggplot2)

library(corrplot)

library(lm.beta)

library(statisticalModeling)

str(df)

summary(df)

sum(is.na(df))

df<- apply(df, 2, function(x) {

if(is.numeric(x)) ifelse(is.na(x), median(x, na.rm = T), x) else x

}) %>%as.data.frame()

corrplot(cor(df))

#Part1

output1 <- lm.beta(model1)

summary(model1)

summary(output1)

coef.lm.beta(output1)

fmodel(model1)

use = “pairwise.complete.obs”)

corrplot(cor1)

#Part2

output2 <- lm.beta(model2)

summary(model2)

summary(output2)

fmodel(model2)

use = “pairwise.complete.obs”)

corrplot(cor2)

#Part3

modelp3.1 <- lm(sJPERF ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk + LMX, data = df)

modelp3.2 <- lm(sJPERF ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk + LMX + Commute, data = df)

modelp3.3 <- lm(sJPERF ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk + LMX + Commute + Stress, data = df)

anova(modelp3.0) # Sum sq 316.95

summary(modelp3.1)

anova(modelp3.1) #Residuals 297.832

lm.beta(modelp3.1)

fmodel(modelp3.1)

evaluate_model(modelp3.1)

effect_size(modelp3.1, ~Autonomy)

summary(modelp3.2)

anova(modelp3.2) #Residuals 295.827

lm.beta(modelp3.2)

fmodel(modelp3.2)

fmodel(modelp3.2, ~ Commute)

summary(modelp3.3)

anova(modelp3.3) #Residuals 292.042

lm.beta(modelp3.3)

fmodel(modelp3.3)

fmodel(modelp3.3, ~ Stress)

anova(modelp3.1, modelp3.2, modelp3.3)

#Part4

modelp4.2 <- lm(LMX ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk +  Commute, data = df)

modelp4.3 <- lm(LMX ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk +  Commute + Stress, data = df)

summary(modelp4.1)

anova(modelp4.1) #Residuals 297.832

lm.beta(modelp4.1)

fmodel(modelp4.1)

summary(modelp4.2)

anova(modelp4.2) #Residuals 295.827

lm.beta(modelp4.2)

fmodel(modelp4.2)

anova(modelp4.3) #Residuals 292.042

lm.beta(modelp4.3)

fmodel(modelp4.1)

anova(modelp4.1, modelp4.2, modelp4.3)

JobChars.Rmd

title: “JobChars”

output: word_document

“`{r setup, include=FALSE}

knitr::opts_chunk\$set(echo = TRUE)

“`

“`{r }

library(dplyr)

library(data.table)

library(ggplot2)

library(corrplot)

library(lm.beta)

library(statisticalModeling)

“`

# Data preprocessing

“`{r }

str(df)

summary(df)

sum(is.na(df))

df<- apply(df, 2, function(x) {

if(is.numeric(x)) ifelse(is.na(x), median(x, na.rm = T), x) else x

}) %>%as.data.frame()

corrplot(cor(df))

“`

#Part1

Let’s construct the regression model.

“`{r }

output1 <- lm.beta(model1)

summary(model1)

summary(output1)

coef.lm.beta(output1)

fmodel(model1)

use = “pairwise.complete.obs”)

print(cor1)

corrplot(cor1)

“`

The regression coefficients (“betas”) are effect sizes. In a simple linear regression, this is the slope of the regression line, in a multiple linear regression, this is the slope of the (hyper-)plane in the direction of the predictor. This means: the value of beta tells you how much the predicted value changes when the corresponding predictor is increased by 1 unit, holding all other predictors constant.

Interpretation always depends on the particular purpose, question, experimental design and the whole context. There is never any reliable interpretation without specifying a context. Secondly, the coeff.of the betas. So it makes sense.

In this model, the standardized and unstandardized coefficients differ. The betas will be estimated so that the explanation of the total effect is balanced vetween the two predictors. But if both predictorsis explain the same things, any other distribution of the values for the betas that give the same effect in sum will be equally valid, so there is a large uncertainty in these estimates.

The model has R squared 0.05082 and p-value 1.549e-12. The most significant variables in the model: Autonomy where p-value 0.000504, TaskSig where p-value 3.18e-13,

Feedbk where p-value 0.000257. Both R2 and the adjusted R2 give you an idea of how many data points fall within the line of the regression equation. However, there is one main difference between R2 and the adjusted R2: R2 assumes that every single variable explains the variation in the dependent variable. The adjusted R2 tells you the percentage of variation explained by only the independent variables that actually affect the dependent variable.

R-Square, also known as the Coefficient of determination is a commonly used statistic to evaluate model fit. R-square is 1 minus the ratio of residual variability. When the variability of the residual values around the regression line relative to the overall variability is small, the predictions from the regression equation are good. For example, if there is no relationship between the X and Y variables, then the ratio of the residual variability of the Y variable to the original variance is equal to 1.0. Then R-square would be 0. If X and Y are perfectly related then there is no residual variance and the ratio of variance would be 0.0, making R-square = 1. In most cases, the ratio and R-square will fall somewhere between these extremes, that is, between 0.0 and 1.0. This ratio value is immediately interpretable in the following manner. If we have an R-square of 0.05082 then we know that the variability of the Y values around the regression line is 1-0.4 times the original variance; in other words we have explained 95% of the original variability, and are left with 60% residual variability. Ideally, we would like to explain most if not all of the original variability. The R-square value is an indicator of how well the model fits the data (e.g., an R-square close to 1.0 indicates that we have accounted for almost all of the variability with the variables specified in the model).

Standardized partial coefficients have the same interpretation as unstandardized partial coefficients except that the units are now standard deviations rather than arbitrary units. Therefore, standardized partial coefficients can be interpreted as the number of standard deviations the outcome increases for every standard deviation increase in the predictor, holding all other predictors constant. Most researchers have an understanding of standard deviations as a unit of measurement, theoretically making standardized partial coefficients easier to interpret than their unstandardized counterparts.

In our example, the standardized coefficients differ from the correlation of a zero-order variable.

Researchers are familiar with correlations: they range from -1 to 1, have standard deviation units, and researchers know what values are considered weak versus strong in their scientific field. Indeed, the standardized coefficient from a simple regression is the (zero-order) correlation between the predictor and outcome. However, when moving to multiple regression, standardized partial coefficients are not on the correlation metric. The added phrase â€œwhile holding all other predictors constantâ€ changes the interpretation. Standardized partial coefficients range from -âˆž to +âˆž rather than -1 to +1 and have fractions of standard deviation units rather than standard deviation units. This results in an unclear understanding of what a weak versus strong standardized partial coefficient is.

#Part2

Let’s construct the regression model.

“`{r }

output2 <- lm.beta(model2)

summary(model2)

summary(output2)

fmodel(model2)

use = “pairwise.complete.obs”)

print(cor2)

corrplot(cor2)

coef.lm.beta(output2)

“`

The model has R squared  0.01818 and p-value 9.487e-05. The most significant variables in the model: TaskSig where p-value 0.000129, TaskID where p-value 0.000732. This model is worse than the previous one. R squared less than previous model. Beta coefficients differ. In the second model, the beta coefficients are very small, indicating a slight slope in the hyperplane relative to the predicted value. Adjusted R-squared is 0.01818 but Multiple R-squared is 0.02248. Correlation coefficients also indicate weak dependencies between variables. We can not use this model to explain the variables. Beta weights are very small, which suggests that variables poorly explain the predicted value.

“`{r }

#Part3

modelp3.1 <- lm(sJPERF ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk + LMX, data = df)

modelp3.2 <- lm(sJPERF ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk + LMX + Commute, data = df)

modelp3.3 <- lm(sJPERF ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk + LMX + Commute + Stress, data = df)

anova(modelp3.0) # Sum sq 316.95

summary(modelp3.1)

anova(modelp3.1) #Residuals 297.832

lm.beta(modelp3.1)

fmodel(modelp3.1)

#evaluate_model(modelp3.1)

#effect_size(modelp3.1, ~Autonomy)

summary(modelp3.2)

anova(modelp3.2) #Residuals 295.827

lm.beta(modelp3.2)

fmodel(modelp3.2)

summary(modelp3.3)

anova(modelp3.3) #Residuals 292.042

lm.beta(modelp3.3)

fmodel(modelp3.3)

anova(modelp3.1, modelp3.2, modelp3.3)

“`

Here, we can see that each successive model is significant above and beyond the previous one. This suggests that each predictor added along the way is making an important contribution to the overall model.

When we use anova() with a single model, it shows analysis of variance for each variable. However, when we use anova() with multiple models, it does model comparisons.

Model 0: SSTotalSSTotal = 316,95 (no predictors)

Model 1: SSResidualSSResidual = 297.83 (after adding basic variables)

Model 2: SSResidualSSResidual=  295.83,SSDifferenceSSDifference = 2.0051, FF(1,96) = 7.779, pp = 0.0053741 (after adding Commute)

Model 3: SSResidualSSResidual = 292.04, SSDifferenceSSDifference = 3.7857, FF(1,95) = 14.687, pp = 0.0001339 (after adding Stress)

By adding Commute, the model accounts for additional SSSS 2 and it was a statistically significant change according to the corresponding F-statistic and p-value. The R2R2 increased by +0.6% (2.0051 /  316,95 = 0.1025399) in Model 2. By adding Stress, the model accounts for additional SSSS 3.7857 and it was statistically significant again. The R2R2 increased by +1.1% (3.7857 / 316,95 = 0.06579513) in Model 3.

Aside from the coefficients of variables, letâ€™s take a look at R2 of Model 1, 2, and 3, which are 0.05535, 0.06088, and 0,07208 respectively. The R2 changes computed using anova() results correspond to differences in R2 in lm()results for each model:  0.06088 â€“ 0.05535 = 0,00553  for Model 2 and 0,07208â€“  0.06088= 0,0112 for Model 3 (with rounding errors). Although we can compute R2 differences between models using lm() results, lm() results donâ€™t provide corresponding F-statistics and p-values to an increased R2. And itâ€™s important to remember that adding variables always increases R2, whether or not it actually explains additional variation in the DV.Thatâ€™s why itâ€™s crucial to perform F-tests and not just rely on the difference in R2 between models.

Beta weight for variable Commute is -0.07, for variable Stress is -0.11.

###let’s visualize the angle of inclination

“`{r }

fmodel(modelp3.2, ~ Commute)

fmodel(modelp3.3, ~ Stress)

fmodel(modelp3.3, ~ Commute + Stress)

“`

“`{r }

#Part4

modelp4.0 <- lm(LMX ~ 1, data = df)

modelp4.2 <- lm(LMX ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk +  Commute, data = df)

modelp4.3 <- lm(LMX ~ Autonomy +TaskVar + TaskSig + TaskID + Feedbk +  Commute + Stress, data = df)

anova(modelp4.0)

anova(modelp4.1) #Residuals 790.40

lm.beta(modelp4.1)

fmodel(modelp4.1)

anova(modelp4.2) #Residuals 789.52

lm.beta(modelp4.2)

fmodel(modelp4.2)

anova(modelp4.3) #Residuals 786.78

lm.beta(modelp4.3)

fmodel(modelp4.1)

anova(modelp4.1, modelp4.2, modelp4.3)

“`

Model 0: SSTotalSSTotal = 927.47  (no predictors)

Model 1: SSResidualSSResidual = 790.40 (after adding basic variables)

Model 2: SSResidualSSResidual=  789.52,SSDifferenceSSDifference = 0.88589, FF(1,96) = 1.2768, pp = 2.2e-16 (after adding Commute)

Model 3: SSResidualSSResidual = 786.78, SSDifferenceSSDifference = 2.73536, FF(1,95) = 3.9425, pp = 2.2e-16 (after adding Stress)

By adding Commute, the model accounts for additional SSSS  0.88589 and it was a statistically significant change according to the corresponding F-statistic and p-value. The R2R2 increased by +0,001120825 (0.88589 /  790.40 = 0,001120825) in Model 2. By adding Stress, the model accounts for additional SSSS 2.73536 and it was statistically significant again. The R2R2 increased by +0,003453947 (2.73536 / 790.40 = 0,003453947) in Model 3.

Aside from the coefficients of variables, letâ€™s take a look at R2 of Model 1, 2, and 3, which are 0.144, 0.1442, and 0.1465 respectively. The R2 changes computed using anova() results correspond to differences in R2 in lm()results for each model:  0.144 – 0.1442  = 0,0002  for Model 2 and 0.1465 – 0.144 = 0,0025 for Model 3 (with rounding errors).

As we see, the contribution is not significant in the accuracy of the model when adding new variables.

###let’s visualize the angle of inclination

“`{r }

fmodel(modelp4.2, ~ Commute)

fmodel(modelp4.3, ~ Stress)

fmodel(modelp4.3, ~ Commute + Stress)

“`……..

Assignment 2

Midterm-Exam-Part-I

1. Several different types of descriptive statistics are reported in studies, including central tendency, dispersion and association. Provide examples of each type of statistic. What type of information is provided by each type?
1. What considerations should go into the selection of a sample for use in an empirical study? Describe how these considerations might be expected to influence the results of various data analyses and, subsequently, the inferences one can make from the results of this study to broader populations.
1. What is sampling error? What effect does the presence of sampling error have on the results of research studies? How can the effects of sampling error on results be mitigated?
1. What is the standard error? What role does it play in the calculation of confidence intervals? What is the correct interpretation of a confidence interval?
1. What does it mean if the results of a test of the hypothesis H0: m1 – m2 = 0 is statistically significant or if it is not statistically significant? How does this differ from testing the hypothesis H0: m1<m2?
1. Discuss the difference between the t-test for the difference between two means and Cohen’s d (the standardized difference between two means). When or for what purposes would you use these two statistics?
1. What are Type I and Type II errors? How is the level of Type I and Type II error in a study influenced by sample size (or other factors)?
1. What does the term “power” refer to with regard to hypothesis tests? How can you increase the power of a design? What would you do if you determine during the design of a study that it is not possible to construct a design that has adequate power (e.g., you cannot obtain power greater than .60)?
1. Differentiate between a (a) single group post-only design, (b) single-group pre-post design, (c) control group post-only design, and (d) a control group pre- and post-test design. When might each be useful? How does the design influence the validity of inferences about findings from these designs to other studies?
1. How does the choice of design influence both internal and external validity? Why would you favor one over the other for a particular study?
1. What is the difference between statistical significance and substantive significance? When or under what circumstances might the statistical and substantive significance of findings be different and what implications does that have for the interpretation of results?

Solution

1. Several different types of descriptive statistics are reported in studies, including central tendency, dispersion and association. Provide examples of each type of statistic. What type of information is provided by each type?

Types of Descriptive Statistics:

• Measure of central tendency: The measure of central tendency measures the average value of the sample. In descriptive statistics, there are two types of averages: the first are the mathematical averages and the second are the positional averages.

The mathematical averages are of three types: arithmetic mean, geometric mean, and harmonic mean. The arithmetic mean is the most widely used measure for central tendency; it can be obtained by adding all the items of the series and dividing this total by the number of items. In descriptive statistics, the geometric mean is defined as the nth root of the products of all the n values of the variable. In descriptive statistics, the geometric mean is used when the items in the series are very large. The harmonic mean is defined as the reciprocal of the item. The harmonic mean is useful in finding the averages that involve speed, time, price and ratio.

• Measure of dispersion: In descriptive statistics, we can elaborate upon the data further by measuring the dispersion. Usually the range of the standard deviation and variance is used to measure the dispersion. In descriptive statistics, range is defined as the difference between the highest and the lowest value. The standard deviation and variance are usually used to measure the dispersion. Standard deviation is also called the root mean square deviation. Variance is also used to measure the dispersion, which can be simply derived from the square of the standard deviation
• The measures of association refer to a wide variety of coefficients that measure the statistical strength of the relationship on the variables of interest; these measures of strength, or association, can be described in several ways, depending on the analysis.The coefficient that measures statistical association, which can vary depending on the analysis, that has a value of zero signifies no relationship exists. In correlation analyses, if the coefficient (r) has a value of one, it signifies a perfect relationship on the variables of interest. In regression analyses, if the standardized beta weight (β) has a value of one, it also signifies a perfect relationship on the variables of interest. In regards to linear relationships, the measures of association are those which deal with strictly monotonic, ordered monotonic, predictive monotonic, and weak monotonic relationships.  The researcher should note that if the relationships in measures of association are perfect due to strict monotonicity, then it should be perfect by other conditions as well.  However, in measures of association, one cannot have perfect ordered and perfect predictive monotonicity at the same time.
1. What considerations should go into the selection of a sample for use in an empirical study? Describe how these considerations might be expected to influence the results of various dataanalyses and, subsequently, the inferences one can make from the results of this study to broaderpopulations.

Empirical research is based on the analysis of empirical data. Empirical evidence, also known as sensory experience, is the knowledge received by means of the senses, particularly by observation and experimentation. Empirical evidence is information that verifies the truth (that which acuratetely corresponds to reality) or falsity (innacuracy) of a claim. “In the empiricist view, one can claim to have knowledge only when based on empirical evidence”, would not be a truthful statement about empericist who believe that testable verifiable information is not the only way of gaining knowledge.

For the successful analysis of empirical data, the quality of the data, as well as the sample size. Non-qualitative data, based on ill-considered data preparation, can not detect inter-relationships between data. But the hypothesis can be valid if the data were correctly prepared. At the same time, not large samples can detect some kind of dependencies, but on a large set of data these dependencies can be false.

1. What is sampling error? What effect does the presence of sampling error have on the results of research studies? How can the effects of sampling error on results be mitigated?

A sampling error is a statistical error that occurs when an analyst does not select a sample that represents the entire population of data and the results found in the sample do not represent the results that would be obtained from the entire population. Sampling is an analysis performed by selecting by specific number of observations from a larger population, and this work can produce both sampling errors and nonsampling errors.

Sampling error can be eliminated when the sample size is increased and also by ensuring that the sample adequately represents the entire population. Assume, for example, that XYZ Company provides a subscription-based service that allows consumers to pay a monthly fee to stream videos and other programming over the web and that the firm want to survey homeowners who watch at least 10 hours of programming over the web each week and pay for an existing video streaming service. XYZ wants to determine what percentage of the population is interested in a lower-priced subscription service. If XYZ does not think carefully about the sampling process, several types of sampling errors may occur.

1. What is the standard error? What role does it play in the calculation of confidence intervals? What is the correct interpretation of a confidence interval?

A standard error is the standard deviation of the sampling distribution of a statistic. Standard error is a statistical term that measures the accuracy with which a sample represents a population. In statistics, a sample mean deviates from the actual mean of a population; this deviation is the standard error.

The term “standard error” is used to refer to the standard deviation of various sample statistics such as the mean or median. For example, the “standard error of the mean” refers to the standard deviation of the distribution of sample means taken from a population. The smaller the standard error, the more representative the sample will be of the overall population.

A confidence interval measures the probability that a population parameter will fall between two set values. The confidence interval can take any number of probabilities, with the most common being 95% or 99%.

A confidence interval is the probability that a value will fall between an upper and lower bound of a probability distribution. For example, given a 99% confidence interval, stock XYZ’s return will fall between -6.7% and +8.3% over the next year. In layman’s terms, you are 99% confident that the returns of holding XYZ stock over the next year will fall between -6.7% and +8.3%.

1. What does it mean if the results of a test of the hypothesis is statistically significant or if it is not statistically significant? How does this differ from testing the hypothesis?

Statistically significant is the likelihood that a relationship between two or more variables is caused by something other than random chance. Statistical hypothesis testing is used to determine whether the result of a data set is statistically significant. This test provides a p-value, representing the probability that random chance could explain the result; in general, a p-value of 5% or lower is considered to be statistically significant.

For example, Novo Nordisk, the pharmaceutical leader in diabetes medication, reported on June 2016, that there was a statistically significant reduction in type 1 diabetes when it tested its new insulin. The test consisted of 26 weeks of randomized therapy among diabetes patients, reduced type 1 diabetes and had a p-value of less than 5%, meaning that the reduction in diabetes was not due to random chance.

1. Discuss the difference between the t-test for the difference between two means and Cohen’s (the standardized difference between two means). When or for what urposes would you use these two statistics?

A t-test is an analysis of two populations means through the use of statistical examination; a t-test with two samples is commonly used with small sample sizes, testing the difference between the samples when the variances of two normal distributions are not known.

Calculating d using the t statistic from the paired t test gives you a statistic sometimes labeled d[z] which can be useful for power calculations, but not for indicating how big an effect is. The issue is that is strips out the impact of individual differences from the estimate of the standardizer (the denominator of the d calculation).

A further issue is that separate paired t tests will use different standardizers and hence will use different units meaning they are not on the same scale and can’t be compared safely.

One option is to use the unstandardised difference in means in preference to a standardised metric. This is my preferred option.

1. What are Type I and Type II errors? How is the level of Type I and Type II error in a study influenced by sample size (or other factors)?

A Type I error is a type of error that occurs when a null hypothesis is rejected although it is true. The error accepts the alternative hypothesis, despite it being attributed to chance.  Type I error rejects an idea that should have been accepted. It also claims that two observances are different, when they are actually the same.

A type II error is a statistical term used within the context of hypothesis testing that describes the error that occurs when one accepts a null hypothesis that is actually false. The error rejects the alternative hypothesis, even though it does not occur due to chance. A type II error fails to reject, or accepts, the null hypothesis, although the alternative hypothesis is the true state of nature.

The difference between a type II error and a type I error is a type I error rejects the null hypothesis when it is true. The probability of committing a type I error is equal to the level of significance that was set for the hypothesis test. Therefore, if the level of significance is 0.05, there is a 5% chance a type I error may occur.

Although crucial, the simple question of sample size has no definite answer due to the many factors involved. We expect large samples to give more reliable results and small samples to often leave the null hypothesis unchallenged. Large samples may be justified and appropriate when the difference sought is small and the population variance large. Established statistical procedures help ensure appropriate sample sizes so that we reject the null hypothesis not only because of statistical significance, but also because of practical importance. These procedures must consider the size of the type I and type II errors as well as the population variance and the size of the effect. The probability of committing a type I error is the same as our level of significance, commonly, 0.05 or 0.01, called alpha, and represents our willingness of rejecting a true null hypothesis. This might also be termed a false negative—a negative pregnancy test when a woman is in fact pregnant. The probability of committing a type II error or beta (ß) represents not rejecting a false null hypothesis or false positive—a positive pregnancy test when a woman is not pregnant. Ideally both types of error are minimized. The power of any test is 1 – ß, since rejecting the false null hypothesis is our goal.

1. What does the term “power” refer to with regard to hypothesis tests? How can you increase the power of a design? What would you do if you determine during the design of a study that it is not possible to construct a design that has adequate power (e.g., you cannot obtain power greater than .60)?

Whenever we conduct a hypothesis test, we’d like to make sure that it is a test of high quality. One way of quantifying the quality of a hypothesis test is to ensure that it is a “powerful” test.

The power of a binary hypothesis test is the probability that the test correctly rejects the null hypothesis (H0) when a specific alternative hypothesis (H1) is true. The Statistical power ranges from 0 to 1, as statistical power increases, the probability of making an error decreases (specifically a type 2 error), type 2 error is =β, statistical power is =1-β. Therefore, as an example, if experiment 1 has a statistical power of 0.6, and experiment 2 has a statistical power of 0.95, then there is a stronger probability that experiment 1 had a type 2 error than experiment 2, and experiment 2 is more reliable than experiment 1 due to the reduction in probability of a type 2 error. It can be equivalently thought of as the probability of accepting the alternative hypothesis (H1) when it is true—that is, the ability of a test to detect a specific effect, if that specific effect actually exists.

1. Differentiate between a (a) single group post-only design, (b) single-group pre-post design, (c) control group post-only design, and (d) a control group pre- and post-test design. When might each be useful? How does the design influence the validity of inferences about findings from these designs to other studies?

One-group pretest-posttest design

A single case is observed at two time points, one before the treatment and one after the treatment. Changes in the outcome of interest are presumed to be the result of the intervention or treatment. No control or comparison group is employed.

Post-Test Only Control Group Design

A type of true experimental design where test units are randomly allocated to an experimental group and a control group. The experimental group is exposed to a treatment and both groups are measured afterwards.

In pre-test/post-test designs, evaluators survey the intervention group before and after the intervention. While evaluators may observe changes in outcome indicators among the intervention participants, they cannot attribute all these changes to the intervention alone using this design because there is no comparison group

An important drawback of pre-experimental designs is that they are subject to numerous threats to their validity. Consequently, it is often difficult or impossible to dismiss rival hypotheses or explanations. Therefore, researchers must exercise extreme caution in interpreting and generalizing the results from pre-experimental studies.

One reason that it is often difficult to assess the validity of studies that employ a pre-experimental design is that they often do not include any control or comparison group. Without something to compare it to, it is difficult to assess the significance of an observed change in the case. The change could be the result of historical changes unrelated to the treatment, the maturation of the subject, or an artifact of the testing.

Even when pre-experimental designs identify a comparison group, it is still difficult to dismiss rival hypotheses for the observed change. This is because there is no formal way to determine whether the two groups would have been the same if it had not been for the treatment. If the treatment group and the comparison group differ after the treatment, this might be a reflection of differences in the initial recruitment to the groups or differential mortality in the experiment.

As exploratory approaches, pre-experiments can be a cost-effective way to discern whether a potential explanation is worthy of further investigation.

Pre-experiments offer few advantages since it is often difficult or impossible to rule out alternative explanations. The nearly insurmountable threats to their validity are clearly the most important disadvantage of pre-experimental research designs.

1. How does the choice of design influence both internal and external validity? Why would you favor one over the other for a particular study?

ultiple experimentation is more typical of science than a once and for all definitive experiment! Experiments really need replication and cross-validation at various times and conditions before the results can be theoretically interpreted with confidence. An interesting point made is that experiments which produce opposing theories against each other probably will not have clear cut outcomes–that in fact both researchers have observed something valid which represents the truth.

Factors which jeopardize internal validity:

• History – the specific events which occur between the first and second measurement.
• Maturation – the processes within subjects which act as a function of the passage of time. i.e. if the project lasts a few years, most participants may improve their performance regardless of treatment.
• Testing – the effects of taking a test on the outcomes of taking a second test.
• Instrumentation – the changes in the instrument, observers, or scorers which may produce changes in outcomes.
• Statistical regression – It is also known as regression to the mean. This threat is caused by the selection of subjects on the basis of extreme scores or characteristics. Give me forty worst students and I guarantee that they will show immediate improvement right after my treatment.
• Selection of subjects – the biases which may result in selection of comparison groups. Randomization (Random assignment) of group membership is a counter-attack against this threat. However, when the sample size is small, randomization may lead to Simpson Paradox, which has been discussed in an earlier lesson.
• Experimental mortality – the loss of subjects. For example, in a Web-based instruction project entitled Eruditio, it started with 161 subjects and only 95 of them completed the entire module. Those who stayed in the project all the way to end may be more motivated to learn and thus achieved higher performance.
• Selection-maturation interaction – the selection of comparison groups and maturation interacting which may lead to confounding outcomes, and erroneous interpretation that the treatment caused the effect. John Henry effect–John Henry was a worker who outperformed a machine under an experimental setting because he was aware that his performance was compared with that of a machine.

Factors which jeopardize external validity:

• Reactive or interaction effect of testing – a pretest might increase or decrease a subject’s sensitivity or responsiveness to the experimental variable. Indeed, the effect of pretest to subsequent tests has been empirically substantiated (Willson & Putnam, 1982, Lana, 1959).
• Interaction effects of selection biases and the experimental variable
• Reactive effects of experimental arrangements – it is difficult to generalize to non-experimental settings if the effect was attributable to the experimental arrangement of the research.
• Multiple treatment interference – as multiple treatments are given to the same subjects, it is difficult to control for the effects of prior treatments.
1. What is the difference between statistical significance and substantive significance? When or under what circumstances might the statistical and substantive significance of findings be different and what implications does that have for the interpretation of results?

Statistical significance reflects the improbability of findings drawn from samples given certain assumptions about the null hypothesis.

Substantive significance is concerned with meaning, as in, what do the findings say about population effects themselves?

Researchers typically estimate population effects by examining representative samples. Although researchers may invest considerable effort in minimizing measurement and sampling error and thereby producing more accurate effect size estimates, ultimately the goal is a better understanding of real world effects. This distinction between real world effects and researchers’ sample-based estimates of those effects is critical to understanding the difference between statistical and substantive significance.

The statistical significance of any test result is determined by gauging the probability of getting a result at least this large if there was no underlying effect.The outcome of any test is a conditional probability or p value. If the p value falls below a conventionally accepted threshold (say .05), we might judge the result to be statistically significant.

The substantive significance of a result, in contrast, has nothing to do with the p value and everything to do with the estimated effect size. Only when we know whether we’re dealing with a large or trivial sized effect, will we be able to interpret its meaning and so speak to the substantive significance of our results. Note, though, that while the size of an effect size will be correlated with its importance, there will be plenty of occasions when even small effects may be judged important.