# Drawing Histograms, QQ Plot, and Box Plot in R for a Variety of Statistical Problems

In this homework solution, we provide detailed answers and insights for a range of statistical questions. Explore our approach to each question and discover how we tackle issues related to descriptive statistics, data visualization, regression analysis, and more using R. This presentation not only provides answers but also offers clear explanations and observations for a thorough understanding of the solutions.

## Problem Description:

This statistical analysis homework covers various statistical questions and tasks. We will provide solutions to these questions and briefly describe the approach taken for each.

### Solution

Question 1

A study with a 70% power rate suggests that only 70 out of 100 statistical tests will actually find any true effects if there are any to be found in the 100 statistical tests. The projected Type II error threshold is 30% with a power of 70%. (1-0.7). The power of a study can be increased by increasing either the effect size, sample size or the significance level.

Question 2

a. Descriptive Statistics

library(haven)

summary(data)

group match trials percent_accuracy
Min. :0.0 Min. : 1.00 Min. : 8 Min. : 12.50
1st Qu.:0.0 1st Qu.: 6.75 1st Qu.: 8 1st Qu.: 75.00
Median :0.5 Median :12.00 Median :16 Median : 87.50
Mean :0.5 Mean :13.43 Mean :16 Mean : 80.65
3rd Qu.:1.0 3rd Qu.:21.00 3rd Qu.:24 3rd Qu.: 91.67
Max. :1.0 Max. :24.00 Max. :24 Max. :100.00

b. The descriptive statistics is not appropriate for the “group” variable because it a categorical variable.

Question 4

a. Code and Plot

 data=read_sav('C:/Users/belak/Desktop/R/Hayden_2005.sav') head(data) hist(data$days, xlab = "Days", main = "Histogram of Days") hist(data$days, breaks=30, xlab = "Days"main="Histogram of Days With breaks=30") 

b. The histogram revealed that the days is normally distributed. It skewed to the left and hence the mean is less than the median.

Question 5

a. Code and Plot

 data=read_sav('C:/Users/belak/Desktop/R/baguley_payne_2000.sav') boxplot(data$percent_accuracy,ylab="Percent Accuracy",main="Boxplot of Percent Accuracy")  b. The box plot indicated that the percent accuracy is not normally distributed and negatively skewed. Hence the percent accuracy mean is less than the median. c. Yes, there are outliers 4 outliers in the percent accuracy. Question 6 a. Code and Plot  eclsk = read.csv("eclsk.csv") head(eclsk) lin.model <- lm (gen~math,eclsk) plot(lin.model,2)  b. The Q-Q plot shows that the model residuals are normally distributed and hence, the normality of residual assumption was not violated. Question 7 a. Code and Plot  eclsk = read.csv("eclsk.csv") head(eclsk) plot(eclsk$math,eclsk\$gen, main='Regression for Math and Gen', xlab='Math',ylab='Gen') abline(lm(gen~math,data=eclsk),col='red') 

b. The scatter plot shows that there is linear relationship between the math and gen. Also, the assumption of homogeneity of variance is not violated.