## Statistical Inference and Probability

The solutions below answer various probability and inferential statistics questions correctly. Some of them are based on the types of errors in hypothesis testing and correlation coefficients while others are hinged on sample sizes and correlation coefficients. Answers to the questions approve or reject statements based on their accuracy. All of them are correct.

Q1.1The dependent variable in a regression analysis must be a continuous variable.

**True
**

Q1.2 For a given sample size, the margin of error of an estimate increases with a confidence level.

**False
**

Q1.3 When assessing the association of factor X with a disease Y, the relative risk and the odds ratio should be very similar.

**True
**

Q1.4 When assessing the association of a risk factor X with a disease Y in a population, an odds ratio of 2.0 (95% confidence intervals 1.3 to 2.7) indicates an exposure of factor X is associated with higher odds of outcome Y.

**True**

**
**

Q1.5 It is appropriate to use the Pearson’s correlation to assess the association of support on secondary use of blood spot cards in medical research (in Likert scale: with 1 represents strongly disagree and 5 represents strongly agree) and age of parents (a continuous variable).

**False
**

Q1.6 Both type I and type II errors are reduced by increasing the sample size.

**False
**

Q1.7 The 95% confidence interval is wider than the 99% confidence interval.

**False
**

Q1.8 All confidence intervals should include zero.

**True
**

Q1.9 The area under the curve in a standard normal distribution between mean ± 1 standard deviation is 34%.

**False
**

Q1.10 A regression coefficient ranges from -1.0 to +1.0, with +1 indicating a perfect positive linear relation between 2 variables of interest.

**True
**

Q1.11 A useful diagnostic test should have a likelihood ratio close to1.0 for both tests positive and test negative.

**True
**

Q1.12 A useful diagnostic test is characterized by a high sensitivity and a high specificity.

**True
**

Q1.13 A funnel plot provides information on the combined result for a meta-analysis.

**False
**

Q1.14 A receiver operator characteristic curve describes the characteristics of a binary test.

**True
**

Q1.15 In a meta-analysis, I² statistic describes the percentage of variation across studies that is due to heterogeneity rather than chance.

**True
**

## Finding Probabilities Using a Table

Suppose the average waiting time at the A&E department is 7.5 hours with a standard deviation of 2.0 hours. Assume the waiting time is normally distributed; using a probability table, find the probability that a randomly selected patient in the department will have a waiting time of

12 hours or more?

µ = 7.5, δ = 2.0

P(X>12) = (12-7.5)/2

= 2.25

P(x>2.25) = 0.4878

Between 3 to 9 hours?

P(3

(3-7.5)/2< X <(9-7.5)/2

-2.25 < X < 0.75

= 0.5122 – 0.2704

= 0.2418

Conducting a Hypothesis Test

A study was carried out to investigate whether LDL cholesterol level at 10 years old was associated with infant feeding during the first 3 months of life (infant formula only or exclusively breastfeeding)

The mean LDL cholesterol level at 10 years old in children fed on infant formula only (n=100) was 82.6 mg/dL (SD 25.1 mg/dL) and the exclusively breastfed group (n=100) was 70.6 mg/dL (SD 20.1 mg/dL) respectively.

Assume LDL cholesterol level is normally distributed. The level of significance is set at 0.05.

Identify the study population.

The infant population who are age 10 or less

What are the measurement scales of the 2 variables: “type of feeding” and “LDL cholesterol level”?

LDL cholesterol level- Scale

Type of feeding – Nominal

A two-sample independent T-test was used to assess the cholesterol level between formula-fed children and breastfed children. What is the null hypothesis to test? Is it a two-tailed testoraone-tailed test?

Null hypothesis: The mean of formula-fed children and breastfed children's cholesterol levels are the same.

It is a two-tailed test

Explain type I error. What is the probability of committing type I error in this study?

Type 1 error is also called false positive and it’s the probability of rejecting a true null hypothesis. It can be committed in this study by rejecting the null hypothesis that the means of the cholesterol levels are the same when they are not.