# Analyzing Workforce Data for XYZ Company in Excel

In this comprehensive analysis, we use excel to analyze intricate details of XYZ Company's workforce data. We explore various facets of the organization's post-pandemic challenges and employee-related issues. From descriptive statistics and hypothesis testing to in-depth examinations of gender-based salary differences and correlations between age and salary, we aim to provide data-driven solutions to the obstacles faced by XYZ Company. Let's navigate through the data and discover the insights that will help XYZ Company excel in its commitment to treating its employees equitably.

## Problem Description:

The recent pandemic has had a significant impact on the working environment, especially when it comes to data analysis homework. Employees post-pandemic are increasingly demanding changes in workplace policies and benefits. In response, XYZ Company, an organization committed to treating its employees well, is facing challenges in attracting and retaining staff in the post-pandemic climate. These challenges include adapting to remote work arrangements and addressing pay equity issues.

As the Manager of Workforce Analytics in the Human Resources department, your role is to provide data-driven insights to address these challenges. Jane, the Chief People Officer (CPO), is spearheading efforts to ensure the company treats its employees equitably. To achieve this, she has set several objectives:

1. Describe the current state of salary data using measures of central tendency and variability.
2. Provide a point estimate and construct a confidence interval for the number of employees who want to work remotely.
3. Conduct a hypothesis test to compare the mean salary between male and female employees.
4. Test the claim that employee pay is on par with industry standards.
5. Apply a normal distribution to analyze dental insurance plan expenses.
6. Investigate the correlation between seniority and pay.
7. Conduct a regression analysis to understand the relationship between employee age and salary.

## Solution

Let's explore the insights gathered from the workforce data:

### 1. Descriptive Statistics:

XYZ Company's salary data reveals the following statistics:

• Mean Salary: $174,339 • Median Salary:$146,412
• Salary Range: $569,316 • Standard Deviation:$95,378

These statistics indicate that the company's salary distribution is right-skewed, with a mean greater than the median. The high standard deviation suggests significant variability in employee salaries.

### 2. Hypothesis Testing:

To ensure employee pay aligns with industry standards, a hypothesis test is conducted. The null hypothesis states that the mean salary is at least $170,000 for engineering managers, while the alternative hypothesis is that it is less than$170,000. With a calculated t-value of 0.6823 (compared to a critical value of 1.9706), the null hypothesis is not rejected, indicating that employee pay is on par with industry standards.

A 95% confidence interval for the mean salary falls between $161,809 and$186,869.

### 3. Estimates and Confidence Intervals:

A poll of 1,003 employees reveals that 37.2% prefer working in the office. This provides a point estimate of 373 respondents. The 95% confidence interval for the percentage of employees who prefer working in the office is 34.2% to 40.2%. Based on the hypothesis test, it is concluded that the majority of employees (more than 50%) do not prefer returning to the office.

### 4. Inferences from Two Samples:

Using salary data, it is determined that there is a significant difference in salaries based on gender. Male and female employees' mean salaries differ significantly, as indicated by a p-value of 0.000.

### 8. Central Limit Theorem:

The Central Limit Theorem states that, regardless of the shape of the population distribution, the sampling distribution of the means approaches a normal distribution as the sample size increases. Two key properties of the sampling distribution are that the mean is equal to the population mean (μ) and the variance is reduced with increasing sample size (Var(x̅) = σ^2/n). A minimum sample size of 30 is recommended, especially when the population is not normally distributed, although normality can be approximated for smaller sample sizes if the population itself is normal.