## Problem Description:

For this study, we are working with data from the file "wage2.wf1" and focusing on several variables, including monthly earnings (WAGE), IQ score (IQ), years of education (EDUC), years of work experience (EXPER), years with the current employer (TENURE), marital status (MARRIED), residence in the South (SOUTH), and living in an SMSA (SMSA).

**Solutions
**

## Question 1: Statistical Descriptive

- We begin by constructing a histogram for WAGE and providing summary statistics, such as the mean, median, and percentiles.
- The mean wage is 958.6516, and the median wage is 705.500. The 25th and 75th percentiles are 216.0 and 603.8, respectively.
- We create a histogram for LOG(WAGE) and compare it to the histogram in part (1).
- The histogram for LOG(WAGE) is left-skewed, while the histogram for WAGE is right-skewed. Taking the logarithm of WAGE reduces the skewness of the distribution.
- We plot LOG(WAGE) against the IQ score, including a least-squares fitted line and discussing patterns.
- The least-squares fitted line has a positive slope, suggesting a positive relationship between LOG(WAGE) and IQ score. However, the data points exhibit some variability around the line, indicating that IQ score alone may not be a perfect predictor of LOG(WAGE).

## Question 2: The simple regression

- We estimate the simple regression model: WAGE = β1 + β2IQ + u1, interpret the slope, and comment on its sign.
- The estimated slope coefficient (β2) implies that, on average, a one-point increase in IQ corresponds to an $8.255828 increase in monthly earnings, with a positive sign indicating a positive relationship.
- For a 15-point IQ increase, the predicted monthly earnings increase is $136.03162.
- The R-squared value of 0.094194 indicates that IQ explains only about 9.4% of the variation in monthly earnings, implying that IQ does not explain most of the variation in wage.
- We estimate the model: LOG(WAGE) = α1 + α2IQ + u2, interpret the slope, and comment on its sign.
- The estimated slope coefficient (α2) suggests that, on average, a one-point increase in IQ corresponds to a 0.08740 increase in the log of monthly earnings, with a positive sign indicating a positive relationship.
- A 15-point increase in IQ corresponds to a predicted 28.1% increase in monthly earnings.

**We use this model to find the predicted wage increase for a 15-point increase in IQ.
**

**We evaluate whether IQ explains most of the variation in wage by examining the R-squared value.
**

**We calculate the approximate percentage increase in predicted wage for a 15-point IQ increase.
**

## Question 3: The multiple regression

- We estimate the multiple regression model: LOG(WAGE) = γ1 + γ2EDU + γ3EXPER + γ4TENURE + u3, discuss the estimated coefficients, their interpretations, signs, and statistical significance.
- The estimated coefficients are all positive and statistically significant, indicating a positive relationship between education, experience, tenure, and the log of monthly earnings.
- We test the overall significance of the regression, reporting the F-statistic and its p-value.
- The F-statistic is 57.00936 with a p-value below 0.000, suggesting that the independent variables jointly explain a significant portion of the variation in log(wage).
- We calculate the coefficient of determination (R-squared) and the estimate of the residual variance.
- The R-squared is 15.59%, implying that the independent variables explain this percentage of the variation in log(wage), and the residual variance estimate is 0.387851.
- We state the null hypothesis regarding the effect of another year of general workforce experience compared to tenure with the current employer and perform a t-test.
- The null hypothesis is that the coefficients of EXPER and TENURE are equal. The t-test suggests that an additional year of general workforce experience significantly impacts log(wage), while an extra year of tenure has little effect.

## Question 4: Further Analysis

We add the variables EXPER2 and TENURE2 to Model 3 and test their joint significance at the 10% level. The results indicate that all coefficients are statistically significant at this level.