## Problem Description:

The data analysis assignment involves conducting statistical analyses using the GSS18 dataset to explore relationships between different sets of variables. Students are required to determine the appropriate hypothesis test (Chi-Square, t-test, ANOVA, or Correlation) for each set of variables and provide detailed interpretations of the statistical outputs. Below are the solutions for the assignment:

### A) Mother's Religion and Religious Preference

**Variables:
**

- #478 – MARELKID - Nominal
- #768 – RELIG (R’s religion preference) - Nominal

**Chi-Square Tests**

Value | df | Asymp.Sig. (2-sided) | |||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Pearson Chi-Square | 2706.759a | 110 | 0.000 | ||||||||||

Likelihood Ratio | 932.54 7 | 110 | 0.000 | ||||||||||

Linear-by-Linear Association | 150.306 | 1 | 0.000 | ||||||||||

N of Valid Cases | 1131 |

**Solution:**For these variables, a chi-square test was performed.

The p-value is < 0.05, indicating that the alternative hypothesis (H1) is true. This implies a statistical relationship between someone's mother's religion when they were a child and their religious preference. The association is moderate, as indicated by Cramer's V (0.489) and lambda (0.429). Further analysis of crosstabulations is required to understand the precise relationship.

### B) Spending Time at a Bar and Age

**Variables:
**

- #869 – SOCBAR (spend evening at bar) - Ordinal (7 groups)
- #28 – AGE (respondent's age) - Ratio

**Solution:** For these variables, an ANOVA test was conducted.
**ANOVAAge of respondent**

Sum of Squares | df | Mean Square | F | Sig. | |||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Between Groups | 34702.745 | 6 | 5783.791 | 19.319 | 0.000 | ||||||||||||

Within Groups | 461939.962 | 1543 | 299.378 | ||||||||||||||

Total | 496642 .707 | 1549 |

The p-value is < 0.05, confirming that the alternative hypothesis (H1) is true, suggesting that at least one group's mean is different. Specifically, respondents who never spend evenings at a bar are 5 to 17 years younger than those who spend evenings at a bar several times a week.

### C) Poor Mental Health and Extra Work Hours

Variables:

- #524 – MNTLHLTH (days of poor mental health in the past 30 days) - Ratio
- #527 – MOREDAYS (days per month R worked extra hours) - Ratio

**Solution:** A correlation analysis was performed for these variables.

**Correlation**

Days of poor mental health past 30 days | Days per month R work extra hours | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

Days of poor mental | Pearson Correlation | 1 | 0.026 | |||||||||||||

health past 30 days | Sig. (2-tailed) | 0.327 | ||||||||||||||

N | 1408 | 1391 | ||||||||||||||

Days per month R work | Pearson Correlation | 0.026 | 1 | |||||||||||||

extra hours | Sig. (2-tailed) | 0.327 | ||||||||||||||

N | 1391 | 1401 |

The p-value is > 0.05, supporting the null hypothesis (H0), indicating no significant connection between the number of days of poor mental health and the number of days worked with extra hours in a month.

### D) Weeks Worked Last Year and Depression

**Variables:
**

- #983 – WEEKSWRK (weeks R worked last year) - Ratio
- #177 –told have depression - Nominal
**Independent Sample Test**

Levene's Test Variances | for Equality of | t-test for Equality | of Means | ||||||
---|---|---|---|---|---|---|---|---|---|

Mean | Std. Error | 95% Confidence Difference | Interval of the | ||||||

F | Sig. | t | df | Sig. (2-tailed) | Difference | Difference | Lower | Upper | |

Weeks R worked last Equal variances year assumed | 18.718 | .000 | -2.683 | 1389 | .007 | -2.106 | 785 | -3.645 | -566 |

Equal variances not assumed | -2.362 | 353.816 | .019 | -2.106 | .891 | -3.859 | -353 |

**Solution:**

A significant difference was found through a t-test, with a p-value < 0.05, confirming that the average number of weeks worked in the past year differs between individuals diagnosed with depression and those who are not. The analysis shows a 95% confidence level for the average number of weeks worked, with those without depression working 0.4 to 4 weeks more than those diagnosed with depression.

### E) Height and Weight

**Variables:
**

- #305 – HEIGHT - Ratio
- #984 – WEIGHT - Ratio

**Solution:** Correlation analysis was used for these ratio variables.**Correlation**

R weighs how much | R is how ta.l.l | ||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|

R weighs how much | Pearson Correlation | 1 | 0.457 | ||||||||||||

Sig.(2-tailed) | 0.000 | ||||||||||||||

N | 138.0. | 1374 | |||||||||||||

R is howtall | Pearson Correlation | 0.457 | 1 | ||||||||||||

Sig.(2-tailed) | 0.000 | ||||||||||||||

N | 1374 | 1402 |

The p-value is < 0.05, indicating that there is a moderate positive association between a person's height and weight (R = 0.457). About 20.88% of the change in weight can be attributed to changes in height, while 79.12% can be attributed to other factors.

### F) Mandatory Overtime and Real Income

**Variables:
**

- #532 – MUSTWORK (mandatory to work extra hours) - Nominal
- #722 – REALRINC (R’s income in constant dollars) - Ratio

**Solution:** For these variables, a t-test was conducted.
**Independent Sample Test**

Levene's Test Variances | for Equality of | t-test for Equality | of Means | ||||||
---|---|---|---|---|---|---|---|---|---|

Mean | Std. Error | 95% Confidence Difference | Interval of the | ||||||

F | Sig. | t | df | Sig. (2-tailed) | Difference | Difference | Lower | Upper | |

R's income in constant $ Equal variances assumed | .038 | .844 | 1.174 | 1192 | .241 | 2127.793 | 1812.996 | -1429.225 | 5684.811 |

Equal variances not assumed | 1.180 | 621.339 | 239 | 2127.793 | 1803.898 | -1414.682 | 5670.268 |

The p-value is > 0.05, supporting the null hypothesis (H0), suggesting that the mean income in constant dollars is equal for those required to work overtime and those who are not. Further analysis is not required as we have accepted the null hypothesis.