• Survival Analysis: Factors Related to Breastfeeding
Survival analysis (also known as time to event analysis) is a field of statistics that studies and analyzes the expected amount of time until an event of interest occurs. In most instances, survival analysis is used to estimate the duration of time for a group of individuals to experience a specific event. It is one of the most commonly used statistical methods for analyzing data on the amount of time left until certain events occur such as device failure, heart attack, death, etc.

### Analysis of Variance (ANOVA)

We have to analyze the duration of breastfeeding in mothers using a dataset of 927 mother-infant pairs, and figure out if variates such as race, poverty, smoking status of the mother, age of mother, birthyr, education, prenatal care affect the duration of breastfeeding.
We will use ANOVA, t-tests for two independent samples, Cox Regression model & general multiple regression to estimate the effects of these variates on the duration of breastfeeding.

### Variation of duration by race An ANOVA was conducted to test whether the duration of breastfeeding varied by the race of the mother. The F-value is 3.114, p-value = 0.0449 < 0.05. Thus, at a 5% significance level, we reject the null hypothesis the average duration of breastfeeding is equal for all the different races.

The average duration of White mothers was highest at 17.036 weeks, followed by Black mothers at 15.188 weeks & mothers from Other races at 13.115 weeks.

We conduct further t-tests for two independent samples, to ascertain the impact of various covariates on the length of breastfeeding.

To get assistance with tasks related to the analysis of variance, take our ANOVA assignment help.

Variation of duration by the smoking status of the mother at the time of birth A t-test for independent samples was conducted to test whether the duration of breastfeeding varied by whether the mother smoked at the time of the birth of the child. The t-statistic is equal to 3.0139, p-value = 0.002685 < 0.05. Thus, we reject the null hypothesis and conclude that the two groups have different durations of breastfeeding.

Those who smoked breastfed their infants for an average of 13.65 weeks & those who didn’t feed for 17.21 weeks.

Variation of duration by alcohol drinking status of the mother at the time of birth A t-test for independent samples was conducted to test whether the duration of breastfeeding varied by whether the mother drank at the time of the birth of the child. The t-statistic is equal to 1.382, p-value = 0.1701 > 0.05. Thus, we fail to reject the null hypothesis and conclude that the two groups have similar durations of breastfeeding.

Those who drank breastfed their infants for an average of 13.772 weeks & those who didn’t feed for 16.40 weeks.

Variation of duration by the economic status of mother (poverty) A t-test for independent samples was conducted to test whether the duration of breastfeeding varied by whether the mother was in poverty at the time of the birth of the child. The t-statistic is equal to -0.98218, p-value = 0.327> 0.05. Thus, we fail to reject the null hypothesis and conclude that the two groups have similar durations of breastfeeding.

Those who were in poverty breastfed for an average of 15.87 weeks & those who weren’t fed for 17.52 weeks.

Variation of duration by whether the mother sought prenatal care after 3 months of pregnancy. A t-test for independent samples was conducted to test whether the duration of breastfeeding varied by whether the mother sought prenatal care after 3 months of pregnancy. The t-statistic is equal to 0.66, p-value = 0.5084> 0.05. Thus, we fail to reject the null hypothesis and conclude that the two groups have similar durations of breastfeeding.

Those who were sought prenatal care breastfed for an average of 15.38 weeks & those who didn’t feed for 16.35 weeks.

To better understand the impact of the covariates on the length of breastfeeding, we use the Cox regression model

Cox Regression test The smoke & race variablesare statistically significant in the Cox regression model, with a hazard ratio of 1.3341 for Smokers& 1.189 for race variable.

The Wald statistic for the smoke variable is 3.028 & the p-value of the smoke variable is 0.000182< 0.05. Thus, we come to the conclusion that the smoke variable is a statistically significant variate. Thus, those who smoke are more at risk of ending the breastfeeding time shorter, by a factor of 1.3341.

The 95% confidence interval for hazard ratios is shown above. The 95% confidence interval for the smoke variable is (1.1472, 1.551).

The race variable has a hazard ratio of 1.189 and the p-value is 0.000171 < 0.05, making the variable statistically significant.

The overall significance of the Cox regression model is given by the three tests shown at the bottom of the image, likelihood ratio test, Wald test & Log-rank test. The p-values of these tests are less than 0.05. Thus, we conclude that the overall model is statistically significant.

### Regression analysis

We created a linear regression model to predict the duration of breastfeeding using the other variables. The F-statistic of regression model is 6.174, dfs = (9,917). The p-value of regression is 1.769*10-8< 0.05. Thus, the overall regression model is statistically significant. The adjusted R-squared value is 0.04788, which is really low & indicates that the model is perhaps not a good fit.

In an effort to improve the regression model, non-significant variables were dropped and a new regression model was run & the results are shown below: The adjusted R-squared has improved to 0.05244 from 0.04788 earlier. Still, the new adjusted R-squared is quite low that the model is not good.

Sample size calculation for smokers

Power = 80%

Alpha = 0.05

Hazard ratio (HR) = 1.3341

Incidence of smoking = 29.12% = q0

Incidence of non-smokers = 70.88% = q1

Sample Size required = (Zalpha + Zbeta)2/(log(HR)2*q0*q1) ~ 458

#### Results

The tests conducted above, show that moms who smoked breastfed their kids for a smaller amount of time compared to moms who didn’t. This was indicated by the Cox-regression model. Racial disparities exist as well, with white moms having the longest duration of breastfeeding followed by black moms and moms from other races.

Other variates failed to have a statistically significant impact on the duration of breastfeeding. If you would like professional expository services on regression models, feel free to take our regression analysis assignment help.