How Various Statistical Methods are Applied in Real Life

Statistical methods are simply mathematical techniques, models, and formulas used to analyze raw research data. They extract useful information from data and provide various ways to study the robustness of different research outputs. The most commonly used statistical methods today are the mean, standard deviation, probability, regression, and hypothesis testing.

Correlation Analysis

Educ is endogenous if it is correlated with the error. The assumption of the CLRM is that there is no correlation between the explanatory variables and the error term. i.e. The error u has an expected value of zero given any values of the independent variables. When for any reason educ is correlated with u, educ is said to be an endogenous independent variable.
The first requirement is that the variable z must be correlated with educ. i.e.
                                            cov(educ,z)≠0
This means that z must be exogenous in the equation. This implies that z should not have effect no partial effect on children and be uncorrelated with omitted variables if there is one.
The second requirement is that the variable z must not be correlated with the error term ‘u’
                                            cov(educ,z)=0
This implies that z must be related to educ whether positively or negatively.
Hire our experts and get quality correlation assignment help for all tasks related to this topic.

Hypothesis Testing

Assumption 1 cov(educ,z)≠0 cannot be generally tested unless we have some proxies for the omitted variables in the error term. The second can be tested by estimating the simple linear regression model:
educ=γ_0+γ_1 frsthalf+v
Then we test the hypothesis that
H_0:γ_1=0
if we are able to reject the null hypothesis, then the assumption is met.
4
The estimate of educ for the instrumental variable is -0.278 which means that 10 more years of education reduces the number of children by approximately 3 children.
5
cov(children,frsthalf)=cov(frsthalf,β_0+β_1 educ+u)
cov(children,frsthalf)=β_0 cov(frsthalf,1)+β_1 cov(frsthalf,educ)+cov(frsthalf,u)
Since cov(frsthalf,1)=0 and cov(frsthalf,u)=0, we have
cov(children,frsthalf)=β_1 cov(frsthalf,educ)
And
cov(children,frsthalf)/cov(frsthalf,educ) =β_1
QUESTION 2
1
y=β_0+β_1 x_1+u……………………1
y ̃=y+e_0………………….2
Thus
y=y ̃-e_0…………….3
Substitute eqn 3 into 1, we have
y ̃-e_0=β_0+β_1 x_1+u
y ̃=β_0+β_1 x_1+u+e_0
Therefore, the equation we are able to estimate is
y ̃=β_0+β_1 x_1+u+e_0
The new error term is now u+e_0
2
The estimators of β_0 and β_1 is unbiased under certain conditions. Estimators of β_0 is unbiased if measurement error (e0) has zero mean. Estimators of β_1is unbiased if the measurement error in y is statistically independent of x1.
3
 The error v is given as
u-β_1 e_1
 x ̃_1 will be endogenous in the model because x ̃_1and e1 are correlated. This implies that x ̃_1 is correlated with the error term v which makes x ̃_1 an endogenous variable.
4
In the model, β_1 is unbiased if u and e1 both have zero means and are uncorrelated with x1. However, the measurement error is only uncorrelated with the unobserved explanatory variable implies that x ̃_1and e1 must be correlated. Therefore, β_1 is biased in the model.
Heteroscedasticity
Yes, we should report standard errors that are robust to heteroskedasticity. This is because the dependent variable is a binary variable and it will violate the assumption of no heteroscedasticity. When the dependent variable is binary, the variance conditional on the independent variable is given by
var(y│x)=p(x)[1-p(x)]
Where p(x) is the probability of success and p(x)=β_0+β_1 x_1+⋯β_k x_k. This means that in as much the probability depends on the independent variables, the variance will not be constant which means there will be heteroscedasticity. With heteroscedasticity, the t-stats and F are not valid because the standard errors under heteroscedasticity are not valid. The solution is to use heteroscedasticity robust standard errors.
2
The coefficient of selfemp is 0.054 which means the probability that Self employed mortgage application being rejected is greater than that of those that are not self-employed by 0.054 and this difference is significant as p<0.05
Probability
Change in probability for the application being denied for a unit change in livrat is
0.204+0.302selfemp
Therefore, for self-employed, change in probability for the application being denied for a unit change in livrat is
0.204+0.302(1)=0.506
Then for an 0.1increase in the lvrat, the change in probability for the application being denied for self-employed is
 0.506×0.1=0.0506
For an 0.1increase in the lvrat, the change in probability for the application being denied for self-employed is 0.506
4
The null hypothesis that the regression functionis the same for those that are and are not self-employed is given as
                                    δ_0=δ_1=δ_2=0
To test this, I will estimate two models; one the full model
                                β_0+δ_0 selfemp+β_1 pirat+β_2 lvrat+δ_1 pirat×selfemp+δ_2 lvrat×selfemp+u
And then estimate a restricted model
                                β_0+β_1 pirat+β_2 lvrat+u
Using the Residual sum of the square of both models, I will calculate the F-statistics as
                                F=(SSR_r-SSR_ur)/(SSR_ur )×(n-k-1)/k
Where SSR_r is the residual sum of the square of the restricted model and SSR_ur is the residual sum of the square of the unrestricted model, k is the number of restricted variables and n is the number of observations.
Then, we compare the F-stat with the critical value, if F-stat is greater, we reject the null hypothesis, otherwise, we do not reject the null hypothesis. Contact us for probability assignment help and get professional assistance with academic papers revolving around this paper.