Using Univariate/ Multivariate analysis To Study Body Strength
Univariate simply means that your data has only a single variable. It is one of the simplest methods of analyzing data and its purpose is to describe, summarize, and find patterns in data. The univariate analysis differs from the multivariate analysis in that the latter is more complex and it is utilized when a data set has more than two variables. However, both are great statistical techniques and have been used in analyzing data for decades.
Multivariate analysis
Theoretically, as age increases, the injury index in a woman increases. Also, as medical difficulty increases, the injury index increases.
Coefficients^{a} | ||||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | 95.0% Confidence Interval for B | |||
B | Std. Error | Beta | Lower Bound | Upper Bound | ||||
1 | (Constant) | -244.230 | 104.141 | -2.345 | .021 | -450.922 | -37.538 | |
Age | 3.628 | 1.541 | .225 | 2.353 | .021 | .568 | 6.687 | |
Medical difficulty index | 284.142 | 95.017 | .286 | 2.990 | .004 | 95.559 | 472.725 |
As the p-value for both the demographic variables is <0.05, we can say that both variables are statistically significant. And, as R2 is 0.162, it means that 16.2% variation in body strength is explained by the demographic variables.
Overall injury index (injury) = β0 + β1 age +β2 Medical difficulty index (medindex) + ei
Regression Analysis
One of the team members, Tom, selected a model retaining four independent variables. His SPSS output is as follows:
The coefficient of 5.668 means that for each extra unit of age (1 year), the overall injury index increases by 5.668. Or, as age increases by 1 unit, the overall injury index increases by 5.668.
The coefficient of -3.981 means that for each extra unit of gluts, the overall injury index decreases by 3.981. Or, as strength in gluteus/hamstrings increases by 1 unit, the overall injury index decreases by 3.981.
For the predicted injury index of -107.321, profile of independent variables has to be
i) 5.668*age + 0.300*quads - 3.981*gluts - 0.620*arms = 0 OR
ii) age = quads = gluts = arms = 0
H0 = βarms = 0
H1 = βarms ≠ 0
Coefficients^{a} | ||||||||
Model | Unstandardized Coefficients | Standardized Coefficients | t | Sig. | 95.0% Confidence Interval for B | |||
B | Std. Error | Beta | Lower Bound | Upper Bound | ||||
1 | (Constant) | -107.321 | 96.170 | -1.116 | .267 | -298.242 | 83.601 | |
Age | 5.668 | 1.420 | .351 | 3.990 | .000 | 2.848 | 8.488 | |
Strength in quadriceps | .300 | .553 | .056 | .543 | .588 | -.797 | 1.398 | |
Strength in gluteus/hamstrings | -3.981 | .918 | -.441 | -4.336 | .000 | -5.804 | -2.158 | |
Strength in arms/shoulders | -.620 | .583 | -.101 | -1.063 | .291 | -1.778 | .538 |
As the above table shows, the p-value for βarms is 0.291 (>0.05), i.e. we can reject the null hypothesis. Hence, we can say that βarms are not statistically significantly different than zero. Thus, the partial regression coefficient of -0.62 may differ from zero because of sampling error.
Looking for professional assistance with this topic? Feel free to take our regression analysis assignment help.
Although Tom’s model may not be the one you picked, for some theoretical reasons, the team agreed that Tom’s model is the ultimate model to pick anyway. Recognizing that the correct model is never known, nonetheless, before reporting the result, you would like to check the aptness of Tom’s model. Using numbers and/or graphs, show the following diagnostics for Tom’s model. Specify your cut-off criterion when necessary. You do not need to re-run the analysis after checking the diagnostics.
Probability
We check the Variance Inflation Factor (VIF) to check multicollinearity.
Coefficients^{a} | |||
Model | Collinearity Statistics | ||
Tolerance | VIF | ||
1 | Strength in quadriceps | .717 | 1.395 |
Strength in gluteus/hamstrings | .737 | 1.356 | |
Strength in arms/shoulders | .829 | 1.206 | |
a. Dependent Variable: Age |
The normality assumption of residuals can be checked using normal probability plots for both standardized and unstandardized residuals.
As the above graph shows, residuals are scattered randomly, hence there is no problem of heteroscedasticity in the model.
To check residual outliers, tests such as Mahalanobis distance, Cook’s distance, etc. can be calculated:
Residuals Statistics | |||||
Minimum | Maximum | Mean | Std. Deviation | N | |
Predicted Value | 56.25 | 209.17 | 145.80 | 28.151 | 100 |
Std. Predicted Value | -3.181 | 2.251 | .000 | 1.000 | 100 |
Standard Error of Predicted Value | 5.778 | 15.924 | 9.752 | 2.371 | 100 |
Adjusted Predicted Value | 54.56 | 209.42 | 145.77 | 28.055 | 100 |
Residual | -103.802 | 87.387 | .000 | 43.953 | 100 |
Std. Residual | -2.313 | 1.948 | .000 | .980 | 100 |
Stud. Residual | -2.342 | 2.023 | .000 | 1.004 | 100 |
Deleted Residual | -106.349 | 94.677 | .031 | 46.224 | 100 |
Stud. Deleted Residual | -2.400 | 2.057 | .000 | 1.014 | 100 |
Mahal. Distance | .652 | 11.480 | 3.960 | 2.414 | 100 |
Cook's Distance | .000 | .091 | .010 | .015 | 100 |
Centered Leverage Value | .007 | .116 | .040 | .024 | 100 |
a. Dependent Variable: Overall injury index |
As the above table shows, the maximum Mahalanobis distance in our model is 11.48 and the critical value for 4 degrees of freedom using the Chi-Sq test is 9.49. Hence, outliers are present in the dataset (Row 1,7,12 and 79 are the outliers).
Model Summary | ||||
Model | R | R Square | Adjusted R Square | Std. The error of the Estimate |
1 | .539^{a} | .291 | .261 | 44.869 |
a. Predictors: (Constant), Strength in arms/shoulders, Age, Strength in gluteus/hamstrings, Strength in quadriceps |
ANOVA |
||||||
Model | Sum of Squares | df | Mean Square | F | Sig. | |
1 | Regression | 78456.806 | 4 | 19614.202 | 9.743 | .000^{b} |
Residual | 191257.194 | 95 | 2013.234 | |||
Total | 269714.000 | 99 | ||||
a. Dependent Variable: Overall injury index | ||||||
b. Predictors: (Constant), Strength in arms/shoulders, Age, Strength in gluteus/hamstrings, Strength in quadriceps |