How statistics methods are applied in biological sciences
Distribution for Healthy and Diseased Blood Smear
stogram for Healthy and Diseased Blood Smear width (Figure1):
As shown in the above Figures, the histogram for healthy blood smear width is in increasing order while the histogram for diseased blood smear width is in decreasing order. It signifies that for diseased blood smear, the width of blood cells is less as compared to healthy blood smear.
Descriptive statistics for Health and Diseased blood smear cell width are (Table2):
As the above summary shows, the mean and median width for healthy blood smear is more as compared to the diseased blood smear. And, for healthy blood smear, most of the blood cells have width >8 (mm x 10-2) (mode), while for diseased mode width class is <1.6x10-2mm. Also, standard deviation and variance for healthy blood smear are less compared to diseased blood smear. Distribution of width for healthy blood smear is negatively skewed whether, for diseased, it is positively skewed.
To find out outliers, we’ll calculate a z-score for each value. Z-Score can be calculated by:
Z-Score = (Value – Mean) / Std. Dev.
If Z-Score for any value is >3 or <-3 then we say that the value is an outlier.
In our case, all values for both healthy and diseased blood smears lie within the range (-3, 3). Hence, we can say that data does not contain any outliers.
Yes. As shown in descriptive statistics, both health and diseased blood smear are highly differentiated from each other. All measures of location and deviation have quite differentiated value for health and diseased blood smears. Hence, quantitative measures can be taken to differentiate between the two groups.
T-Test to compare means of two samples:
Null Hypothesis: H0: µH = µD
Alternate Hypothesis: H1: µH ≠ µD
We created a new variable Diff that is the difference between the width of health and diseased blood smears. So we have to test:
Null Hypothesis: H0: µDiff = 0
Alternate Hypothesis: H1: µDiff ≠ 0
Value of t-statistic for the test is 7.6286 (d.o.f. = 29) and p-value is 2.0689E-08 (i.e. <0.05). Hence, the Null hypothesis is rejected. So, we can say that the mean width is statistically significantly different for healthy and diseased blood smears.
To apply a T-test to compare sample means, these 3 conditions need to be satisfied:
1) Population data should be normally distributed. And if sample sizes are equal then non-normality is not a problem.
2) The skewness of the two populations should be almost the same.
3) No outliers.
In our case, all three conditions are satisfied. As the sample size is the same (30) in both cases. Skewness doesn’t differ much and as we have seen, there are no outliers in the data.
a) Null Hypothesis: H0: µA = µB = µC = µD
Alternate Hypothesis: H1: At least one of the µ is different
Where µA: Mean birth weights of infants for Non-smokers
µB: Mean birth weights of infants for Ex-smokers
µC: Mean birth weights of infants for smokers who smoke <1/2 pack/day
µD: Mean birth weights of infants for smokers who smoke >=1/2 pack/day
To get more insights on the null and alternate hypotheses, take our hypothesis testing assignment help.
Analysis of Variance
b) ANOVA results are as follows:
c) As the results show, the p-value is 0.004231142 (<0.01). i.e. H0 is rejected. Hence, we can say with 99% confidence that means the birth weight of at least one smoking status category is statistically significantly different from others.
To get professional assistance with the analysis of variance, avail our ANOVA assignment help.