Data analysis for decision makers
Descriptive statistics
Details of Data Analysis
N, no. of sample data | 22 |
Mean | 6.432 |
Median | 5.940 |
Sample Standard Deviation | 2.285 |
Range | 10.120 |
Coefficients of Variation | 36% |
Z-score of data are being checked and their value are within the range of -3.0 & +3.0. Therefore, no extreme data is removed.
Mean is suggested to be the average point of the given data set. In Table 1A, the Mean is 6.432. It means that the staffs spend 6.432 minutes on “New Business” call in average.
The Median of the “New business” calls is 5.940. Median < Mean, which will distort the distribution curve into right-skewed distribution.
Fig. 1A Distribution curve (positive or right-skewed distribution) for the call duration of the “New Business”.
The Call Category 1 “New Business” display a right-skewed distribution. In the right-skewed distribution, most of the call duration is short and is distributed in the lower portion. Some extremely long call cause the long tail and distortion to the right and cause the Mean (6.432) to be greater than the Median (5.94). Because the skewness statistic for such a distribution will be greater than zero, some use the term positive skew to describe this distribution.
The data analysis of Call Category
From the Table 1B (see the Appendix II), the analytical data is summaried as follows.
N, no. of sample data | 22 |
Mean | 6.252 |
Median | 6.855 |
Sample Standard Deviation | 2.499 |
Range | 9.700 |
Coefficients of Variation | 40% |
Z-score of data are being checked and their value are within the range of +/-3.0. Therefore, no extreme data is removed.
Mean is suggested to be the average point of the given data set. In Table 1B, the Mean is 6.252. It means that the staffs spend 6.252 minutes on “New Claim” call in average.
The Median of the “New business” calls is 6.855. Median > Mean, which will distort the distribution curve into left-skewed distribution.
Fig. 1B Distribution curve (negative or left-skewed distribution) for the call duration of the “New Claim”.
The Call Category 3 “New Claim” display a left-skewed distribution. In the left-skewed distribution , most of the call duration is long and is distributed in the upper portion. Some extremely short call cause the long tail and distortion to the left and cause the Mean (6.252) to be less than the Median (6.855). Because the skewness statistic for such a distribution will be less than zero, some use the term negative skew to describe this distribution.
In addition, we are notice that the Sample Standard Deviation of Table 1A (2.285) is smaller than that of Table 1B (2.499). It means that the samples for the call duration of “New Business” is closer from the mean; however, the samples for the call duration of “New Claim” is further from the mean and have higher deviation within the samples.
Comparing coefficients of variation for category 1 (new business) & category 3 (new claim)
Category 1 – new business: 36%
Category 3 – new claim: 40%
It is concluded that, relative to the mean, the call duration of “New Claim” is much more variable than that of the “New Business”.
Part II. In according with the data “Table 2” of the insurance company taken in April 2015, please find our analysis report for the Claim Value & the Time to Process Claim as follow:
(B) The data analysis of Claim Value
From the Table 2A (see the Appendix III), the analytical data is summaried as follows
N, no. of sample data | 44 |
Mean | 622.493 |
Median | 478.170 |
Sample Standard Deviation | 859.626 |
Range | 5,910.670 |
Coefficients of Variation | 138% |
Z-score of data are being checked. The Claim Value of Claim ID 1011 is found out of the range of +/-3.0. Therefore, Claim ID 1011 is the extreme case and its data is removed from our statistical analysis. After removed the data. New analytical data is summarized as follows:
N, no. of sample data | 43 |
Mean | 498.704 |
Median | 462.200 |
Sample Standard Deviation | 257.392 |
Range | 957.92 |
Coefficients of Variation | 52% |
Mean is suggested to be the average point of the given data set. In Table 2A, the Mean is 498.704. It means that the claim value is €498.704 in average. The Median of the Claim Value is 462.2. Median < Mean, which will distort the distribution curve into right-skewed distribution.
Fig. 2A Distribution curve (positive or right-skewed distribution) for the Claim Value.
The Claim Value display a right-skewed distribution. In the right-skewed distribution, most of the Claim Value is short and is distributed in the lower portion. Some extremely high claim value cause the long tail and distortion to the right and cause the Mean (498.704) to be greater than the Median (462.2). Because the skewness statistic for such a distribution will be greater than zero, some use the term positive skew to describe this distribution.
Random Variables
In the following section, we will use Discrete Random Variable and its probability distribution to explain the relationship between the call of claims received by call centre, and the claim handled by the related staff.
Since the Call Category 1: “New Business” and Call Category 4: “Service Cancellation” are not implied claims. The data analysis will focus on Call Category 2: “Query on Existing Claim”& Call Category 3-: “New Claim”.
(A) Data analysis of Call Category 2: “Query on Existing Claim”& Call Category 3: “New Claim”
According to the data from Table 1 (refers to Appendix V), the call centre received 81 calls in total. Among the data, Call Category 2 has 18 and Call Category 3 has 22 . (showed as below chart).
Chart of Probabilities of Call Category Id
(B) Data analysis of Call of Claims
In Call Category 2 & 3, the total call of claims are 40. And only 19 calls needs to follow up (showed as below table):
Category Id | # of calls/ category Id | call needs to follow-up (Yes) | call needs to follow-up (No) |
2 | 18 | 5 | 13 |
3 | 22 | 14 | 8 |
Total | 40 | 19 | 21 |
Refer to Table 2 (Appendix VI), the result of covariance is -0.394 that indicates that there is a negative relationship between the call needs to follow-up (Yes) and call needs to follow-up (No).
According to the above data, the call centre
According to the data collected from Table 3 (Appendix VII), the total completed claims are 44 that 18 claims authorized by AC and 26 authorized by PC.
Hence, the call category received from call centre (Appendix VIII, Table 5A & 5B) is negative relationship (-2.889) with claims handled by different staff.
Section 3
3a. In this section we’ll use Claim Value as our continuous random variable and find out the 95% confidence interval for the variable to get the lower and upper limit of average claim value for population with a 95% confidence.
As the Q-Q plot for the variable shows (Appendix IX), the plot is linear and thus, the variable follows normal distribution.
As mentioned in section1, the Z-score of data are being checked. The Claim Value of Claim ID 1011 is found out of the range of +/-3.0. Therefore, Claim ID 1011 is the extreme case and its data is removed from our statistical analysis.
As we need to estimate population mean where sample mean and standard deviation is known. So, we’ll use t-statistic for estimation purposes.
In excel, 95% (alpha=0.05)confidence interval for the variable can be caluculated using
[Mean –(Confidence(alpha,standard_dev,size)) , Mean + (Confidence(alpha,standard_dev,size))]
Claim Value € | |||
Sample Size | 43 | ||
Mean | 498.70 | ||
Standard Deviation | 257.3923 | ||
alpha | 0.05 |
Section 4
