# Quantitative analysis

Quantitative analysis is the process of collecting measurable data and evaluating the data. The data can be market shares, revenues, and wages. It helps people understand the performance of their businesses. In the early days, people relied on experience during decision making. However, in the current days, people use quantitative analysis to know performance and make decisions.

## Matrix of correlation

NCARREG | IMPORTS | RATE | |

NCARREG | 1 | ||

IMPORTS | 0.653606 | 1 | |

RATE | -0.69246 | -0.90112 | 1 |

There is a high correlation between imports and registered cars, with an increase in the number of registered cars there equates to an increase in import value as the time period heighten from 1992q1 2018 q1.On the contrary, rates display a negative correlation with both new registered cars and imports. Which entirely means for every unit change in either Imports or Registered cars there results in a change in rates in the opposite direction, which has been a trend over the time period on the study?

This assertion can further be evident by the time series graphs below,

Learn more through our

**quantitative analysis assignment help**.**Estimate a regression model of the following form for the period 1992Q1 to 2018Q1:**

**Regression Statistics**

Multiple R | 0.69714 |

R Square | 0.486004 |

Adjusted R Square | 0.470736 |

Standard Error | 7687.808 |

Observations | 105 |

Coefficients | Standard Error | t Stat | P-value | |

Intercept | 39302.54 | 9449.291 | 4.159311 | 6.71E-05 |

IMPORTS | 14.2145 | 85.54235 | 0.166169 | 0.868356 |

RATE | -1889.57 | 1068.645 | -1.7682 | 0.080048 |

TREND | 78.22917 | 130.3307 | 0.600236 | 0.549693 |

**NCARREG= 39302.54+ 14.2145IMPORT-1889.57RATE+78.22917TREND**

New registrations of cars change by a unit resulted in a change directly in imports by a factor 14.2145 and trend by 78.23, but inversely with rate by a factor 1889.57.

**To improve the model dynamics are introduced. The two variables to explain the car registrations are lagged by 4 periods equivalent to a year. The model to be estimated for the period 1993Q1 to 2018Q1 is now of the form:**

**Regression Statistics**

Multiple R | 0.679677 |

R Square | 0.461961 |

Adjusted R Square | 0.445321 |

Standard Error | 7695.466 |

Observations | 101 |

Coefficients | Standard Error | t Stat | P-value | |

Intercept | 49344.56 | 10777.07 | 4.578662 | 1.39E-05 |

TREND | 41.77199 | 146.8604 | 0.284433 | 0.776685 |

IMPORTS LAG | -25.4673 | 92.8713 | -0.27422 | 0.784497 |

RATELAG | -2799.19 | 1243.621 | -2.25084 | 0.026653 |

**NCARREG= 49344.56-25.4673IMPORT-2799.19RATE+41.7719TREND**

Lagging the period’s results in a relationship where New registrations of cars display an indirect relationship with both import and rate, but a direct relationship with the trend,.

**Conduct a model control and investigate for the presence of autocorrelation by use of the Durbin Watson test. Save the residuals from this regression and call this variable RESID. Comment on the sign of the included variables. Assume a 10 percent level of significance. Compare also the model found under question B**

SUM OF SQUARED RESIDUALS | 5969341438 |

SUM OF SQUARED DIFF RESIDUALS | 4700260869 |

DURBIN WATSON | 0.78740024 |

The Durbin Watson statistic displays a positive presence of autocorrelation, the similarity between observations as a function of the time lag between them is thus high.

The model can explain variability of up to 67% within the data set compared to the model in B which explained up to 69% of the variability within the data set.

**Expand the initial model above to an autoregressive (AR) model of order 4. This is an AR-4 model for the period 1994Q1 to 2018Q1. Estimate a model of the form**

**SUMMARY OUTPUT**

**Regression Statistics**

Multiple R | 0.838415552 |

R Square | 0.702940638 |

Adjusted R Square | 0.67903931 |

Standard Error | 5443.022059 |

Observations | 95 |

Coefficients | Standard Error | t Stat | P-value | |

Intercept | 47268.57699 | 7963.301 | 5.935802 | 5.83E-08 |

IMPORTS LAG | -58.76740547 | 67.89612 | -0.86555 | 0.389118 |

RATELAG | -2263.925707 | 925.0793 | -2.44728 | 0.016404 |

t-1 | 0.368681017 | 0.096027 | 3.839331 | 0.000234 |

t-2 | 0.025210739 | 0.10695 | 0.235724 | 0.814201 |

t-3 | -0.028588272 | 0.106814 | -0.26765 | 0.789605 |

t-4 | 0.475936119 | 0.097676 | 4.87262 | 4.9E-06 |

Trend | 113.0254443 | 107.3485 | 1.052883 | 0.29531 |

**NCARREG= 47268.57-58.767 IMPORT-2263.92RATE+113.025TREND+0.368RESID(t-1)+0.0252RESID(t-2)-0.0285RESID(t-3)+0.4759RESID(t-4)**

## Why is it interesting for this data set to consider a model of order 4 relative to quarterly statistics?

This is because an explanation of high variance in the data can be achieved at 83.84%.

SUM OF SQUARED DIFFERENCES | 3535804788 |

SUM OF SQUARED RESIDUALS | 2577504555 |

DURBIN WATSON STAT | 0.72897253 |

The Durbin Watson statistic displays a positive presence of autocorrelation, the similarity between observations as a function of the time lag between them is thus high.

**RESIDUAL PLOT**

**1E**

Coefficients | Standard Error | t Stat | P-value | |

Intercept
LIMPORTS LRATE | 9.194577
0.306131 -0.12427 | 0.452701
0.092492 0.028566 | 20.31051
3.309828 -4.35023 | 1.26E-37
0.001291 3.23E-05 |

**LNCARREG=9.194577+0.306131LIMPORTS-0.12427LRATE**

**Save the residuals from this model and call them ECMRESID. Estimate second the short run ECM-model**

**SUMMARY OUTPUT**

Regression Statistics | |

Multiple R
R Square Adjusted R Square Standard Error Observations | 0.034987219
0.001224105 -0.028442505 0.897380716 105 |

Coefficients | Standard Error | t Stat | P-value | |

Intercept | 0.608732812 | 0.09918 | 6.137651 | 1.66E-08 |

LRATE4TH DIFF | 0.040354728 | 0.157096 | 0.25688 | 0.797794 |

LIMPORT4TH DIFF | -0.349327418 | 1.501216 | -0.2327 | 0.816468 |

ECMRESID | 0.054029855 | 0.471968 | 0.114478 | 0.909086 |

### How good is the model, and how is the model control?

The model exhibits a positive coefficient of the residuals hence not a good fit

#### Regression statistics

**Regression Statistics**

Multiple R | 0.290141 |

R Square | 0.084182 |

Adjusted R Square | 0.056979 |

Standard Error | 10261.89 |

Observations | 105 |

Coefficients | Standard Error | t Stat | P-value | |

Intercept | 36596.22 | 1974.901 | 18.53066 | 2.88E-34 |

D2 | 5444.97 | 2819.659 | 1.931074 | 0.056277 |

D3 | -2616.22 | 2819.659 | -0.92785 | 0.355697 |

D4 | -1299.49 | 2819.659 | -0.46087 | 0.645884 |

dd

*NCARREG= 36596.22+ 5444.97D2-2616,222D3-1299.49D4***How good is the model? Is deterministic seasonality observed? Is it true that spring is the high season for new car registrations?**

Deterministic seasonality has been observed this is due to the time constant means lagged over 4 periods. Spring is a high season for new car registrations evident by the coefficient of relationships which displayed an increased value.

**2c: Examine for deterministic monthly seasonality**

**Regression Statistics**

Multiple R | 0.344883 |

R Square | 0.118944 |

Adjusted R Square | 0.086958 |

Standard Error | 3669.452 |

Observations | 315 |

Coefficients | Standard Error | t Stat | P-value | |

Intercept | 11182.37 | 706.1864 | 15.83487 | 1.44E-41 |

D2 | -114.037 | 998.6984 | -0.11419 | 0.909166 |

D3 | 3163.148 | 998.6984 | 3.167271 | 0.001696 |

D4 | 1898.36 | 1008.256 | 1.882817 | 0.060683 |

D5 | 3208.13 | 1008.256 | 3.181862 | 0.001615 |

D6 | 3387.591 | 1008.256 | 3.359854 | 0.00088 |

D7 | 185.9758 | 1008.256 | 0.184453 | 0.853781 |

D8 | -111.255 | 1008.256 | -0.11034 | 0.91221 |

D9 | 358.1681 | 1008.256 | 0.355235 | 0.72266 |

D10 | 723.5142 | 1008.256 | 0.71759 | 0.473563 |

D11 | 527.1681 | 1008.256 | 0.522852 | 0.60146 |

D12 | 498.9373 | 1008.256 | 0.494852 | 0.621063 |

*NCARREG= 11182.37-114.037D2+3163.148D3+1898.36D4+3208.13D5+3387.591D6+185.98D7-111.255D8+358.1681D9+723.5142D10+527.1681D11+498.9373D12*### Is it true that March, May, and June are special?

Yes, the three months experience a high relationship with the dependent variable, for every unit change in the dependent variable there result in a change in the predictor variable at a high magnitude in march at a factor 3163.148, may at a factor 3208.13, and June at a factor 3387.59, being the highest.

**Compare also with the model estimated on the quarterly data in question A. Is any information lost by using only the model estimated under question A?**

#### Hypothesis of distractive driving

**Ho:**distractive driving is more likely to occur on motorways

**H1:**distractive driving is NOT likely to occur on motorways

**SUMMARY**

Groups | Count | Sum | Average | Variance |

Cities | 16 | 147 | 9.1875 | 2.9665 |

Land | 16 | 109.8 | 6.8625 | 2.0665 |

Motorways | 16 | 254.8 | 15.925 | 8.539333333 |

**ANOVA**

Source of Variation | SS | df | MS | F | P-value |

Between Groups | 708.9517 | 2 | 354.4758333 | 78.35259228 | 2.19355E-15 |

Within Groups | 203.585 | 45 | 4.524111111 | ||

Total | 912.5367 | 47 |

The difference in the mean test based on the ANOVA output gives a significant p-value, that is p-value greater than the standardized value of 0.05, we, therefore, fail to reject the null hypothesis and conclude that Distractive driving is more likely to occur on motorways compared to land and cities, this can further be heightened by the high variance and mean of the accidents through motorways variable which acts as a supplementary analysis.

## ANOVA Analysis

**ANOVA**

Source of Variation | SS | df | MS | F | P-value | F crit |

Sample | 149.4317 | 3 | 49.81056 | 50.51211 | 5.54E-13 | 2.866266 |

Columns | 708.9517 | 2 | 354.4758 | 359.4685 | 1.63E-24 | 3.259446 |

Interaction | 18.65333 | 6 | 3.108889 | 3.152676 | 0.013741 | 2.363751 |

Within | 35.5 | 36 | 0.986111 | |||

Total | 912.5367 | 47 |

**Describe the method and set up the hypotheses behind the test.**

**Ho:**Distractive performance is equal in all the segments(U1=U2=U3=U4)

**H1:**Distractive performance differs per segment (U1≠U2≠U3≠U4)

Where U1 is the mean percent for distractive performance at Scandinavia and U2, U3, U4 for small Western countries, large Western countries, and East European countries respectively.

#### What is the outcome? Do we observe interaction or segmentation among the two factors?

Segmentation among the factors is observed this is evident by the p-value of 0.0137, which brings statistical insignificance.

#### What is the interpretation?

The p values in the table are used to draw conclusions, in the table statistical significance is ascertained based on this, testing for equality of distractive performance is concluded with the p-value in the column of 1.63E-24 which is less than the standardized sig of 0.05.

#### Which group(s)/segments of countries are of interest with regard to both factors?

Segm | Cities | Land | Motorways |

1 | 7.1 | 5.3 | 12.4 |

2 | 8.8 | 6.4 | 15.0 |

3 | 10.0 | 6.7 | 17.0 |

4 | 10.9 | 9.0 | 19.4 |

The table displays averages of the incidents of distractive performance, the last two segments of countries are of interest when we regard the factors, as attributed to the large mean values of the incidences.

#### Can ranking be undertaken?

Based on the tabulated means, there is clear evidence that ranking can be undertaken.

**Examine the symmetry of the dataset and conduct the Bowman-Shenton test. Set up the hypotheses, describe, and perform the test.**

**H0:**the data follow a normal distribution.

**H1:**the data does not follow a normal distribution.

Bowman-Shenton statistics is defined as;

Skewness | -2.13823 |

Kurtosis | 4.629575 |

Bowman-Shenton test. | 21.81636 |

Chi-square value at 95% level of significance and d.f (n-1) =24 is 13.85 since the B.S> Chi-square value we reject the null hypothesis and conclude that the data does not follow a normal distribution, that is the data is skewed, this is further evident by the skewness value of -2.13823

**Set up a sign-test and examine a hypothesis stating that the median is higher than**

**13,000. Set up the hypotheses, describe and perform the test.**

**Ho:Median> 13000**

**H1:Median <13000**

**Sign test**

SUCCESS(positive) | 19 |

FAILURE(negative) | 6 |

Q | 0.24 |

P | 0.76 |

BINOMIAL PROBABILTY | 0.184053578 |

The p-value based on the binomial probability is more than 0.05; we, therefore, reject the null hypothesis and conclude that the median is less than 13000.

#### Analysis of incomes

**5A**

**DKINCOME**

Mean | 526.6375 |

Standard Error | 17.19441727 |

Median | 528 |

Mode | 600 |

Standard Deviation | 343.8883454 |

Kurtosis | 11.72712285 |

Skewness | 2.674171798 |

Range | 2662 |

Sum | 210655 |

Count | 400 |

Confidence Level (95.0%) | 33.80297294 |

The data set has a mean of 526.6375, normality of the data is indistinct since the skewness value is high at a positive 2.6742. This further evident by the histogram below which shows the data as skewed to one side.

Class | Frequency |

0-160 | 27 |

161-320 | 88 |

321-480 | 73 |

481-640 | 111 |

641-800 | 65 |

801-960 | 20 |

961-1120 | 4 |

1121-1280 | 1 |

1281-1440 | 1 |

1441-1600 | 1 |

1601-1760 | 1 |

1761-1920 | 1 |

1921-2080 | 1 |

2081-2240 | 3 |

2241-2400 | 1 |

2401-2560 | 1 |

2561-2720 | 1 |

The majority of the data are in the class of 481-640 representing 111, followed by 161-320 at 88, the rest of the distribution was as in the frequency table, highly depicting skewness to the right.

**Use descriptive statistics and a discussion of the shape of the samples compared to the distribution of the total data set.**

Simple random sample | Stratified/systematic sampling | Total sample | |

Mean | 557.575 | 516.925 | 526.6375 |

Standard Error | 62.92947709 | 32.46425 | 17.19441727 |

Median | 510 | 540 | 528 |

Mode | 600 | 600 | 600 |

Standard Deviation | 398.0009591 | 205.3219 | 343.8883454 |

Sample Variance | 158404.7635 | 42157.1 | 118259.1941 |

Kurtosis | Kurtosis | -0.42397 | 11.72712285 |

Skewness | 2.986920769 | -0.13469 | 2.674171798 |

Range | 2032 | 840 | 2662 |

Minimum | 68 | 120 | 18 |

Maximum | 2100 | 960 | 2680 |

Sum | 22303 | 20677 | 210655 |

Count | 40 | 40 | 400 |

The data set from a simple random sample and total population are positively skewed, with a skewness coefficient of 2.986920769 and 2.674171798 respectively, this is further displayed by the histogram which weighed to the right side. Nonetheless, the sample from the stratified sampling technique exhibits a negative skewness, showing that the majority of the data points are to the left side, the three data sets explained in terms of outliers’ exhibit high outliers for the total population, and the simple random sample as compared to the stratified sampling.

Get a similar solution from our team. We offer quality

**quantitative analysis homework help**.