Friday 20 June 2008

Bias in the misspecified AR model under common estimation techniques

Growth data split by country is commonly used in estimating parameters in growth models. The estimation techniques often assume that the model takes a particular form, so it is interesting to find out what happens when the assumed model is incorrect.

I've been running Monte Carlo tests on a misspecified AR model. The true generating process is lny(t)=a(i)*lny(t-1) + b(i) +N(0,V) for a and b fixed in time but varying by countries with starting incomes separated by 1000 from 1000 upwards. a is tested for various values - by turns, uniform over (0,1) for each country, uniform over (-1,1), evenly spread by country over 0 and 1, and correlated with starting income. b is adjusted either to give a starting growth rate of six percent or to give a growth rate in U(0%,10%). The normal variance V is adjusted to give less than an extra 10% variation in the data most of the time (so V equals 0.047). The misspecified model is lny(t)=a*lny(t-1)+b+c(i) error, with c time invariant and country variant, and a and b both constants.

Repeated Monte Carlo data were generated for different numbers of countries and time periods, and the estimates were made for each panel. The estimates were made by GMM-SYS, Arellano-Bond, and within group OLS. The model looks good according to the statistics commonly used in each estimation technique. The estimates of a are high. For the U(0,1) data, they are near 1 in GMM-SYS, around 0.7-0.8 in A-B, and between 0.8 and 0.85 in within group.

What happens is that countries with low AR parameter approach their maximum income quickly, and their main variation is random fluctuation, so after a while their data looks like it could be generated by a pure AR=1 process with no constant. The high AR data continues to display variation over a longer period. I suspect this is why the pooled data AR estimates are close to one for all the estimation methods.

Since the standard capital + education Solow model tends to display little variation in its key parameters and comparatively low explanatory power, models based on specifications like lny(t)=a*lny(t-1)+d*country education rate+e*country saving rate+c(i) is actually quite similar to the original misspecified model with the two rate terms taking the place of the constant term. High growth countries with much variation in the data will often have high a parameters (being the source of identified growth) and the overall panel data estimate of AR should be near 1.

The standard ARIMA estimates gave reasonable estimates of parameters for each individual country. Checking ARIMA values for AR stability before using panel data is sensible.

No comments: