GDP clustering: A reappraisal

Preview:

Citation preview

Economics Letters 117 (2012) 837–840

Contents lists available at SciVerse ScienceDirect

Economics Letters

journal homepage: www.elsevier.com/locate/ecolet

GDP clustering: A reappraisal

Michele Battisti a,b,c, Christopher F. Parmeter d,∗

a Dipartimento di Studi su Politica, Diritto e Societá, Palermo, Italyb CeLEG LUISS Rome, Italyc RCEA Rimini, Italyd Department of Economics, University of Miami, Coral Gables, FL 33124-6520, United States

a r t i c l e i n f o

Article history:Received 13 August 2012Accepted 24 August 2012Available online 11 September 2012

JEL classification:C33O15O16O47

Keywords:Mixture densitiesOutputAutoregressiveIsotropic

a b s t r a c t

This note explores clustering in cross country GDP per capita using recently developed model basedclustering methods for panel data. Previous research characterizing the components of the overalldistribution of output either use ad hocmethods, ormethodswhich ignore/subvert the panel nature of thedata. These newmethods allow the characterization of the possible autoregressive relationship of outputbetween time points. We show that traditional static clustering decade by decade gives mixed resultsregarding clustering over time, while the application of longitudinal mixtures presents three distinctclusters at all periods of time.

© 2012 Elsevier B.V. All rights reserved.

1. Introduction

There has been a growing use of mixture models in the growthempirics literature. These methods acknowledge heterogeneityacross a variety of dimensions in the growth spectrum, allowingthe characterization of club convergence and the measurementof polarization. The primary use of these methods has been tostudy either direct features of the distribution of growth, such asthe number of distinct components, or to determine the variableswhich influence membership into a specific club. Additionally,within group features have also been investigated, such as clubσ -convergence. Overall, mixture methods have provided an arrayof insights into the cross country growth process.

One shortcoming of the existing methods is that they havebeen administered in a primarily static setting, even with the useof panel data. For example, Pittau et al. (2010) and Battisti andParmeter (2011) both investigate the distribution of cross countryoutput with access to panel data, but their main analyses hinge on

∗ Corresponding author. Tel.: +1 305 284 4397; fax: +1 305 284 2985.E-mail addresses:mbattisti@unipa.it (M. Battisti), cparmeter@bus.miami.edu

(C.F. Parmeter).

0165-1765/$ – see front matter© 2012 Elsevier B.V. All rights reserved.doi:10.1016/j.econlet.2012.08.038

treating each data period as distinct from one another. However,given the repeated measurement of countries over time, a morerigorous analysis can be conducted by allowing for cross periodcovariance within the mixture setup.

Using recently developed mixture methods for repeatedlymeasured data, this paper takes a fresh look at clustering in thedistribution of cross country output. While we find the numberof components to be consistent with previous studies,1 the mean-variance dynamics over time suggest different behavior acrossthe identified components, notably a disappearance of the middleclass as measured by the component averages. Further, for ourdataset, we see that using static decade by decade clustering doesnot produce a consistent number of components over time. Onthe contrary, exploiting the panel nature we show that thereexist three components, with the poorest component (mainlysub-Saharan nations) not growing over time, while there is atendency of within and across club convergence for the other twocomponents.

1 Pittau et al. (2010) find three groups usingweighted averageworkforce GDP perworker PWT data over the period 1960–2000, while we have a longer time frameand different output data.

838 M. Battisti, C.F. Parmeter / Economics Letters 117 (2012) 837–840

2. Empirical framework

The basic framework for our empirical analysis is the Gaussianmultivariate mixture model:

f (y) =

Gg=1

ψgφ(y|µg ,Σg), (1)

where ψs is the probability of membership in group g andφ(y|µ,Σ) is the multivariate Gaussian distribution with meanvector µ and covariance matrix Σ . The covariance structure maybe decomposed to test alternative shape, volume and orientationrestrictions across the components (Fraley and Raftery, 2002).Note that the setup ignoring the panel nature of the data wouldestimate a univariate Gaussian mixture model. The multivariateversion emerges since each period of measurement acts like asingle variable in the multivariate framework.

McNicholas andMurphy (2010) allow for an extended Choleskydecomposition of the covariance matrix Σ in order to take intoaccount the relationship between measurements at differentpoints in time. This decomposition separates Σ into generalizedautoregressive parameters T and innovation variances D (Pourah-madi, 1999), so that is possible to distinguish between the differentsources of covariance:

Σ = T ′D−1T . (2)

This setup for Σ has a very natural interpretation from leastsquares prediction theory. The least squares predictor of yt givenyt−1, . . . , y1 is

yt = µt +

t−1s=1

(−ϑts)(ys − µs)+

dtεt . (3)

The ϑts are the sub-diagonal elements of T , while the dts arethe diagonal elements of D and µs is the sth period mean.Taken together this format for constructingΣ allows for previousobservations to influence current observations, something thatis missing from static mixture analysis methods. As in staticmixturemodeling, the optimal solution can be obtained via the EMalgorithm (Dempster et al., 1977).

3. Analysis

Our measurement of cross country output is GDP per capitataken from the Groningen Growth and Development Center andmeasured in constant 1990 prices for 101 countries over the period1950–2010. Analyzing the data across each decade provides uswith seven time points for each country. Prior to estimating theclustering of the distribution of world-wide output, we focus onthe results from a traditional, static implementation.

Table 1 shows the stylized fact of emerging multimodality(see Quah, 1993). The LR bootstrap test suggests a single group ofcountries from 1950 until 1970, with the number of componentsranging from two and four after 1970. The static clustering makesinterpreting and discussing polarization and club σ -convergencedifficult given the different numbers of groups emerging in eachdecade. In Pittau et al.’s (2010) analysis they always find threegroups of countries which makes these types of statements morenatural. The use of longitudinal mixture methods will help toalleviate these issues.

Turning our attention to the longitudinal mixture results, wetest for the presence of a maximum of six components for eachof the eight possible covariance decomposition structures listedin McNicholas and Murphy (2010).2 The eight possible covariance

2 We use the longclust R module, see McNicholas and Subedi (2012).

Table 1LR bootstrap test computed over 1000 replications.

G = 1 G = 2 G = 3 G = 4

1950 0.1391960 0.7641970 0.8941980 0.004 0.002 0.4071990 0.006 0.7812000 0.004 0.049 0.1812010 0.004 0.030 0.043 0.481

Notes: the table reports the probability of rejection of the null hypothesis ofcomponents equal to G segments, with an alternative hypothesis that the numberof segments is greater than G.

Table 2Log GDP per capita: longitudinal cluster characteristics.

Cluster 1 Cluster 2 Cluster 3

Mean 1950 8.017 6.544 7.430Mean 1960 8.256 6.768 7.810Mean 1970 8.553 6.935 8.336Mean 1980 8.851 6.942 8.754Mean 1990 8.950 6.944 8.824Mean 2000 9.149 7.035 8.947Mean 2010 9.329 7.355 9.259

△ Variance 1960 (%) 10.8 20.3 3.2△ Variance 1970 (%) 5.2 23.4 −15.7△ Variance 1980 (%) −15.2 −12.5 −8.0△ Variance 1990 (%) 4.5 26.0 98.9△ Variance 2000 (%) 3.2 48.3 79.9△ Variance 2010 (%) −15.6 27.0 −8.3Overall change (%) −5.6 213.1 172.8

Weight (%) 48.0 29.9 22.1

structures for Σ arise over choices concerning the parameters in(3). The within component variance coefficients can be restrictedto be equal across time, known as an isotropic constraint (Dg =

δg I), the across component variance coefficients in each timeperiod can be equal (dgt = dmt for g = m) and the across componentautoregressive parameters can be equal (ϑg

ts = ϑmtk for g = m).

Given that there exists a trade off between a greater descriptionof the clustering process (withmore groups) andmore coefficientsto estimate, we use two criteria to select the optimal solution, thetraditional BIC and the ICL (integrated completed log-likelihood)that is a correction of BIC that penalizes components that are morespread out.

The best model determined using either the BIC or theICL is a mixture with three components that restricts thegeneralized autoregressive parameters across components to beequal while allowing different component innovation variances,without isotropic constraints. The restriction on the generalizedautoregressive parameters for Σg suggests that the componentsbehave in a similar fashion over time, relative to previousrealizations of output, but given different component means,certain clubs will inevitably have higher levels of output onaverage. The fact that the component innovation variances differacross time and across components also implies that the membersof each component react differently to shocks than members inother components.

To study the estimated clusters in more detail Table 2 presentsthe cluster means over the seven decades as well as changesin the within group variance across the decades (Pittau et al.,2010, initially propose this idea). Component 1 represents mainlyrich countries (containing oil producers, OECD and several LatinAmerican countries), component 2 is primarily made up of thepoorest nations (sub-Saharan African and some Asian countries)while component 3 is an intermediate group having some Asian,Eastern European and Middle East/North African countries. Thesegroups are consistent with the univariate results of Pittau et al.

M. Battisti, C.F. Parmeter / Economics Letters 117 (2012) 837–840 839

Fig. 1. Country specific and component mean trajectories.

(2010) and themultivariate results of Battisti and Parmeter (2011),albeit using static methods.

Comparing these results with the static clustering results wesee that the Asian Tigers would have been assigned to the poorestgroup, which is difficult to conceive given their strong growthafter 1970. Further, the growth slowdown of Italy, Israel, and NewZealand placed them in a middle income cluster using the staticmethods, while the panel clustering clearly places these countriesin the richest (as measured by the mean) component, given thefact that prior to 2010 these countries had strong growth profiles.We also notice that component 1 and component 2 do not have anoverlap while component 3 has an upward trend, and shows signsof catching up with component 1.

The bottom panel of Table 2 shows club σ -convergence forcomponent 1 and divergence for components 2 and 3. The variancechanges inside the components highlight a common break duringthe 1970s most likely the oil shocks which led to a global growthslowdown. We also see the effects of the current economicslowdownmanifest itself in the large reduction in thewithin groupvariance for components 1 and 3. Looking at the relative distancebetween the means across groups (a raw measure of polarization)the distance between the first and second components was 22.5%in 1950 and grew to almost 30% by 2010. Alternatively component3 exhibited catching up to component 1, moving from a relativemean gap of 7% in 1950 to less than 1% by 2010.

The additional feature of the panel clustering approach is thatwe are able to see the dynamics of these groups. Fig. 1 presentsthe within component trajectories over time. The dashed linesare the growth profiles for each country within a componentwhile the bold lines are the overall component means. The lowerright figure presents the estimated component means from thecluster analysis. This figure shows clearly the findings of Table 2,where component 1 has a larger mean, an upward trend and

lower variation at the end of the sample, while component 2is characterized by flat growth with more variation throughoutthe entire period and component 3 grows faster than the othertwo components (on average) with an increasing spread. It is thecomplementary information to the polarization we just saw: theintermediate component almost equals the average output of therich component but it has a substantially larger within componentvariance than component 1.

4. Conclusions

The analysis described here uses recently developed longitudi-nal mixture analysis to reassess the changes in the density of GDPper capita over time as well as the behavior of the identified sub-components over time. We find that static methods deliver mixedresults regarding the number of components over time, while thepanel clustering methods show three distinct components overthe full period, consistent with previous research. The ability ofthe longitudinal clustering methods to model trajectories of eachcountry over time allows a greater analysis of the components ofthe overall density of cross country output over time.

References

Battisti, M., Parmeter, C.F., 2011. Income polarization, convergence tools andmixture analysis, Working Paper Miami Business School WP2011/17.

Dempster, A.P., Laird, N.M., Rubin, D.B., 1977.Maximum likelihood from incompletedata via the EM algorithm. Journal of the Royal Statistical Society, Series B 39,1–38.

Fraley, C., Raftery, A., 2002. Model-based clustering, discriminant analysis, anddensity estimation. Journal of the American Statistical Association 97, 611–631.

McNicholas, P.D., Murphy, T.B., 2010. Model-based clustering of longitudinal data.The Canadian Journal of Statistics 38 (1), 153–168.

840 M. Battisti, C.F. Parmeter / Economics Letters 117 (2012) 837–840

McNicholas, P.D., Subedi, S., 2012. Clustering gene expression time course datausing mixtures of multivariate t-distributions. Journal of Statistical Planningand Inference 142 (5), 1114–1127.

Pittau, M.G., Zelli, R., Johnson, P.A., 2010. Mixture models and convergence clubs.Review of Income and Wealth 56 (1), 102–122.

Pourahmadi, M., 1999. Joint mean-covariance models with applications tolongitudinal data: unconstrained parameterisation. Biometrika 86 (3),677–690.

Quah, D.T., 1993. Empirical cross-section dynamics in economic growth. EuropeanEconomic Review 37, 426–434.

Recommended