Economic complexity unfolded: Interpretable model for the productive structure of economies

Economic complexity reflects the amount of knowledge that is embedded in the productive structure of an economy. It resides on the premise of hidden capabilities—fundamental endowments underlying the productive structure. In general, measuring the capabilities behind economic complexity directly is difficult, and indirect measures have been suggested which exploit the fact that the presence of the capabilities is expressed in a country’s mix of products. We complement these studies by introducing a probabilistic framework which leverages Bayesian non-parametric techniques to extract the dominant features behind the comparative advantage in exported products. Based on economic evidence and trade data, we place a restricted Indian Buffet Process on the distribution of countries’ capability endowment, appealing to a culinary metaphor to model the process of capability acquisition. The approach comes with a unique level of interpretability, as it produces a concise and economically plausible description of the instantiated capabilities.


Introduction
According to Lall [1,2], each country has to find its own path towards development, focusing on its learning system in order to add capabilities to the ones it already owns. This line of reasoning, which Lall calls the "capabilities approach", has been further developed in the seminal works of Hidalgo and Hausmann [3][4][5][6][7][8][9], as well as in [10][11][12].
These fundamental endowments describing the productive structure of an economy are at the roots of the economic complexity theory, which leverages tools from network science and econometrics to reflect the amount of knowledge that is embedded in the productive structure of an economy (please see [3][4][5][6] and references therein for a detailed overview on the topic). According to the economic complexity approach, the complexity of the productive structure (more precisely the export structure) of countries is measured by using the information a1111111111 a1111111111 a1111111111 a1111111111 a1111111111 infer the endowment of capabilities, i.e. the level of complexity of a productive system. Exception can be found in the binomial model for capabilities introduced in [5], and the work in [32] which estimates latent factors of endowments and formalizes the concept of latent comparative advantage.
Against this background, we extend over these methods by interpreting the productive structure from the perspective of probabilistic learning [33], where latent features are introduced to capture dependencies between exported products. This is similar in spirit to Latent Semantic Analysis in natural language processing, where documents are related to the words they contain via a set of topics [34]. In this same sense, the extracted features may be associated with the aforementioned capabilities, and may thus be understood as factors underlying the competitive advantage of countries in exported products.
However, in contrast to traditional factor analysis, the approach comes with the unique flavor of balancing predictive accuracy with interpretability, the current "holy grail" in highdimensional data exploration. This interpretation is key for the subsequent exploitation phase which can be put forward as a discriminative model that provides highly accurate predictions. To assure interpretability and to simultaneously accommodate for the observed properties of real trade data (namely, the sparse triangular structure of the country product matrix [4,14]), we propose a Bayesian non-parametric (BNP) approach which naturally encompasses sparse feature analysis when the underlying latent dimension is unknown. BNP models have mainly been used for clustering [35] and sparse feature analysis [36], when the number of clusters or features is a priori unbounded and is also learned from data. Many research disciplines have benefited from BNP models, including psychiatry [37], social sciences [38,39], cancer research [40], or sports [41].
We note that, although being data driven, the method incorporates economic evidence about the relations between the different degrees of diversification in the exports and the expected distribution of the capabilities across the countries. In particular, we place an Indian Buffet Process [36] on the distribution of countries' capability endowment, appealing to a culinary metaphor to model the process of capability acquisition. Such flexible prior allows capturing different kinds of realities, from a world in which countries with few skills focus on different types of products, to a world in which less-developed countries have a strong overlap in export portfolios. Second, we also rely on the restricted Indian buffet process formulation from [42], which allows for a general marginal distribution over the number of active features per country.
The model not only allows us to characterize economic complexity of countries and products, but also to unfold the productive structure by isolating the key features associated with the competitive advantage in each of the exported products. With the capability interpretation, all measures (such as similarity, complexity, etc.) defined in the country or product space now admit a natural representation in the capability space. Moreover, we are able to identify features exclusively associated with less ubiquitous, i.e., more complex products. We refer to Fig 1 to motivate the subsequent discussion and to illustrate the (first order) interpretative power of the model. It depicts the tripartite country-capability-product network, where the capabilities (latent features) relating countries and products are learned from data.
Conceptually, the framework is related to the works in economic development theory which reside on the ideas of implicit capacities, such as [32]. In particular, the authors in [32] perform non-linear principal component analysis (exponential PCA) as a dimensionality reduction technique to extract the latent variables which best explain the export data. However, compared to these approaches, the deployment of the Bayesian non-parametric approach with a flexible prior comes with the unique advantage of combining both interpretability and accuracy. As illustrated in the "Model Properties" section and in S1 Supporting Information, the comparison with the other factorization techniques suggests that our approach significantly enhances interpretability of the latent factors in terms of both conciseness and precision of the clustering of products. Additionally, we show that our findings are quite robust and stable in the sense that the evaluation of the model across different product classifications and years does not produce any significant changes in the results.
The proposed model exhibits further favorable qualitative and quantitative properties. Specifically, it captures the empirical distribution of ubiquity and, importantly, diversity, when compared to other models which fail to capture adequately the tails of the empirical diversity distribution. An important feature of the model is that it is able to incorporate temporal dynamics and thus capture the transitions in the capability space. The temporal dynamics can be put in the context of the studies that examine the path-dependence diversification in countries and regions (see e.g. [3,43]). Consequently, our model indicates that export structures, being capability-dependent and thus difficult to change on short terms, have implications for growth and development. An interesting aspect of our model is the identification of latent features which are exclusively related to more diverse, i.e., developed and wealthier economies, as illustrated in the "Results" section. The network of countries, capabilities and products. A visualization of the tripartite network between countries, capabilities and products. In the middle of the network are the countries with node sizes proportional to their diversity. They are linked to the capabilities they have (the capability nodes are with uniform size). For visualization purposes, we link each capability k to the products for which B pk ! 0.3, see Methods Section for further details. The size of the product nodes is proportional to their ubiquity and they are colored according to the one digit SITC classification. Products for which there is no k such that B pk ! 0.3 are isolated, and thus are not shown in the

Methods
We consider two publicly available trade datasets, the SITC and HS databases (see Section A of S1 Supporting Information for detailed description of the data). The data matrix is binary and represents the Revealed Comparative Advantage (RCA) of countries [13]. Basically, an entry x cp in the country-product matrix X equals one when country c has a relative advantage at exporting product p, and zero otherwise. Even if the matrix is binary, we may use a Poisson likelihood because of the high degree of sparseness in data. Such approximation has already been adopted successfully in the case of recommendation systems [44].
The adjacency matrix M of the country-product network obtained from real trade data presents an approximately triangular sparse structure. This is in contrast to the block-diagonal structure as predicted by the Ricardian paradigm [45], which suggests that the wealthiest countries specialize in economic niches characterized by the production of only a few products with a high degree of specialization. The data thus suggests that countries have different diversity degrees in their export portfolios, and thus different trade strategies and skills. The main objective herein is to find an underlying representation that is easy to understand and able to capture this triangular structure in the input data.

Probabilistic matrix factorization
According to the proposed probabilistic framework, we assume that elements of M are realizations of random variables, whose dependencies are "captured" by latent features, i.e., capabilities which "dictate" relations between products. The elements of M are distributed according to a certain probability distribution f: where the C × K per-country (i.e., country-capability) matrix Z captures feature activation patterns, and the K × P per-product (i.e., capability-product) matrix B represents the effect of each latent feature on every product. For instance, if feature k is active for a certain country, all products having high values in vector B k• will be more likely to be exported by that country. Under a slight abuse of notation, (1) might also refer to point-wise probability distributions, e.g., M cp = f(Z c B p ). The approach may be interpreted as a probabilistic extension of non-negative matrix factorization where the number of latent features is not fixed a priori, both matrices are sparse, and soft-constraints on the expected latent sparsity structure are imposed through the prior.

Capability endowment: A culinary metaphor
In order to model the capability endowment of countries, we place a modified Indian Buffet Process (IBP) prior over the per-country matrix Z [42,46]. The standard IBP may be illustrated using a culinary metaphor which gives the name to the process [36]. Imagine an Indian restaurant whose buffet consists of infinitely many dishes arranged in a line. C customers enter the restaurant sequentially. The first customer takes a serving from each dish, stopping after a Poisson (α) number of dishes, as his plate becomes full. The c-th customer moves along the buffet and samples dishes in proportion to their popularity, serving himself with probability m k c , where m k is the number of previous customers who tried dish k. Having reached the end of all previously sampled dishes, the c-th customer then tries Poisson a c À Á new dishes. A matrix Z results from this experiment, such that z ck = 1 when customer c tries dish k. More generally, we say that a feature k is active for sample c if z ck = 1. This process is denoted by Z * IBP(α), where α is the mass parameter controlling the a priori activation probability of new features.
While an IBP may be appropriate to deal with the a priori unknown number of capabilities, the assumptions underlying the standard IBP are not flexible enough in this case. In fact, economic data suggests different capability distribution across countries, e.g., developed (diversified) countries should in general exhibit a higher number of capabilities. This is different to the standard IBP which implicitly assumes the same distribution Poisson(α) for the number of capabilities per country. We address this limitation via the Restricted IBP (R-IBP) formulation from [42]. The R-IBP allows for a general marginal distribution g over the number of latent variables (capabilities) per sample (country). In particular, we choose g to be a negative binomial in order to account for the over-dispersion of the number of capabilities per country. We further restrict the IBP by using a non-sparse bias term denoted as F0 which is active for all countries. Such term allows for the other latent features to be sparser and more interpretable [37,41]. Finally, we combine the R-IBP with the three-parameter formulation from [46] that allows for different sharing degrees of capabilities across countries (potentially yielding a power-law distribution of the number of hidden features), increasing de facto the potential number of hidden capabilities.

Infinite Doubly-sparse Poisson Factorization
In addition to the prior for the country-capability matrix Z, we specify a Gamma prior with shape parameter α B and mean parameter μ B for each element of the capability-product matrix B. We enforce sparsity in the per-product matrix B by choosing α B much smaller than one. Our model, that we refer to as Sparse Three-parameter Restricted-IBP (S3R-IBP), reads where Z c• denotes the c-th row of Z, and B •p denotes the p-th column of B. This model is fully specified by the a priori distribution g over the number of active features per sample, together with three hyperparameters: i) α, which is the same mass parameter from the standard IBP controlling the a priori total number of non-zero entries in matrix Z; ii) σ 2 [0,1) is the stability exponent which controls the power-law behavior of the model; iii) δ > −σ is the concentration parameter that affects the sharing degree of capabilities across countries. More details about the methodology can be found in Sections B-C of S1 Supporting Information and [47]. Altogether, our model has been designed to find highly-specific and easy-to-interpret latent features involving only a few products and are active for a small number of countries, in consistency with the economic literature [5]. Since exact computation of the posterior distribution for the latent variables is intractable, inference is performed using a Markov Chain Monte Carlo procedure, as described in Section D of S1 Supporting Information.

Results
In the following we present a qualitative evaluation of the model. In particular, we introduce the capability space and discuss the interpretative power of the model. We also address correlations in the capability space and their implications. Finally, we introduce temporal dynamics, with the idea to capture the dynamics in the acquisition of capabilities and thereby associated products over time. Quantitative evaluation is provided in the subsequent Section (Model Properties).

The capability space
By adopting the probabilistic learning framework, we are able to unfold the productive structure by isolating the key features associated with the competitive advantage in each of the exported products. As a result of the decomposition of the bipartite country-product network, complexity-related measures (such as country/product complexity, product similarity etc.) now admit a natural representation in the capability space.
In the first step, through the extraction of capabilities we are able to separate highly-diversified from less-diversified countries. Countries with low diversity are associated with less capabilities, and these capabilities are linked to more ubiquitous products. This observation is in line with [4,14], as it suggests that capabilities have different degrees of "complexity", which plays an important role in the production (export) of products. Moreover, the model produces homogeneous and concise clustering of products, i.e., products of similar ubiquity and SITC 1 digit classification are usually associated to the same capability.
This effect is also captured in Fig 1, where we connect countries to capabilities whenever Z ck = 1, and capabilities to products whenever B kp ! 0.3. The size of the country nodes is proportional to their diversity d c , defined as the number of products in which the country has comparative advantage. Similarly, the product node size is proportional to their ubiquity u p , which corresponds to the number of countries exporting that product having comparative advantage (see Section A of S1 Supporting Information). This implies that the differentiation of capabilities takes place according to both the level of diversification of countries and the elements required in the production of products. Table 1 captures the interpretable power of the model, by listing the capabilities found by the S3R-IBP model in 2010 (SITC classification of products). For each of the instantiated capabilities, we report the averaged number of countries endowed with the respective capability, as well as the top-5 products associated with it. Also, for each capability in the list, we single out a representative country, defined as the one with the smallest number of active capabilities among all countries possessing the capability in question. For each representative country, we also report the average number of active capabilities. Fig 2 presents a graph-based description of the capability space, which aims at capturing correlations between the capabilities. Two capabilities k and j are highly correlated (share an edge) if they co-occur frequently across countries, which can be measured for example by using the Jaccard similarity index between column vectors Z •k and Z •j . In fact, capability coactivation patterns may be potentially useful to define policy recommendations in a subsequent exploitation phase: as capabilities are related to each other, having one capability active increases the likelihood of having other capabilities active. For example, for a country possessing capability k, but not j (in the case when k and j are significantly correlated), the acquisition of j may come at a relatively small cost, thus justifying policy incentives in this direction.
In the following we summarize some general observations for the inferred correlations between capabilities in our model: 1) Capabilities F5 and F12 associated with electronics and chemicals respectively, are not highly correlated with the rest; 2) F4 (synthetic fabrics and fibers) and F2 (clothing items) are highly correlated, which is logical as both relate to the clothing industry, and are almost isolated from the rest of the graph; 3) F1-F5 are representative for developing countries, as explained in the next subsection; 4) F14 (metalworking machinetools) is loosely related to F9 (misc. rotating electric plant parts, control instruments, but also misc. rubber and plastic products) only; 5) F6 is particularly interesting, as it is associated with heterogeneous products such as baked goods and misc. edibles, but also metal containers, misc. articles of paper, and misc. organic surfactants. F6 has the role of a "capability hub" towards more advanced (i.e. less ubiquitous) capabilities such as F11 (vehicles), F9, F8 (iron and wood articles) and F10; 6) F3 and F1 (roughly farming and agriculture) seem to be the starting point of developing countries to acquire some more advanced skills, in particular captured by F6 and F10; 7) F13 (bio-chemical products such as medicines, vaccines and carbonyl compounds) seems to be the most difficult capability to be acquired, as it is disconnected from the "core" of the network and is related only to F7 (specialized electronics, heating and cooling Notes: From left to right, " m k is the averaged number of countries having latent feature k active, we list the top-5 products with highest weights B kp ; a representative country is the country that has the least number of capabilities among those possessing feature k. " J c is the averaged number of active features for each representative country c. https://doi.org/10.1371/journal.pone.0200822.t001 Economic complexity unfolded: Interpretable model for the productive structure of economies machines, and parts of office machines), itself being also advanced and relatively isolated from the rest of the network.

The Meta-Capability Space
In order to further analyze the relationships between the inferred capabilities, we again apply our S3R-IBP procedure over the inferred capability activation matrix Z as input data. Such a deep structure, i.e. using two-layer IBP, has already been explored in [48]. As before, we use a bias term, denoted by M-F0, which is active for all countries. In addition to M-F0, our approach extracts another meta-feature (meta-capability), M-F1, which is active for 46.19 countries on average. The list of countries is provided in Table 2. Both M-F0 and M-F1 assign Economic complexity unfolded: Interpretable model for the productive structure of economies different weights to each capability from the first layer. The countries with an active M-F1 are those that have more capabilities in the original model and, in general, are more diversified. Hence, we interpret M-F1 as the meta-feature that separates countries based on their level of diversification.
These two meta-features furthermore divide the capabilities learned in the original model into three disjoint sets. The first set contains the latent features whose weight is either zero or insignificant for M-F1, F0 to F5 (highlighted in white in Fig 1). These are the features that define countries which deal with goods whose production mostly relies on the presence of physical factors (i.e. natural resources). Hence, it makes sense that more diversified countries do not have stronger weights than less diversified countries for these features. While diversified countries might have a non-zero RCA in the products associated with these features, they are not necessarily exploiting them better than the less diversified countries.
The second set consists of a single feature, F6 (highlighted in light yellow in Fig 1), which has a high weight in both M-F0 and M-F1. This feature is associated with products exported by both groups of countries. However, the diversified countries do trade these products more efficiently than the less diversified ones. We can interpret the products associated with this feature as being on the capability frontier.
The last set includes the remaining features (highlighted in gold in Fig 1), with weights in M-F0 which are negligible compared to their weights in M-F1. These features are associated with less ubiquitous products, for example chemicals and complex machinery (see Table A in S1 Supporting Information), which are only exported by diversified (i.e. economically complex) countries. To put the findings in geographic context, in Fig 3 we plot a heat map of the world according to the presence of features associated with meta feature M-F1.

Countries in the capability space
Given the factorization (1), it is possible to define distances between countries based on their capability endowment, i.e., we may define a similarity measure in the capability space. Effectively, by comparing the latent representations of two countries in the capability space, we can say in which way two countries are similar. In this sense, the framework provides a complementary representation to the approaches which measure similarity directly in the product space.
For illustration of the concept, we use a simple similarity measure based on the weighted Euclidean distance between the vectors representing the capability endowments of individual countries, where the weights are inversely proportional to the capabilities ubiquity. To be precise, weights are proportional to (1 − π k ) where π k is the proportion of countries having capability k. Intuitively, two countries are more similar if they share a less ubiquitous capability. The mapping to the two-dimensional plane is performed via non-classical multidimensional scaling [50]. Fig 4 depicts the computed distances between countries in 1995 and 2010. In both years we observe a core of closely spaced countries with similar underdeveloped productive structure, i.e. with minimal capability endowment. Moreover, in 1995 we notice a relatively strong polarization in the world due to large differences between the most and least diverse economies. Compared to 1995, the situation in 2010 is somewhat less polarized, as some of the less diverse countries gradually acquired capabilities over the years, increased their diversification and consequently narrowed the gap to the most diverse countries.
The results from Fig 4 are in line with a growing body of empirical studies which suggest that knowledge diffuses among neighbors and/or that shared history influences the future development [51][52][53]. The figure also reveals interesting relations between the productive structures of individual countries within regions and across those having different economic policy ideologies. For instance, we can easily separate the Asian economies into two groups, one in the left periphery and another at the core of the Figure. The first group are countries which exhibited significant economic development over the examined years, whereas the countries in the other group stagnated. Typical countries from the first group are the "Asian Tigers" Hong Kong, Singapore and South Korea (note that Taiwan is also in the same group, but it does not satisfy our criteria for entering the dataset), placed in the top left corner of Fig  4a and 4b. A distinct characteristic of these economies was the exploitation of previously unused industrial capabilities for developing information and communication based technologies (ICT) that contributed to their economic growth rates. Fig 4 also presents interesting outliers, such as the Republic of Ireland, whose productive structure differs substantially from those of their geographic neighbors. Interestingly, Ireland was also dubbed as Celtic Tiger due to the economic characteristics and the implementation of policies similar to the Asian Tigers. Fig 5 depicts the positioning of the Tigers in the capability space, based on their inferred capabilities. All four countries share two capabilities, F0 (basic capability present in all countries), and F5, which is exactly associated to ICT intensive products, as shown in Fig 1 and Table 1. This example suggests that our model can be effectively utilized as a discriminative framework to explain country economic dynamics based on neighborhood relationships in the capability space.
In general, it would be relevant to interpret these findings from the perspective of evolutionary economic geography, see e.g. [21,43,54].  Economic complexity unfolded: Interpretable model for the productive structure of economies

Products in the capability space
Hidalgo et al. [6] propose a similarity measure between goods by looking at the probability that they are co-exported. To quantify this similarity it is assumed that if two goods share most of the requisite capabilities (where the reference to the capabilities is implicit), the countries that export one will also export the other. By analogy, it is expected that goods that do not share many capabilities are less likely to be co-exported. The proximity measure intents to infer the similarity between the capabilities required by a pair of goods by capturing the conditional probability that a country that exports product p will also export product p 0 . In the absence of explicit reference to the underlying capabilities, proximity is inferred directly from the entries in the M matrix as where k p,0 is the ubiquity of product p. Since conditional probabilities are not symmetric, in (3) the minimum of the probability (which is inversely related to the maximum of the ubiquity) of exporting product p, given p 0 and the reverse, is considered. In our framework, which provides a full probabilistic description of the trade flows with explicitly instantiated capabilities, product similarity may naturally be defined in the capability (i.e. the feature space). Here, a starting point is the representation of the products by the column vectors of the matrix B in the decomposition (3), which interprets products as points in the capability space. Based on this representation, various "similarity" measures between products may be introduced. In the first approximation, the framework offers an efficient clustering of products by adding a threshold in the vector representation. As depicted in Fig 1, this rather simple methodology produces a compact and consistent representation of products. While products of similar ubiquity and SITC one digit classification are often associated to the same capability, our approach is able to furthermore differentiate products inside this broad classification. For instance, capabilities F1 and F3 separate vegetable and fruits from animal products, which appear together in the SITC 0 as "Food and live animals group". As such, our model may provide alternatives to the SITC classification of products, based on their positioning in the capability space.
A more refined approach relying on a high-dimensional embedding of the per-product matrix B could bring additional insights on the relations between different products.

Temporal dynamics
A large volume of studies provides strong evidence supporting the notion that diversification in countries and regions is path dependent. The authors in [3,43] showed how countries expand their mix of exports around the products in which they already established a comparative advantage. Neffke et al. [21] used information on product portfolios to show that regions tend to diversify into new industries related to existing local industries. Refs. [24][25][26][27], among others, used measures of technological relatedness between patent classes to show that countries and cities develop new technologies related to existing local technologies.
We complement these studies by introducing a temporal dynamics in our probabilistic model. This is performed by applying the S3R-IBP model to the aggregated SITC database between 1964 and 2010. We assume a shared set of latent features over the years, and independent feature activations for each year, e.g., USA in 1965 is counted as a different country from USA in 1986. To speed up mixing, we initialize the features to the values obtained from the analysis of 2010 data, listed in Table 1, and learn the feature activation values for all years. Indonesia. All these countries were able to increase their number of active features over the years, although their dynamics can be attributed to different economic factors. The latent features may be used to infer information for the internal situation of the countries' economic system at different time instants. In particular, the growth of Chile can be attributed to the activation of features associated to natural resources, whereas the growth of Indonesia is due to acquisition of features related to clothing and electronic products. Interestingly, Chile is a well known natural resource producer [6], while the start of Indonesia's growth coincides with the period when the country opened its economy and got an influx of foreign direct investments.
Finally, Egypt is a country that has very unique dynamics. There is a sudden fall in the activity of features F2 and F4 (both associated to clothes) and F1 (vegetables and fruits) at the end of the 1970s, after which the number of active features remained steady at its minimum, until the beginning of the 1990s. The steady state period corresponds to years of political and economic instability [55]. At the beginning of the second growth, which corresponds to a period of political reforms that made Egypt a more open economy, the country regained the activity in F1, F2 and F4 quickly, and even adding vehicles to its export basket (F11).
In addition to monitoring the temporal dynamics of each individual country, it is also interesting to study the feature transitions globally, as a simple transition model. This is possible given the discrete nature of the capability endowments, as features can be either active or inactive (binary vector). To better account for time, we retrained our model assuming a Markovian dependency across capability endowment vectors Z c• (t) for each country c over the years. The Markovian assumption works as a smoothing factor of the country trajectories in the capability space. In fact, Z c• (t) can be interpreted as the latent state of country i at time t, which indicates the set of features that are active for that country at that particular point in time. Let G = {M, E} be a hidden network, where M denotes all possible latent states (all possible values for vector Z c• (t) and E refers to all directed edges connecting any two elements belonging to M (full network). For each country i, the sequence [Z c• (1),Z c• (2), . . .,Z c• (T)] corresponds to a path within such network. By monitoring which are the most common nodes and most visited edges in the network, we can gain new insights on the developing mechanisms of countries. An illustration of the concept is provided in Fig 7 corresponding to the first steps in countries' development.

Possible effects on policy recommendations
The product space analysis in [3] suggests that there is a tendency for countries to develop RCA close to products for which RCA was already developed. According to the terminology in [3], a product for which a country has developed RCA is called an occupied product (O), and unoccupied product (U) otherwise. By considering the transition in the product space which takes unoccupied products to occupied ones (U ! O), the authors in [3] provide evidence that countries perform structural transformations by jumping from occupied products to nearby ones. The analysis involves the definition of a quantity referred to as density, which is defined as a weighted fraction of the space which is occupied from the point of view of a product in a particular country. Similarly, one can consider the probability for a product to develop given that the closest developed product is at a certain proximity, as defined by the authors in [6].
By employing our probabilistic approach, we connect to the observations in [3] and [6]. Specifically, we define a simple quantity that can be defined from the inferred country-capability-product decomposition. In particular, the estimated likelihood that country c can export product p given the country's capability vector Z c• , can be used to define two sets of products. The first set includes those products for which the country has high likelihood according to our model, but no RCA has been developed yet. Conversely, the products for which that country has low likelihood but has already comparative advantage are those that are at risk to be lost in the future, as inferred from the capability decomposition. We note that this line of reasoning aligns with the concept of latent comparative advantage, which has been introduced in [32], and aims to assess how well countries are matching their potential implied by the latent Economic complexity unfolded: Interpretable model for the productive structure of economies endowment, as well as to identify products for which the latent advantage is not yet revealed (extensive margin).
To investigate these hypotheses, and to provide complementary insights to the works in [3], [6] and [32], we monitored the products with RCA = 1 and highest probability of deactivation (transition O ! U as described in [3]), as well as the products with RCA = 0 and highest probability of activation (transition U ! O as described in [3]) in 2010. We thereby estimated the percentage of those products that get activated in the next 4 years. This percentage ranges within 32 − 44% (the variation depends on which threshold parameter is chosen to transform the soft-predictions of our model into binary RCA values), in contrast to a random activation of 7% (when choosing products at random that have RCA = 0). Tables 3 and 4 present the top-5 products with RCA = 1 and highest probability of deactivation, as well as RCA = 0 and highest probability of activation respectively, for the subset of countries studied in the "Temporal dynamics" subsection.
In each country, note that the exported products at risk are those for which the required capabilities are not present (in a probabilistic sense, and with respect to a threshold). For instance, in Indonesia and Egypt these are the exported chemicals (or products used in chemistry) whereas in Chile the products with the highest chance to be deactivated are machineries. On the other hand, the products that the countries are most likely to occupy in the future (i.e. to add to their export baskets) are related to the capabilities already acquired by the respective countries. Note that the most likely new products for Chile are miscellaneous manufactured goods, which are related to feature F8. This is a capability in which Chile had a high weight in the past (see Fig 6), and thus, might be more likely to get it active in the future.
As a final note, we suggest that the temporal model and the finite state machine interpretation might provide a step towards the more fundamental understanding of the typical evolution paths of developing countries, ideally followed by possible policy recommendations in the light of these findings. The evaluation of the effects of specific policy recommendations, which may, for example, come in the form of subsidies or R&D incentives that the countries may provide to stimulate the production of specific products, is not covered by this work.

Model properties
To describe the general properties of our model, in Fig 8a and 8b we depict the empirical and inferred country-product adjacency matrices (M matrix) for 2010. The countries are ordered according to their decreasing diversity d c , whereas products are sorted according to their decreasing ubiquity u p . It is evident that the model successfully reproduces the triangular sparse structure present in the empirical data matrix . Fig 8c and 8d depicts the inferred country-capability matrix Z, and capability-product matrix B for the same year. The countrycapability matrix also exhibits a sparse triangular structure, suggesting that more diversified countries also have more capabilities. On the other hand, the weights in the capability-product matrix are block sparse with varying sparsity among the rows (capabilities), implying that there are certain capabilities which are associated with more products. Furthermore, in order to evaluate the performance of our model, in Fig 8a-8f we compare the inferred cumulative distribution of our S3R-IBP model with two other models. The first model used in the comparison is a simple binomial model described in [4], which assumes binary matrices Z and B, a finite number of capabilities K, and uniform activation probabilities r and q for all country-capability and capability-product combinations. The parameters in this baseline model are chosen to best fit a functional form connecting a country's diversification to the average ubiquity of its products [5]. The second model we compare with is the IBP with sparse Gamma prior on the features with a bias term (S-IBP).
We observe that all models adequately reproduce the distribution of ubiquity. However, there are large differences when comparing the models' ability to explain the distribution of diversity. In particular, the baseline model predicts an almost uniform distribution, whereas both S-IBP and S3R-IBP fail to capture the lowest values of the empirical diversity distribution (the vertical line for the S-IBP and S3R-IBP correspond to countries only having the bias term active defined in the Methods Section). Nevertheless, our model significantly outperforms S-IBP as it is able to capture the distribution for higher diversity values, i.e., S-IBP predicts a lower number of countries with high number of exports.

Quantitative evaluation
We perform a quantitative evaluation of our model in terms of predictive accuracy, interpretability, strength and ability to capture the row marginal distribution of the input, using data from 2010. Simulations are run for 10 different train-test splits with a proportion of 90-10% entries. The burn-in period for the MCMC inference algorithm is 30,000 iterations, and results are averaged using the last 1,000 posterior samples. Table 5 compares our model against probabilistic matrix factorization (MF) [56], non-negative MF (NMF) [57], the standard Indian Buffet Process (IBP), and the sparse IBP (S-IBP) which uses α B < 1 in terms of mean and standard error of the test log-perplexities, and topic coherence.
Accuracy. We use perplexity to measure predictive accuracy, i.e. the harmonic mean of the inverse test log likelihood. All the models present similar perplexity, except the S-IBP model, in which the sparseness restriction degrades its performance significantly. Our S3R-IBP has the same sparse restriction, but it has a more flexible prior that it is able to compensate the penalty in perplexity and perform close to to the non-sparse models, i.e. MF, NMF and IBP. The combination of the negative binomial and the stable-Beta process allows to match the perplexity performance of non sparse methods, but keeping the results interpretable, as we illustrate in the next paragraphs.
Interpretability. We test the robustness of our results by performing comparison with with: i) results from Singular Value Decomposition (SVD) which serves as a baseline factorization method; ii) results corresponding to the year 1995; and iii) results estimated by using the Harmonized System (HS) rev. 1992 classification disaggregated to six digit level (see Section E of S1 Supporting Information for more details). The comparison with the SVD (provided in Table B in S1 Supporting Information), suggests that our approach significantly enhances interpretability of the latent factors in terms of both conciseness and precision of the clustering of products. Additionally, our findings are quite robust and stable in the sense that the evaluation of the model across different classifications and years does not produce any significant changes in the results, as detailed in Section E of S1 Supporting Information.
In order to assess semantic quality, we rely on the coherence [58], which is an often-used metric in topic modeling literature. The coherence C k of a feature is defined as where v k i is the i-th product with highest weight in factor k, and M represents how many top products should be evaluated (here we take M = 20 products). Also R(x) refers to the number of countries exporting product x, and R(x, y) is the number of countries exporting both products x and y. The closer coherence is to zero, the better.
The S3R-IBP outperforms not only NMF, but the IBP and S-IBP as well by far, as shown in Table 5, making it specially suitable for data exploration in high-dimensional count data scenarios. The non-sparse methods present a very low coherence, as expected. Intuitively, coherence measures the degree of homogeneity and agreement within a latent factor, and thus, will always degrade for non-sparse projections.
Sparsity structure. We revisit the evaluation of the Baseline model, the S-IBP and the S3R-IBP model's fitting ability by comparing its Q-Q plot of the predicted diversification and ubiquity distribution versus the empirical distribution. The S3R-IBP and the S-IBP are by far superior than the Baseline model in fitting the distribution of diversity and ubiquity. Moreover, Fig 9a and 9b reveals even more the differences between the S3R-IBP and S-IBP in their performance regarding the diversity distribution-it is clear that the S3R-IBP adequately captures the right tail of the distribution, whereas the S-IBP does not. Finally, both models exhibit promising results modeling the distribution of ubiquity (Fig 9d and 9e). We estimate a twosample Kolmogorov-Smirnov test for each predicted-real distribution combination as a means to statistically test the differences between them. The results are in line with the conclusions from the cumulative distribution and the Q-Q plot comparisons.

Discussion and future work
This work complements the studies in the field of economic complexity by introducing a probabilistic framework which leverages Bayesian non-parametric techniques. The framework Economic complexity unfolded: Interpretable model for the productive structure of economies aims at extracting the dominant features (capabilities) behind the comparative advantage in exported products. Based on economic evidence and trade data, we place a restricted Indian Buffet Process on the distribution of countries' capability endowment, appealing to a culinary metaphor to model the process of capability acquisition. We further extend the approach by introducing temporal dynamics in the acquisition of capabilities and introduction of new products in the countries' export portfolios. The overall approach comes with an adequate level of interpretability, as it produces a concise and economically plausible description of the instantiated capabilities. The introduced framework offers a unifying qualitative and quantitative assessment of a country's productive structure. The factorization of the country-product matrix provides a descriptive view of the internal composition of each economy, and a direct way of comparison between different economies.
Several research directions remain for future work. First, it would be interesting to interpret the implications of the model from the perspective of evolutionary economic geography, and economic learning in general, for example with emphasize on the inter-industry and interregional learning channels. Second, we expect that the proposed framework would also be useful in the context of unrelated diversification in economic development (as in e.g. [59]) or, for example, in the study of industrial dynamics in the Varieties of Capitalism (VoC) framework [22].
Finally, there remains the question of how the predictions of the model may be put forward as policy recomendations. The first step in this direction is the extraction of relevant information than can be used by a particular country for developing industrial policies, for example by focusing on the easiest-to-be-acquired capabilities, or by following another country's Economic complexity unfolded: Interpretable model for the productive structure of economies development path in the capability space. In this context, our findings are only to be interpreted as the first step towards possible policy recommendations. The evaluation of the effects of specific policy recommendations, which may, for example, come in the form of subsidies or R&D incentives that the countries may provide to stimulate the production of specific products, is not covered here, but presents an important direction of future work.