Weighted Multiplex Networks

One of the most important challenges in network science is to quantify the information encoded in complex network structures. Disentangling randomness from organizational principles is even more demanding when networks have a multiplex nature. Multiplex networks are multilayer systems of nodes that can be linked in multiple interacting and co-evolving layers. In these networks, relevant information might not be captured if the single layers were analyzed separately. Here we demonstrate that such partial analysis of layers fails to capture significant correlations between weights and topology of complex multiplex networks. To this end, we study two weighted multiplex co-authorship and citation networks involving the authors included in the American Physical Society. We show that in these networks weights are strongly correlated with multiplex structure, and provide empirical evidence in favor of the advantage of studying weighted measures of multiplex networks, such as multistrength and the inverse multiparticipation ratio. Finally, we introduce a theoretical framework based on the entropy of multiplex ensembles to quantify the information stored in multiplex networks that would remain undetected if the single layers were analyzed in isolation.

In this supplementary material we give background information regarding the manuscript 'Weighted Multiplex Networks". The material is organized as follows. In Section II we present additional information about the multiplex datasets analyzed in the manuscript, and provide the details of the statistical analysis of the data. In section III we describe the theory of weighted multiplex ensembles, and in particular we focus on the uncorrelated weighted multiplex ensemble and on the correlated weighted multiplex ensemble used in the main body of the manuscript. Finally, in section IV we provide background information that explains how Fig. 4 in the main text of the manuscript was obtained.

A. The two datasets
We have considered the American Physical Society (APS) research data that is organized into two main datasets: • Article metadata: for each article the metadata includes DOI, journal, volume, issue, first page and last page, article id and number of pages, title, authors, affiliations, publication history, PACS codes, table of contents, heading, article type, and copyright information.
• Citing article pairs: this dataset consists of pairs of APS articles that cite each other. Each pair is represented by a pair of DOIs. The first id cites the second id.
In the APS metadata an author is usually identified by given name, middle name, and surname. In different articles, the same author can appear with his/her full name or with his/her initials. To deal with this issue, we decided to identify a specific author with the initials of his/her given name and middle name and with his/her full surname. We restricted our analysis to the article metadata and citing article pairs that relate only to PRL and PRE. The total number of PRL articles is 95, 516 and the total number of PRL authors is 117, 412. The total number of PRE articles is 35, 944 and the total number of PRE authors is 36, 171. The number of authors that published both in PRE and PRL is equal to 17, 470.
Among the papers published in PRE and PRL, we focused our study only on those containing a number of authors n p ≤ 10. This excludes most of the experimental high-energy collaborations that are typically characterized by a number of authors of a different order of magnitude. We decided to place such a cut-off to the maximum number of authors allowed per paper to avoid biases due to very large publications. Given the cut-off, our study thus becomes limited to 35, 766 PRE articles (99.5 %) and 35, 205 PRE authors (97.3 %) on the one hand, and 89, 245 PRL articles (93.4 %) and 92, 436 PRL authors (78.7 %) on the other. The intersection of these two datasets includes 16, 207 authors (i.e., 92.8 % of the previous intersection).
We analyzed two types of interaction between APS authors: scientific collaborations and citations, with weights defined as follows.
• Collaborations: two authors are connected if they co-authored at least one paper. The collaborative interaction between author i and author j is defined as in [1,2], i.e., the undirected adjacency matrix element a ij is given by where the index p indicates an article in the dataset I, n p indicates the number of authors of article p and δ p i = 1 if node i is an author of article p, and δ p i = 0 otherwise. The resulting network is undirected and without self-loops.
• Citations: two authors are connected by a directed link if one author cites the other one. In this case, the element a ij of the directed adjacency matrix indicating how many times node i cites node j is given by where b p,p ′ = 1 if article p cites article p ′ , and b p,p ′ = 0 otherwise. Moreover δ p i is defined as above and indicates whether i is author of article p (δ p i = 1) or not (δ p i = 0). The resulting network is directed and with self-loops. We constructed the following two duplex networks: 1. CoCo-PRL/PRE: collaborations among PRL and PRE authors. The nodes of this multiplex network are the authors who published articles both in PRL and PRE (i.e., 16, 207 authors). These nodes are connected in layer  In order to characterize the overlap existing between the links of the multiplex networks, we define the total overlap O α,α ′ between layer α and layer α ′ as the total number of pair of nodes (i, j) connected both in layer α and in layer α ′ , i.e., where θ(x) = 1 if x > 1 and θ(x) = 0 otherwise. This definition can be extended to weighted multiplex networks by defining the total weighted overlap O (w)α,α ′ between layer α and layer α ′ as where w α max is the maximal weight in layer α. Table I reports details on the total overlap and total weighted overlap, and indeed shows that our multiplex networks are characterized by a significant overlap of links.   The nodes i = 1, 2 . . . , N of the multiplex networks have degrees k 1 i in layer 1 and k 2 i in layer 2. Moreover, we can define the multidegree k m i of a generic node i as the sum of the multilinks m incident on it. We observe that, since we always have we can therefore restrict the analysis to multidegrees m = 0. Figs. 1 and 2 show that the degree and multidegree for both the CoCo-PRL/PRE and the CoCi-PRE multiplex network are broadly distributed. Moreover, in both duplex networks, the degrees each author has in the two layers are positively correlated, as indicated by the Pearson, Spearman, and Kendall correlation coefficients between degrees (See Tables II, III ). Finally, also multidegrees in the multiplex networks are correlated, as indicated by their Pearson, Spearman, and Kendall coefficients (See Tables  IV , V ). These correlations coefficients are calculated both for degree and multidegrees. In what follows, we give the definition in the case of two degree sequences. The extension to multidegree sequences is straightforward. The Pearson correlation coefficient r between two degree sequences {k α i } and {k β i } is given by where σ α = k α k α − k α 2 . The Spearman correlation coefficient ρ between the degree sequences {k α i } and {k β i } in the two layers α and β is given by where x α i is the rank of the degree k α i in the sequence {k α i }, and x β i is the rank of the degree k β i in the sequence {k β i }.
The Spearman coefficient suffers from the problem that the ranks of the degrees in each degree sequence are not uniquely defined because the degree sequence has in general some degeneracy. Thus, the Spearman correlation coefficient is not a uniquely defined number. The Kendall's τ correlation coefficient between the degree sequences {k α i } and {k β i } in the two layers α and β is a measure that takes into account the sequence of ranks {x α i } and {x β i }. A pair of nodes i and j are concordant if their ranks have the same order in the two sequences,.i.e., ( The Kendall's τ is defined in terms of the number n c of concordant pairs and the number n d of discordant pairs, and is given by where n 0 = 1/2N (N − 1) and the terms n 1 and n 2 account for the degeneracy of the ranks and are given by where we call u n the number of nodes in the nth tied group of the degree sequence {k α }, and we call v n the number of nodes in the nth tied group of the degree sequence {k β }.  Here we report the weighted network properties of the single layers of our multiplex networks. In general, the average strength s α k of nodes with degree k in layer α and the average inverse participation ratio Y α k of nodes with degree k in layer α are described by the functional behavior We have considered both the CoCo-PRL/PRE dataset and the CoCi-PRE dataset and fitted s α k and Y α k according to this expected power-law behavior.  The exponents shown in Table VI and Table VII have been computed with the method "regression", function of Matlab [3]. This function performs a multiple linear regression, and for each coefficient gives the 95% confidence interval. In the tables, we show also the coefficient of determination R 2 indicating how well the power-law trend fits the data.
As shown by Fig. 3 and Table VI, the CoCo-PRL/PRE multiplex network is characterized by a linear behavior of average strength as a function of the degree of nodes. Fig. 4 and Table VII show that the CoCi-PRE multiplex network is characterized by a linear behavior of average strength as a function of the degree of nodes in the collaboration network, and by a super-linear behavior in the citation network.

E. Statistical analysis of the properties of multilinks in the CoCo-PRL/PRE multiplex network
In this subsection, we discuss in detail the results of our statistical analysis of the properties of multilinks in the CoCo-PRL/PRE multiplex network. In particular, we focus on the average multistrength of nodes with a given with exponents β m,α ≥ 1 and λ m,α ≤ 1. We have computed these exponents with the method "regression", function of Matlab [3] This function performs a multiple linear regression, and for each coefficient gives the 95% confidence interval. We have also computed the coefficient of determination R 2 indicating how well the power-law trend fits the data. For a complete list of the exponents characterizing multistrength and the inverse multiparticipation ratio, see Table VIII. In what follows, we will label the PRL collaboration layer as α = 1 and the PRE collaboration layer as α = 2 .

Statistical analysis of the average inverse multiparticipation ratio in the CoCo-PRL/PRE multiplex network
In the PRL layer the fitted exponents λ m,α are significantly different. The weights regarding multilinks (1, 1) are distributed more heterogeneously than the weights regarding multilinks (1, 0). A similar situation is found also in the PRE layer. The paired Student's t-test is also useful to understand the properties of the average inverse multiparticipation ratio. In addition to the fitted exponents, we can perform a t-test as we did previously considering now Y m,α (k m ). This test underlines how the inverse multiparticipation ratios regarding multilinks (1,1) are significantly higher than those regarding multilinks (1,0) or (0,1). In the case Y (1,1),1 (k) vs Y (1,0),1 (k), the t-test gives a p-value equal to 0.002 and an average value log Y (1,1),1 (k)/Y (1,0),1 (k) = 0.11. In the case Y (1,1),2 vs Y (0,1),2 (k), the p-value is equal to 6.64 · 10 −6 , and the average value is log Y (1,1),2 (k)/Y (0,1),2 (k) = 0.19.  VIII: CoCo-PRL/PRE multiplex network: power-law exponents λ m and β m and parameters p m , q m determining the functional behavior for average multistrength of nodes with a given multidegree, s m,α (k m ), and for average inverse multiparticipation ratio of nodes with a given multidegree, Y m,α (k m ), with α corresponding to the collaboration layer in PRL (1) or in PRE (2). The value of the determination coefficient R 2 for the power-law fits is also reported.
with exponents β m,α,(in/out) ≥ 1 and λ m,α,(in/out) ≤ 1. We have computed these exponents with the method "regression", function of Matlab [3] This function performs a multiple linear regression, and for each coefficient gives the 95% confidence interval. We have also computed the coefficient of determination R 2 indicating how well the power-law trend fits the data. The complete list of the exponents and the multiplication constants characterizing multistrength and the inverse multiparticipation ratio can be found in Table IX together with the correspondening values of R 2 . In what follows we will label the PRE collaboration layer as α = 1 and the PRE citation layer as α = 2.
Based on the fitted parameters and the Student's t-test, the data suggest that both multidegrees for multilinks (1, 1) and multilinks (1, 0) have a linear relation with their own multistrengths in the collaboration layer, and that multistrengths (1,1) are related to multistrengths (1,0) by a multiplicative constant. In the citation layer, the fitted exponents β m,in/out indicate a super-linear scaling, and are significantly different (see Table IX).

A. Definition
A weighted multiplex network is formed by N nodes connected within M weighted networks G α = (V, E α ), with α = 1, . . . , M and |V | = N . Therefore we can represent a multiplex network as G = (G 1 , G 2 , . . . , G α , . . . G M ). Each network G α is fully described by the adjacency matrix of elements a α ij , with a α ij = w α ij > 0 if there is a link of weight w α ij between nodes i and j in layer α, and a α ij = 0 otherwise. In what follows, in order to simplify the treatment of the weighted multiplex networks, we will assume that the weight of the link between any pair of nodes i and j, a α ij = w α ij , can only take integer values. This is not a major limitation because in a large number of weighted multiplex networks the weights of the links can be considered as multiples of a minimal weight.

B. Canonical weighted multiplex ensembles or exponential weighted multiplex ensembles
The canonical network ensembles (also known as exponential random graphs) are a very powerful tool for building null models of networks [4,5]. Here we generalize the formalism developed for unweighted multiplex ensembles [6] to take weighted multiplex ensembles into account.
The construction of the canonical weighted multiplex ensembles or exponential random multiplex graphs follows closely the derivation or the exponential random graphs. A weighted multiplex ensemble is defined once the probability P ( G) of any possible weighted multiplex is given. We can build a canonical multiplex ensemble by maximizing the entropy S of the ensemble given by under the condition that the soft constraints we want to impose are satisfied. We assume there are K of such constraints determined by the conditions for µ = 1, 2 . . . , K, where F µ ( G) determines one of the structural constraints that we want to impose on the multiplex network. Therefore, the maximal-entropy multiplex ensemble satisfying the constraints given by Eqs. (15) is the solution of the following system of equations where the Lagrangian multiplier Λ enforces the normalization of the P ( G) probability distribution, and the Lagrangian multiplier ω µ enforces the constraint µ. Therefore, we obtain that the probability of a multiplex network P ( G) in a canonical multiplex ensemble is given by where the normalization constant Z = exp(1+Λ) is called the "partition function" of the canonical multiplex ensemble and is fixed by the normalization condition on P ( G). Thus, Z is given by The values of the Lagrangian multipliers ω µ are determined by imposing the constraints given by Eq. (15) assuming for the probability P ( G) the structural form given by Eq. (17). From the definition of the partition function Z and Eq. 17, it can easily be shown that the Lagrangian multipliers ω µ can be expressed as the solutions of the following set of equations In this ensemble, we can then relate the entropy S (given by Eq. (14)) to the canonical partition function Z, and we obtain We call the entropy S of the canonical multiplex ensemble the Shannon entropy of the ensemble.

C. Uncorrelated and correlated canonical multiplex ensembles
Multiplex ensembles can be distinguished between uncorrelated and correlated ones [6]. For uncorrelated multiplex ensembles, the probability of a multiplex network P ( G) is factorizable into the probability P α (G α ) of each single network G α in layer α, i.e., Therefore, the entropy S of any uncorrelated multiplex ensemble given by Eq. (14) with P ( G) given by Eq. (21) is additive in the number of layers, i.e., For a canonical uncorrelated multiplex ensemble, P ( G) has to satisfy both Eq. (21) and Eq. (17). Therefore, in order to have an uncorrelated multiplex ensemble, the functions F µ ( G) should be equal to a linear combination of constraints f µ,α (G α ) on the networks G α in a single layer α, i.e., A special case of this type of constraints is given when each constraint depends on a single network G α in layer α. An example of this type of constraints will be discussed in the following subsection where we will focus on the important case in which the constraints are the strength sequence {s α i } in any layer α, and the degree sequence {k α i } in any layer α.
Moreover, we can define the marginal probability for a specific value of the element a α ij π α ij (a α ij = w) = where δ(x, y) is the Kronecker delta. The marginal probabilities π α ij (a α ij ) sum up to one We can also calculate the average weight a α ij of links between nodes i and j as In layer α, a link between two nodes i and j exists with probability p α ij , related with all the possible weights different from zero D. Multiplex ensemble with given expected strength sequence and degree sequence in each layer Here we consider the relevant example of the uncorrelated multiplex ensemble in which we fix the expected strength s α i and the expected degree k α i of every node i in each layer α. We have K = M · 2N constraints in the system. These constraints are given by with α = 1, 2, . . . , M . We introduce the Lagrangian multipliers w i,α for the first set of N · M constraints and the Lagrangian multipliers ω i,α for the second set of N · M constraints. Therefore, the probability P ( G) of a multiplex network in this ensemble, in general given by Eq. (17), in this specific example is given by where the partition function Z can be expressed explicitly as and the Lagrangian multipliers are fixed by the conditions Eqs. (28). From Eq. (24) we write the marginal probabilities π α ij (a α ij ) for this specific ensemble that are given by π α ij (a α ij ) = e −(wi,α+wj,α)a α ij −(ωi,α+ωj,α)θ(a α ij ) (1 − e −(wi,α+wj,α) ) 1 + e −(wi,α+wj,α) (e −(ωi,α+ωj,α) − 1) .
Moreover, from Eq. (27) the probability p α ij that the link (i, j) in layer α has weight different from zero is given by Finally, the probability of a multiplex network G in this ensemble, characterized by the M adjacency matrices a α , is given by with the marginals π α ij (a α ij ) given by Eq. (30). Therefore the entropy S of this canonical multiplex ensemble is given by with the marginals π α ij (a α ij ) given by Eqs. (30).
E. Multiplex ensemble with given expected multidegree sequence and given expected multistrength sequence Here we consider the example of the correlated weighted multiplex ensemble, in which we fix the expected multidegrees k m i of node i, for each node i = 1, . . . , N , for each m. Moreover, in addition to these constraints we impose also a given expected strength s m,α i for each node i = 1, 2, . . . , N and each multilink m, in each layer α where m α = 1. The number of constraints is therefore K = 2 M · N + (2 M−1 ) · M · N . In particular, the constraints we are imposing are where we have now used the multiadjacency matrices A m ij with elements given by Here we introduce the Lagrangian multipliers ω m i for the first set of constraints and the Lagrangian multipliers w m i,α for the second set of constraints. Without loss of generality, if m α = 0 we set w m α = 1/2. We can do this because the probability of a multiplex network does not depend on any of these values, and we need to define this Lagrangian multipliers only for simplifying the notation. Using these expression for the Lagrangian multipliers, we obtain the following expression for the probability P ( G) of the multiplex network in the ensembles The partition function Z can be expressed explicitly as where Z ij is given by Finally, the Lagrangian multipliers are fixed by the conditions given by Eqs. (36).
We now indicate with a ij the vector (a 1 ij , a 2 ij , . . . , a α ij , . . . , a M ij ). The probability of a multiplex network P ( G) can be rewritten as where the probability of a specific multiweight a ij between nodes i and j is where m ij = (m ij 1 , . . . , m ij α , . . . , m ij m ) with m ij α = θ(a α ij ). We note here that π ij ( a ij ) satisfies the following normalization condition aij π ij ( a ij ) = 1.
The average weight a α ij A m ij of multilink m between nodes i and j in layer α and the probability p m ij of multilink m between nodes i and j are given by Finally, since the probability of a multiplex network P ( G) is given by Eq. (41), the entropy S defined in Eq. (14) in this ensemble is given by (46)

IV. BACKGROUND INFORMATION ON FIG. 4 OF THE MAIN TEXT
As an example of a possible application of the indicators Ψ and Ξ, we analyze a case inspired by the CoCi-PRE multiplex network. Due to the numerical limitations of the programs that are able to evaluate the entropy of multiplex ensembles, we perform a finite-size analysis of the indicators Ψ and Ξ as a function of the size of the multiplex network N = 128, 256, . . . , 2048. In particular, we consider the following undirected multiplex ensembles: • Correlated weighted multiplex ensemble. First, we create the correlated multiplex ensemble with power-law expected multidegree distributions with exponents γ (1,m2) = 2.6 for m 2 = 0, 1 and γ (0,1) = 1.9 (for multidegree (0, 1) we impose a structural cut-off). In particular, in order to avoid the effects of fluctuations in the expected multidegree sequence, we rank the multidegrees as r = 1, 2, . . . N , and take the sequence in which the multidegree k m r of rank r is defined by where we take the maximal cut-off K = N 1/γ m for γ m > 2 and K = k m N for γ m < 2. Note that this is possible because the expected multidegrees are real values. Moreover, the expected multistrengths are assumed to satisfy s m,α i = c m,1 (k m,α ) λ m,α , with c m,α = 1, β (1,m2),1 = 1 for m 2 = 0, 1, β (1,1),2 = 1.3, and β (0,1),2 = 1.1.
• Uncorrelated weighted multiplex ensemble. In this ensemble, we set the expected degree k α i of every node i in every layer α = 1, 2 to be equal to the sum of the expected multidegrees (with m α = 1) in the correlated weighted multiplex ensemble. Moreover, we set the expected strengths s α i of every node i in every layer α to be equal to the sum of the expected multistrengths of node i in layer α in the correlated weighted multiplex ensemble.
We measure the indicator Ψ that compares the entropy of a weighted multiplex ensemble S with the entropy of a weighted multiplex ensemble in which weights are distributed homogeneously. Therefore, Ψ can be defined as where the average . . . π(w) is calculated over multiplex networks with the same structural properties but with weights distributed homogeneously. In particular, when the weight distribution is randomized, the multiplex networks are constrained in such a way that each link must have a minimal weight (i.e. w ij > 1), while the remaining of the total weight is distributed randomly across links. When numerically evaluating . . . π(w) , we obtain the average over 100 weight randomizations. The distribution P (S) of the entropy S calculated over these randomizations, both for the uncorrelated weighted multiplex ensemble and for the correlated weighted multiplex ensemble, is shown in Fig. 5. In both cases, we observe a distribution that can be fitted by a Gaussian function with mean and variance scaling as S ∝ N log N and (δS) 2 π(w) ∝ √ N (See Fig. 6). We call Ψ corr the indicator Ψ calculated on the correlated multiplex ensemble and indicate with Ψ corr the indicator Ψ calculated on the corresponding uncorrelated multiplex ensemble. Finally, to quantify the additional amount of information carried by the correlated multiplex ensemble with respect to the uncorrelated multiplex ensemble, we measure the indicator Ξ as Ξ = Ψ corr Ψ uncorr .
The finite-size scaling of Ψ corr , Ψ corr and Ξ are shown in Fig. 4 in the manuscript.