Sharp Bounds and Normalization of Wiener-Type Indices

Dechao Tian; Kwok Pui Choi

doi:10.1371/journal.pone.0078448

Abstract

Complex networks abound in physical, biological and social sciences. Quantifying a network’s topological structure facilitates network exploration and analysis, and network comparison, clustering and classification. A number of Wiener type indices have recently been incorporated as distance-based descriptors of complex networks, such as the R package QuACN. Wiener type indices are known to depend both on the network’s number of nodes and topology. To apply these indices to measure similarity of networks of different numbers of nodes, normalization of these indices is needed to correct the effect of the number of nodes in a network. This paper aims to fill this gap. Moreover, we introduce an -Wiener index of network , denoted by . This notion generalizes the Wiener index to a very wide class of Wiener type indices including all known Wiener type indices. We identify the maximum and minimum of over a set of networks with nodes. We then introduce our normalized-version of -Wiener index. The normalized -Wiener indices were demonstrated, in a number of experiments, to improve significantly the hierarchical clustering over the non-normalized counterparts.

Citation: Tian D, Choi KP (2013) Sharp Bounds and Normalization of Wiener-Type Indices. PLoS ONE 8(11): e78448. https://doi.org/10.1371/journal.pone.0078448

Editor: Fabio Rapallo, University of East Piedmont, Italy

Received: July 3, 2013; Accepted: September 11, 2013; Published: November 8, 2013

Copyright: © 2013 Tian, Choi. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Funding: This work was supported by funds provided by Ministry of Education (R146-000-134-11). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Recent years witness exponential growth of available biological network data. Thanks to past decades’ breakthrough in biotechnology, researchers now are able to interrogate molecular interactions at systems level. It has since been observed that topological properties of these networks provide important insight into the functions of proteins, and their relationship with one another [1]–[8]. For examples, degree distribution, average clustering coefficient, diameter, centrality, lethality and graphlet distribution have been extensively studied. Hopefully, based on a carefully chosen list of network topological properties and methods in quantifying them, a complex network is adequately summarized in the form of a numerical -dimensional vector where is the number of topological properties in consideration. This representation enables us to take full advantage of a host of classification and clustering techniques to compare complex networks.

A significant step towards this direction is facilitated by the introduction of the R package QuACN by Mueller et al. [9]. QuACN computes the values of different categories of descriptors in a network. One such category is the distance-based descriptors which include Wiener index, Harary index, etc. The use of Wiener index and related type of indices dates back to the seminal work of Wiener in 1947 [10], [11]. Wiener introduced his celebrated index to predict the physical properties, such as boiling point, heats of isomerization and differences in heats of vaporization, of isomers of paraffin by their chemical structures. Viewing the chemical structure of an isomer as a connected graph, the Wiener index is defined as where represent nodes in the graph, the distance between nodes and which is defined as the length of a shortest path between them, and the sum is over all pairs of nodes in the graph. Wiener index has since inspired many distance-based descriptors in Chemometrics. These include Harary index [12], hyper Wiener index [13], q-analog of Wiener index [14], Wiener polynomial [15], Q-index [16], Balaban J index [17], and information indices [18]–[20]. These indices, or commonly called descriptors, play significant roles in quantitative structure-activity relationship/quantitative structure-property relationship (QSAR/QSPR) models [21].

It is known that the Wiener type indices depend both on a network’s number of nodes and its topology. When the numbers of nodes in the networks are equal, as in the applications to isomers, these indices provide informative measures of the branching property of the networks and hence a fair comparison among them. However, when they are used to measure similarities of networks with different numbers of nodes, the intended measure of topological structures will be masked by the sizes of the networks. Normalization of a Wiener type index expectedly minimizes the effect of the network’s number of nodes and hence brings forth its topological structure better. Furthermore, it is also desirable for the normalized index to take value in an absolute scale for better understanding and interpretation. This paper seeks to fill this gap. The normalization introduced in definition 2 below fulfils this purpose. This definition will be of limited practical value if the sharp upper and lower bounds of the index on a graph cannot be found explicitly. The objective of this article is three-fold. First, introduce a very general Wiener type index. We call it -Wiener index, and denote it by for a graph . This definition includes all known Wiener type indices as special cases. Second, identify the maximum and minimum values of over a class of connected networks or a class of connected trees . We are able to derive explicit formulas for these optimal values. Third, propose a normalized version, which takes value in for better interpretation and network comparison.

This paper is organized as follows. We first introduce some standard graph-theoretic notations and recall some special graphs. We then introduce the functional analog of Wiener index, , and our proposed normalized versions of this functional Wiener index in the method section. In the result section, we provide our main results Theorems 1 to 4. Theorem 1 gives the maximum and the minimum of over the set of connected graphs of nodes, and characterization of graphs achieving the maximum or the minimum. Theorem 2 gives a parallel result when the maximum and minimum are taken over the set of connected trees of nodes. Theorem 3, (respectively Theorem 4) identifies the maximum of over the set of connected graphs (respectively connected trees) of nodes with specified maximum degree. We also give a brief description of related works in next section. Then, we consider special cases of in to provide explicit expressions of the maximum and the minimum of Wiener, Harary, hyper Wiener, generalized Wiener indices. In the experiment section, we report the performance of hierarchical clustering based on the usual Wiener type indices and the normalized version of these in our experiments. We end with conclusions section of this paper.

Methods

Definitions and Terminologies

Let be a simple (that is, no self-loops nor multiple edges) connected graph on nodes where and . Denote by as the number of nodes in . Let denote the set of all simple, connected graphs with nodes. A graph having no cycles is called a tree, and we let denote the set of all connected trees with nodes. The distance between any pair of nodes, and , in is the number of edges in a shortest path from to . Let be the distance matrix. We denote the maximum degree of by .

Figure 1 shows some special graphs we frequently refer to in this paper. A path graph, , is a graph that can be drawn so that all of its vertices and edges lie on a straight line. Figure 1(a) shows . A star, , is a tree with one internal node and leaves. is shown in Figure 1(b). A complete graph, , is a graph with nodes in which every pair of distinct nodes is connected by an edge. A caterpillar, , is a tree with a central path with number of nodes where at most one end node of the central path has less than leaves, each of the other nodes in the central path has leaves. Figures 1(d) and 1(e) show caterpillars and respectively. A broom is a tree joining a star and a path by attaching a pendant node (or leaf) in to a pendant node of . For examples, brooms and are shown in Figures 1(f) and 1(g) respectively. A kite is a graph obtained from connecting two end nodes one from a complete graph and one from a path . Figure 1(h) shows a kite .

Download:

Figure 1. Some special graphs. Figure 1 (a) to (g) are trees.

https://doi.org/10.1371/journal.pone.0078448.g001

Throughout this paper, denotes a monotone function defined on nonnegative integers. We define a functional-analog Wiener index below. Our definition contains the Wiener index, Harary index, hyper Wiener index, compactness, average efficiency, generalized Wiener index, Wiener polynomial, -index, -analogy of Wiener index as special cases. For detail, see subsection Important special cases. We abbreviate it as -Wiener index. Thanks to an anonymous reviewer of this article, this definition has also been independently introduced by Schmuck et al. [22].

Definition 1.The -Wiener index of is defined by

Here denotes the shortest distance between nodes and .

The number of nodes of has a very strong effect on Wiener type indices (see Results section). In order to apply -Wiener index for comparing networks, which often differ in the numbers of nodes, we are led to propose a normalized version for graphs and a normalized version for trees for better interpretation of the index.

Definition 2. (a) The normalized -Wiener index for a graph is defined as

Here and .

(b) The normalized -Wiener index for a tree is similarly defined where the maximum and the minimum are taken over instead.

These normalized versions will be of limited practical value if one cannot compute nor . Our main results, stated in Theorems 1 and 2, show that these optimal upper and lower bounds can be easily computed. Moreover, they characterize those graphs which attain the maximum or the minimum.

By definition, takes values in . When is a non-decreasing function, Theorem 1 below shows that if and only if is a path graph, and if and only if is a complete graph. So (respectively, ) suggests looks like a path graph (respectively, a complete graph). And hence the numerical value of provides an indication how is like.

Effect of Number of Nodes on Wiener Type Indices

It is known that the Wiener index for a connected graph with nodes ranges from to (see Corollary 5 below or [23]–[25] ). This wide range can be undesirable if it is used for comparing similarity of graphs with different number of nodes. For example, consider two path graphs, and , with 4 nodes and 5 nodes respectively, and a star graph with 5 nodes, . Values of the Wiener index for and are respectively 10, 20 and 16, giving the false impression that and are more similar than that of and . However, values of the normalized Wiener index are 0 for and , and 1 for . This example is far from being an isolated case, it can be shown that if the number of nodes of a path graph is at least 26% more than the number of nodes in another path graph, there exists a star graph whose Wiener index is closer to that of the path graph with smaller number of nodes.

The normalized Wiener index of , star with nodes, is , suggesting stars of sufficiently large , based on the normalized Wiener index, is very similar to a complete graph. This is concordant with the fact that a is the line graph of [26].

Main Idea

A key ingredient in our proofs is a matrix majorization (see Supporting information file Text S1 for definition) argument. Given a connected graph , we can transform it to another graph such that the distance matrix of , majorizes the corresponding distance matrix of . Since Wiener index of , or its generalization -Wiener index for increasing function , is the sum of the upper diagonal entries in the distance matrix, it follows that . The construction of is fairly straightforward as can be seen in the proofs. The construction of such that is majorized by requires delicate and judicious pruning and regrafting. However, the essential idea remains the same. Technical details of proofs are given in supporting information file Text S1.

Results

We provide explicit expressions for the maximum and minimum of over , and over in Theorems 1 and 2 below. We also characterize those graphs or trees attaining the extremum. Theorems 3 and 4 concern trees or graphs with a specified maximum degree. For simplicity of presentations, we shall only state our results for non-decreasing function . Analogous results for non-increasing can be deduced easily by replacing by .

Theorem 1 Let be a non-decreasing function on nonnegative integers, and , then

The lower bound is attained if and only if is . The upper bound is attained if and only if is .

Theorem 2 Let be a non-decreasing function on nonnegative integers, and , then

The lower bound is attained if and only if is . The upper bound is attained if and only if is .

Theorem 3 Let be a non-decreasing function on nonnegative integers. Then, for any with , we have

The upper bound is attained if and only if is a broom .

Theorem 4 Let be a non-decreasing function on nonnegative integers. Then, for any with , we have

Moreover,

Equality holds if and only if is .

Related Work

The proofs of Theorems 1 to 4 will be given in supporting information file Text S1. Theorem 2 has also been independently obtained by Wagner et al. (see Theorem 2.7 and Corollary 4.1 in [27]). Special cases of Theorems 1 to 4 for particular Wiener type index are known in the literature. For examples, the complete graph (respectively, the path graph) is shown to be the minimizer (respectively, maximizer) of the Wiener index among simple connected graphs with the same number of nodes in [23]–[25]. Similar conclusions are proved to hold for the hyper Wiener index in [25], and the Harary index in [28]. The results in Theorems 1 to 4 in its full generality as -Wiener index are novel to the best knowledge of the authors. Moreover, we have provided a unifying methodology for the proofs.

Important Special Cases

Since its introduction, Wiener index has inspired many variants and thoroughly studied in a sizeable literature [29]. By choosing appropriate functions , the -Wiener index can be reduced to a number of commonly used descriptors as follows.

If we take , written as is the well-studied descriptor introduced by Wiener in 1947 [10], [11].

Taking , the -Wiener index is the Harary index [12], denoted by which is shown to be more discriminating than the Wiener index [12]. Latora and Marchiori in 2001 [30], used a scaled version of the Harary index (more precisely, ) to measure a network’s efficiency in information exchange.

Taking , where can be positive or negative, the -Wiener index is called generalized Wiener index, denoted by [31].

If , the -Wiener index is known as the hyper Wiener index [13], denoted by .

Taking , where is regarded as a parameter, the -Wiener index is called the Hosoya polynomial or Wiener polynomial [15]. With an additional factor 2, the Hosoya polynomial is called -index and denoted by in [16].

The -analog of the Wiener index, introduced by Zhang et al. (2012) in [14] is simply the -Wiener index by choosing .

Applications

By specializing to various forms in Theorems 1 and 2, we provide below explicit sharp upper bounds and sharp lower bounds for the Wiener index , the Harary index , the hyper Wiener index , and the generalized Wiener index for and .

Corollary 5 Let be a simple, connected graph with nodes (that is, ), we havewhen when α>0,

Corollary 6 Let be a tree with nodes (that is, ), we havewhen α<0,when α>0,

Experiments

We describe below three experiments to compare the hierarchical clustering using normalized -Wiener indices with the hierarchical clustering using non-normalized -Wiener indices. Each experiments consists of 3 main steps.

Step 1: A collection of networks (or graphs) or trees, , are chosen to be clustered. The collection is detailed in each experiment below.

Step 2: Seven functions are chosen to form the -Wiener indices. In all our experiments, we choose

and

The first four functions chosen are increasing and the -Wiener indices correspond to the usual index, Wiener index, the hyper Wiener index and the compactness index. The remaining 3 functions chosen are decreasing and correspond to the index, the Harary index and the index. Hopefully these indices collectively capture some essential characters of networks and useful for clustering. For , we construct two characteristic vectors,

Step 3: We adopt a clustering algorithm to cluster using and then produce a dendrogram. We do the same using . Minimum variance method algorithm due to Ward [32] which is made available in R base package [33], was used in all the experiments. The computed the Adjusted Rand Index (ARI) in all the experiments are summarized in Table 1 below.

Download:

Table 1. Adjusted Rand Index (ARI) for clustering (or classification) of networks in our three experiments.

https://doi.org/10.1371/journal.pone.0078448.t001

Experiment 1: Hierarchical Clustering of Random Networks

The collection of networks chosen for this experiment is the networks generated by some commonly used random network models, namely, Erdos-Renyi (ER) model [34], [35], scale-free (SF) network model [36] and 3-D geometric model (GE) [37]. Each of these random network models is applied to generate 10 random networks with the number of nodes ranging from 500 to 950 with step of increment 50. Experiment 1 consists of 5 small, but similar, experiments. We enumerate these 5 small experiments as 1.1,…, 1.5. Subsection after experiments provides more details on how to generate these random networks. We then apply Steps 2 and 3 above to form two dendrograms: one using -Wiener indices without normalization (Figure 2A) and the other dendrogram using normalized -Wiener indices (Figure 2B). To quantify the classification of the two methods: with and without normalization, we adopt the commonly used Adjusted Rand Index (ARI) [38] for classification validation. ARI measures the accuracy of classification, and takes values between −1 and 1. The larger the ARI is, the better is the classification. The ARI for Figures 2A and 2B are respectively 0.18 and 0.56 for Experiment 1.5. Using normalized -Wiener indices lead to a substantial improvement in the classification. We repeat Experiments 1.1 to 1.5 1000 times each. The boxplots of the ARI are shown in Figure 3. The means and standard deviations for these experiments are given in Table 1. They clearly demonstrate the superiority of classification using normalized -Wiener indices.

Download:

Figure 2. Hierarchical clustering of random networks.

30 networks with 10 each generated by the Erdos-Renyi (ER), scale-free (SF) and geometric (GE) random network models. Panel (A) shows the hierarchical clustering based on the -Wiener indices (see Step 1 on page 8 for functions used). The adjusted rand index (ARI) for this clustering is 0.24. Panel (B) is the hierarchical clustering based on the normalized versions of the same -Wiener indices. The ARI of this clustering is 0.67. Number of nodes chosen are 500, 550, …, 950, and is 0.05 in the Erdos-Renyi model. A scale-free network with 500 nodes is denoted by . The others are denoted in a similar way.

https://doi.org/10.1371/journal.pone.0078448.g002

Download:

Figure 3. Boxplots of adjusted rand index for measuring the extent of agreement of clustering of the random networks using non-normalized

-Wiener indices versus normalized -Wiener indices.

https://doi.org/10.1371/journal.pone.0078448.g003

Experiment 2: Hierarchical Clustering of Trees

The collection of trees to be classified consists of 10 paths (), 10 stars (), 10 brooms (), 20 caterpillars ( which is like a path, and which is like a star), and for ranging from 500 to 950 with step of increment 50.

Figure 4 shows the two dendrograms. The ARI for Figures 4A and 4B are respectively 0.10 and 1.00. This demonstrates that using normalized -Wiener indices provides much better accuracy for classification purposes. The result in this experiment is consistent with that of experiment 1.

Download:

Figure 4. Hierarchical clustering of trees.

Panel (A) shows the hierarchical clustering based on the -Wiener indices (see Step 1 on page 6 for functions used). The adjusted rand index (ARI) is 0.1. Panel (B) shows the hierarchical clustering based on normalized -Wiener indices. The ARI is 1. Trees used in the clustering consist of paths (), stars (), caterpillar-like trees (), kites (). Number of nodes .

https://doi.org/10.1371/journal.pone.0078448.g004

Experiment 3: Hierarchical Clustering of Random Networks and Trees

The collection of networks consists of (i) networks generated by three random network models, namely, ER model, SF Model and 3-D geometric model; (ii) some trees such as paths, brooms, caterpillars, stars. Figure 5 shows the two dendrograms formed. And the ARI for Figures 5A and 5B are respectively 0.04 and 0.86.

Download:

Figure 5. Hierarchical clusters of trees and graphs.

Panel (A) shows the hierarchical clustering based on the -Wiener indices (see Step 1 on page 6 for functions used). The adjusted rand index (ARI) is 0.04. Panel (B) shows the hierarchical clustering based on normalized -Wiener indices, and ARI = 0.86. Trees used are paths (), stars (), caterpillar-like trees (), kites (). Graphs are generated by Erdos-Renyi (), scale-free () and geometric () random network models. The parameter, , in the Erods-Renyi random graph equals to 0.05, number of nodes = 500,550,…,950.

https://doi.org/10.1371/journal.pone.0078448.g005

Details on Generating Random Networks

We describe here in details on how to choose the networks generated by the three random network models in experiments 1 and 3.

Experiment 1 consists of 5 small, but similar, experiments which we label as Experiment 1.1, …, Experiment 1.5 which correspond to respectively. Now we describe Experiment 1.5 in details.

ER Model

There are two parameters in the ER model, namely, , the number of nodes, and , the probability that an edge is formed between a pair of nodes. All edges are formed independently of each other. In Experiment 1.5, where , we choose ranging from 500 to 950 with step of increment 50. We generate an ER network using ‘erdos.renyi.game’ function available in the R package igraph [39]. If the network is connected, we keep it in and denote it as . If not, then we repeat the function ‘erdos.renyi.game’ until a connected network is obtained. Similarly, are generated.

SF Model

We also construct ten SF networks by the function ‘barabasi.game’ available in the R igraph package. We shall describe how to grow a SF network with 500 nodes for a given , say . The other 9 SF networks with nodes are constructed in a similar manner. In ‘barabasi.game’ function, we set number of vertices 500, number of edges to be added in each time step rounded to the nearest integer, and the option to create a directed graph false.

Geometric Model

We generate ten 3-D geometric networks with nodes. We shall describe how to construct one with 500 nodes as follows. The rest are constructed similarly. We first place 500 nodes in a unit cube uniformly and independently, then we compute all the pairwise distances and rank these distances in ascending order. We choose the top of these pairwise distances and connect their corresponding nodes. If this network is connected, then we keep it in and denote it by . Otherwise, we discard it, and repeat the above procedure until we get a connected network. The other networks are constructed similarly.

Conclusions

Wiener index and other Wiener type indices have been commonly applied in Chemometrics to associate structures and physicochemical properties of molecules. Recently, these indices are incorporated in quantifying complex networks as in QuACN [9] and NetCAD [40]. In this article, we first generalize Wiener index to a general functional form, called -Wiener index. This -Wiener index contains all well-known Wiener type indices as special cases such as Wiener index, Harary index, hyper Wiener index, compactness, and average efficiency. We provide a unifying method to identify the maximum and minimum over the set of simple connected graphs with nodes, or the set of simple connected trees with nodes (Theorems 1 and 2). Explicit sharp upper and lower bounds for Wiener index, Harary index, hyper Wiener index and the generalized index are deduced over networks (Corollary 5) and over trees (Corollary 6). Moreover, the maximizer and minimizer are characterized in Theorems 1 and 2. We believe these results are general and of independent interests.

Armed with these maximum and minimum values, we propose a normalized version of -Wiener index over networks, and a similar version over trees. These normalized versions provide better interpretation of indices over networks of varying number of nodes than the non-normalized one. We conduct a number of experiments to compare the clustering performance using normalized -Wiener indices with that of the non-normalized -Wiener indices. The results of these experiments consistently demonstrate that using normalized versions improved clustering substantially. The normalized versions capture similar topological structures among networks with different number of nodes better. Our method of optimizing can be easily extended to index of the form where and are monotone functions. For example, taking and leads to which measures small-world behvaior of network [8]. For other descriptors, it is of interest to study whether normalization is needed; if so, how best to normalize them; and to what extent normalization improve network comparison.

Observe that where we assume , denotes the number of pairs of nodes in with distance equals , and the number of pairs of nodes in with distance greater than . Since in most biological networks the number of nodes is large, one may normalize a scaled-version of in terms of the asymptotic distribution of the ’s under the assumption that the observed network is generated by a given random network model . This will enable us to determine the likelihood that the observed network is generated by . Currently a fair amount of information about shortest paths in some network models is available in [41], [42]. How to make use of these results seems like a worthwhile future project.

Supporting Information

Figure S1.

Illustrating the choices of and in Lemma 2. Here has 5 nodes, 3 nodes. We choose and . Tree is constructed by joining and while by joining and . and are matrices where the first 5 columns correspondent to the 5 nodes in , and the last 3 rows correspondent to the 3 nodes in .

https://doi.org/10.1371/journal.pone.0078448.s001

(TIF)

Figure S2.

Illustration of Lemma 3. Here . From the counts of the distances above, it is clear that and .

https://doi.org/10.1371/journal.pone.0078448.s002

(TIF)

Figure S3.

Illustration of the subtree pruning and regrafting algorithm. Here is obtained from first by deleting the edge and then connecting and . is proved to satisfy these properties: (i) ; (ii) ; and (iii) number of pendant nodes is one less than that of .

https://doi.org/10.1371/journal.pone.0078448.s003

(TIF)

Text S1.

Detailed proof for Theorems 1–4.

https://doi.org/10.1371/journal.pone.0078448.s004

(PDF)

Acknowledgments

The authors thank the anonymous reviewers for a careful reading of this article, helpful suggestions and bringing to their attention of some additional references.

Author Contributions

Conceived and designed the experiments: DT KPC. Performed the experiments: DT KPC. Analyzed the data: DT KPC. Wrote the paper: DC KPC. Performed the mathematical analysis and statistical analysis: DT KPC.

References

1. Vidal M, Cusick ME, Barabasi AL (2011) Interactome networks and human disease. Cell 144: 986–998.
- View Article
- Google Scholar
2. Barabasi AL, Gulbahce N, Loscalzo J (2011) Network medicine: a network-based approach to human disease. Nature Reviews Genetics 12: 56–68.
- View Article
- Google Scholar
3. Hu L, Huang T, Shi X, Lu WC, Cai YD, et al. (2011) Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties. PLoS One 6: e14556.
- View Article
- Google Scholar
4. Delprato A (2012) Topological and functional properties of the small GTPases protein interaction network. PLoS One 7: e44882.
- View Article
- Google Scholar
5. Resendis-Antonio O, Hernández M, Mora Y, Encarnación S (2012) Functional modules, structural topology, and optimal activity in metabolic networks. PLoS Computational Biology 8: e1002720.
- View Article
- Google Scholar
6. Milenković T, Memišević V, Bonato A, Pržulj N (2011) Dominating biological networks. PloS One 6: e23016.
- View Article
- Google Scholar
7. Junker BH, Schreiber F (2008) Analysis of biological networks. Wiley-Interscience, volume 2: 31–59.
- View Article
- Google Scholar
8. Newman ME (2002) The structure and function of networks. Computer Physics Communications 147: 40–45.
- View Article
- Google Scholar
9. Mueller L, Kugler K, Dander A, Graber A, Dehmer M (2011) QuACN: an R package for analyzing complex biological networks quantitatively. Bioinformatics 27: 140–141.
- View Article
- Google Scholar
10. Wiener H (1947) Structural determination of paraffin boiling points. Journal of the American Chemical Society 69: 17–20.
- View Article
- Google Scholar
11. Wiener H (1947) Correlation of heats of isomerization, and differences in heats of vaporization of isomers, among the paraffin hydrocarbons. Journal of the American Chemical Society 69: 2636–2638.
- View Article
- Google Scholar
12. Plavšić D, Nikolić S, Trinajstić N, Mihalić Z (1993) On the harary index for the characterization of chemical graphs. Journal of Mathematical Chemistry 12: 235–250.
- View Article
- Google Scholar
13. Randić M (1993) Novel molecular descriptor for structureproperty studies. Chemical Physics Letters 211: 478–483.
- View Article
- Google Scholar
14. Zhang Y, Gutman I, Liu J, Mu Z (2012) q-analog of wiener index. Match: Communications in Mathematical and Computer Chemistry 67: 347.
- View Article
- Google Scholar
15. Hosoya H (1988) On some counting polynomials in chemistry. Discrete Applied Mathematics 19: 239–257.
- View Article
- Google Scholar
16. Brückler F, Došlić T, Graovac A, Gutman I (2011) On a class of distance-based molecular structure descriptors. Chemical Physics Letters 503: 336–338.
- View Article
- Google Scholar
17. Balaban A (1982) Highly discriminating distance-based topological index. Chemical Physics Letters 89: 399–404.
- View Article
- Google Scholar
18. Dehmer M, Mowshowitz A (2011) A history of graph entropy measures. Information Sciences 181: 57–78.
- View Article
- Google Scholar
19. Dehmer M, Varmuza K, Borgert S, Emmert-Streib F (2009) On entropy-based molecular descriptors: Statistical analysis of real and synthetic chemical structures. Journal of Chemical Information and Modeling 49: 1655–1663.
- View Article
- Google Scholar
20. Dehmer M (2008) Information processing in complex networks: Graph entropy and information functionals. Applied Mathematics and Computation 201: 82–94.
- View Article
- Google Scholar
21. Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics. Wiley-VCH.
22. Schmuck NS, Wagner SG, Wang H (2012) Greedy trees, caterpillars, and wiener-type graph invariants. Match-Communications in Mathematical and Computer Chemistry 68: 273.
- View Article
- Google Scholar
23. Soltés L (1991) Transmission in graphs: a bound and vertex removing. Math Slovaca 41: 11–16.
- View Article
- Google Scholar
24. Dobrynin A, Entringer R, Gutman I (2001) Wiener index of trees: theory and applications. Acta Applicandae Mathematicae 66: 211–249.
- View Article
- Google Scholar
25. Gutman I, Linert W, Lukovits I, Dobrynin A (1997) Trees with extremal hyper-wiener index: Mathematical basis and chemical applications. Journal of Chemical Information and Computer Sciences 37: 349–354.
- View Article
- Google Scholar
26. Resendis-Antonio O, Hernández M, Mora Y, Encarnación S (1931) Gráfok és mátrixok. Matematikai és Fizikai Lapok 38: 116–119.
- View Article
- Google Scholar
27. Wagnera S, Wangb H, Zhangc XD (2013) Distance-based graph invariants of trees and the harary index. Filomat 27: 41–50.
- View Article
- Google Scholar
28. Gutman I (1997) A property of the wiener number and its modifications. Indian journal of chemistry Sect A: Inorganic, Physical, Theoretical & Analytical 36: 128–132.
- View Article
- Google Scholar
29. Todeschini R, Consonni V (2008) Handbook of molecular descriptors. Wiley-Vch.
30. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393: 440–442.
- View Article
- Google Scholar
31. Gutman I, Popović L, et al. (1998) Graph representation of organic molecules Cayley’s plerograms vs. his kenograms. Journal of the Chemical Society, Faraday Transactions 94: 857–860.
- View Article
- Google Scholar
32. Ward Jr JH (1963) Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58: 236–244.
- View Article
- Google Scholar
33. R Development Core Team (2012) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. ISBN 3-900051-07-0.
34. Erdős P, Rényi A (1959) On random graphs. Publicationes Mathematicae Debrecen 6: 290–297.
- View Article
- Google Scholar
35. Erdős P, Rényi A (1960) On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5: 17–61.
- View Article
- Google Scholar
36. Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286: 509–512.
- View Article
- Google Scholar
37. Pržulj N, Corneil DG, Jurisica I (2004) Modeling interactome: scale-free or geometric? Bioinformatics 20: 3508–3515.
- View Article
- Google Scholar
38. Rand WM (1971) Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66: 846–850.
- View Article
- Google Scholar
39. Csardi G, Nepusz T (2006) The igraph software package for complex network research. InterJournal, Complex Systems 1695: 38.
- View Article
- Google Scholar
40. Ren G, Liu Z (2013) NetCAD: a network analysis tool for coronary artery disease-associated PPI network. Bioinformatics 29: 279–280.
- View Article
- Google Scholar
41. Barbour AD, Reinert G (2011) The shortest distance in random multi-type intersection graphs. Random Structures & Algorithms 39: 179–209.
- View Article
- Google Scholar
42. Fronczak A, Fronczak P, Ho lyst JA (2004) Average path length in random networks. Physical Review E 70: 056110.
- View Article
- Google Scholar

[ref1] 1. Vidal M, Cusick ME, Barabasi AL (2011) Interactome networks and human disease. Cell 144: 986–998.
View Article
Google Scholar

[2] View Article

[3] Google Scholar

[ref2] 2. Barabasi AL, Gulbahce N, Loscalzo J (2011) Network medicine: a network-based approach to human disease. Nature Reviews Genetics 12: 56–68.
View Article
Google Scholar

[5] View Article

[6] Google Scholar

[ref3] 3. Hu L, Huang T, Shi X, Lu WC, Cai YD, et al. (2011) Predicting functions of proteins in mouse based on weighted protein-protein interaction network and protein hybrid properties. PLoS One 6: e14556.
View Article
Google Scholar

[8] View Article

[9] Google Scholar

[ref4] 4. Delprato A (2012) Topological and functional properties of the small GTPases protein interaction network. PLoS One 7: e44882.
View Article
Google Scholar

[11] View Article

[12] Google Scholar

[ref5] 5. Resendis-Antonio O, Hernández M, Mora Y, Encarnación S (2012) Functional modules, structural topology, and optimal activity in metabolic networks. PLoS Computational Biology 8: e1002720.
View Article
Google Scholar

[14] View Article

[15] Google Scholar

[ref6] 6. Milenković T, Memišević V, Bonato A, Pržulj N (2011) Dominating biological networks. PloS One 6: e23016.
View Article
Google Scholar

[17] View Article

[18] Google Scholar

[ref7] 7. Junker BH, Schreiber F (2008) Analysis of biological networks. Wiley-Interscience, volume 2: 31–59.
View Article
Google Scholar

[20] View Article

[21] Google Scholar

[ref8] 8. Newman ME (2002) The structure and function of networks. Computer Physics Communications 147: 40–45.
View Article
Google Scholar

[23] View Article

[24] Google Scholar

[ref9] 9. Mueller L, Kugler K, Dander A, Graber A, Dehmer M (2011) QuACN: an R package for analyzing complex biological networks quantitatively. Bioinformatics 27: 140–141.
View Article
Google Scholar

[26] View Article

[27] Google Scholar

[ref10] 10. Wiener H (1947) Structural determination of paraffin boiling points. Journal of the American Chemical Society 69: 17–20.
View Article
Google Scholar

[29] View Article

[30] Google Scholar

[ref11] 11. Wiener H (1947) Correlation of heats of isomerization, and differences in heats of vaporization of isomers, among the paraffin hydrocarbons. Journal of the American Chemical Society 69: 2636–2638.
View Article
Google Scholar

[32] View Article

[33] Google Scholar

[ref12] 12. Plavšić D, Nikolić S, Trinajstić N, Mihalić Z (1993) On the harary index for the characterization of chemical graphs. Journal of Mathematical Chemistry 12: 235–250.
View Article
Google Scholar

[35] View Article

[36] Google Scholar

[ref13] 13. Randić M (1993) Novel molecular descriptor for structureproperty studies. Chemical Physics Letters 211: 478–483.
View Article
Google Scholar

[38] View Article

[39] Google Scholar

[ref14] 14. Zhang Y, Gutman I, Liu J, Mu Z (2012) q-analog of wiener index. Match: Communications in Mathematical and Computer Chemistry 67: 347.
View Article
Google Scholar

[41] View Article

[42] Google Scholar

[ref15] 15. Hosoya H (1988) On some counting polynomials in chemistry. Discrete Applied Mathematics 19: 239–257.
View Article
Google Scholar

[44] View Article

[45] Google Scholar

[ref16] 16. Brückler F, Došlić T, Graovac A, Gutman I (2011) On a class of distance-based molecular structure descriptors. Chemical Physics Letters 503: 336–338.
View Article
Google Scholar

[47] View Article

[48] Google Scholar

[ref17] 17. Balaban A (1982) Highly discriminating distance-based topological index. Chemical Physics Letters 89: 399–404.
View Article
Google Scholar

[50] View Article

[51] Google Scholar

[ref18] 18. Dehmer M, Mowshowitz A (2011) A history of graph entropy measures. Information Sciences 181: 57–78.
View Article
Google Scholar

[53] View Article

[54] Google Scholar

[ref19] 19. Dehmer M, Varmuza K, Borgert S, Emmert-Streib F (2009) On entropy-based molecular descriptors: Statistical analysis of real and synthetic chemical structures. Journal of Chemical Information and Modeling 49: 1655–1663.
View Article
Google Scholar

[56] View Article

[57] Google Scholar

[ref20] 20. Dehmer M (2008) Information processing in complex networks: Graph entropy and information functionals. Applied Mathematics and Computation 201: 82–94.
View Article
Google Scholar

[59] View Article

[60] Google Scholar

[ref21] 21. Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics. Wiley-VCH.

[ref22] 22. Schmuck NS, Wagner SG, Wang H (2012) Greedy trees, caterpillars, and wiener-type graph invariants. Match-Communications in Mathematical and Computer Chemistry 68: 273.
View Article
Google Scholar

[63] View Article

[64] Google Scholar

[ref23] 23. Soltés L (1991) Transmission in graphs: a bound and vertex removing. Math Slovaca 41: 11–16.
View Article
Google Scholar

[66] View Article

[67] Google Scholar

[ref24] 24. Dobrynin A, Entringer R, Gutman I (2001) Wiener index of trees: theory and applications. Acta Applicandae Mathematicae 66: 211–249.
View Article
Google Scholar

[69] View Article

[70] Google Scholar

[ref25] 25. Gutman I, Linert W, Lukovits I, Dobrynin A (1997) Trees with extremal hyper-wiener index: Mathematical basis and chemical applications. Journal of Chemical Information and Computer Sciences 37: 349–354.
View Article
Google Scholar

[72] View Article

[73] Google Scholar

[ref26] 26. Resendis-Antonio O, Hernández M, Mora Y, Encarnación S (1931) Gráfok és mátrixok. Matematikai és Fizikai Lapok 38: 116–119.
View Article
Google Scholar

[75] View Article

[76] Google Scholar

[ref27] 27. Wagnera S, Wangb H, Zhangc XD (2013) Distance-based graph invariants of trees and the harary index. Filomat 27: 41–50.
View Article
Google Scholar

[78] View Article

[79] Google Scholar

[ref28] 28. Gutman I (1997) A property of the wiener number and its modifications. Indian journal of chemistry Sect A: Inorganic, Physical, Theoretical & Analytical 36: 128–132.
View Article
Google Scholar

[81] View Article

[82] Google Scholar

[ref29] 29. Todeschini R, Consonni V (2008) Handbook of molecular descriptors. Wiley-Vch.

[ref30] 30. Watts DJ, Strogatz SH (1998) Collective dynamics of ‘small-world’ networks. Nature 393: 440–442.
View Article
Google Scholar

[85] View Article

[86] Google Scholar

[ref31] 31. Gutman I, Popović L, et al. (1998) Graph representation of organic molecules Cayley’s plerograms vs. his kenograms. Journal of the Chemical Society, Faraday Transactions 94: 857–860.
View Article
Google Scholar

[88] View Article

[89] Google Scholar

[ref32] 32. Ward Jr JH (1963) Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association 58: 236–244.
View Article
Google Scholar

[91] View Article

[92] Google Scholar

[ref33] 33. R Development Core Team (2012) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. URL http://www.R-project.org/. ISBN 3-900051-07-0.

[ref34] 34. Erdős P, Rényi A (1959) On random graphs. Publicationes Mathematicae Debrecen 6: 290–297.
View Article
Google Scholar

[95] View Article

[96] Google Scholar

[ref35] 35. Erdős P, Rényi A (1960) On the evolution of random graphs. Publications of the Mathematical Institute of the Hungarian Academy of Sciences 5: 17–61.
View Article
Google Scholar

[98] View Article

[99] Google Scholar

[ref36] 36. Barabási AL, Albert R (1999) Emergence of scaling in random networks. Science 286: 509–512.
View Article
Google Scholar

[101] View Article

[102] Google Scholar

[ref37] 37. Pržulj N, Corneil DG, Jurisica I (2004) Modeling interactome: scale-free or geometric? Bioinformatics 20: 3508–3515.
View Article
Google Scholar

[104] View Article

[105] Google Scholar

[ref38] 38. Rand WM (1971) Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association 66: 846–850.
View Article
Google Scholar

[107] View Article

[108] Google Scholar

[ref39] 39. Csardi G, Nepusz T (2006) The igraph software package for complex network research. InterJournal, Complex Systems 1695: 38.
View Article
Google Scholar

[110] View Article

[111] Google Scholar

[ref40] 40. Ren G, Liu Z (2013) NetCAD: a network analysis tool for coronary artery disease-associated PPI network. Bioinformatics 29: 279–280.
View Article
Google Scholar

[113] View Article

[114] Google Scholar

[ref41] 41. Barbour AD, Reinert G (2011) The shortest distance in random multi-type intersection graphs. Random Structures & Algorithms 39: 179–209.
View Article
Google Scholar

[116] View Article

[117] Google Scholar

[ref42] 42. Fronczak A, Fronczak P, Ho lyst JA (2004) Average path length in random networks. Physical Review E 70: 056110.
View Article
Google Scholar

[119] View Article

[120] Google Scholar

Correction

Figures

Abstract

Introduction

Methods

Definitions and Terminologies

Effect of Number of Nodes on Wiener Type Indices

Main Idea

Results

Related Work

Important Special Cases

Applications

Experiments

Experiment 1: Hierarchical Clustering of Random Networks

Experiment 2: Hierarchical Clustering of Trees

Experiment 3: Hierarchical Clustering of Random Networks and Trees

Details on Generating Random Networks

ER Model

SF Model

Geometric Model

Conclusions

Supporting Information

Figure S1.

Figure S2.

Figure S3.

Text S1.

Acknowledgments

Author Contributions

References