Sharp Bounds and Normalization of Wiener-Type Indices

Complex networks abound in physical, biological and social sciences. Quantifying a network’s topological structure facilitates network exploration and analysis, and network comparison, clustering and classification. A number of Wiener type indices have recently been incorporated as distance-based descriptors of complex networks, such as the R package QuACN. Wiener type indices are known to depend both on the network’s number of nodes and topology. To apply these indices to measure similarity of networks of different numbers of nodes, normalization of these indices is needed to correct the effect of the number of nodes in a network. This paper aims to fill this gap. Moreover, we introduce an -Wiener index of network , denoted by . This notion generalizes the Wiener index to a very wide class of Wiener type indices including all known Wiener type indices. We identify the maximum and minimum of over a set of networks with nodes. We then introduce our normalized-version of -Wiener index. The normalized -Wiener indices were demonstrated, in a number of experiments, to improve significantly the hierarchical clustering over the non-normalized counterparts.


Introduction
Recent years witness exponential growth of available biological network data. Thanks to past decades' breakthrough in biotechnology, researchers now are able to interrogate molecular interactions at systems level. It has since been observed that topological properties of these networks provide important insight into the functions of proteins, and their relationship with one another [1][2][3][4][5][6][7][8]. For examples, degree distribution, average clustering coefficient, diameter, centrality, lethality and graphlet distribution have been extensively studied. Hopefully, based on a carefully chosen list of network topological properties and methods in quantifying them, a complex network is adequately summarized in the form of a numerical d-dimensional vector where d is the number of topological properties in consideration. This representation enables us to take full advantage of a host of classification and clustering techniques to compare complex networks. A significant step towards this direction is facilitated by the introduction of the R package QuACN by Mueller et al. [9]. QuACN computes the values of different categories of descriptors in a network. One such category is the distance-based descriptors which include Wiener index, Harary index, etc. The use of Wiener index and related type of indices dates back to the seminal work of Wiener in 1947 [10,11]. Wiener introduced his celebrated index to predict the physical properties, such as boiling point, heats of isomerization and differences in heats of vaporization, of isomers of paraffin by their chemical structures. Viewing the chemical structure of an isomer as a connected graph, the Wiener index is defined as P i,j d(i,j) where i,j represent nodes in the graph, d(i,j) the distance between nodes i and j which is defined as the length of a shortest path between them, and the sum is over all pairs of nodes in the graph. Wiener index has since inspired many distance-based descriptors in Chemometrics. These include Harary index [12], hyper Wiener index [13], q-analog of Wiener index [14], Wiener polynomial [15], Q-index [16], Balaban J index [17], and information indices [18][19][20]. These indices, or commonly called descriptors, play significant roles in quantitative structure-activity relationship/quantitative structure-property relationship (QSAR/QSPR) models [21].
It is known that the Wiener type indices depend both on a network's number of nodes and its topology. When the numbers of nodes in the networks are equal, as in the applications to isomers, these indices provide informative measures of the branching property of the networks and hence a fair comparison among them. However, when they are used to measure similarities of networks with different numbers of nodes, the intended measure of topological structures will be masked by the sizes of the networks. Normalization of a Wiener type index expectedly minimizes the effect of the network's number of nodes and hence brings forth its topological structure better. Furthermore, it is also desirable for the normalized index to take value in an absolute scale for better understanding and interpretation. This paper seeks to fill this gap. The normalization introduced in definition 2 below fulfils this purpose. This definition will be of limited practical value if the sharp upper and lower bounds of the index on a graph cannot be found explicitly. The objective of this article is three-fold. First, introduce a very general Wiener type index. We call it f -Wiener index, and denote it by W f (G) for a graph G. This definition includes all known Wiener type indices as special cases. Second, identify the maximum and minimum values of W f (G) over a class of connected networks G or a class of connected trees G. We are able to derive explicit formulas for these optimal values. Third, propose a normalized version, W Ã f (G) which takes value in ½0,1 for better interpretation and network comparison. This paper is organized as follows. We first introduce some standard graph-theoretic notations and recall some special graphs. We then introduce the functional analog of Wiener index, W f (G), and our proposed normalized versions of this functional Wiener index in the method section. In the result section, we provide our main results Theorems 1 to 4. Theorem 1 gives the maximum and the minimum of W f (G) over the set of connected graphs of n nodes, and characterization of graphs achieving the maximum or the minimum. Theorem 2 gives a parallel result when the maximum and minimum are taken over the set of connected trees of n nodes. Theorem 3, (respectively Theorem 4) identifies the maximum of W f (G) over the set of connected graphs (respectively connected trees) of n nodes with specified maximum degree. We also give a brief description of related works in next section. Then, we consider special cases of f in W f (G) to provide explicit expressions of the maximum and the minimum of Wiener, Harary, hyper Wiener, generalized Wiener indices. In the experiment section, we report the performance of hierarchical clustering based on the usual Wiener type indices and the normalized version of these in our experiments. We end with conclusions section of this paper.

Definitions and Terminologies
Let G~(V ,E) be a simple (that is, no self-loops nor multiple edges) connected graph on n nodes where V~f1, . . . ,ng and E(V |V . Denote by N(G) as the number of nodes in G. Let G n denote the set of all simple, connected graphs with n nodes. A graph having no cycles is called a tree, and we let T n denote the set of all connected trees with n nodes. The distance d(i,j) between any pair of nodes, i and j, in G is the number of edges in a shortest path from i to j. Let D(G)~½d(i,j) 1ƒi,jƒn be the distance matrix. We denote the maximum degree of G by D(G). Figure 1 shows some special graphs we frequently refer to in this paper. A path graph, P n , is a graph that can be drawn so that all of its vertices and edges lie on a straight line. Figure 1(a) shows P 8 . A star, S n , is a tree with one internal node and n{1 leaves. S 8 is shown in Figure 1(b). A complete graph, K n , is a graph with n nodes in which every pair of distinct nodes is connected by an edge. A caterpillar, C n,k , is a tree with a central path with number of nodes [½n=(kz1),(nzk)=(kz1) where at most one end node of the central path has less than k leaves, each of the other nodes in the central path has k leaves. Figures 1(d) and 1(e) show caterpillars C 12,2 and C 8,3 respectively. A broom B n,k is a tree joining a star S kz1 and a path P n{k{1 by attaching a pendant node (or leaf) in P n{k{1 to a pendant node of S kz1 . For examples, brooms B 8,4 and B 8,5 are shown in Figures 1(f) and 1(g) respectively. A kite K n,' is a graph obtained from connecting two end nodes one from a complete graph K ' and one from a path P n{' . Figure 1(h) shows a kite K 8,4 .
Throughout this paper, f denotes a monotone function defined on nonnegative integers. We define a functional-analog Wiener index below. Our definition contains the Wiener index, Harary index, hyper Wiener index, compactness, average efficiency, generalized Wiener index, Wiener polynomial, Qindex, q-analogy of Wiener index as special cases. For detail, see subsection Important special cases. We abbreviate it as f -Wiener index. Thanks to an anonymous reviewer of this article, this definition has also been independently introduced by Schmuck et al. [22].
Here d(i,j) denotes the shortest distance between nodes i and j.
The number of nodes of G has a very strong effect on Wiener type indices (see Results section). In order to apply f -Wiener index for comparing networks, which often differ in the numbers of nodes, we are led to propose a normalized version for graphs and a normalized version for trees for better interpretation of the index.
Here These normalized versions will be of limited practical value if one cannot compute M f nor m f . Our main results, stated in Theorems 1 and 2, show that these optimal upper and lower bounds can be easily computed. Moreover, they characterize those graphs which attain the maximum or the minimum.
By definition, W Ã f (G) takes values in ½0,1. When f is a nondecreasing function, Theorem 1 below shows that suggests G looks like a path graph (respectively, a complete graph). And hence the numerical value of W Ã f (G) provides an indication how G is like.

Effect of Number of Nodes on Wiener Type Indices
It is known that the Wiener index for a connected graph with n nodes ranges from n(n{1)=2 to n(n{1)(nz1)=6 (see Corollary 5 below or [23][24][25] ). This wide range can be undesirable if it is used for comparing similarity of graphs with different number of nodes. For example, consider two path graphs, P 4 and P 5 , with 4 nodes and 5 nodes respectively, and a star graph with 5 nodes, S 5 . Values of the Wiener index for P 4 ,P 5 and S 5 are respectively 10, 20 and 16, giving the false impression that P 5 and S 5 are more similar than that of P 4 and P 5 . However, values of the normalized Wiener index are 0 for P 4 and P 5 , and 1 for S 5 . This example is far from being an isolated case, it can be shown that if the number of nodes of a path graph is at least 26% more than the number of nodes in another path graph, there exists a star graph whose Wiener index is closer to that of the path graph with smaller number of nodes.
The normalized Wiener index of S n , star with n nodes, is 1{3=n, suggesting stars of sufficiently large n, based on the normalized Wiener index, S n is very similar to a complete graph. This is concordant with the fact that a K n is the line graph of S nz1 [26].

Main Idea
A key ingredient in our proofs is a matrix majorization (see Supporting information file Text S1 for definition) argument. Given a connected graph G, we can transform it to another graph G 0 such that the distance matrix of G, D(G)~½d(i,j) 1ƒi,jƒn majorizes the corresponding distance matrix of G 0 . Since Wiener index of G, or its generalization f -Wiener index for increasing function f , is the sum of the upper diagonal entries in the distance matrix, it follows that fairly straightforward as can be seen in the proofs. The construction of G 00 such that D(G) is majorized by D(G 00 ) requires delicate and judicious pruning and regrafting. However, the essential idea remains the same. Technical details of proofs are given in supporting information file Text S1.

Results
We provide explicit expressions for the maximum and minimum of W f (G) over G n , and over T n in Theorems 1 and 2 below. We also characterize those graphs or trees attaining the extremum. Theorems 3 and 4 concern trees or graphs with a specified maximum degree. For simplicity of presentations, we shall only state our results for non-decreasing function f . Analogous results for non-increasing f can be deduced easily by replacing f by {f .
Theorem 1 Let f be a non-decreasing function on nonnegative integers, and G[G n , then The lower bound is attained if and only if G is K n . The upper bound is attained if and only if G is P n .
Theorem 2 Let f be a non-decreasing function on nonnegative integers, and T[T n , then The lower bound is attained if and only if T is S n . The upper bound is attained if and only if T is P n .
Theorem 3 Let f be a non-decreasing function on nonnegative integers. Then, for any T[T n with D(T)~k, we have The upper bound is attained if and only if T is a broom B n,kz1 .
Theorem 4 Let f be a non-decreasing function on nonnegative integers. Then, for any G[G n with D(G)~k, we have Moreover, Equality holds if and only if G is B n,kz1 .

Related Work
The proofs of Theorems 1 to 4 will be given in supporting information file Text S1. Theorem 2 has also been independently obtained by Wagner et al. (see Theorem 2.7 and Corollary 4.1 in [27]). Special cases of Theorems 1 to 4 for particular Wiener type index are known in the literature. For examples, the complete graph (respectively, the path graph) is shown to be the minimizer (respectively, maximizer) of the Wiener index among simple connected graphs with the same number of nodes in [23][24][25]. Similar conclusions are proved to hold for the hyper Wiener index in [25], and the Harary index in [28]. The results in Theorems 1 to 4 in its full generality as f -Wiener index are novel to the best knowledge of the authors. Moreover, we have provided a unifying methodology for the proofs.

Important Special Cases
Since its introduction, Wiener index has inspired many variants and thoroughly studied in a sizeable literature [29]. By choosing appropriate functions f , the f -Wiener index can be reduced to a number of commonly used descriptors as follows.
If we take f (k)~k, W f (G) written as W (G) is the well-studied descriptor introduced by Wiener in 1947 [10,11].
Taking f (k)~1=k, the f -Wiener index is the Harary index [12], denoted by H(G) which is shown to be more discriminating than the Wiener index [12]. Latora and Marchiori in 2001 [30], used a scaled version of the Harary index (more precisely, n(n{1)k ) to measure a network's efficiency in information exchange.
Taking f (k)~k a , where a can be positive or negative, the f -Wiener index is called generalized Wiener index, denoted by W a (G) [31].
Taking f (k)~l k , where l is regarded as a parameter, the f -Wiener index is called the Hosoya polynomial or Wiener polynomial [15]. With an additional factor 2, the Hosoya polynomial is called Q-index and denoted by Q(l) in [16].

Applications
By specializing f to various forms in Theorems 1 and 2, we provide below explicit sharp upper bounds and sharp lower bounds for the Wiener index W (G), the Harary index H(G), the hyper Wiener index WW (G), and the generalized Wiener index W a (G) for aw0 and av0.
Corollary 5 Let G be a simple, connected graph with n nodes (that is, when av0, when a.0, Corollary 6 Let T be a tree with n nodes (that is, T[T n ), we have when a,0, when a.0,

Experiments
We describe below three experiments to compare the hierarchical clustering using normalized f -Wiener indices with the hierarchical clustering using non-normalized f -Wiener indices. Each experiments consists of 3 main steps.
Step 1: A collection of networks (or graphs) or trees, C, are chosen to be clustered. The collection is detailed in each experiment below.
Step 2: Seven functions are chosen to form the f -Wiener indices. In all our experiments, we choose , The Step 3: We adopt a clustering algorithm to cluster C using v G and then produce a dendrogram. We do the same using v Ã G . Minimum variance method algorithm due to Ward [32] which is made available in R base package [33], was used in all the experiments. The computed the Adjusted Rand Index (ARI) in all the experiments are summarized in Table 1 below.

Experiment 1: Hierarchical Clustering of Random Networks
The collection of networks chosen for this experiment is the networks generated by some commonly used random network models, namely, Erdos-Renyi (ER) model [34,35], scale-free (SF) network model [36] and 3-D geometric model (GE) [37]. Each of these random network models is applied to generate 10 random networks with the number of nodes ranging from 500 to 950 with step of increment 50. Experiment 1 consists of 5 small, but similar, experiments. We enumerate these 5 small experiments as 1.1,…, 1.5. Subsection after experiments provides more details on how to generate these random networks. We then apply Steps 2 and 3 above to form two dendrograms: one using f -Wiener indices without normalization (Figure 2A) and the other dendrogram using normalized f -Wiener indices ( Figure 2B). To quantify the classification of the two methods: with and without normalization, we adopt the commonly used Adjusted Rand Index (ARI) [38] for classification validation. ARI measures the accuracy of classification, and takes values between 21 and 1. The larger the ARI is, the better is the classification. The ARI for Figures 2A and 2B are respectively 0.18 and 0.56 for Experiment 1.5. Using normalized f -Wiener indices lead to a substantial improvement in the classification. We repeat Experiments 1.1 to 1.5 1000 times each. The boxplots of the ARI are shown in Figure 3. The means and standard deviations for these experiments are given in Table 1. They clearly demonstrate the superiority of classification using normalized f -Wiener indices.

Experiment 2: Hierarchical Clustering of Trees
The collection of trees to be classified consists of 10 paths (P n ), 10 stars (S n ), 10 brooms (B n, n 2 ), 20 caterpillars (C n,2 which is like a path, and C n, n{10 10 which is like a star), and for n ranging from 500 to 950 with step of increment 50. Figure 4 shows the two dendrograms. The ARI for Figures 4A  and 4B are respectively 0.10 and 1.00. This demonstrates that using normalized f -Wiener indices provides much better accuracy for classification purposes. The result in this experiment is consistent with that of experiment 1.

Experiment 3: Hierarchical Clustering of Random Networks and Trees
The collection of networks consists of (i) networks generated by three random network models, namely, ER model, SF Model and 3-D geometric model; (ii) some trees such as paths, brooms, caterpillars, stars. Figure 5 shows the two dendrograms formed. And the ARI for Figures 5A and 5B are respectively 0.04 and 0.86.

Details on Generating Random Networks
We describe here in details on how to choose the networks generated by the three random network models in experiments 1 and 3. Experiment 1 consists of 5 small, but similar, experiments which we label as Experiment 1.1, …, Experiment 1.5 which correspond to p~0:01, . . . ,0:05 respectively. Now we describe Experiment 1.5 in details.

ER Model
There are two parameters in the ER model, namely, n, the number of nodes, and p, the probability that an edge is formed between a pair of nodes. All edges are formed independently of each other. In Experiment 1.5, where p~0:05, we choose n ranging from 500 to 950 with step of increment 50. We generate an ER network using 'erdos.renyi.game' function available in the R package igraph [39]. If the network is connected, we keep it in C and denote it as ER 500 . If not, then we repeat the function 'erdos.renyi.game' until a connected network is obtained. Similarly, ER 550 , . . . ,ER 950 are generated.

SF Model
We also construct ten SF networks by the function 'barabasi.game' available in the R igraph package. We shall describe how to grow a SF network with 500 nodes for a given p, say p~0:05. The other 9 SF networks with 550, . . . ,950 nodes are constructed in a similar manner. In 'barabasi.game' function, we set number of vertices 500, number of edges to be added in each time step np=2 rounded to the nearest integer, and the option to create a directed graph false.

Geometric Model
We generate ten 3-D geometric networks with 500,550, . . . ,950 nodes. We shall describe how to construct one with 500 nodes as follows. The rest are constructed similarly. We first place 500 nodes in a unit cube uniformly and independently, then we pairwise distances and rank these distances in ascending order. We choose the top 100p% of these pairwise distances and connect their corresponding nodes. If this network is connected, then we keep it in C and denote it by GE 500 . Otherwise, we discard it, and repeat the above procedure until we get a connected network. The other networks GE 550 , . . . ,GE 950 are constructed similarly.

Conclusions
Wiener index and other Wiener type indices have been commonly applied in Chemometrics to associate structures and physicochemical properties of molecules. Recently, these indices are incorporated in quantifying complex networks as in QuACN [9] and NetCAD [40]. In this article, we first generalize Wiener index to a general functional form, called f -Wiener index. This f -Wiener index contains all well-known Wiener type indices as special cases such as Wiener index, Harary index, hyper Wiener index, compactness, and average efficiency. We provide a unifying method to identify the maximum and minimum over the set of simple connected graphs with n nodes, or the set of simple connected trees with n nodes (Theorems 1 and 2). Explicit sharp upper and lower bounds for Wiener index, Harary index, hyper Wiener index and the generalized index are deduced over networks (Corollary 5) and over trees (Corollary 6). Moreover, the maximizer and minimizer are characterized in Theorems 1 and 2. We believe these results are general and of independent interests.
Armed with these maximum and minimum values, we propose a normalized version of f -Wiener index over networks, and a similar version over trees. These normalized versions provide better interpretation of indices over networks of varying number of nodes than the non-normalized one. We conduct a number of experiments to compare the clustering performance using normalized f -Wiener indices with that of the non-normalized f -Wiener indices. The results of these experiments consistently demonstrate that using normalized versions improved clustering substantially. The normalized versions capture similar topological structures among networks with different number of nodes better. Our method of optimizing W f (G) can be easily extended to index of the form W(W f (G)) where W and f are monotone functions. For example, taking W(x)~1=x and f (k)~2 n(n{1)k leads to which measures small-world behvaior of network G [8]. For other descriptors, it is of interest to study whether normalization is needed; if so, how best to normalize them; and to what extent normalization improve network comparison.
Observe that W f (G)~P ? r~1 f (r)n r (G)~P ? r~0 ½f (rz1){ f (r)N r (G) where we assume f (0)~0, n r (G) denotes the number of pairs of nodes in G with distance equals r, and N r (G) the number of pairs of nodes in G with distance greater than r. Since in most biological networks the number of nodes is large, one may normalize a scaled-version of W f (G) in terms of the asymptotic distribution of the N r 's under the assumption that the observed network G is generated by a given random network model M. This will enable us to determine the likelihood that the observed network is generated by M. Currently a fair amount of information about shortest paths in some network models is available in [41,42]. How to make use of these results seems like a worthwhile future project. Figure S1 Illustrating the choices of u 1 , u 2 and u 3 in Lemma 2.

Supporting Information
Here T 1 has 5 nodes, T 2 3 nodes. We choose u 1~3 ,u 2~5 and u 3~6 . Tree T is constructed by joining u 1 and u 3 while T 0 by joining u 2 and u 3 . D(T) and D(T 0 ) are 8|8 matrices where the first 5 columns correspondent to the 5 nodes in T 1 , and the last 3 rows correspondent to the 3 nodes in T 2 . (TIF) Figure S2 Illustration of Lemma 3. Here n~10,i~j~5,'~3, k~7. From the counts of the distances above, it is clear that