## Figures

## Abstract

Biology presents many examples of planar distribution and structural networks having dense sets of closed loops. An archetype of this form of network organization is the vasculature of dicotyledonous leaves, which showcases a hierarchically-nested architecture containing closed loops at many different levels. Although a number of approaches have been proposed to measure aspects of the structure of such networks, a robust metric to quantify their hierarchical organization is still lacking. We present an algorithmic framework, the *hierarchical loop decomposition*, that allows mapping loopy networks to binary trees, preserving in the connectivity of the trees the architecture of the original graph. We apply this framework to investigate computer generated graphs, such as artificial models and optimal distribution networks, as well as natural graphs extracted from digitized images of dicotyledonous leaves and vasculature of rat cerebral neocortex. We calculate various metrics based on the asymmetry, the cumulative size distribution and the Strahler bifurcation ratios of the corresponding trees and discuss the relationship of these quantities to the architectural organization of the original graphs. This algorithmic framework decouples the geometric information (exact location of edges and nodes) from the metric topology (connectivity and edge weight) and it ultimately allows us to perform a quantitative statistical comparison between predictions of theoretical models and naturally occurring loopy graphs.

**Citation: **Katifori E, Magnasco MO (2012) Quantifying Loopy Network Architectures. PLoS ONE 7(6):
e37994.
https://doi.org/10.1371/journal.pone.0037994

**Editor: **Jérémie Bourdon, Université de Nantes, France

**Received: **February 7, 2012; **Accepted: **May 1, 2012; **Published: ** June 6, 2012

**Copyright: ** © 2012 Katifori, Magnasco. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

**Funding: **This work was supported in part by the National Science Foundation under Grant PHY-1058899. http://www.nsf.gov/. EK would like to acknowledge support from the Burroughs-Wellcome Career at the Scientific Interface Award and the Raymond and Beverly Sackler Fellowship. http://www.bwfund.org/. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

**Competing interests: ** The authors have declared that no competing interests exist.

## Introduction

Among the many different classes of complex systems that can primarily be described as “networks”, an important subclass concerns physical networks devoted to transportation of various entities, such as fluids or energy. To some extent, structural load-bearing networks can also be considered in this category, as their job is the distribution of stress-strain. Besides their evident technological importance, these networks are central to the function of living beings; because of their concrete physicality they are sometimes far more accessible to experimental analysis than other important biological networks, and hence offer an important window into the organization and function of naturally evolved large-scale networks.

Many biological distribution and structural networks contain dense numbers of reentrant loops. The venation of angiosperm leaves (Fig. 1) [1], the structural veins of insect wings, the continuously adapting foraging networks of some fungi and slime molds [2], the vasculature of animal organs such as the adrenal glands, the brain [3] and the liver are just a few of a large number of examples where physical networks developed loops in living organisms. These networks perform functions crucial to the survival of the organisms that use them. The hierarchical organization and the intricacies of the architecture of these highly interconnected networks dictate the efficacy in providing support or distributing load under varying conditions. In some cases the function of closed loops and how many there should be is intuitively obvious; the webbing-like veins of a dragonfly wing have cross-bracings that serve to maintain rigidity and resistance with a minimum of weight. In other cases it is not self-evident why there are as many loops as observed.

(a), (b) Leaf vasculature of two dicotyledonous species. (c) Detail of leaf collected from the same plant as leaf (a). The venation of (a) and (c) is predominately reticulate, (b) is percurrent. In general, leaves from the same plant (or species) share statistically similar architectural properties, as compared to leaves from different species. The scale is 1 cm.

In many cases, such as leaf venation, loopy networks evolved gradually from a tree architecture [4]. Various reasons for the evolution of loopiness in biological distribution networks have been proposed [5]–[7]. These networks are the result of developmental processes that frequently dictate not the exact position of each network edge but the overall organization in a statistical sense. For example, one can frequently determine by mere inspection of the leaf venation patterns if two leaves are specimens from two different species (Fig. 1). Similarly, networks produced *in silico* by optimization routines or developmental simulations that incorporate the effects of biological noise exhibit architectures that are to some extent random: each simulation repeat will produce statistically similar, but never identical, networks [8]–[12]. To compare naturally occurring networks with the computer simulated models we therefore need to be able to test the null hypothesis that the two networks in question have been drawn from the same distribution.

Some of the distribution and structural networks in question are planar, i.e. their edges are (or can be) all confined to the same plane and meet only at vertices (no two edges can cross each other). Examples of naturally occurring planar networks include the veins of leaves and insect wings, the loopy arterial network of the mammalian neocortex and many others.

Despite the importance of these planar loopy networks, the arsenal of specialized tools and techniques that can sufficiently capture the architecture is still limited. Instead, so far the scientific focus has been on quantifying and describing the topology of networks with a tree architecture or the connectivity of non-planar complex networks, such as the internet. In particular, work developed since the fifties to describe river networks and dendritic architectures helped establish some powerful measures to describe the topological properties of tree structures. The Horton-Strahler stream ordering system [13], [14] and the asymmetry [15] are two such measures that played a crucial role in understanding the laws that dictate network growth and organization. Invaluable though these measures might be for rivers and dendrites, their definition and usage presuposes a tree architecture and loops destroy their consistency.

Although measures developed for general, non-planar complex networks such as the degree distribution and the community structure in principle work for planar graphs, frequently they are not fine tuned to capture many aspects of the 2-d network organization [16]–[22]. Methods to extract the hierarchical organization of complex networks have focused primarily on the node connectivity [23]. Similarly, other more specialized metrics such as the distribution network entropy [24] are not very informative with regard to quantities of interest in this work, and in particular the hierarchical organization of graphs. Some specialized schemes have been developed to quantify the loopy architecture of dicotyledonous leaves (see e.g. [25]), and though they can reveal important information about leaf physiology and function [26] these methods do not explicitly characterize the nestedness of the topology. Similarly, predominately geometrical methods [27] do not fully capture the hierarchical organization of the graph.

To achieve a meaningful and elegant quantification of highly interconnected and loopy biological networks we need a sufficiently nuanced metric that captures certain important aspects of the topology and architecture of the loopy network without relying on the exact value of the bond strength or geometrical location. Such a metric would allow phenotypic parameter reduction and assignment of numeric values to the level of loop nestedness and other aspects of the architectural organization that are not represented by descriptions that rely on local, scalar quantities (such as histograms of the vein density). More importantly, it will allow a quantitative and topologically based comparison between natural loopy networks and the prediction of optimization models.

In this paper we present an algorithmic framework that allows us to map the architectural organization of a planar graph to that of a binary tree. We then use three metrics widely used for binary trees and examine their properties with regard to the original graph. These metrics are the asymmetry [15], the cumulative size distribution [28], [29] and the Strahler bifurcation ratio [14]. We present results from three classes of networks: computer generated networks (whose building rules are predetermined), networks optimized for known functionals and naturally occurring networks such as leaf veins and the arterial vasculature of the rat neocortex. We finally discuss the advantages and disadvantages of each approach and present future directions and applications.

## Results

We have developed an algorithmic framework that maps a predominately loopy architecture to a dichotomously branching tree. This framework hierarchically decomposes the loopy architecture by succesively deleting edges and joining contiguous loops, and represents this hierarchical decomposition as a tree, termed the *nesting tree*.

In what follows, the term *link* will refer to a graph element that connects two nodes, and the term *edge* will refer to a chain of links, connecting nodes. Each node in an edge is connected to exactly two other nodes, except the nodes at the boundaries of the edge, which can be connected to only one other node (when that edge is the “leaf” of a tree), or three or more other nodes (see inset of Fig. 2). The “edge strength” is a quantity that parametrizes the weight of the edge *J*. If an edge *J* is composed of a chain of links, then can be set to be the edge strength of the weakest of the chain links, the median value, or any other quantity that is of interest. The term *loop* is used to refer to the graph cycles, and the *terminal* or *ultimate* loops are the cycles that do not contain other loops.

The first step consists of pruning all tree-like components of the graph. In the second step we order the list of graph edges based on their width. Here we mark the 5 thinnest edges, ordered based on their weight. In the third step, we remove the weakest edge from the graph. Here this step will result in joining the green with the red loop, to form the yellow facet. The loops are represented as color coded nodes in the nesting tree. We then repeat steps 2 and 3 iteratively, to sequentially remove every edge, and as a consequence, gradually join every loop.

### Algorithm

The hierarchical decomposition framework if graphically represented in Fig. 2. The algorithm begins with a pruning step, where all tree-like components rooted on the loopy graph backbone, if present, are removed from the graph. This step eliminates all vertices that belong only to one edge, and produces a graph where each edge either separates or connects two loops (Fig. 3(a)).

(a) Deletion of an edge in a loopy graph. (i) The deletion of the edge joins two adjacent loops. (ii) The deletion of the edge disconnects the graph. (b) Hierarchical decomposition of a planar graph. Boundary loops sequentially join the outside space, marked as . Left: Nesting tree of the hierarchical decomposition. Right, top to bottom: hierarchically decomposed graph. The bottom right panel corresponds to the full graph, the rest represents the network at different levels of decomposition (the corresponding cutoff level of the tree representation is marked with a gray dashed arrow). As edges of the graph are hierarchically deleted, based on their thickness, the original loops (A–E) are joined to form derived loops (–). (c) Hierarchical decomposition of a planar graph. Phantom boundary loops surround the graph perimeter. Loops contiguous to the perimeter of the graph join a ring of phantom boundary loops. The decomposition proceeds as in (b), but the phantom loops – appear among the loops of the original graph in the tree representation. (d) Building blocks of a loopy architecture. The two basic building blocks of the loopy architecture can be identified using the tree representation of the graph. (i_{1}),(i_{2}): multiplicative nestedness. Nested loops merge hierarchically. (i_{3}): This architecture is represented by “tall” trees. (ii_{1}),(ii_{2}): additive nestedness. Ordered loops join consecutively. (ii_{3}): The tree representation is that of “short” trees. Graphs (i_{1}) and (i_{2}) map equivalently to (i_{3}), similarly graphs (ii_{1}) and (ii_{2}) map equivalently to (ii_{3}). (e) Cumulative size distributions of additive and multiplicative models of nestedness. (i_{1}) Nesting tree for additive nestedness. The degree of each node is is shown. (i_{2}) Degree (size) distribution for additive nestedness. (i_{3}) Cumulatize size distribution for additive nestedness. (ii_{1}) Nesting tree, (ii_{2}) Degree (size) distribution and (ii_{3}) Cumulatize size distribution for multiplicative nestedness.

In the second step of this hierarchical decomposition, we order the list of graph edges based on their width (if the graph edge is composed of a single link) or their edge strength, and identify the edge with the smallest . In this step we assume that the edges can be ordered according to their weight in a strictly monotonic fashion, namely that for every pair of edges *J*,*K*. This is a requirement that can easily be implemented by infinitesimally randomly perturbing or when .

In the third step, we remove the edge with the smallest edge strength from the graph. When an edge separates two contiguous loops, as in Fig. 3(a)(i), then its removal will result in joining the two loops to form a larger one, the area of which is the sum of the areas of the two initial loops. In most cases, this step will also result in joining the remaining edges of the contiguous loops. For example, in Fig. 3(a)(ii), the links AB and BC will be joined to form the edge AC.

We then repeat steps 2 and 3 iteratively, to sequentially remove every edge, and as a consequence, gradually join every loop, and perform what we have termed a *hierarchical decomposition* of the graph. We can represent this procedure with a dichotomously branching tree, as follows. The “leaves” of the tree are the original loops of the full graph, loops A–E in Fig. 3(b), and each node downstream of the leaf nodes represents a larger loop that is formed by joining two upstream loops through the removal of an edge. The location of the downstream nodes on the vertical axis of the branching tree represents the edge strength that was removed to join these two loops. Loops are being hierarchically combined until they break to the outer region, termed *exterior* (and labeled ). The exterior is treated as a separate loop.

This algorithm will hierarchically decompose the original graph and will register this hierarchical decomposition as a binary, nesting tree. The branching patterns of this nesting tree contain information about important topological properties of the original graph. Thus, the nesting tree allows us to adapt and use metrics traditionally defined for trees, to quantify the architecture and topology of loopy graphs.

Examples of graphs and their corresponding nesting trees are shown in Fig. 4. The underlying geometry, link connectivity and point-wise link weight distribution are identical in every example shown. The architecture is solely defined by the building rule according to which the link weight values are assigned on the network. In the *gradient* model in Fig.4, the link weights are distributed according to the link center Euclidean distance from the left-most vertex, creating a smooth gradient of link weight. The model *random links* is produced by random assignment of the weights to the links and exhibits no log-range order. In the *nested* model, the straight lines defined by the underlying link connectivity are ordered based on a self similar subdivision scheme: the lines on the boundaries and center are assigned order , the lines bisecting order lines are assigned order etc. The link edges are similarly ordered according to weight, and then distributed to the ordered straight lines so that higher thickness links occupy lower order lines. This produces a hierarchical self-similar pattern, characterized by long range order in the link weights. Finally, the *random lines* model is produced by a random permutation of all the lines.

In these examples the nesting trees have been truncated for clarity. Note that in the “random links” nesting tree frequently low order nodes connect directly to high order nodes. This feature is absent from the “random lines” nesting trees, which are statistically self-similar.

The ordered arrangement of the edge weights in the gradient and nested models is reflected on the nesting tree structure. The nesting tree of the gradient model is a purely additive tree of the type shown in Fig. 3(e)(ii_{1}). The nesting tree of the nested model is similar to Fig. 3(e)(i_{1}), however, the iterated building unit is composed of four sequentially joining elements, rather than just two joining nodes.

The random models *random links* and *random lines* translate to disordered trees, with characteristic bifurcation statistics. This is visually apparent by the frequent direct connections of low order nodes to high order nodes in the nesting tree of the *random links* model. Such discrepancies can be captured with metrics that quantify the topology of trees, such as the asymmetry or the Cumulative area distribution, discussed in Section “Hierarchical decomposition of generated networks”.

The hierarchical decomposition and the nesting tree contain no explicit information about the geometry of each edge and element of the graph, other than the fact that the two joining loops need to be adjacent. Nodes of the nesting tree thus correspond to *neighborhoods* of the original graph - the nesting subtree rooted at node *j* represents the architecture of the subgraph enclosed in the loop represented by node *j*.

When edges at the graph perimeter are removed and loops at the boundary merge with the exterior, the neighborhood information is lost. We can retain that information by appropriately fragmenting the exterior region. Instead of having a single exterior loop, where every boundary loop sequentially merges to, we define a multitude of exterior loops as follows. We consider an exterior phantom loop that encompasses the original graph in its entirety. We then connect the vertices on the perimeter of the original graph with the perimeter of the phantom loop as shown on Fig. 3(c). Thus defines *n* boundary phantom loops, labeled , where *n* is the number of loops in the original graph that are adjacent to the perimeter. The added phantom exterior loop and links are assigned infinite weights and will never be removed during the hierarchical decomposition. After the addition of the phantom loops to the original graph, we proceed to iteratively decompose the graph as before, and represent the decomposition with a binary nesting tree, like the one shown in Fig. 3(b). In this way, the neighborhood information at the boundaries is preserved and will be reflected in the architecture of the nesting tree.

The nesting tree facilitates straightforward identification of the two basic building blocks of the organization of a planar graph. We will denote these building blocks as *multiplicative* (Fig. 3(i_{1})(i_{2})), and *additive* (Fig. 3(ii_{1})(ii_{2})). The multiplicative building blocks consist of events where the small loops are joined in an iterative, self-similar fashion. It maps to a tall binary tree, such as the one shown in Fig. 3(i_{3}). The additive building block is characterized by sequential joining of minor loops to an encompassing major loop. It maps to a short binary tree, as in Fig. 3(ii_{3}).

This mapping to a nesting tree is not a bijection. Any information about the geometric organization (shape and location) of the loops is lost. Only topological information is retained. For example, networks Fig. 3(i_{1}) and (i_{2}) both map to Fig. 3(i_{3}), and Fig. 3(ii_{1}) and (ii_{2}) both map to Fig. 3(ii_{3}). The connection between the nesting tree the a spanning tree on the dual graph are discussed in Supporting Information S1.

Elements of the architectural organization, such as loop area or aspect ratio can be retained by assigning related values to the nodes of the tree *j* and defining quantities that reflect their distribution. For example, the cumulative size distribution is based on measurements of the loop areas assigned to the nodes *j* of the nesting tree.

When an edge connects, rather than separates, two loops, its deletion will disconnect the graph (Fig. 3(a)(ii)). There is a number of ways to incorporate such an event to the hierarchical decomposition algorithm. In the example cases that we consider in this work, such events are rare, so for simplicity we chose to discard them in our implementation. In particular, we replaced the weight value of the disconnecting edges with the maximum edge width value of the disconnected loopy components, this way ensuring that the loop will not disconnect from the graph before it is hierarchically merged to the encompassing loop (Fig. 3(a)(ii)).

The nesting tree allows the unique assignment of a number to properties of the hierarchical organization of the graph and decouples geometry from topology. In this paper we consider and adapt three metrics that have been traditionally used on trees: the asymmetry, the cumulative size distribution and the Strahler bifurcation ratio. These metrics are presented in the Methods section of this work - here we apply them to the nesting tree and consider what they mean for the organization of the original graph.

### Hierarchical Decomposition of Generated Networks

In this section we will consider various classes of architectural models. These computer generated networks were produced according to various predetermined rules. The gradient, random links, nested and random lines models are shown in Fig. 4, and are discussed in the previous section. The *nested5* model is produced by the nested model, choosing five lines at random and randomly permuting their order. Similarly, the model *nested10*, is derived from the nested model by swapping 10 lines at random. Finally, in the *peaks* model, the thick links are concentrated around seven equidistant peaks. These models are shown at the insets of Fig. 5 and 6.

The graphs were constructed to share identical underlying topology (N = 817 vertices, triangular lattice) and edge width distribution, as shown in Fig.3. (a) The asymmetry of the every subgraph of rooting node *n* is plotted as a function of the base 2 logarithm of the degree , for the nested (red circles), gradient (green squares), and peaks model (blue diamonds). For the peaks and random lines model, instances of the graph are plotted with highlighted subgraphs of degree 2^{3} and 2^{7} (nested) and (peaks). Note the quasi-periodicity of the asymmetry of the nested model (a signature of the self similar structure of the nested model) and the change of monotonicity of the peaks model (indicating a qualitative change in the architecture of the graph at that level of organization). (b) The asymmetry of the random lines model (red) and random links model (cyan). The x-axis is the logarithm of the degree of the vertex or the nesting tree. Red line: averaged asymmetry of subgraphs of degree , random lines model. Cyan line: averaged asymmetry of subgraphs of degree , random links model. Inset: Density plots: The overlap of the two distributions is plotted in white. (c) The averaged asymmetry of the nested (blue), nested5 (orange), nested10 (light blue), random lines (red) and random links model (cyan) as a function of the base 2 logarithm of the degree *d*. The colored area indicates the standard error of 20 realizations.

These graphs were constructed to share identical underlying topology (N = 817 vertices, triangular lattice) and edge width distribution. (a) The asymmetry of the random lines model (red) and random links model (cyan). The x-axis is the logarithm of the degree of the vertex or the nesting tree. Red line: averaged asymmetry of subgraphs of degree , random lines model. Cyan line: averaged asymmetry of subgraphs of degree , random links model. Inset: Density plots: The overlap of the two distributions is plotted in white. (b) The averaged asymmetry of the nested (blue), nested5 (orange), nested10 (light blue), random lines (red) and random links model (cyan) as a function of the base 2 logarithm of the degree *d*. (c) Cumulative size distribution of generated models. Random links model (green), nested (blue), gradient (magenta), peaks (green). The total area of the graphs has been normalized to 1. Discontinuities or near discontinuities in the slope of cumulative size distribution indicate lengthscales where potentially the architectural organization changes qualitatively. (f_{1}). Adjusted cumulative size distribution, random links model. (f_{2}) The Adjusted cumulative size distribution is plotted for the nested (blue), nested5 (orange), nested10 (light blue) and random lines model (red).The Adjusted cumulative size distribution of the self-similar networks (nested, nested5, and random lines) can be approximated by a straight line of slope zero. Notice the periodicity in the nested lines model. The colored area indicates the standard error of 20 realizations.

We use the hierarchical decomposition and associated metrics to quantify various aspects of the architecture, demonstrate what the metrics reveal about the graph organization and understand the effects of the finite size, boundaries and of noise.

In Fig. 5(a) we plot the asymmetry (defined in the Methods section) of the architectural models termed nested (blue), gradient (magenta) and peaks (green). We have analytically calculated the asymmetry for the infinite gradient and nested models. For the gradient model, it can be trivially found to be:(1)where is the degree of the subtree , defined as the total number of leaf nodes of .

The analytical, closed form expression of the nested model is more complicated and is presented in Supporting Information S1. To demonstrate the finite-size, boundary effects in the asymmetry of the nested and gradient model, we overlay the theoretical predictions on the finite size numerical results of Fig. 5(a). For the gradient model, where a large number of low order loops break directly to the boundary, we notice a deviation of the actual measured finite size asymmetry from the theoretical one. This deviation is mostly noticeable at small degrees *d*. In the nested model case, there are no low level loops that join the exterior during the initial stages of the decomposition, so the finite size effects produce a deviation from the theoretical graph only at high degrees *d*. The damped fluctuations in the asymmetry of the nested model are a signature of the self similarity of the model. The asymptotic relaxation value of these fluctuations depends on the topology of the iterative building block of this architecture.

The asymmetry plot of the peaks model follows closely the one of the gradient model, but, at approximately there is a marked change of monotonicity. This is the characteristic scale where the architecture of the model changes qualitatively. Until that scale, the architecture was predominately additive, with smaller loops sequentially joining larger ones, and the asymmetry curve followed qualitatively that of the gradient model. The asymmetry decreases when the six separate, large size segments, represented in the inset graph with different colors, join. After those events take place during the hierarchical decomposition smaller loops with stronger edges continue to sequentially join creating a pattern in the asymmetry plot that is again reminiscent of the gradient model. The asymmetry can be used to identify characteristic length scales in graphs where major changes in the architecture take place.

All the three models shown on Fig. 5(a) are deterministic, with relatively simple architectures. Models such as the random links or the random lines model exhibit a much more complex asymmetry profile, as shown in Fig. 5(b). The asymmetry values in that case are drawn from a distribution the properties of which reflect the architecture in question. Calculating mutual information and comparing density maps such as the ones shown in the inset of Fig. 4(b) can provide a statistically meaningful way to examine the null hypothesis if two random graphs belong to the same architectural class. An extensive statistical comparison of the different architectural models is beyond the scope of this work.

Alternatively, we calculate the moving average of the asymmetry , as described in the methods section and plot in Fig. 5(b) with the red and cyan solid lines. The exact average asymmetry of each realization of the random models depends on the details of the noise. In Fig. 5(c) we plot the mean over 20 realizations of the nested (blue line), nested5 (orange), nested10 (light blue), random lines (red) and random links (cyan) models. The colored area represents the standard error.

The nested5 and nested10 models represent intermediate models between the nested and the random lines architecture, with progressively increasing disorder (and asymmetry) as the number of lines that have been swapped becomes greater. The nested, nested5, nested10 and random lines model are architectures with long range order in the link strength, qualitatively significantly different than the random links model, in which the link strength is uncorrelated. This difference in reflected in the asymmetry values of the random lines and random links models (Fig. 5(c)).

The cumulative size distribution is the cumulative distribution over the areas *A* associated with the nesting tree nodes. The cumulative size distribution of the generated models is presented in Fig. 6(a) and Fig. 6(b,c). In particular, in Fig. 6(a) we plot the cumulative size distribution of the peaks (green), gradient (magenta), nested (blue) and random links model (cyan). As shown in Fig. 3(e), the gradient model follows a straight line of slope 1/2 (a small deviation for small *a* is due to boundary effects). Kinks and discontinuities in the slope, like the ones seen in the peaks model curve, are indicative of qualitative changes in the architecture. The random lines and nested model curves are significantly different from the gradient model. We can robustly test for scale invariance by defining the adjusted cumulative size distribution . Since for self similar graphs, such as the nested model, we expect to fluctuate around a constant value. The nature of the fluctuations depends on the topology of the iterative building block of the nested model.

In Fig. 6(b) we plot for the random links model, and in Fig. 6(c) for the various nested and random lines models. As expected, the curves for all realizations of the self similar models fluctuate around a straight line. The periodicity of the curve can reveal the size of the architectural unit of the self similar network. The deviation from a straight line for large *a* is due to boundary effects. As the disorder increases, the periodicity becomes less pronounced, and disappears at the random lines model.

### Hierarchical Decomposition of Optimized Networks

In this section we use the hierarchical decomposition and the nesting tree to analyze the output of the optimization routines presented in [5]. Here, unlike the architectural models presented earlier, the building rules according to which the networks were constructed are not a-priori known. However, the functional purpose of the networks is known, as they are the (local) minima of global energy functions. The two models under consideration are a robustness to damage (broken bond) and fluctuations in the load (sink) model.

Modeled as electrical (or equivalently water distribution) grids, the networks transport load from the root (bottom center vertex in the networks of Fig. 7(a)) to other nodes in the network. In the “bond” model, the root has to distribute the load evenly to all the vertices, even if a random single bond is removed (robustness to damage). In the fluctuating sink model, instead of a uniform distribution of sinks there is a single sink, the position of which moves across the network. The cost to build the network is determined by a function and is set to a constant in each case. The parameter quantifies the “economy of scale”, i.e. how relatively expensive is a high conductivity edge compared to a smaller edge. The link thickness of the graphs shown in Fig. 7(a) represents the bond conductivities, which are determined by optimizing for the total network power dissipation (results are shown for and 0.7).

(a) Optimized networks, fluctuations in the load (sink model). Instances of optimized graphs () when the load is concentrated at a single, moving, point. (b) Optimized networks, robustness to damage (bond model). Instances of optimized graphs () when robustness is required under the presence of random damage. (c) Asymmetry of sink model. (d) Asymmetry of bond model. The average asymmetry is plotted as a function of the normalized subtree degree . Red line: . Green line: . Blue line: . Black dashed line: random links model. The colored area represents the standard error after averaging over 20 realizations of each model. (e) Adjusted cumulative size distribution, sink models. The gray line overlayed on the blue, line is the random links model. (f) Adjusted cumulative size distribution, bond models. The adjusted cumulative size distribution is plotted for (red, green, blue respectively) The adjusted cumulative size distribution is averaged over 20 realizations for the bond, sink and random edges model. The colored area represents the standard error after averaging over 20 realizations of each model.

The asymmetry plots demonstrate the strong statistical similarity of the sink and models with the random links model at intermediate and large scales (Fig. 7(c)). For the bond models, the follows closely the optimum, and they both exhibit a marked change in monotonicity at larger scales. The overall asymmetry increases with in the sink model, whereas there appears to be a significant qualitative change in the architecture between the and of the bond model. Here it should be noted that the asymmetry metric, as defined here, does not depend on the actual numerical value of the bond strengths, just the absolute ordering on the lattice. The sink model network for , 0.7 appears uniform as the smaller conductivity values are similar in value, however, architecturally the network is similar to the random model of Fig. 4.

The adjusted cumulative size distribution shown in Fig.7(e),(f), overall qualitatively reproduces the findings of the asymmetry. The bond model for exhibits a small size plateau. The sink model follows a similar curve as the one of the random links model. Note the change of monotonicity in the bond and model. This indicates a change of architecture from primarily additive to primarily multiplicative nestedness.

### Hierarchical Decomposition of Natural Networks

In this section we apply the hierarchical decomposition for two real examples, a leaf from *Bursera tecomaca* and a leaf from *Protium heptaphyllum*, show on Fig.8. The leaves have been cleared and stained by the group of D. Daly in the New York Botanical Gardens, who provided us with high resolution images of the specimens. We reconstructed and digitized the vasculature of leaves using custom made software that we have developed to translate the pixel values information to a collection of nodes and edges on which we can perform hierarchical decomposition.

(a) Segments of digitized leaf vasculature. The image of the skeletonized leaf has been overlayed with the digitized portion of interest. (a_{1}) Bursera tecomaca, (a_{2}) Protium heptaphyllum. Images courtesy of Douglas Daly, New York Botanical Gardens. (b) Hierarchical decomposition of Bursera and Protium. (b_{1}) Bursera, (b_{2}) Protium. Top to bottom: remaining loops at three different, progressively higher thickness cutoffs. Notice the persistent minor loops at the proximity of the major veins. (c) Segmentation of Protium heptaphyllum and associated tree representation. The protium intercostal area area has been separated to six color-coded sectors, as identified by hierarchical decomposition. The associated tree representation for that sector is shown for the green and red sector. The non-colored (white) areas of the graph and associated gray links on the tree representation correspond to high asymmetry nodes of the tree representation. Note how the high asymmetry areas are concentrated near major leaf veins.

In Fig. 8 we show the reconstructed portion of the leaves, overlayed on the digital image from which it was acquired. A non-uniform staining or illumination of the specimen can introduce bias to the reconstruction algorithm and certain neighborhoods of the reconstructed graph might appear to have spuriously large weights. In particular, executing an initial decomposition step on the two networks of in Fig. 8(b_{1}) and (b_{1}), we can easily see that unlike the Bursera, the Protium sample appears to have strong loops of smaller size concentrated around major veins. A careful inspection of the actual specimen is necessary to determine whether the origin of this bias is due to differential staining or this effect is of true biological origin. Although problems like this can be dealt before the digitization step in a number of ways (such as a variable threshold), here we will not follow this approach. In fact, the Protium sample was chosen to illustrate a case where non-uniform staining can result in spurious data and we will use it to discuss how we can use the hierarchical decomposition framework to perform data cleaning post the digitization stage.

A hierarchical decomposition of the intercostal area of Bursera allows us to identify high level nodes of the nesting tree that correspond to major loops. We use the nesting tree to identify a natural segmentation of the graph to six major areas which we plot in Fig. 8(c) along with the corresponding nesting subtrees for two of those sections. The histogram of the partition asymmetry *q* defined on the nodes of the nesting tree has a local minimum at approximately . This value can serve as a natural cutoff for data cleaning, In the nesting trees of Fig. 8(c), we color the links of the subtree upstream of the nodes with partition asymmetry higher than 0.97 with gray. The corresponding high asymmetry loops are colored white in the original graph. We see that indeed the high symmetry loops are consistently concentrated around major veins.

The asymmetry curve of the intercostal area of Bursera, shown in Fig. 9(a), reaches a plateau. On the contrary, the Protium asymmetry does not approach a constant value. However, if we clean the sample by disregarding the high asymmetry nodes with , we see that the Protium asymmetry curve similarly reaches a plateau, which is nevertheless higher than Bursera, indicating an architectural model based on more additive than multiplicative building blocks compared to Bursera. We can calculate the asymmetry for each individual segment of Protium in Fig. 8 and see that, as expected, the different segments exhibit the same architecture and the asymmetry curves relax to a value of approximately , significantly different than the value of 0.45 of the Bursera (Fig. 9(b)).

(a) Asymmetry of Bursera and Protium intercostal areas. The average asymmetry is plotted as a function of the normalized subtree degree *d*. Red solid line: Protium, cleaned. Red dashed line: Protium, full graph. Blue line: Bursera. Dark diamonds: random edges model. Dark circles: nested model. (b) Asymmetry of Protium intercostal segments. is plotted as a function of the normalized subtree degree *d*. Black dashed line: Protium, cleaned. Red, blue, green, magenta, cyan, yellow lines: Protium segments, colorcoded as in Fig. 8. Gray squares: average of segment asymmetry with standard error. (c) Adjusted cumulative size distribution, Bursera and Protium. Red solid line: Protium, cleaned. Red dashed line: Protium, full graph. Blue line: Bursera.

The cumulative size distributions of Fig. 9(c) qualitatively follow the asymmetry plots. The cleaned Protium curve, as well as the Bursera curve, both reach a plateau, however the cumulative size distribution cannot effectively distinguish between the two speciments.

### Strahler Bifurcation Ratio

The Strahler bifurcation ratio (10) (discussed in the Methods section), when computed on the nesting tree can provide a metric to quantify the overall nestedness of graphs. It is defined as the ratio of the number of streams of order to the number of streams of order . Since the Strahler law of stream numbers is an inevitable reality for most trees, it is possible to fit the plots versus with a straight line the slope of which will determine the logarithm of the Strahler bifurcation ratio for the whole graph. Examples of this fit are shown in the inset of Fig. 10. The best fit is found in the least squares sense, and it is forced to pass through ( is equal to the total number of ultimate loops, or leaf nodes in the nesting tree). The data point for is discarded, as it is very sensitive to noise.

Red error bar: standard error of the linear regression fit (represents goodness of linear fit). Black error bar: standard deviation of the logarithm of the bifurcation ratio (average over 20 realizations). Insets: Number of Strahler streams of order as a function of for the random lines, nested and gradient model and the Bursera leaf. Note that in each case, the follows closely an inverse geometric progression with (shown with the red dashed line).

In Fig. 10 we plot the Strahler bifurcation ratios for all the graphs presented in this paper. For the architecture or the optimization models that are not deterministic each realization of the graph will produce a different . In those cases, we plot , the average bifurcation ratio over 20 realizations, with the black error bar being the standard deviation. The red error bar represent the (average) goodness of the linear fit. Notice the extent of the red error bar for the gradient and peaks models.

The Strahler bifurcation ratio can clearly distinguish between the strongly multiplicatively nested Bursera and additively nested gradient model, but, with our current implementation it could not sufficiently distinguish between many of the models presented in this work. A major drawback of is that it is a single number which is inherently unsufficient to capture the complexity of networks whose architectural properties do not necessarily remain the same over all lengthscales.

### The Rat Brain

The analysis and framework presented in this work can be useful not only for leaves, but any other, biological or man made, planar graphs. A notable example is the arterial vasculature of the rodent neocortex which forms a planar network with multiple loops [3]. We extracted the diameters of the arterial blood vessels from a composite rat brain image provided to us by the Kleinfeld group in UCSD and augmented the connectivity information in [3] to obtain a weighted map of the arterial vasculature of the rat brain, as seen in Fig. 11(a). Although the resolution of the image in our disposal does not allow us to determine the vein widths with absolute confidence, we were able to identify major vascular sectors and determine that, according to the data at hand and the corresponding nesting trees shown in Fig. 11(b), the architecture of the network in question is primarily additive. Five sectors in Fig. 11(a) and their associated nesting subtrees are shown in color.

(a)The arterial network forms a planar graph. Different segments of the network, as identified by hierarchical decomposition are represented by different colors. (b) Nesting tree of the digitized network. the highlighted segments of the network are color-coded.

## Discussion

We have presented a framework that allows us to quantify the hierarchical organization of predominately loopy architectures. Our *hierarchical decomposition* consists of three iteratively repeated steps:

- pruning of the tree-like components
- ordering of the edges
- removal of the thinnest edge

This framework relies on the mapping of loopy planar graphs and their hierarchical decomposition to binary nesting trees. The nesting tree is subsequently used to quantify the architectural organization of the original graph. A number of quantities that reflect various aspects of the graph organization can be defined on the nesting tree, each with each own advantages and disadvantages. In this work we presented results for three such quantities, the asymmetry (and average asymmetry ), the cumulative size distribution and the Strahler bifurcation ratio. The asymmetry is a bottom-up approach that assigns a number to every composite loop at each scale. The value is a weighted average of the nestedness of the architecture of the portion of the graph enclosed in the *j* loop, corresponding to node *j* of the nesting tree. This metric can be degenerate as, depending on the averaging window, two different architectures of a high degree loop can map to the same value. On the contrary, the cumulative size distribution performs better in differentiating architectures at the high levels of organization. The larger number of low level loops frequently results in washing out interesting features of the structures at smaller scales.

These observations are demonstrated in the sink and bond model Asymmetries and cumulative distributions of Fig. 7. For example, the asymmetry of the sink and bond models (Fig. 7(c), (d)) has a local maximum, a feature that is absent from the adjusted cumulative size distribution (Fig. 7(e), (f)). Similarly, the asymmetry of all bond models is indistinguishable for large scales, whereas the cumulative size distribution can statistically distinguish these models.

Depending on the weight function , the asymmetry can be used to define a single number that encompasses information about the whole architectural organization (e.g. by calculating ). Such a number would be meaningful only for graphs with some degree of self-similarity. Some examples are shown in Supporting Information S1.

The Strahler bifurcation ratio can be used to describe the overall architecture, but it does not perform well for complex architectures. We have examined the Strahler bifurcation ratio as a function of the Strahler order and degree *d* in an attempt to extract information about the scale dependent organization of the graph. We have found that the result is very sensitive to noise, especially at high .

The metrics presented in this paper focus on the metric topology of the structure but they do not explicitly capture any information about the geometry of the network. The cumulative area distribution depends on the area of the terminal loops (the areoles of a leaf vein network). The cumulative area distribution follows closely the cumulative degree distribution provided that the terminal loops are not substantially polydisperse. It is evident we can supplement the descriptions presented here with more detailed geometrical analysis, in which some aspects of the geometry of the closed loops is kept, such as e.g. an approximating SVD ellipsoid, which can be used to define a major axis and an eccentricity. We can then incorporate such geometrical information into the analysis of nesting, i.e., relationships defining what is the average orientation of subloops in relation to the parent loops. Such detailed geometrical analysis, however, will evidently be subordinate to the coarser topological analysis we have presented here.

A big part of our extensive understanding of fluvial networks is due to the development of metrics to characterize and quantify tree architectures. Accordingly, progress in understanding loopy networks, which are ubiquitous in both natural and man made structures, is contingent on our ability to measure their hierarchical architecture. The hierarchical decomposition framework presented in this work provides a robust mathematical description of the network architecture, applicable to leaf venation and other loopy distribution (and structural) structures. It can be used to characterize the *in silico* networks obtained from computer simulations as well as to perform quantitative statistical comparisons between theory and experiment. As such, it can provide an invaluable tool in deciphering the functional significance of the loopy networks and possibly their developmental origin.

## Methods

In this section we present in more detail the three metrics that, applied on the nesting tree, characterize various aspects of the hierarchical organization of the original graph.

### Asymmetry

The asymmetry is a metric that characterizes the topological structure of a binary tree. It was first developed mainly in the context of neuronal branching patterns, such as dendritic trees and was defined as the weighted mean value of the asymmetry of its partitions. Adjusting the definition and notations of [15], we define the partition asymmetry of a bifurcation vertex *j* as:(2)with and . The parameters and are the degrees of the two subtrees at partition *j*. The degree of a (sub)tree is defined here as the total number of the leaf nodes (terminal segments) of that (sub)tree. Note that Eq. 2 differs slightly from the definition in [15].

The asymmetry of a subtree rooted at node *n* can now be defined as the weighted average of the partition asymmetry of the nodes :(3)where *j* runs over all bifurcating vertices of the subtree ( is the degree of the subtree), and is the weight of the partition *j*. In the results shown in this paper we use a weighted averaging window that includes all nodes of the subtree, . Results derived using other weight functions are discussed in Supporting Information S1.

Finally, the normalization factor is defined as:(4)

The averaged asymmetry of trees of degree is defined as:(5)where is the number of nodes with degree . In this work, we adjust this definition to be the mean of the asymmetry for all the nodes whose degree is within a distance from :

(6)Calculated on the nesting trees of the hierarchical decomposition, the asymmetry is an metric that quantifies the nestedness of the original graph. High asymmetry values correspond to a graph that is primarily composed from *additive* building blocks, and low asymmetry values correspond to a graph that is made from *multiplicative* building blocks.

The actual correspondence between asymmetry values and level of nestedness depends on the choice of weight function . Different choices of weight functions amplify different aspects of the graph architecture, and comparisons of asymmetry plots of different graphs should only be done when the weight function choice is consistent. In Supporting Information S1 we present results acquired by considering no averaging and by averaging over a shallow averaging window.

### Cumulative Size Distribution

The cumulative size distribution [28], [29] is the cumulative distribution over the areas associated with the nesting tree nodes. It is calculated by assigning an area value to each node *j* of the nesting tree, and then calculating the probability that an area drawn at random will exceed a certain value *a*. In general, we can associate the nesting tree nodes with any quantity that reflects a property of the original graph that is of interest, such as the total number of terminal loops nested in loop *j* of the original graph (equal to the degree of node *j* of the nesting tree, if the terminal loops are of equal size).

The cumulative size distribution reflects the overall architecture of the original graph, as the smaller degree nodes of an aggressive subdivision, like the one in Fig. 3(e)(ii), will be overepresented in the degree and cumulative degree distribution. It is easy to show that the cumulative degree distribution of iterative, self similar architectures is inversely proportional to the area(7)Conversely, the cumulative degree distribution of an architecture with additive nestedness (Fig. 2(e)(i)) is a straight line with slope:(8)

### Strahler Bifurcation Ratio

The Horton-Strahler stream-ordering system has been an invaluable tool in quantifying aspects of river topology and architecture since its inception in the fifties by Horton and Strahler [13], [14]. It has since been used with considerable success in describing the topology of a wide class of natural and man-made networks.

According to the Horton-Strahler stream-ordering system, the terminal nodes of the network (the leaves) are assigned Strahler order 1. The order of every non-leaf node is determined by the following rule: when two edges are connected to two nodes of Strahler order upstream, the node downstream is assigned an order(9)

The Strahler numbers (or the related Horton numbers) can be used to quantify the tree topology in a number of ways. In this work we focus in particular on the Strahler bifurcation ratio, defined as:(10)where is the number of streams of Strahler order . A stream is defined as a maximal path of branches connecting vertices of Strahler order , ending in a vertex of higher order.

The law of stream numbers states that the stream numbers approximate an inverse geometric progression with the order , a statement that implies . However, it is not possible to use this law as evidence of self-similarity of a distinctive architecture, as it is followed by the vast majority of binary trees [30].

The Horton-Strahler stream-ordering system cannot be directly used to describe loopy networks, as there can be no unique assignment of the stream order in a redundant graph. The hierarchical decomposition and the nesting tree provide a mapping that allows assignment of Strahler numbers to a loopy graph, as the loops of the original graph map to the vertices of the nesting tree and the Strahler number of node *j* depends on the nestedness of the graph segment enclosed by the loop *j*.

We now analyze examples from three classes of graphs: models generated by specific, prescribed building rules, outputs of optimization routines and natural graphs (in particular the venation of two dicotyledonous leaves and the arterial vasculature of the rat neocortex).

## Acknowledgments

The authors would like to thank D. Daly from the New York Botanical Garden for kindly providing us with cleared leaf images, and P. Blinder and the Kleinfeld group at University of California San Diego, for making available the rat neocortex vasculature data to us. EK would like to thank the Burroughs-Wellcome Fund and the Raymond and Beverly Sackler Foundation.

During preparation of this manuscript the authors became aware of the work of Mileyko, Edelsbrunner, Price, and Weitz (arXiv:1110.1413v1), concurrently submitted and accepted for publication at PLoS1. This manuscript discusses a generalization of the Horton-Strahler order to weighted planar reticular networks and presents a sensitivity analysis of the ordering.

## Author Contributions

Conceived and designed the experiments: EK MM. Performed the experiments: EK MM. Analyzed the data: EK MM. Contributed reagents/materials/analysis tools: EK MM. Wrote the paper: EK MM.

## References

- 1.
Ellis B, Daly DC, Hickey LJ, Johnson KR, Mitchell JD, et al. (2009) Manual of Leaf Architecture. Cornell University Press,Ithaca, NY.
- 2. Tero A, Takagi S, Saigusa T, Ito K, Bebber DP, et al. (2010) Rules for biologically inspired adaptive network design. Science 327: 439–442.
- 3. Blinder P, Shih AY, Rafie C, Kleinfeld D (2010) Topological basis for the robust distribution of blood to rodent neocortex. Proc Natl Acad Sci USA 107: 12670–12675.
- 4. Melville R (1969) Leaf venation patterns and the origin of the angiosperms. Nature 224: 121–125.
- 5. Katifori E, Szollosi GJ, Magnasco MO (2010) Damage and uctuations induce loops in optimal transport networks. Phys Rev Lett 104: 048704.
- 6. Corson F (2010) Fluctuations and redundancy in optimal transport networks. Phys Rev Lett 104: 048703.
- 7. Roth-Nebelsick A, Uhl D, Mosbrugger V, Kerp H (2001) Evolution and function of leaf venation architecture: A review. Ann Bot 87: 553–566.
- 8. Couder Y, Pauchard L, Allain C, Adda-Bedia M, Douady S (2002) The leaf venation as formed in a tensorial field. Eur Phys J B 28: 135–138.
- 9. Dimitrov P, Zucker SW (2006) A constant production hypothesis guides leaf venation patterning. Proc Natl Acad Sci USA 103: 9363–9368.
- 10. Fujita H, Mochizuki A (2006) The origin of the diversity of leaf venation pattern. Dev Dynam 235: 2710–2721.
- 11. Rinaldo A, Banavar JR, Maritan A (2006) Trees, networks, and hydrology. Water Resour Res 42: W06D07.
- 12. Corson F, Adda-Bedia M, Boudaoud A (2009) In silico leaf venation networks: Growth and reorganization driven by mechanical forces. J Theor Biol 259: 440–448.
- 13. Horton R (1945) Erosional development of streams and their drainage basins - hydrophysical approach to quantitative morphology. Geol Soc Am Bull 56: 275–370.
- 14. Strahler A (1952) Hypsometric (area-altitude) analysis of erosional topography. Geol Soc Am Bull 63: 1117–1141.
- 15. VanPelt J, Uylings HBM, Verwer RWH, Pentney RJ, Woldenberg MJ (1992) Tree asymmetry – a sensitive and practical measure for binary topological trees. B Math Biol 54: 759–784.
- 16. Rinaldo A, Rodriguez-Iturbe I, Rigon R (1998) Channel networks. Annu Rev Earth Pl Sc 26: 289–327.
- 17. Albert R, Barabasi A (2002) Statistical mechanics of complex networks. Rev Mod Phys 74: 47–97.
- 18. Newman MEJ (2003) The structure and function of complex networks. SIAM Review 45: 167–256.
- 19. Barrat A, Barthelemy M, Pastor-Satorras R, Vespignani A (2004) The architecture of complex weighted networks. Proc Natl Acad Sci USA 101: 3747–3752.
- 20. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU (2006) Complex networks: Structure and dynamics. Phys Rep 424: 175–308.
- 21. da F Costa L, Rodrigues FA, Travieso G, Boas PRV (2007) Characterization of complex networks: A survey of measurements. Adv Phys 56: 167–242.
- 22. Shao J, Buldyrev SV, Braunstein LA, Havlin S, Stanley HE (2009) Structure of shells in complex networks. Phys Rev E 80: 036105.
- 23. Sales-Pardo M, Guimera R, Moreira AA, Amaral LAN (2007) Extracting the hierarchical organization of complex systems. Proc Natl Acad Sci USA 104: 15224–15229.
- 24. Ang WK, Jowitt P (2005) Some new insights on informational entropy for water distribution networks. Eng Optimiz 37: 277–289.
- 25. Rolland-Lagan AG, Amin M, Pakulska M (2009) Quantifying leaf venation patterns: two-dimensional maps. Plant J 57: 195–205.
- 26. Blonder B, Violle C, Bentley LP, Enquist BJ (2011) Venation networks and the origin of the leaf economics spectrum. Ecol Lett 14: 91–100.
- 27. Perna A, Kuntz P, Douady S (2011) Characterization of spatial networklike patterns from junction geometry. Phys Rev E 83: 066106.
- 28. Takayasu H, Nishikawa I, Tasaki H (1988) Power-law mass-distribution of aggregation systems with injection. Phys Rev A 37: 3110–3117.
- 29. Paik K, Kumar P (2007) Inevitable self-similar topology of binary trees and their diverse hierar- chical density. Eur Phys J B 60: 247–258.
- 30. Kirchner J (1993) Statistical inevitability of horton laws and the apparent randomness of stream channel networks. Geology 21: 591–594.