Modular structure in fish co-occurrence networks: A comparison across spatial scales and grouping methodologies

doi:10.1371/journal.pone.0208720

Fig 1.

Workflow diagram of the network and cluster analyses.

For each of the 10 nested river basins included in the study, a species’ presence-absence matrix was first compiled then converted to a species’ edge list or a species × species Jaccard dissimilarity matrix. Edge lists were used to build unipartite networks, followed by modularity analysis through simulated annealing. (Pairwise effect sizes from cooccurrence analyses were estimated for use in subsequent tests of nest associate species and are not utilized within the workflow diagram.) Dissimilarity matrices were used in PAM cluster analyses. Comparisons of network modules and PAM clusters included congruence (C) analysis and ratios of mean distances within and among groups (MD_within:among) when the number of PAM clusters (k) was equal to the number of modules (k = no. modules). However, network and cluster comparisons were limited to MD_within:among when an optimal number of PAM clusters was independently identified with the gap statistic (k = optimal no.). R packages used in each step of the workflow are shown in curly brackets within gray bubbles (e.g., ‘igraph’).

More »

Expand

Fig 2.

Illustration of the process used to quantify congruence (C) among network modules and PAM clusters.

Hypothetical results are shown at the top of the diagram for network modules and PAM clusters: the same 12 species (A—L) were first partitioned among three network modules (grey boxes), then among three PAM clusters (black boxes). Partitioning of species among network modules and PAM clusters was conducted independently, though the number of PAM clusters was determined by (i.e., equivalent to) the number of network modules identified by the simulated annealing algorithm. Note that the number of species assigned to each module and cluster may vary and is determined by the annealing and clustering algorithms respectively. In this hypothetical example, species numbers are variable among modules, with five, three, and four species assigned to the first, second, and third modules respectively, but four species are assigned to each of the three clusters. Congruence between network modules and PAM clusters is based on the number of instances in which species are grouped together in the same module and cluster. For example, in the first pair of columns shown at the lower-left side of the diagram, a total of eight species are assigned to the same module and cluster groups, as indicated by gray arrows. Congruence in this instance is calculated as eight matches divided by the total number of species (i.e., C = 8 ÷ 12 = 0.67). To aid in interpretation, modules and clusters are identified by the number of ‘tabs’ assigned to each; one, two, or three tabs per module or cluster are shown and are consistent among the upper and lower parts of the diagram (e.g., the black cluster with two tabs consistently contains species C, G, H, and I). When assessing C, it is critical to recognize that the labels used to identify network modules and PAM clusters (one, two, or three tabs in this illustration) are arbitrary. Only the shared identities of the species’ lists within each module or cluster are important. Therefore, a complete test of C cannot be performed by simply comparing module 1 vs. cluster 1, module 2 vs. cluster 2, etc. Rather, the level of congruence between modules and clusters must account for multiple module vs. cluster combinations. This is achieved by ‘rotating’ the clusters in a combinatorial manner, as shown in the 2^nd through 6^th pairs of columns in the lower part of the diagram. The goal is to investigate all possible combinations of modules and clusters while searching for the highest possible level of C, given the constraint of the observed species’ assignments within modules and clusters (shown at top of diagram). Note, however, that the network modules do not need to be rotated during the combinatorial comparisons with the PAM clusters, as the objective is to assess the degree to which species’ assignments within clusters match species’ assignments within modules. Thus, with a system of three modules and three clusters, six cluster rotations are needed to explore all possible module vs. cluster combinations. The observed C value for each of the six rotations is shown at the bottom of the diagram. Hence, the first combination of modules and clusters (i.e., first two columns at lower-left) in this illustration leads to the highest possible C value which is then taken as the overall amount of congruence between the modules and clusters.

More »

Expand

Fig 3.

Hypothetical maps of fish co-occurrence networks.

These maps demonstrate spatial clustering and the absence of spatial clustering. Each map is centered on the Ohio River Basin with 8-digit hydrologic units delineated by grey lines. Panel a illustrates the process used to interpolate range centroids for individual species. In this example, the native range of the Variegate Darter (Etheostoma variatum) is indicated by shaded grey polygons (see main text for source data). The center or ‘centroid’ of each range polygon, interpolated in a two-dimensional Cartesian plane, is indicated by a black triangle. The master centroid of the species’ native range, calculated as the grand mean of the x and y coordinates for individual range polygon centroids, is shown as a black circle. Panel b illustrates a hypothetical network of 12 fish species, partitioned into three distinct network modules (red, blue, and green circles). In this instance, strong spatial clustering is evident. Panel c illustrates a similar fish network, but one that is characterized by a lack of spatial clustering.

More »

Expand

Table 1.

Congruence (C) and spatial clustering results for fish network modules and partitioning around medoids (PAM) clusters.

More »

Expand

Fig 4.

The Kanawha River Basin (HU-4 scale) fish network.

Panel a shows the complete network of 94 fish species, with co-occurrences among species indicated by light grey edges and the four distinct modules indicated by node colors. The network was plotted with the Kamada-Kawai force-directed layout, which positions the most highly connected nodes near the center and weakly connected nodes along the periphery. (Note that this network does not incorporate species’ centroids in the layout.) Panel b magnifies the Kanawha River network module (green nodes) that includes the Bluehead Chub (‘BhC’; Nocomis leptocephalus) and its six known nest associates: White Shiner (‘WS’; Luxilus albeolus), Rosyside Dace (‘RsD’; Clinostomus funduloides), Mountain Redbelly Dace (‘MRD’; Chrosomus oreas), Rosefin Shiner (‘RfS’; Lythrurus ardens), Crescent Shiner (‘CS’; Luxilus cerasinus), and Central Stoneroller (‘CSr’; Campostoma anomalum). Edge widths within the green module are proportional to the effect sizes estimated with the probabilistic model of co-occurrence (see main text) and nest associate pairs are highlighted by black edges. Panel c illustrates density functions (kernel estimates) for three groups of co-occurrence effect sizes within the Kanawha River fish network: all edges between species in different modules (‘among module’), all edges between species in the same modules (‘within module’), and a group that is exclusive to the six edges between Bluehead Chub and each of its nest associates.

More »

Expand

Table 2.

Representation of the Bluehead Chub (Nocomis leptocephalus) nest associate complex at three spatial scales.

More »

Expand