A network-based approach to identifying correlations between phylogeny, morphological traits and occurrence of fish species in US river basins

doi:10.1371/journal.pone.0287482

Fig 1.

Procedure for construction of basin and species networks.

(a) The diagram shows the procedure for construction of species co-occurrence based basin network of eight representative basins. Each basin is assumed to have a fixed number of species. The number of common species between two given basins determines the edge-weight between the basins, which is here represented by the thickness of the edge. On the right is corresponding adjacency matrix for the network on the left, with numbers indicating the number of common species between the corresponding basins. (b) The schematic diagram demonstrates the procedure for construction of network of species based on their nearest neighbours (NNs). The panel on the right shows an example network of five species, with each node connecting to two of its NNs. This network (main) is formed by aggregation of different sub-networks. The panel on the left shows these sub-networks; first sub-network shows species S₁ connecting to S₂ and S₃, second shows S₂ connecting to S₁ and S₄, and so on. Alongside each of these sub-networks and the main network is shown an adjacency matrix, where 1 (0) in a cell indicated the presence (absence) of connection between the nodes. The network on the right shows all the connections in the sub-networks on the left, and its adjacency matrix is the sum of all the sub-network adjacency matrices. The solid edges are the ones explicitly shown in the sub-networks, and the dashed edges are edges from sub-networks whose display has been skipped on the left.

More »

Expand

Fig 2.

Identification of clusters of Hydrological Unit Code 8 (HUC8) basins based on number of species of co-occurrence.

The figure shows a network of the US watersheds with nodes plotted at the centroids of 2073 HUC8s and colored according to the cluster they belong to, the edges are lightened for visualization. Basins in only the coterminous United States is shown here. The clusters are the modules identified using community detection algorithm on the network of HUC8s, where the network is constructed based on number common species between HUC8s. The map was made with Natural Earth base maps data (public domain) and USGS WBD—Watershed Boundary Dataset (U.S. Public Domain). The map also contains information about the major drainage basin boundaries (USGS) as represented by the black bold lines that separate watersheds flowing into different oceans. (b) The figure shows morphological traits based mean cosine distance (〈CD〉) versus mean phylogenetic distance (〈PD〉) between the species in the basins in the HUC8 clusters. The colors are indicative of the module index. (c) The figure shows 〈CD〉 versus 〈PD〉 of randomly selected basins from across the network. The number of randomly selected basins are same as in the actual modules shown in part (b). The blue lines in parts (b) and (c) are the best linear fits to the data.

More »

Expand

Table 1.

Phylogenetic signal of morphological traits.

For each morphological trait, the table presents the phylogenetic signal in terms of two metrics–Blomberg’s K [11] and Pagel’s λ [60], along with their p- values. The ten morphological traits are relative eye size (EdHd), oral gap position (MoBd), relative maxillary length (JlHd), vertical eye position (EhBd), body elongation (BlBd), body laterally shape (HdBd), pectoral fin size (PFiBd), pectoral fin vertical position (PFlBl), caudal peduncle throttling (CFdCP) and maximum body length (Length).

More »

Expand

Fig 3.

Relationship between cosine distance (CD), phylogenetic distance (PD) and number of basins of co-occurrence based on nearest neighbour (NN) species networks.

For the network constructed using cosine distance between species, the figure show relationship between the number of basins having at-least two of the species of the module (along x-axis) and corresponding means 〈CD〉 (right y-axis) and 〈PD〉 (left y-axis) of species within modules, for NN = 10 (a), and NN = 50 (b) networks. Similarly, the bottom row shows corresponding results for species network constructed using PD between species for NN = 10 (c), and NN = 50 (d).

More »

Expand

Table 2.

Summary statistics of linear regression models showing relation among three variables: Number of basins of co-occurrence of at least two species (NBS), phylogenetic distance (PD) and cosine distance (CD) between species.

x and y represent independent and dependent variables, respectively, in the regression analysis. Results for four settings with respect to the metric used of species network construction and the number of nearest neighbours are shown. The R² coefficient and the p-values shown here are for a single experiment.

More »

Expand

Fig 4.

(a-b) Variation of R² with nearest neighbours (NN) for cosine distance (CD) and phylogenetic distance (PD) based networks with y are dependent variable and x as independent variable. For the CD based network, mean CD (〈CD〉) is the independent variable, and for the PD based network mean PD (〈PD〉) is the independent variable. The dependent variables in the former case (CD based network) are the number of basins of co-occurrence of at least two species (NBS) and 〈PD〉, and in the latter case (PD based network) they are NBS and 〈CD〉. The filled circles indicate that corresponding p- value < 0.05, and hollow ones indicate p- value ≥ 0.05. (c-d) These are the corresponding plots for the random module analysis.

More »

Expand