Large-scale metabarcoding analysis of epipelagic and mesopelagic copepods in the Pacific.

A clear insight into the large-scale community structure of planktonic copepods is critical to understanding the mechanisms controlling diversity and biogeography of marine taxa in terms of their high abundance, ubiquity, and sensitivity to environmental changes. Here, we applied a 28S metabarcoding approach to large-scale communities of epipelagic and mesopelagic copepods at 70 stations across the Pacific Ocean and three stations in the Arctic Ocean. Major patterns of community structure and diversity, influenced by water mass structures, agreed with results from previous morphology-based studies. However, a large-scale metabarcoding approach could detect community changes even under stable environmental conditions, including changes in the north/south subtropical gyres and east/west areas within each subtropical gyre. There were strong effects of the epipelagic environment on mesopelagic communities, and community subdivisions were observed in the environmentally stable mesopelagic layer. In each sampling station, higher operational taxonomic unit (OTU) numbers and lower phylogenetic diversity were observed in the mesopelagic layer than in the epipelagic layer, indicating a recent rapid increase in species numbers in the mesopelagic layer. The phylogenetic analysis utilizing representative sequences of OTUs revealed trends of recent emergence of cold-water OTUs, which are mainly distributed at high latitudes with low water temperatures. Conversely, the high diversity of copepods at low latitudes was suggested to have been formed through long evolution under high water temperature conditions. The metabarcoding results suggest that evolutionary processes have strong impacts on current patterns of copepod diversity, and support the "out of the tropics" theory explaining latitudinal diversity gradients of copepods. Diversity patterns in both epipelagic and mesopelagic copepods was highly correlated to sea surface temperature; thus, predicted global warming may have a significant impact on copepod diversity in both layers.


Introduction
layer formed a distinct group, separated from other warm-water groups, including Tropical, North and South subtropical, and Kuroshio regions. 163 The clustered groups of copepod compositions in the warm-water group clearly 164 corresponded to the respective geographical areas in each sampling layer, including transition, 165 California Current, Kuroshio, subtropical, and tropical regions (Fig 3). The latitudinal subdivisions 166 of community compositions were especially evident in the epipelagic layer even within subtropical 167 regions (Fig 3a; P < 0.005), including differences between the North and South Pacific subtropical 168 regions. These community changes were reflected in the best model, which showed that the average 169 water temperature in the epipelagic layer was the factor most strongly related to epipelagic copepod 170 compositions, followed by latitude ( Table 1). The other geographical factor of longitude was also 171 included in the best model for the epipelagic layer.  The epipelagic environment had a strong effect on the compositions in the mesopelagic layer (Table   189 1). The compositions in the upper mesopelagic layer were explained by average salinity, water 190 temperature, and dissolved oxygen content of the epipelagic layer as well as longitude in the best 191 model. In the lower mesopelagic layer, latitude was the variable explaining most of the variation in mass structures in the Pacific, as represented by the profiles of T-S diagrams at 0-1,000 m (Fig 4d).

202
The latitudinal and longitudinal changes through water column were also observed in the best model 203 to explain changes of copepod composition. The variable explaining most of the variation in 204 community compositions at 0-1,000 m was average water temperature in the epipelagic layer, 205 followed by longitude. Among cluster groups through the water column, there is a clear difference 206 between warm and cold waters in the taxonomic compositions in each layer (Fig 4a). There are no 207 clear differences with respect to family-level taxonomic compositions within warm waters, and 208 proportions of unclassified copepods were high in the mesopelagic layer.

210
Copepod community based on sequence reads 211 The distribution peaks of major OTUs largely affected large-scale copepod community based on 212 quantitative data using relative proportions of sequence reads. In addition to compositions based on 213 presence/absence of OTUs, overall cluster analysis based on sequence reads also showed 214 significantly different groups, which corresponded with sampling depth and geographical regions (P 215 < 0.005; Fig 5). The key variables that explained copepod communities based on sequence reads 216 were similar to those explaining the presence/absence of OTUs; in particular, environmental 217 conditions in the epipelagic layer affected copepod communities based on sequence reads both in the 218 epipelagic and mesopelagic layers (S4 Table). A total of 36 OTUs were selected as major OTUs to 219 contribute to differences among cluster groups (Fig 5). These major OTUs showed a distinct 220 difference between cold and warm waters, and a small number of specific OTUs dominated the arctic 221 and subarctic regions in the cold-water group as well as in the transition-California region. These

222
OTUs were mainly restricted to high latitudes, although some OTUs were present in warm-water sequence reads were different in each OTU even in closely-related taxa. Kuroshio region compared with tropical and subtropical regions. These spatial patterns of diversity 243 were also observed in the rarefaction curves for each layer (Fig 2). The variable explaining most of 244 the variation in the spatial patterns of OTU numbers in all layers was sea surface temperature (SST)

245
(  There was a regional pattern of high diversity at low latitudes for the Simpson diversity 261 index (Fig 7b) and phylogenetic diversity (Fig 7c). Unlike the OTU numbers, the Simpson index  The regional differences of phylogenetic diversity were especially evident in the epipelagic and 266 upper mesopelagic layers, and significantly high phylogenetic diversity (P < 0.05) was observed 267 especially in the North subtropical region compared with other regions (S5 Table).

268
In addition to horizontal patterns, a vertical gradient of copepod diversity was observed, geographic region (Fig 7c). The phylogenetic diversity was significantly higher (P < 0.05) in the 275 epipelagic layer than in lower mesopelagic layers in all regions except the subarctic (S5 Table).

276
Different spatial and vertical distribution patterns of OTUs were associated with taxonomic groups 277 of copepods (Fig 8). High proportions of OTU numbers with distributions in the mesopelagic layer Mormonilloida also had high proportions of OTUs with the main distribution in mesopelagic layers.

283
These taxonomic groups essentially showed large OTU numbers, and these taxonomic groups were 284 also abundant with high proportions of OTU compositions in the mesopelagic layer (Fig 4a).  noted that we mainly used a standardized method for sampling using the same sampling gear, with some exceptions in the Arctic and Kuroshio area (S1 Table). Although the mesh size of the plankton 332 net, which largely affect results of metabarcoding analysis of zooplankton [33], were same in all 333 stations, different size mouth openings for the sampling gear may have affected regional differences 334 of copepod communities in this study.

335
The major groups, based on both quantitative (sequence reads) and non-quantitative  and all sequence reads were aligned using the add-fragments option in Multiple Alignment using 503 Fast Fourier Transform (MAFFT) with the default setting [59]. Sequences that were not aligned with reference data were removed. We performed a single-linkage pre-clustering, removal of a singleton read, and chimera removal by UCHIME both with and without a reference dataset for the alignment 506 sequences [60]. The final copepod sequence reads were subsampled again, and sequence differences 507 were calculated among sequence reads without considering indels. OTU clustering was performed    The spatial patterns of the OTU numbers in each sample were compared at each layer and throughout 554 the sampling layers. The evenness for each sample was evaluated using the Simpson diversity index.
The phylogenetic diversity was calculated based on the average genetic distance between OTUs 556 based on Kimura's two-parameter nucleotide substitution model, which is a commonly used genetic 557 distance for zooplankton including copepods [23]. Spatial differences in copepod diversity were 558 evaluated using non-parametric Kruskal-Wallis tests and Dunn's tests using SPSS 21.0 (IBM 559 Corporation). The effect of environmental variables on OTU numbers were investigated using 560 generalized linear models (GLMs) in R 3.5.0 [64]. We used the same environmental variables as in 561 the DistLM analyses except for water temperature. SST was used as the temperature parameter in the 562 epipelagic layer, because SST is more closely correlated to OTU numbers than average temperature.  OTUs are other copepod OTUs without high similarity to target species. Note that seven species in 602 mock community 2 were incubated after sampling to remove gut contents, but 33 species without 603 incubation were used for mock community 1. The analyses were based on the preliminary analyses 604 only using data of mock communities.