Predicted Deep-Sea Coral Habitat Suitability for the U.S. West Coast

Regional scale habitat suitability models provide finer scale resolution and more focused predictions of where organisms may occur. Previous modelling approaches have focused primarily on local and/or global scales, while regional scale models have been relatively few. In this study, regional scale predictive habitat models are presented for deep-sea corals for the U.S. West Coast (California, Oregon and Washington). Model results are intended to aid in future research or mapping efforts and to assess potential coral habitat suitability both within and outside existing bottom trawl closures (i.e. Essential Fish Habitat (EFH)) and identify suitable habitat within U.S. National Marine Sanctuaries (NMS). Deep-sea coral habitat suitability was modelled at 500 m×500 m spatial resolution using a range of physical, chemical and environmental variables known or thought to influence the distribution of deep-sea corals. Using a spatial partitioning cross-validation approach, maximum entropy models identified slope, temperature, salinity and depth as important predictors for most deep-sea coral taxa. Large areas of highly suitable deep-sea coral habitat were predicted both within and outside of existing bottom trawl closures and NMS boundaries. Predicted habitat suitability over regional scales are not currently able to identify coral areas with pin point accuracy and probably overpredict actual coral distribution due to model limitations and unincorporated variables (i.e. data on distribution of hard substrate) that are known to limit their distribution. Predicted habitat results should be used in conjunction with multibeam bathymetry, geological mapping and other tools to guide future research efforts to areas with the highest probability of harboring deep-sea corals. Field validation of predicted habitat is needed to quantify model accuracy, particularly in areas that have not been sampled.


Introduction
Predictive habitat suitability modelling is a cost effective approach to assist scientific research, conservation and management of vulnerable marine ecosystems (VMEs) in the deep sea. To date, the majority of studies using predictive models in the deep sea have focused on deep-sea corals, mostly due to the conservation status of this group and the relative abundance of data compared to other VMEs (e.g. [1,[2][3][4][5][6][7]). These models identify areas with the highest probability of containing deep-sea corals and enhance our knowledge of the factors that control the distribution of these organisms. Whitmire and Clarke [8] reviewed the state of deep coral ecosystems in the waters of California, Oregon, and Washington and reported 101 species of corals from six cnidarian orders have been identified in the region. These included 18 species of stony corals (Class Anthozoa, Order Scleractinia) from seven families, seven species of black corals (Order Antipatharia) from three families, 36 species of gorgonians (Order Gorgonacea) from 10 families, eight species of true soft corals (Order Alcyonacea) from three families, 27 species of pennatulaceans (Order Pennatulacea) from eleven families, and five species of stylasterid corals (Class Hydrozoa, Order Anthoathecatae, Family Stylasteridae). The U.S. West Coast has been relatively well sampled for deep-sea corals in comparison to many other regions of the world's oceans, but the spatial distribution of deep-sea corals in unsurveyed areas within the EEZ remains largely unknown.
Predictive habitat models work by extrapolating potential species' distributions from presence data and a range of environmental variables. These two components are critical, as incomplete or erroneous data can reduce confidence in the approach and can potentially lead to predictions of limited conservation or management value [9]. When considering the utility of a model, one further consideration is the selection of an appropriate spatial scale. For example, poor spatial resolution of environmental data continues to hinder the spatial accuracy of deep-sea habitat modelling at global scales [3,5,7]. To address this, several studies have focused on improving smaller-scale, local models (i.e. 10 to 100 km2) by integrating terrain variables derived from multibeam bathymetry (e.g. [10,11,12]). Whilst smaller-scale modelling produces valuable data on species distributions in localised areas, it often requires intensive sampling effort and is of limited use in the identification of unknown habitat for cruise planning, management and conservation initiatives. Regionalscale models are needed to predict habitat suitability for corals in areas that have not been surveyed and have to be accurate enough to guide a research vessel towards a clearly defined area where sampling can be targeted [7]. Recent approaches have investigated the overlap between areas of protection and models of the distribution of vulnerable marine species [13,14]. This manuscript presents a predictive habitat suitability modelling effort for deep-sea corals within U.S. Exclusive Economic Zone (EEZ) waters off the coasts of California, Oregon and Washington. The objectives of this manuscript are to; 1) develop predictive habitat suitability models at the highest possible spatial and taxonomic resolution for deep-sea corals, 2) use model results, in addition to other tools and data, to help guide field research efforts to areas with the highest probability of harboring deep-sea corals, and 3) integrate model results with existing bottom trawl closures (i.e. essential fish habitat (EFH) area closures) and National Marine Sanctuary boundaries to determine high probability habitat areas that remain at risk from human activity.

Coral presence data
Coral distribution data were gathered from several sources including: Monterey Bay Aquarium Research Institute (MBARI), NOAA Fisheries, NOAA National Marine Sanctuaries, Smithsonian Institute's National Museum of Natural History, and Washington State University. These records were obtained from a variety of gear types: remotely operated vehicles (ROVs), manned submersibles, cameras, grabs and bottom trawls. Over 90,000 coral records were collected for the U.S. West Coast region. However, only of a fraction of these could be retained for use in the habitat suitability models. Coral observations were eliminated if they matched the following criteria: 1) records were collected as bycatch in bottom trawls as they have inherent spatial and taxonomic accuracy issues, creating uncertainties that stem from both the method in which they were collected and the taxonomic knowledge of observers on fishing vessels. Bottom trawls can be several kilometers in length and it can be difficult, if not impossible, to determine the position of the actual coral occurrence [15]. 2) Records were located in waters of less than 50 m depth. This depth cutoff was based on the fact that most zooxanthellate corals are found in shallow waters and this study is focused on deep-sea azooxanthellate corals, which tend to occur in waters deeper than 50 m. 3) The taxonomy of coral records was uncertain at the family level. 4) If more than one coral record of the same taxon (order or suborder) was located within the same 500 m grid cell. The spatial resolution of the bathymetry, environmental data, and model results was 500 m6500 m. If more than one coral record from the same taxon occurred in the same 500 m grid cell, it was treated as a 'spatial duplicate' and removed. Spatial duplicates skew models towards the environmental conditions found in those cells resulting in distorted model predictions. Some sampling approaches, such as ROVs, drop cameras or manned submersibles, document numerous coral records along relatively short distances and can introduce significant spatial bias into the analysis if all records are retained.
There were several issues which prevented models from being performed at the species level: 1) taxonomic disagreement, 2) varying degrees of taxonomic knowledge among observers and collectors, and 3) many coral presences are documented without a sample being collected to conclusively determine coral taxonomy to species. These are concerns that have been similarly noted in global models for octocoral habitat suitability [16]. For these reasons coral records were binned and modelled at the Suborder and Order levels. Suborders for which coral presence data were obtained included Alcyoniina, Calcaxonia, Filifera, Holaxonia, Scleraxonia, Stolonifera. Order level data included Antipatharia and Scleractinia. A total of 2,120 coral records were retained for analysis (Table 1 and Figure 1). Predictive models were not performed for Suborders Filifera (n = 12) and Stolonifera (n = 30) due to a paucity of coral records. It should be noted that nine of the 203 scleractinian presence records used in the predictive models were habitat-forming scleractinians (e.g. Lophelia pertusa and Oculina profunda). All other scleractinian records were solitary, nonbranching corals. Most scleractinian records used in this analysis were not structure forming, but habitat suitability was modelled due to the high level of research interest for this taxon.

Bathymetry
The bathymetry of the U.S. West Coast shelf consists of a complex series of canyons, ridges and seamounts [8]. There has been significant effort in the acquisition of reliable bathymetry in this region and several data products are available. The most prominent is the Coastal Relief Model (CRM) generated by NOAA's National Geophysical Data Centre (NGDC). This is a publically available dataset with a stated cell resolution of 3-arc seconds (,90 m). The bathymetric component of the model is constructed from soundings obtained from National Oceanographic Service (NOS) hydrographic soundings, the NGDC multibeam database, and recently digitized soundings from NOS [17]. Soundings are gridded into a continuous grid for much of the shelf using the Generic Mapping Tools program Surface. However, the final CRM output is highly smoothed, omits smaller-scale features, and is of limited extent due to the high density of soundings in the shallower waters of the shelf (Figure 2a).
A custom bathymetry with a resolution of 500 m6500 m was produced from NOS hydrographic soundings, the NGDC multibeam database, and Trackline data [17][18][19]. Raw soundings were extracted into XYZ for the target area of interest ( Figure 2b) using MB System [20]. As with the CRM dataset, the raw sounding data was not corrected to the same vertical and horizontal datum, but this has little effect on the accuracy of the final grid output [21]. Null values and erroneous soundings were removed using either a PERL script or were removed manually. The final grid was created using the spline interpolation program MB Grid. In total, 35% of the area of interest was covered by sounding data with additional data used to infill areas with sparse soundings from Smith and Sandwell's global seafloor topography version 14.1 [22]. The final resolution of the custom grid was 500 m6500 m; smaller cell sizes showed little improvement in the quality of the bathymetry as it is limited by the spatial coverage and density of soundings. The custom grid was highly correlated with CRM data (Pearson's correlation = 0.999, p,0.001 based on 500 random points within the extent of CRM), spanned the entire study region, and retained more topographic complexity than CRM. However, considering this is the first development of a custom bathymetry, care should be taken when interpreting data in areas that contain sparsely distributed or no soundings as these areas rely on satellite derived altimetry for the bathymetry (Figure 2b).

Environmental, physical and chemical data
Environmental layers were collated and constructed from sources that included ship-based CTD casts, satellites and global climatologies such as World Ocean Atlas ( Table 2). The majority of source data was available as gridded datasets partitioned into standardized depth-bins ranging from 0 to 5500 m. Other data were available only as single layers from the surface (e.g. surface primary productivity) ( Table 2). For depth-binned datasets, it was assumed that the conditions found at a specific gridded depth were representative of conditions on the seafloor. This allowed for the creation of continuous representations of seafloor conditions by extrapolating each depth-bin to the corresponding area of seafloor at that depth. This approach was initially developed for the creation of global environmental, physical, and chemical datasets [7]. Converting depth-binned datasets into representations of seafloor conditions involved several computer intensive processes that were conducted within a series of Python scripts. Firstly, each depth-bin of the gridded data is extracted into a single layer and interpolated at a higher spatial resolution (usually 0.1u) using inverse distance weighting. The interpolation was required to reduce gaps that appear between adjacent depth bins due to a lack of overlap when extrapolated to the bathymetry. Each of these layers was then resampled to match the extent and resolution of the bathymetry with no further interpolation. Secondly, these layers were resampled to match the extent and cell resolution of the bathymetry. Thirdly, each resampled depth-bin was clipped by the area of seafloor that was available at that particular depth. Each bin did not overlap and all were merged to produce a continuous representation of the variable on the seafloor.
This approach essentially develops a model of potential conditions for each variable. It uses the best available data, but makes several assumptions: 1) environmental conditions from the gridded CTD data are representative of the conditions at the seafloor. The majority of CTD casts do not normally reach the seafloor as they are usually stopped between 5 and 10 m from the bottom to reduce the chance of damage to the CTD system  through impact, and 2) seafloor conditions are relatively stable.
Annual mean values were used to maximize the amount of environmental data incorporated. While much of the deep sea is relatively stable below 200 m, there is still significant temporal variability in shelf and coastal areas and caution should be taken when interpreting predictions in shallow-water areas. However, the longevity of many deep-sea coral species far exceeds the measuring period of most oceanographic variables. Surface datasets were not up-scaled by the above process, as they were only available as a single depth-bin. Surface variables were interpolated to a higher spatial resolution using the datainterpolating variational analysis approach (DIVA; [23]) that is written into Ocean Data View version 4. For this study, we selected particulate organic carbon flux to the seafloor from Lutz et al. [24] as the productivity variable. Slope was calculated within ArcGIS Spatial Analyst using a moving window to extrapolate both fine scale slopes (1 km, 5 km) and broad-scale slopes (10 km, 20 km) using Horn's algorithm [25]. The accuracy of the up-scaled environmental variables was tested using quality controlled water bottle data obtained from the 2009 version of the World Ocean Database (WOD) [26]. Only points collected post-2001 were used in the statistical comparison and null values were removed as these were not used in the development of the temperature or salinity grids. WOD data were filtered to ensure 1) values contained a bottom depth meta-data flag, and 2) data values were within 5% of the total depth from the custom bathymetry for a cast location. Four variables contained adequate data for statistical comparison with environmental layers: temperature, salinity, phosphate and dissolved oxygen ( Figure 3). The four up-scaled environmental variables that were assessed with WOD water bottle data were highly correlated at each sampling location (Pearson's correlation, R2, temperature = 0.98 (n = 108), salinity = 0.93 (n = 105), dissolved oxygen = 0.91 (n = 100) and phosphate = 0.88 (n = 101), all values significant at p,0.001). The phosphate comparison showed an artifact at bottle concentrations above 3.5 mg l-1 ( Figure 3). This occurred in regions that have low topographic relief, resulting in environmental variables not being upscaled by the bathymetry and the original resolution of the environmental variable being visible (in this case 1u). Several other CTD datasets from the U.S. West Coast were assessed for suitability, but many did not penetrate into the deep sea and did not include bottom depth as meta-data making it impossible to determine whether a cast went near the seafloor.

Variable selection
Variables were selected based on a literature search of environmental, physical, and chemical factors known or thought to influence deep-sea coral growth and survival. Temperature, salinity, aragonite saturation state, and topographic complexity have been shown to be strong predictors of bioherm forming scleractinian coral distribution in recent deep-sea modelling efforts [3,5,7,27]. Calcite saturation state was chosen over aragonite saturation state for use in this study as the majority of coral taxa that were modelled use calcite to build their calcium carbonate spicules and structures [16]. Scleractinians have aragonitic skeletons, but the vast majority of scleractinian corals used in this analysis were solitary, non-reef forming species. Living specimens of these solitary species have been collected in highly undersaturated waters with respect to aragonite, which led to the deduction that aragonite saturation state would not be a strong predictor for determining their potential distribution at a regional scale [28]. Slope was calculated at a variety of spatial resolutions ranging from 1 km-20 km and is a useful proxy for current acceleration and mixing, which are known to influence coral distribution and abundance [29][30][31].
Covariation between environmental datasets is a complication that must be addressed in many predictive modelling efforts. Environmental datasets used in this analysis were assessed for covariation in correlation matrices (See Figures S1-S10). Although Maxent is reasonably robust with respect to covariation, an a priori variable selection process was used to reduce covariation by removing variables that were highly correlated and likely to adversely affect final predictions. Covariation was assessed using correlation matrices in R. Strong correlations between variables (. 0.7) were addressed by omitting one of the environmental variables (except for calcite saturation state, temperature, and depth; see Results and Discussion). The importance of each variable in the model was assessed using a jack-knifing procedure that compared the contribution of each variable (when absent from the model) with a second model that included the variable. The final habitat suitability maps were produced by applying the calculated models to all cells in the study region, using a logistic link function to yield a habitat suitability index (HSI) between zero and one [32].  [5,33]. Presence-only modelling is one of the only methods available for modelling species distributions in the deep sea because documented absence data is sparse and when available can be unreliable. Maxent's underlying assumption is the best way to determine an unknown probability distribution is to maximise entropy based on constraints derived from environmental variables [32]. Default model parameters were used as they have performed well in other studies (a convergent threshold of 10 25 , maximum iteration value of 500 and a regularisation multiplier of 1 [34]).
Model accuracy between the test data and the predicted suitability models was assessed using a threshold-independent procedure that used a receiver operating characteristic curve with area under curve (AUC) for the test localities and a thresholddependent procedure that assessed misclassification rate. With presence-only data, Phillips et al. [32] define the AUC statistic as the probability that a presence site is ranked above a random background site. In this situation, AUC scores of 0.5 indicate that the discrimination of the model is no better than random, with the maximum achievable AUC value being 1, which implies perfect discrimination of validation data. To develop the models in this study, coral presence data was spatially partitioned into four regions to calculate validation metrics and assess whether or not spatial sampling bias of coral records was influencing model performance (regions depicted in Figure 1). Four Maxent models were performed for each coral taxon (order/suborder) so models could be spatially cross-validated. For example, model 1 for any given taxon used coral records from regions 2, 3, and 4 as training records and region 1 coral records as test data to assess model performance using AUC. Model 2 used coral records from regions 1, 3, and 4 as training records and region 2 coral records as test records. The same procedure was performed for models 3 and 4. The cross-validation of models across the four regions was necessary due to the high number of coral presence records provided by MBARI for Monterrey Canyon and Davidson seamount and has the benefit of testing models with spatially independent data (see regions 3 and 4 in Figure 1). In this study, spatially cross-validated models with AUC scores .0.7 were retained for further analysis and used in the production of thresholded predictions, models scoring lower than this were omitted.
There is ongoing debate regarding the interpretation of Maxent's logistic prediction values (0-1) for habitat suitability [35,36], but it represents the best metric at present. Several studies have defined a binary threshold, which states that a species is likely to be found in an area with a habitat suitability value above a given threshold, but not likely to be found below it [37][38][39]. To create usable predictions from the cross-validations we used a 0.5 logistic presence value threshold for each taxa and all taxa to provide a cut-off point for suitability in this study (below which was considered unsuitable and above suitable), we also used a 0.75 logistic presence value threshold to further constrain the model output for EFH management applications. These values are higher than used in previous studies (i.e. the 10 th percentile training presence used in [7]) because the main application of this modelling effort is to use predictions to help target areas for future field research and provide an assessment of EFH area closures. Summary maps were generated for each order and suborder by creating thresholded predicted outputs for each of the four regions (using 0.5 or 0.75 as the cutoff presence/absence value). If predicted logistic suitability was greater than 0.5 or 0.75 for any given cell, that cell was assigned a value of 1. If predicted habitat suitability was less than the cutoff value, the cell was assigned a value of 0. The binary models were summed for each coral taxon resulting in final consensus grids that had cell values ranging 0-4 depending on the number of retained spatiallypartitioned models (maximum = 4).

Fishing intensity data
To estimate the amount of fishing activity within the U.S. West Coast, a map of fishing intensity was acquired that showed the relative intensity of commercial bottom trawling from 12 June 2006 to 31 December 2010 [40]. The map was developed from a commercial logbook program administered by coastal states and records for bottom-contact fishing gear (e.g., ''small'' footrope, ''large'' footrope, flatfish, selective flatfish, and roller trawl) collated by the Pacific Fisheries Information Network (PacFIN). This is not a fully comprehensive dataset, as some states do not submit full data, state-managed fisheries such as pink shrimp, ridgeback prawn and sea urchin are not included in PacFIN and cells with data from less than three fishing vessels were omitted from available maps to protect privacy. The final layer represents the total bottom trawl lines that fall within a 3 km radius neighbourhood centered on cells within a 5006500 m grid similar to the custom bathymetry (represented as km of trawl per km 2 ). These data were contrasted against a thresholded prediction for all taxa, with the threshold raised to 0.75 to locate areas that are highly suitable habitat for cold-water corals.

Substrate data
Substrate data was obtained for a limited subset of the model domain that mostly covered the shallower shelf of the US West Coast [41]. This layer was built from a variety of archived data including limited multibeam echosounder bathymetry and backscatter and was provided as a 25 m625 m grid [42]. The substrate types described included, probable soft sediment, probable rock (including predicted rock based on expert knowledge) and a mixture of soft sediment and rock. This layer was also accompanied with a confidence layer that shows that for the majority of the data, the probable substrate type was of low confidence with only shallower water being granted medium and high confidence levels. This data layer and any output from it must be considered with substantial caution especially given that a wide variety of mapping approaches were used in its creation and these sources were interpreted into the three coarse categories. Due to the low confidence in the substrate data, an exploratory analysis was undertaken to determine how the habitat suitability models for each species fared in light of the available substrate in the region.

Species' niche
From the available environmental data, an a priori variable selection process that took into account closely related and highly correlated variables, identified seven variables that were likely to influence the probability of species presence (temperature, salinity, particulate organic carbon, depth, dissolved oxygen, calcite saturation state, slope 1 km) ( Tables 3, 4 and S1-S10 in File S1). The jack-knife of variable contribution showed slope 1 km, temperature, and salinity were the strongest predictors for Suborder Alcyoniina, Order Antipatharia, Suborder Calcaxonia, and Suborder Scleraxonia. Temperature and salinity were consistently strong predictors in models for Suborder Holaxonia, whereas, salinity and depth were the strongest predictors for Order Scleractinia. For all taxa combined the strongest predictors were salinity, temperature and depth. Three highly correlated variables (depth, calcite saturation state, and temperature) were retained due to ecophysiological importance and the strength of their contributions. This must be interpreted with caution as these layers covary and may contain similar information, which can artificially inflate variable contribution scores. However, the test AUC scores for models generated with a single variable reinforced that these variables were top predictor variables regardless of covariation (Tables 3 and 4). Suborder Holaxonia was the only group to have calcite saturation state as one of the top three predictor variables (two of the four models) indicating some species within this Suborder could be sensitive to changes in carbonate chemistry.
It was possible to gain insight into the species niches and the factors that are most important in driving their distribution by intersecting the distribution of coral records with the environmental, physical, and chemical layers ( Figure 4). For V CALC , all coral records were found in waters supersaturated with respect to calcite (V CALC .1) with the majority (82%) being found in waters with V CALC between 1 and 2. Most coral records were found in waters with a temperature range of 1.5-8uC and salinity in the range of 33.5-34.7. Coral records were found in depths ranging from 50-4,129 m, but the majority (88%) were found between 50 and 2500 m. Slope 1 km values were widely distributed across taxa, but was an important predictor variable (see jack-knife of variable importance in Tables 3 and 4) for all taxa combined, Alcyoniina, Antipatharia, Calcaxonia, and Scleraxonia. Slope 1 km was not in the top three predictor variables for Holaxonia and Scleractinia. Dissolved oxygen values ranged from 0.3-5.9 ml l-1 with 89% of records having values in the 0.3-3.1 ml l-1 range. Particulate organic carbon values were widely distributed across taxa with Antipatharia having notably lower POC values (80% of Antipatharian records were found in waters with POC,7 g Corg m-2 yr-1).

Model evaluation
The coral habitat models performed well across all the metrics used to validate the modeled outputs. All, bar two AUC scores, were .0.7 and were significantly different from that of a random prediction of AUC = 0.5 (Wilcoxon rank-sum test, p,0.01). High AUC scores were supported by high test gains and low omission rates across many of the modeled taxa indicating most presences  were accounted for in the predictions (Tables 3 and 4). Models generated for the spatial partitions of Antipatharia (Model 1) and Scleractinia (Model 4) were excluded from the summed grids for each taxon as the AUC scores for these partitions were ,0.7, so suitability was ranked between 0-3, rather than 0-4 as for all other taxa (Figures 5 and 6).    Figure S2e). In contrast, suitability probabilities and areal extent were low in these areas for Antipatharia ( Figure S2b) and Scleraxonia ( Figure S2f). High probability areas were modeled to the west of MBNMS's northwest boundary. This area of highly suitable habitat extends approximately 50 km to the west of MBNMS's NW boundary ( Figure S2).
Channel Islands National Marine Sanctuary (CINMS): Habitat suitability probabilities and areal extent of predicted habitat were generally low across taxa within the CINMS boundary when compared to surrounding waters. Predicted habitat probabilities and areal extent of suitable habitat were low for Alcyoniina ( Figure  S3a Figure S3f). Modeled habitat probabilities and areal extent of predicted habitat were high in waters surrounding all islands within the sanctuary for Scleractinia ( Figure S3e). It is worth noting that probabilities and areal extent were high in the waters surrounding the islands of San Nicolas, Santa Catalina and San Clemente for Alcyoniina, Calcaxonia, Holaxonia, Scleractinia and to a lesser extent Scleraxonia. Probabilities and areal extent were very low in all island waters, both within and outside the sanctuary boundaries, for Antipatharia ( Figure S3).
To evaluate the effectiveness of each NMS in encompassing predicted coral habitat, the total area of suitable habitat within each was calculated (Table 5). Overall, a large proportion of total area within each sanctuary encompassed habitat that was classed as suitable (a habitat suitability value greater than zero) for multiple taxa. In some NMS', certain taxa dominated. For example, suitable habitat for Holaxonia was predicted to occur within 45% of the area of OCNMS, whereas in the Gulf of Farallones, Antipatharia was predicted to occur in 60% of the area. To avoid over-prediction skewing the effectiveness, the highly constrained model for all taxa (cut-off above 0.75 logistic suitability) showed that the most effective NMS may be the Gulf of Farallones, which contained 16% suitable habitat, followed by Monterey Bay that contained 14%, Olympic Coast at 12% and the Channel Islands at 9%. The least effective may be Cordell Bank that encompassed 6% suitable habitat.

Spatial distribution of predicted habitat suitability and bottom trawl closures: Essential Fish Habitat (EFH) and Cowcod Conservation Area West (CCA-West)
Coral habitat suitability maps for selected taxa and all taxa combined are depicted in Figure 8 (all taxa are shown in Figure  S4) with overlays of essential fish habitat area closures and Cowcod Conservation Areas sourced from the Pacific Fishery Management Council. These area closures depict areas with fishing gear restrictions off Washington, Oregon, and California. Gear restrictions were established under NMFS' Final Rule to implement Amendment 19 to the Pacific Coast Groundfish Fishery Management Plan (71 FR 27408; May 11, 2006). Fishing with bottom trawl gear within these areas was prohibited to minimise adverse effects from fishing. All bottom contact gear was prohibited in waters surrounding Thompson Seamount, President Jackson Seamount, and several sites in the Channel Islands and Cordell Bank National Marine Sanctuaries. All bottom contact gear and any gear deployed deeper than 500 fathoms (914 m) was prohibited in the waters surrounding Davidson Seamount. There are two Cowcod Conservation Areas (East and West), but only CCA-East is designated as EFH. CCA-West is large in size and has high levels of habitat suitability for many of the coral taxa modelled. The EFH and the CCA-West areas are not the only bottom trawl closures present on the U.S. West Coast. There are also Rockfish Conservation Areas (RCAs) that extend along the entire length of the U.S. West Coast, but their boundaries can change throughout the year and are based on approximate depth contours between ,100-150 fathoms (183-274 m). Both of these factors make quantitative assessment of RCA closures with predicted habitat suitability highly uncertain. In addition, California and Washington prohibit bottom trawling within their state territorial seas (out to 3 nautical miles). These trawl closures were not included in this analysis as the majority of suitable coral habitat is found in deeper areas outside state territorial waters.

Northern Region (42uN to 48uN, Washington and Oregon)
Significant areas of high probability coral habitat for Alcyoniina were predicted off the coast of Washington state (Figure 5a and Figure S4). The areas with high probabilities that remain open to bottom contact gear include the entire western boundary of the OCNMS. These include regions adjacent to existing EFH area closures Biogenic 1 and Biogenic 2. Highly suitable habitat also was identified between existing EFH area closures Grays Canyon and Nehalem Bank/Shale Pile, along the western boundary of Hecata Bank, along the western boundary of Bandon High Spot, and areas between Bandon High Spot and Rouge Canyon EFH. The majority of Antipatharian predicted habitat was located in  existing EFH area closures in this region; the exception to this are the predicted areas located between Grays Canyon and Nehalem Bank/Shale Pile, and within the OCNMS ( Figure S4b). The most significant predicted habitat areas for Calcaxonia, which are not currently contained in EFH area closures, include waters north and south of Biogenic 1 and 2 and areas between Grays Canyon and Nehalem Bank/Shale Pile ( Figure S4c). The predicted habitat pattern for Holaxonia was similar to that of Alcyoniina with two additional areas being highly suitable. These areas are located directly east of Nehalem Bank/Shale Pile and Heceta Bank ( Figure  S4d). Predictions for Scleractinia were limited in this region to a narrow depth band almost all of which is in areas open to bottom trawl gear ( Figure S4e). High probability areas for Scleractinia were identified within the OCNMS and in a narrow depth band between Biogenic 2, Grays Canyon, and Nehalem Bank/Shale Pile. Areas west of Heceta Bank and Bandon High Spot were also identified, in addition to the area east of Rogue Canyon. The majority of predicted habitat for Scleraxonia that remains in open trawling areas is located within the OCNMS and between Grays Canyon and Nehalem Bank/Shale Pile ( Figure S4f).

Central Region (36uN to 42uN, Northern California)
The majority of high probability areas predicted for Alcyoniina occurred in areas currently open to bottom trawl gear ( Figure  S5a). These include areas between Rogue Canyon and Eel River Canyon and between Blunts reef and Pt. Arena South Biogenic Area. High probability areas for Antipatharia are almost completely contained within existing EFH area closures in this region ( Figure S5b

Bottom-trawl intensity
The bottom-trawl intensity data obtained from Whitmire [40] only covered the northern region between 33uN and 45uN and became fragmented below 37uN (Figure 9). The layer covers the shelf, and is largely constrained to depths shallower than 1,000 m. Much of this area is open for trawling, with EFH closures generally being deeper, however, as the intensity data does not cover the whole shelf, it is difficult to draw parallels between protection and trawling activity. There are areas where trawling activity overlaps with predicted suitable habitat (Figures 9 and 10), some areas are enclosed within closed areas but the majority falls outside areas of protection. Using the 0.75 logistic threshold-all-taxa model provided a spatially constrained prediction that focuses on areas with the highest predicted suitability. In the Northern region, and the Olympic Coast National Marine Sanctuary, there are several areas with high trawling intensity and high suitability, the majority of which falls outside of any designated area (Figure 9b). Within the Central region there are again areas of high suitability that fall outside of EFH areas such as adjacent to Heceta Bank, Brandon High Spot and Rogue Canyon (Figure 9c and see Figure 8 for locations of EFH areas). In the Southern region, Mendocino Ridge captures an area of high suitability with no trawling intensity data but further south adjacent to Delgada Canyon and Tolo Bank there are large areas of suitability with moderate trawling intensity (Figure 9d). Intersecting the trawling intensity grid with the habitat suitability classes for the whole region shows that there are areas of high suitability that are trawled at moderate levels compared to lower suitability classes, but the level of trawling for habitat suitability classes 1 and 4 are similar in contrast to 2 and 3, indicating that it is possible that trawlers are focusing both on areas that are not likely to contain corals (i.e. suitability values of 1) and are also targeting areas that do contain corals (i.e. suitability values of 4).

Substrate data
The substrate data showed that for the area available, 91% of the shelf was described as probable soft sediment, 1% was probable mixed hard and soft sediment and 8% was classified as hard substrate (Figure 10a). To determine how much suitable habitat fell within areas of each probable substrate type, the modeled suitability layers for each taxa were spatially intersected with each substrate class and the proportion of area enumerated. By contrasting this, it was possible to provide an estimation of the level of over-prediction within the habitat suitability models ( Figure 10 and Figure S7). In general, the majority of predicted habitat was found to fall within areas of probable soft sediment for all taxa (Table 6). However, for areas that were predicted as higher suitability (.2 on the habitat suitability scale), the proportion of predicted suitable habitat that fell within hard substrate area

Discussion
This study is the first attempt to model the potential distribution of deep-sea coral habitat for the U.S. West Coast EEZ. The approaches presented here are a significant improvement over previous regional efforts such as those in the North East Atlantic [3] and Canadian Shelf [1,2]. In recent studies there have been significant advancements in parallel with the data presented in this study, especially with respects to identifying usable datasets for regional-scale habitat suitability modelling in the deep-ocean [12][13][14]. However, there still remain several limitations that must be considered when interpreting broad-scale model results.

Unincorporated model variables and model accuracy
There are several variables that are important for coral settlement, growth and survival that were not included in the model because they do not exist at sufficient resolutions, a problem shared with all habitat suitability efforts [43]. These variables include benthic hard substrata, high-resolution current direction/ velocity, and the distribution of mobile or benthic sediments. Many corals require hard substrata for colonisation and like depth; substrate tends to be highly variable over small spatial scales. Model results presented here will overpredict the amount of suitable habitat in some areas because fine-scale and moderate scale bathymetric features (109s of metres to 300 metres), substrate, and current data are not available. It is likely that model results indicate suitable coral habitat in areas that are known soft bottom regions with high sediment loadings where corals are likely or known to be absent, as indicated by 91% of the shelf being classed as probable soft sediment [41]. By contrasting the amount of available hard substrate and the predicted suitability for many taxa, the level of over-prediction can be estimated (Table 6). For example, of the area of habitat predicted as unsuitable (habitat suitability of 0) for Alcyoniina, 6.8% fell in areas classed as probable hard substrate. This increased as habitat suitability increased to the highest value (4), where 14.2% of area fell in areas classed as probable hard and mixed substrate. This shows that the model for this species may overpredict in approximately 85.8% of the area, assuming that the hard substrate layer is accurate and that the distribution of this taxa is entirely dependent upon hard substrate that can be mapped at a 25 m625 m resolution.
Data on the distribution of sediments is unfortunately scarce for much of the world's seafloor, a fact that urgently needs to be addressed by mapping programs around the world. In this study, the IOOS Surficial Geologic Habitat maps for the Washington and Oregon continental shelves (Version 2.2) were initially explored to determine whether or not this information could be used in the habitat suitability models to refine taxon niches but these data were not suitable due to incomplete coverage of the study area. The National Marine Fisheries Service produced a composite substrate dataset as part of their 5-year Essential Fish Habitat Review for the West Coast [42]. This dataset aggregated many sources into a standardised classification (hard, mixed and soft substrate) layer at 25 m625 m resolution, but covered only the shallower parts of the shelf limiting its utility in this modelling effort. In addition, the layer was also provided with a confidence layer that stated low confidence for most of the area, with only shallower water having medium or high confidence in the probable substrate type. This layer was used to constrain the output of the habitat suitability models for each taxa to produce a focused dataset that highlights areas where the habitat is suitable for coral and areas of probable hard substrate overlap ( Figure 10 and Figure S7).
Model results for all taxa combined undoubtedly overpredict as the suborders and orders modeled separately occupy different niches, depth ranges, and caution should be exercised when using all taxa combined model results. However, we introduced a constrained model by increasing the threshold to the 0.75 logistic suitability for all taxa, producing a model that was far more focused on areas of very high suitability. Assessment of model accuracy will be dependent on field operations to validate model predictions. In addition to several unincorporated datasets, the extent, quality, and availability of environmental, chemical and physical data are continually improving and should be incorporated in an iterative process with field surveys to refine predictions and reduce the number of false positives and negatives in habitat suitability models.

Presence records
The limited number of coral presence records used to model habitat distribution for some coral taxa highlights the need for more targeted sampling to document coral locations. For example, very few presence localities for Suborders Filifera and Stolonifera were obtained and preliminary models suffered from significant overprediction and artificially high AUC scores. Low presence numbers could be due to coral rarity among these taxa and/or undersampling. The lack of coral records for these suborders resulted in the omission of these models from the analysis. Several recent studies have investigated the effectiveness and reliability of habitat suitability models constructed with low numbers of presences, a common problem for difficult to detect species (i.e. deep-sea corals) and those that have had limited systematic survey effort such as records from museum collections [44]. This does not preclude the possibility of modelling species distributions with low sample numbers, as Maxent is capable of producing acceptable models with relatively limited numbers of presences [37]. However, Maxent does appear to overpredict suitable habitat when using small presence datasets compared with other methods [37,45]. In addition, grouping coral records at the order and suborder level undoubtedly combines coral taxa (family, genus, species) with different environmental niches. This is a recognised limitation of the approach, but one that is necessary due to taxonomic uncertainty and total number of coral records available.

Model validation and targeting areas for field operations
Field validation of modeled habitat is needed to 1) Assess the accuracy of model predictions. 2) Refine models by identifying false positives and negatives. 3) Gauge the utility of these modelling methods for identifying deep-sea coral habitat in unsurveyed areas. The predicted habitat suitability results presented here are not meant to identify coral occurrences with pin point accuracy and are unlikely to achieve this based on currently available data. They are more useful for directing research effort to areas that have the highest probability of supporting deep-sea corals and identifying low probability areas that could be avoided to maximise time spent in high probability areas. Broad-scale predictive habitat results should be used in conjunction with multibeam surveys, geologic substrate maps and other tools to determine the most likely areas for harboring deep-sea corals. One additional complication for field validation efforts using these predictions is the current technological limitation of survey vehicles and equipment (i.e. ROVs, submersibles, drop cameras, etc.). The distribution of deep-sea corals within a single grid cell of these models (500 m6500 m) could be patchy [46] and could be missed on vehicle transects with limited range and narrow fields of view. To address this limitation and to improve the probability of locating undiscovered coral areas, research ships should first use multibeam surveys (in high probability areas) to identify substrate characteristics that can support deep-sea coral growth or identify corals (e.g. emergent hard substrata, coral rubble) and then move towards visual detection methodologies.

Assessment of closures and trawl intensity
Predictive models have been used to assess the suitability of existing protected areas in several areas including the North East Atlantic [14] and South Pacific [47]. Our findings are broadly similar, showing that the boundaries of U.S. National Marine Sanctuaries contained suitable habitat for corals above the average proportions of predicted suitable habitat throughout the entire study area (Table 5). Significant areas of highly suitable deep-sea coral habitat were modeled both within and outside existing NMS, EFH, and CCA-West closure boundaries. However, the majority of suitable habitat for Suborder Holaxonia and Order Scleractinia was predicted in areas outside of existing area closure boundaries. We also, however, identified numerous areas where areas of suitable habitat, from the most constrained model (75 th percentile) fell outside of current protection initiatives. This was particularly evident in the EEZ waters off of Washington and Oregon. Overlaying the spatially limited trawling intensity layer from Whitmire [40] revealed that the majority of intense trawling was outside of current closed areas (Figure 11), as expected, but many areas did not have intensity data available. In previous studies, data from vessel monitoring systems have shown that vessels will enter closed areas occasionally and that vessel behaviour may be linked to the establishment of a closed area [48]. There were several high suitability areas that had a higher trawling intensity than other areas (Figure 11), implying that there may be some overlap in areas that are being fished and that contain suitable habitat for corals. Trawling has been shown to be highly damaging to corals, especially reef-forming scleractinans [49]. Emergent epifauna including octocorals will be adversely affected particularly in areas with repeated, high intensity trawling [50].

Conclusion
The U.S. West Coast has been relatively well researched with respect to the distribution of deep-sea coral species when compared to other regions of the world's oceans. However, significant spatial bias in sampling effort exists in the region and future field research efforts should be directed to unsampled areas to improve habitat predictions. Target areas for future field operations should include high probability areas identified in this study in regions of the U.S. West Coast EEZ that have not been visited. Predictive habitat model results are the only data available for areas that have not been sampled and should be used in in conjunction with other tools, data (i.e. geologic maps, multibeam bathymetry, etc.), and field surveys (where available) to help managers identify potential coral areas that remain at risk from human activity.  (75% threshold). For abbreviations, see Figure 8 in the manuscript. (TIF) Figure S6 Predicted habitat suitability in the Southern Region with EFH area closures (stippled areas) and CCA-West closures (hatched areas) for a) Alcyoniina, b) Antipatharia, c) Calcaxonia, d) Holaxonia, e) Scleractinia, f) Scleraxonia, g) all taxa (50% threshold), h) all taxa (75% threshold). For abbreviations, see Figure 8 in the manuscript. (TIF) Figure S7 Predicted habitat suitability for areas identified as probable hard substrate for a) Alcyoniina, b) Antipatharia, c) Calcaxonia, d) Holaxonia, e) Scleractinia, f) Scleraxonia, g) all taxa (50% threshold), h) all taxa (75% threshold).

(TIF)
File S1 Contains Tables S1-S10. Table S1: Correlation matrix for 10000 randomly placed points within the model domain. Table S2: Correlation matrix for points where the taxon Alcyoniina was found (n = 791). Table S3: Correlation matrix for points where the taxon Antipatharia was found (n = 128). Table  S4: Correlation matrix for points where the taxon Calcaxonia was found (n = 413). Table S5: Correlation matrix for points where the taxon Filifera was found (n = 11). Table S6: Correlation matrix for points where the taxon Holaxonia was found (n = 308). Table S7: Correlation matrix for points where the taxon Scleractinia was found (n = 203). Table S8: Correlation matrix for points where the taxon Scleraxonia was found (n = 277). Table S9: Correlation matrix for points where the taxon Stolonifera was found (n = 30).