Underwater hyperspectral classification of deep sea corals exposed to a toxic compound

Tropical corals are routinely monitored from satellite and aeroplane using remote sensing techniques, revealing the health of coral reefs. Notably, coral bleaching is continuously monitored using multi-or hyperspectral imagery from satellites and aeroplanes. For deep-water corals, however, no established remote sensing technique exists, and for this reason, much less is known about the status of their habitats over time. The purpose of the present work was to evaluate the use of underwater hyperspectral imaging to detect changes in health status of both orange and white color morphs of the coral species L. pertusa. In this study, we examine the feasibility of similar ecosystem health monitoring by the use of underwater hyperspectral imagery. A total of 66 coral samples were exposed to 2-methylnaphthalene concentrations from 0 mg L−1 to 3.5 mg L−1, resulting in corals of varying health condition. By use of a machine learning model for classification of reflectance spectra, we were able to classify exposed corals according to lethal concentration (LC) levels LC5 (5 % mortality) and LC25 (25 % mortality). This is a first step in developing a remote sensing technique able to assess environmental impact on deep-water coral habitats over larger areas underwater.

depending on their color and pigmentation providing a spectral signature of that object. 48 The Underwater Hyperspectral Imager (UHI) used in the present work represents a 49 new system for identification, mapping and monitoring of objects of interest (OOI) at 50 the seabed [25][26][27]. Underwater hyperspectral imaging is constrained to the visible part 51 of the spectrum, as both ultra violet and infrared radiation is attenuated in water [28]. 52 Hyperspectral imaging generates large amounts of data which require sophisticated data 53 analysis and machine learning methods [29]. Multivariate data analysis and machine 54 learning have been used successfully in several marine environmental studies for the 55 interpretation of large data sets, for example in integrated environmental 56 monitoring [30] and for the analyses of photos to assess the potential impact of 57 water-based drill cuttings on deep-water rhodolith-forming calcareous algae [31]. 58 The purpose of the present work was to evaluate the use of UHI and multivariate 59 data analysis to detect changes in health condition of the coral species L. pertusa. 60 Corals were exposed to 2-methylnaphthalene in laboratory experiments in order to 61 provide corals with health condition varying from unaffected to dead. Hyperspectral 62 images of exposed and control corals were then recorded after a recovery period. Finally, 63 classification of these images using machine learning shows in a visual way which spatial 64 areas are affected by exposure to toxic compounds.

66
The experimental work presented in this study consists of the following activities: 67 collecting and rearing coral samples, exposing the corals to the toxicant 68 2-methylnaphthalene, monitor corals to determine polyp mortality, and imaging the 69 corals using UHI. An overview of the timeline is given in Table 1. Samples of L. pertusa were collected at Stokkbergneset in Trondheimsfjorden (Norway,72 63.47°, 9.91°) on September 1st 2015 in collaboration with the Norwegian University of 73 Science and Technology (NTNU) onboard R/V Gunnerus. The site is characterized by a 74 steep rock wall with L. pertusa occurring in colonies from 100 m to 500 m depth [32]. 75 The coral samples were collected from four different colonies and included both white The set-up comprised five treatment groups: one control group (C0) and four exposed 91 groups (C1, C2, C3, and C4 branch with 3 to 9 polyps. Briefly, after 24 h acclimatization, L. pertusa were used in a 97 96 h acute toxicity testing to 2-methylnaphthalene (see Table 1 for a timeline). During 98 acclimatization and exposure, each coral replicate were kept in a square glass aquarium 99 containing 1.5 L sea water. The animals were not fed during the toxicity testing.  were kept in the same tank as C0-C4 during the recovery period. Hence, reference alive 133 corals were exposed to minimal handling compared to C0-C4 before they were scanned 134 by the UHI. The reference alive group was included as an additional control group in Prior to exposure to 2-methylnaphthalene (at time −24 h, see Table 1 for a timeline), 139 the number of alive polyps on each coral samples were counted. Mortality was assessed 140 after recovery and UHI scans (at time 1440 h). This approach was chosen because it was 141 not possible to determine whether polyps were alive or dead immediately after exposure 142 was ended. By keeping the coral samples in a recovery tank over weeks the samples and 143 individual polyps could be monitored visually for the presence of soft tissue on skeleton 144 and polyps. A polyp was classified as dead when soft tissue was no longer present 145 within the polyp's calyx. Illustration of live and dead polyps are given in Figure 2. The 146 fraction of dead polyps (i.e., the number of dead polyps divded by the number of alive 147 polyps before toxicant exposure) is presented in results and statistics as polyp mortality. 148    The underwater hyperspectral imager is a line camera (sometimes referred to as a "push 178 broom" sensor), consisting of a thin slit, a spectrograph, and a monochrome camera,  By comparing the measured spectra of the Spectralon, I spec (x, λ), with its calibrated reflectance spectrum, R spec (λ), a conversion factor, A(x, λ), was found for every spatial pixel (at position x) covered by the Spectralon.
The reflectance of the PVC, R PVC (λ), was then calculated by applying the conversion factor, A spec (x, λ), to the spatial pixels in the UHI slit covering the Spectralon, when 6/28 were set up on the bottom of a tank and imaged using an underwater hyperspectral imager (UHI). The UHI was attached to a linear scanning mechanism and operated in a "push-broom" fashion. The setup also includes a Spectralon reference plate and an inclined reference plate to account for changes in the spectrum due to the presence of water.
imaging the PVC at 3 cm: Finally, conversion factors could then be found for all spatial pixels in the slit at any altitude, and be applied to acquire the reflectance at the altitude of the corals for the full image: where I(x, y, λ) is the recorded intensity, i.e., the hyperspectral image.  Data used in spectral classification 212 Spectra were recorded for 60 coral samples representing the C0 to C4 corals used in the 213 exposure study, as well as for the 6 reference alive corals. Due to poor transmission of 214 7/28 ultra violet and near infrared wavelengths even in extremely clean water [28], poor 215 signal quality was experienced in both ends of the spectrum. Therefore, only data from 216 the spectral range 400 nm to 750 nm are included in this study. The spectra recorded 217 are placed in a matrix X, where each column corresponds to one wavelength, and each 218 row corresponds to one observation, i.e., a pixel in the hyperspectral image.

219
The matrix Y h , used for Projection to Latent Structures (PLS) [34] in the 220 preprocessing stage only, consists of the following columns: 2-methylnaphthalene 221 concentration and polyp mortality. Each row of the matrix Y h corresponds to the same 222 row in X, i.e., spectra associated with the same coral sample.

223
The vector Y c , used for classification, consists of one categorical variable, namely, 224 the exposure category of the organism which the spectrum came from. The exposure 225 category is determined by the LC5 and LC25 concentration (lethal concentration for 5 % 226 and 25 % of polyps) of 2-methylnaphthalene: the non-or low exposure category was 227 exposed to a concentration C ≤ LC5; the intermediate exposure category was exposed 228 with LC5 < C ≤ LC25; and the high exposure category was exposed with C > LC25. The stages consist of standardization, PLS and transformation, followed by an 237 ν-Support Vector Machine (νSVM) classification algorithm [35][36][37]. Note that linear 238 classification algorithms were found to be unsuitable to the problem at hand, due to to 239 the nonlinear separation of points in the PLS latent variable space. The scikit-learn 240 software package was used for both of the PLS and SVM algorithms [35].

241
Standardization of spectra 242 Before attempting classification, the model inputs were standardized: where z i is the scaled value, x i is the original value, and µ i and σ i is the mean and standard deviation of feature i (i.e., the spectrum intensity at wavelength i), respectively. Following standardization, all input variables to the model have zero mean and unit variance. This avoids the problem of some variables being weighted more heavily due to, e.g., different units: the importance of a variable should not depend on whether it is measured in milligram or kilogram. Each feature corresponds to a wavelength in the spectrum, and a column of the input data matrix X: where the Following standardization of inputs, a PLS model relating spectra (X) to 2-methylnaphthalene concentration and polyp mortality (Y h ) is constructed: where E is the error term. In the process of constructing the regression coefficient 247 matrix, B, the input matrix X is transformed to a lower-dimensional (latent) subspace; 248 hence, the name PLS. This is done in a way such that the covariance between the X 249 and Y h matrices is maximized. Furthermore, PLS selects variables that extract the Subsequently, classification is improved both in quality and speed, as noise is 264 removed and the dimensionality of input data is reduced significantly. Also, separation 265 of spectra is improved as the PLS algorithm maximizes covariance between spectra and 266 2-methylnaphthalene concentration and polyp mortality.

267
Classification of spectra and samples

268
The input to the classification stage is, as described above, spectra (X) transformed nominal concentrations for C1 to C4, respectively (see Table 2 and S1 sticking to the tube walls. It is unlikely that bioaccumulation of 2-methylnaphthalene in 313 coral tissue was of significance as the biomass of corals in each replicate beaker was low 314 (ranging from 9 g to 21 g coral, skeleton and soft tissue, in 1.5 L sea water). In all 315 beakers, the concentration of 2-methylnaphthalene was lower at T0 than measured at  [39]. Although no firm 336 conclusion can be given, we believe that there has been no influence of 337 intercommunication between polyps affecting our mortality data.

338
The variation in polyp mortality was relatively high with a polyp mortality ranging 339 from 0 % to 50 % for concentrations from 0.9 mg L −1 to 2.3 mg L −1 , and a polyp • Non-exposed and lowest exposed corals: C ≤ 1.25 mg L −1 351 • Intermediate exposed corals: 1.25 mg L −1 < C ≤ 2.30 mg L −1 352 • Highest exposed corals: C > 2.30 mg L −1 353 What we here refer to as low, intermediate and high doses, is based on the doses used in 354 the toxicity test to ensure mortality was obtained, and does not refer to what we 355 consider as low, medium, and high concentrations in situ and associated with, e.g., oil 356 spills.

357
As such, the doses of 2-methylnaphthalene chosen in this study were not primarily 358 chosen to be environmentally relevant. Compared to measured doses of naphthalenes in 359 sea water samples during oil spills, e.g. Deepwater Horizon, the doses used in the 360 present study are considerably higher [40]. Water samples collected and analyzed for oil 361 compounds, including naphthalenes, during or shortly after an oil spill, are generally 362 below the lowest levels of exposure reported in this study. In a study by Guyormach et 363 al. [41], dissolved PAH concentration was monitored during a field trial in the North polyp mortality. Several spectral features, i.e., peaks and dips, can be observed across 374 the spectrum, especially for higher order loadings. Loading 5 and 6 exhibit high 375 frequency noise, indicating that one is approaching an appropriate cutoff in number of 376 latent variables. Some features appear to be present in both color morphs (e.g., the 377 peak/dip at 650 nm for loading 5 in both color morphs), whereas others are observed 378 only in one color morph (e.g., for white corals, a dip is observed in loading 3 at 540 nm, 379 but no such dip is observed for orange corals). Each spectral dip or peak (note that the 380 sign of loading vectors is arbitrary) can potentially be interpreted as absorption by a 381 chemical compound in the coral. It is of interest to examine further the sources of these 382 spectral features, as it would enable the creation of more robust interpretation models 383 for hyperspectral imagery.

384
After fitting the PLS model to training data, X scores were extracted. The scores 385 correspond to coordinates in the latent variable space. The first three score components 386 for white coral data can be seen in    (a) shows scores for X latent variable 1 vs latent variable 2, (b) shows scores for X latent variable 1 vs latent variable 3. One dot corresponds to one spectrum (i.e. one hyperspectral image pixel) taken from a coral.
whereas spectra from corals exposed to higher values of 2-methylnaphthalene form 394 scattered clusters outside of the central cluster. We attribute this clustering partly to 395 the 2-methylnaphthalene exposure levels being highly clustered. When rotated in a 396 3-dimensional view, one can more easily see that the spectra of high exposed corals form 397 a "shell" outside of the central cluster consisting mainly of low exposed corals, and do 398 15/28 not a simple linear structure. With this in mind, we note that a linear discriminant will 399 not be able to separate the spectra from the lowest and highest concentrations, nor will 400 a linear regression algorithm (including PLS regression) satisfactorily predict 401 2-methylnaphthalene exposure level. However, as we observe good separation between 402 the non-exposed to lowest exposed corals, the intermediately exposed corals and the 403 highest exposed corals, we have performed spectral classification using the nonlinear  The quality of the classification of exposure category is assessed using the metrics precision, P , and recall, R, which are frequently used in the context of classification. The metrics are defined as and Here, T p is the fraction of true positive classifications, and F p and F n is the fraction of false positive and false negative classification, respectively. Finally, we give the F 1-score, which takes both precision and recall into account as the harmonic average: Note that all quantities T p , F n , F p , P , R, and F 1 are defined on the interval [0, 1], and 411 that a high score of P , R, and F 1 equal to 1 is optimal.

412
The input data is split 80 − 20 into a training data set (80 %) and a test data set 413 (20 %). The split is performed using random stratified sampling, giving each class (no or 414 low, medium, and high exposure) a number of training data samples proportional to the 415 size of the class. The classification model is fit to the training data set, and all 416 classification results shown below are for the test data set.

417
Results for the three-class case are given in Table 3. The overall picture is that an 418 accuracy in the range 78 % to 97 % is achieved using SVM classification. The poorest 419 performance is found for high exposed corals. This is possibly due to the fact that fewer 420 spectra were collected for these classes (see the "pixels" column of Table 3).

421
Per-organism classification 422 Obviously, each pixel of each coral cannot have its exposure level be classified separately. 423 Hence, following classification of spectra, i.e., hyperspectral image pixels, we classify the 424 corals at the sample level. This is done by the majority vote algorithm: the coral 425 sample is of the exposure category that most of its pixels are. In this manner, spatial  The spectra taken from the orange reference corals classify with a precision of 1.00 444 and a recall of 0.95, whereas spectra from the white reference corals classify with a 445 precision of 1.00 and a recall of 0.88. This is consistent with the classification results of 446 single spectra in the results above, and gives a verification that the experimental 447 conditions have not systematically modified the spectra, at least with respect to 448 classification results. 449 Finally, at the organism level, 100 % of reference coral samples were classified in the 450 category of nonexposed-or lowest exposed corals. This is correct, as the reference corals 451 were kept separately from the coral samples exposed to 2-methylnaphthalene, and 452 should thus be representative of healthy corals in the wild.

454
Underwater hyperspectral imaging has been shown capable of classifying the cold water 455 coral L. pertusa according to their individual exposure to the toxic petroleum compound 456 2-methylnaphthalene, when categorized according to lethal concentration levels LC5 and 457 LC25 (5% and 25% mortality, respectively). A classification model consisting of 458 projection to latent structures followed by a support vector machine classifier achieves a 459 classification score of 73% to 100% correctness for single spectra. When exploiting the 460 spatial information in the hyperspectral images, a full 100% of L. pertusa samples are 461 Fig 9. Classification of organisms. Two example corals from each exposure category: no-or low exposure (left), medium exposure (center), and high exposure (right). Note that the oddly colored patches in the lower left corner (left image) and upper left corner (center image) are caused by overexposure of a mechanical part holding the corals. The lower row shows classification into no-or low exposure (orange), medium exposure (red), and high exposure (right).
classified correctly under laboratory conditions. The model has been verified with 462 hyperspectral images of reference coral samples not exposed to any toxic compound.

463
Future work 464 Scientists have often struggled to provide an integrated and non invasive assessment of 465 coral health status. In that respect, this study represents the first step towards a 466 non-invasive, automated method for in situ mapping of deep-water coral condition. In 467 order to develop exposure mapping using underwater hyperspectral imaging into a field 468 ready method, several challenges remain.  Table 5 shows the same results, but for the exposure beakers. The exposure was 487 performed with 3 corals in each exposure beaker, across 4 replicates for each of the 5 488 treatment groups.

489
Regarding the analyses performed on the 2-MN water samples, the limit of 490 quantification is 0.1 mg L −1 , and the uncertainty is 20 % at the limit of quantification. 491 This uncertainty decreases above the limit of quantification. Additional uncertainty is 492 expected as the magnetic stirring mechanism cannot operate at high frequency, as this 493 would disturb the coral samples. 494   The bottom three rows indicate average, standard deviation, and percentage of nominal concentration for each column, respectively. The missing value in column C4 II was not analyzed.