Acoustic traits of bat-pollinated flowers compared to flowers of other pollination syndromes and their echo-based classification using convolutional neural networks

Bat-pollinated flowers have to attract their pollinators in absence of light and therefore some species developed specialized echoic floral parts. These parts are usually concave shaped and act like acoustic retroreflectors making the flowers acoustically conspicuous to the bats. Acoustic plant specializations only have been described for two bat-pollinated species in the Neotropics and one other bat-dependent plant in South East Asia. However, it remains unclear whether other bat-pollinated plant species also show acoustic adaptations. Moreover, acoustic traits have never been compared between bat-pollinated flowers and flowers belonging to other pollination syndromes. To investigate acoustic traits of bat-pollinated flowers we recorded a dataset of 32320 flower echoes, collected from 168 individual flowers belonging to 12 different species. 6 of these species were pollinated by bats and 6 species were pollinated by insects or hummingbirds. We analyzed the spectral target strength of the flowers and trained a convolutional neural network (CNN) on the spectrograms of the flower echoes. We found that bat-pollinated flowers have a significantly higher echo target strength, independent of their size, and differ in their morphology, specifically in the lower variance of their morphological features. We found that a good classification accuracy by our CNN (up to 84%) can be achieved with only one echo/spectrogram to classify the 12 different plant species, both bat-pollinated and otherwise, with bat-pollinated flowers being easier to classify. The higher classification performance of bat-pollinated flowers can be explained by the lower variance of their morphology.


Introduction
The more relevant information about the environment an animal can extract from the continuous sensory information flow, the better and more efficiently it can navigate, and the more resources it may access [1]. This applies to visual information but also to other perceptual modalities as animals largely benefit from information acquired through all sensory channels. In order to obtain relevant information, the stream of information is filtered, evaluated and compared with known sensory templates. This is commonly referred to as a classification task [2]. Besides classification, it might also be informative to assess how certain variables in the stream of information change over time. In the case of a nectar-feeding bat searching for flowers, this stream of information will not only consist of visual [3] and olfactory information [4] but also of echo acoustic information. The latter is particularly important for detection but also during the approach of flowers [5][6][7][8]. It has been debated to which extent plant and flower echoes contain information about the physical form [9,10] and whether echolocating bats can identify and classify plants and flowers by their echoes alone [9,11,12].
Insectivorous bats are able to detect and categorize flying insects echo-acoustically by evaluating small modulations in the amplitude and frequency domain of an echo called "acoustic glints". These glints are caused by reflections of the impinging sound waves at the fluttering wings of flying insects [13][14][15]. Similarly, it has been shown for nectar-feeding and omnivorous phyllostomid bats that they can classify and categorize objects of different form independent of their size, which implies that they are capable of acoustic generalizations of objects (Glossophaga soricina [16]; Phyllostomus discolor [17]. Such capabilities were further confirmed by neurophysiological studies (Phyllostomus discolor [18]). Those bats are able to correctly classify different degrees of "impulse response roughness" in the echoes of ultrasound calls reflected by different types of trees (e.g. deciduous trees and conifers; [19]). Impulse response roughness is an echo feature that describes the time/amplitude structure of a reflected signal. A time signal with equally high, closely spaced peaks (as expected from conifers for example) would be regarded as smooth, whereas a signal with peaks of different height and larger spaced peaks (as expected from deciduous trees) is coarse. Moreover, echo-acoustic measurements of trees and shrubs show that their echoes are complex and highly variable [12]. Echo-acoustic classification based on the magnitude of the echoes' spectrograms, a feature obviously perceived by bats, are thus likely to permit species-specific assignment [11]. Being able to acoustically recognize and categorize plant species might be beneficial for insectivorous bats as they can be used as landmarks [20] and/or to select feeding sites. For nectarivorous and frugivorous bats, however, recognizing plants by their echoes is of paramount importance as they are their primary food source. There is some evidence that characteristic differences in flower morphology lead to distinctive echoes [9] which could be recognized by bats. Moreover, for two plant species it was shown that they have specialized reflective floral parts which make the flowers acoustically conspicuous and facilitate detection [5,7]. Nectar-feeding bats foraging on these flowers depend on these signals and have difficulties to find flowers when these signals are missing [5]. This raises the question why there are not more bat-pollinated plant species that developed floral acoustic adaptations. Moreover, how do echo features of batpollinated flowers compare to echo features of flowers belonging to other pollination syndromes such as entomophilous (insect-pollinated) or ornithophilous (bird-pollinated) flowers?
To approach these questions, we conducted a series of morphological and echo acoustic measurements of 168 flowers from 12 different neotropical plant species belonging to 6 different families. For each family we chose a bat-pollinated and a hummingbird-or insect-pollinated plant that were both flowering sympatrically. We acquired echo recordings of these flowers, from different angles, and analyzed their spectral target strength for different frequency bands. We trained a deep learning classifier (a convolutional neural network, CNN) with echo spectrograms of the flowers and classified a test set of flower echoes. We predicted that echo-acoustic properties (such as target strength) of bat-pollinated flowers differ from those of flowers belonging to other pollination syndromes and that bat-pollinated flowers are easier to classify.

Results
The flowers we investigated differed considerably in size and form and consequently showed differences in their spectral directional characteristics (Fig 1). Bat-pollinated flowers did not share a common spectral signature or a feature that was different from that of flowers not pollinated by bats. We took different morphological measurements of each of the flowers such as the length of the corolla, the length of the calyx, and the inner as well as outer diameter of the flowers opening in the horizontal and vertical plane. With these measurements, we estimated the surface area of the flowers and compared them to the average target strength of the directional spectra. The surface area of the flowers had an influence on overall target strength, TS (40 kHz-160 kHz). Flowers with larger surface area had a higher mean target strength (y = 3.488 ln (x) -32.422; F 1,166 656.6; R 2 = 0.797; p < 0.0001; Fig 2A). A linear mixed effect model including an interaction between surface and pollination explained more of the variance in overall target strength than a model with only surface as fixed factor (LMM plant family as random factor, χ 2 (2) = 113.29, p < 0.0001). After removing the effects of flower surface area, bat-pollinated flowers had a significantly higher overall TS than insect-and bird-pollinated flowers (Independent samples T-test with residuals: t = 4.716, df = 122.64; p < 0.0001, Fig 2B). The influence of the surface area on the overall target strength was also evident when Images and directional spectra of flowers from 6 different plant families. As a visual representation of the flowers' echo characteristics, the echo spectra were plotted over the angle of sound incidence, which resulted in spectral different frequency bands were analysed separately (see S1-S4 Figs). Interestingly, the overall TS values for all flowers of Macrocarpea macrophylla for the highest frequency band formed a cluster and had values well above the predicted values for this flower size (average residual 5.0 dB ± 0.6).
In addition to size, flower shape had a major influence on the directional spectra. Similarshaped flowers, whether on the species or the individual level, had similar directional spectra. For example, all bell-shaped flowers (Macrocarpaea macrophylla, Symbolanthus calygonus, Merinthopodium neuranthum, Crecentia cujete, Paragonia pyramidata and Cobaea scandens) shared similar patterns in their directional spectra (Fig 1). They were characterized by frequent occurrence of interference patterns, i.e. reinforced frequencies in close vicinity to cancelled frequencies. The positions of these interference patterns within the spectrograms varied strongly, depending on the angle of sound incidence. This, in turn, resulted in a "spotted" pattern of the directional spectral plot. Moreover, it revealed that there is a relation between the size of the bell-shaped flowers and the occurrence of interferences (left column of Fig 1, Gentianaceae, Solanaceae, Bignoniaceae, Polemoniaceae; flowers in this column are arranged in from directional plots. The angle of sound incidence represented the position from which the echo was acquired, 0r epresents the opening of the flower, 90˚represents the side, images were taken from 45˚. The target strength, or echo intensity, is given in a colour gradation (see colour bar below). The images and directional spectra on the left are from the chiropterophilous plant species, at the right, the flowers are from insect-or bird-pollinated species, as indicated by the white symbols in the images. The bar in the upper left image represents 10 mm, all flower images are on the same scale.
https://doi.org/10.1371/journal.pcbi.1009706.g001 smallest to largest). The larger the flower, the more interferences occurred. This led to a more blurred or noisier appearance of the directional spectral plots.

Morphological flower features
For all individual flowers we deduced morphological features, such as length of the corolla and calyx, depth, and inner and outer diameter of the corolla in the horizontal as well as in vertical plane. As the flowers from different species differed considerably in size, we chose the coefficient of variation (c V ) for an interspecific comparison of variability for different morphological features. c V is the ratio of standard deviation (σ) to the mean (μ), and hence is dimensionless.
To quantify the morphological variation within the 14 flowers we collected for every species, we determined the c V for each of the morphological features, such as length of corolla, length of calyx, depth, and inner as well as outer diameter of the flowers opening in the horizontal and vertical plane. In all taxa, coefficients of variation were always higher for insect-and bird-pollinated species than for bat-pollinated species (Fig 3).
The c V of bat-pollinated flowers was significantly lower than the c V of insect-and bird-pollinated species (Linear Mixed Model [LMM] with pollination syndrome as fixed factor and plant family as random factor, χ 2 (1) = 24.085, p < 0.0001). The mean c V of bat-pollinated flowers was 0.08 whereas flower morphology of insect-and bird-pollinated species was more variable, as revealed by a c V of 0.16.

Classification with a convolutional neural network
Our CNN classifier was evaluated on a testing set derived from individual flowers which had not been used at any point in time during training ( Fig 4A). In all cases models trained on echoes sampled randomly performed better than respective implementations trained with the interval sampling, especially when classifying between 3 to 7 echoes, with 10 echoes the difference being negligible. The CNN was generally able to distinguish between any of the 12 classes well. Across the different number of inputs classification performance of plant species based on the echoes was between 84 and 98%. There was a positive correlation between the number of inputs and classification accuracy, with the largest gain when increasing the number of PLOS COMPUTATIONAL BIOLOGY echoes from 1 to 3. When the input to the CNN was just a single echo, the classification accuracy was 84%, and rose as high as 98% with 10 echoes (Fig 4B).
The classifier's confusion matrix ( Fig 4C) showed some compelling characteristics. The convolutional neural network performed considerably better in recognizing classes representing species pollinated by bats, than those pollinated by insects and hummingbirds, and this difference showed especially in earlier stages of training. When evaluated on 10 echoes the species classification accuracy for bat and non-bat-pollinated flowers was as follows: After 10 epochs 73% for bat-pollinated and 51% for non-bat-pollinated (95% confidence intervals, using binomial method for classification correct detection rate [21]: 69.3-77.5% versus 46.9-55.9%). After 40 epochs 97% for bat-pollinated and 93% for non-bat-pollinated (confidence intervals 95.8-99.0% versus 91.1-95.7%). The network's convergence depends on amount of inputs and takes longer the more inputs are processed simultaneously. With 1-5 inputs, the network achieved good results only after approximately 20 epochs, and around 40 epochs with 10 inputs.

Discussion
Nectar-feeding bats have to find and revisit hundreds of flowers every night to fulfil their energetic needs. Species-specific acoustic information may help them to decide which flowers they should exploit. It will facilitate recognition of suitable, bat-pollinated and thus nightly nectar secreting flowers. Moreover, species-specific echo sequences will inform the bats about flower morphology and how to get to the nectar. Therefore, acoustic information on floral shape, in addition to the olfactory and visual cues, provides crucial support for efficient foraging.

How much information do nectar-feeding bats acquire during their approach?
Nectarivorous and frugivorous phyllostomid bats emit echolocation signals during search flight at a rate of 15 ± 5 calls per second, which increases up to 32 ± 11 calls per second when approaching a target [22,23]. Our own observations with the species Glossophaga soricina showed that during behavioural experiments, when bats had to accomplish echo-acoustic discrimination tasks, repetition rate increases to up to 51 calls per second. As flying bats continuously change their position between echolocation call emissions they will receive target echoes from many different directions. Moreover, bats are known to investigate flowers of interest by hovering briefly in front of them [6]. They may actively acquire important echo information from the respective object and are likely to perform an "auditory scene analysis" where echo information is integrated over several calls [24]. Thus, spectral information similar to the information of the directional spectra, which consisted of 101 single spectra acquired from different directions (Fig 1), should be available to the bats within 2 to 3 seconds of an approach flight assuming repetition rates of 32 to 51 calls per second.
Such an "auditory scene analysis" can be based on different echo features, but is a spectrogram based auditory scene, like we assumed it for our CNN, realistic? Bats are able to deduce many aspects of the impulse response from the echoes and thus can use highly resolved temporal information i.e. range profile information [25]. Bats can clearly perceive spectral echo information [26], and they are known to use both spectral and temporal information [27,28] and may have spectrogram-like representations [10]. This is supported by many target discrimination tasks where spectral echo information was found to be a decisive cue for bats [16,26,[29][30][31][32][33]. Moreover, our results suggest that spectral echo sequences acquired from different directions are particularly informative, and even single flower echoes or small sequences contain species-specific information.

Are echoes of bat-pollinated flowers louder?
After removing effects of flower surface area, we found a higher overall target strength for batpollinated flowers than for flowers pollinated by insects or birds. The maximum distance at which an object can be detected echo acoustically by a bat depends on the target strength of the object [34]. Increased target strength of bat-pollinated flowers will increase echo acoustic detection distance. Furthermore, separation of target echoes and clutter echoes should also be favoured when the target echo is of higher intensity. What could be the reason for the increased target strength of bat-pollinated flowers? Sound waves are reflected at the border of two media, for example at the border between water and air. In case of flowers it would be the border of air and the waxy cuticle of the flowers' petals. Sound waves will be mainly reflected when the two media have different acoustic impedances [35,36] and will enter the medium if the acoustic impedances are equal or similar. The greater the difference in acoustic impedance of the two media, the greater is the magnitude of the reflected echo [35]. The acoustic impedance is determined by the product of speed of sound and the density of the medium, thus the magnitude of the reflected echo (i.e. target strength) increases when density of the medium increases. The stiff and waxy surfaces of many bat-pollinated flowers [37,38] might be an adaptation to increase density and thereby their acoustic impedance with respect to the acoustic impedance of air. This would be a mechanism to enhance flowers' target strength and increase their echo acoustic detection distance.
Bat-pollinated flowers show less variance in their morphological features than entomophilous or ornithophilous flowers. Although the present study, with a total of 32320 flower echoes, from 168 individual flowers, is one of the most extensive studies on echoacoustic adaptions of flowers, the amount of species covered is still rather limited (n = 12). However, as we consistently found a more conserved flower morphology and a better classification performance by our CNN for bat-pollinated flowers, we suggest that the conserved flower morphology is an acoustic adaption. Conserved morphology will lead to less variant echo characteristics, which in turn favours classification performance. To test if the lower variance in morphological features is directly resembled in a lower variability of echo features we determined the coefficient of variation (c V ) for the target strength (TS) values of the different frequency bands.
We found that the c V of the TS of bat-pollinated flowers was significantly lower than the c V of the TS of insect-and bird-pollinated species (Linear Mixed Model [LMM] with pollination syndrome as fixed factor and plant family as random factor, χ 2 (1) = 33.821, p < 0.0001, Fig 5). This indicates that a high morphological variance is directly related to higher acoustic variance. An explanation could be that bird-or insect-pollinated flowers are less constrained regarding their morphology as they lure their pollinators by conspicuous colours and slight differences in morphology will not influence their detectability. For bat-pollinated flowers the morphological fit between pollinator and flower is more important, moreover, the lower morphological variance facilitates echo acoustic detection.
Our deep learning study is in a similar spirit as prior work which used machine learning to classify objects based on their acoustic reflections. Support vector machines (SVM) have been used to classify plants [11], and multi-layer perceptron (MLP) to classify Platonic shapes [39]. In these prior works, the acoustic excitations were bat-inspired but synthetic frequency sweeps, whereas we used recorded bat echolocation calls (synthetically, via impulse responses). Perhaps more importantly, our deep learning method has a particular advantage over the machine learning algorithms previously used: a CNN can use a high-dimensional "image" such as an echolocation spectrogram as direct input to the algorithm, and does not require the data to be preprocessed by feature extraction or dimension reduction. The machine learning algorithm is thus able to make use of subtle discriminative cues in the signal which might otherwise be discarded. Further, we designed a custom CNN architecture that integrates the information from multiple echoes processed in parallel, allowing that the information can be combined holistically and non-linearly to produce the final decision. In these ways, our machine learning method moves closer to being a proxy for the type of inferences that an echolocating bat may be making. Moreover, our method might be an approach to classify artificial shapes and objects. Acoustic reflector shapes inspired by floral reflectors have already been shown to be useful as sonar landmarks for autonomous navigation [40]. Therefore, our approach might also be relevant for autonomous systems which are equipped with technical sonar systems.
Our CNN is not directly modelled on biological neural organisation, and the analogy is still highly heuristic. In particular, our CNN input is simply a set of acoustic echo spectrograms, presented in parallel, without contextual information such as (relative) angle. Bats do not emit 10 calls at once but in sequence, and may produce/analyse them dependently on each other: previous echoes can influence the interpretation of the current ones. Bats also presumably make use of timing and (relative) angle of echoes in their decisions. These observations could perhaps facilitate future work in implementing an 'angle aware' network, which, in addition to receiving an echo, would receive the angle from which this echo comes from and process the echoes sequentially (e.g. via a recurrent neural network). The production of further calls, or even the next choice of observation angle, could depend on the previous measurements. This online problem-solving setting would then be of the kind studied in reinforcement learning. These avenues offer future ways to study echolocation as an information-gathering strategy, by using machine learning to emulate real-time echolocation behaviour.

Plant species
A total of 168 flowers from six families and twelve species were collected and measured. For each family, we collected 14 flowers of a bat-pollinated species and 14 flowers of a species https://doi.org/10.1371/journal.pcbi.1009706.g005 pollinated by insects or birds and we tried to collect flowers not only from one individual plant but from at least 3 and up to 14 different plants (Table 1). With the exception of the species pair belonging to the family Polemoniaceae, all pairs of species occurred and flowered syntopically (in the same habitat). The Polemoniaceae pair flowered sympatrically (in the same geographic region) around 100 km apart. Flowers were collected in the field, kept fresh with wet cellulose tissue and brought to the laboratory at nearby field stations for immediate measurement. Studies were conducted at two locations in the mountain rainforest of south-east Ecuador at the Estación Científica San Francisco (ECSF) situated at 1820 m, halfway between Loja and Zamora (Longitude: 79˚4'41.

Echo measurements
The setup to measure ultrasonic echoes reflected from targets, here flowers, consisted of two main components: the biomimetic sonar head (Fig 6A) and the unit where objects could be fixed and rotated. Both were installed on aluminum guides made out of aluminum profiles (item Industrietechnik GmbH, Solingen, Germany). To turn the objects and ensonify them from different directions, we used a high precision stepping motor (Faulhaber GmbH, Schönaich, Germany). We used a U-formed metal frame, which allowed fixing the objects from their hind side and freely adjusting the center of rotation. The high precision stepping motor was controlled by a computer and turned the flowers in 1.8˚steps. The biomimetic sonar head (representing a "bat head") consisted of a loudspeaker and microphone, which were installed in a solid aluminum body on a vertical guide (Fig 6B). To mimic the movement of a bat's head during approaches to a target, the aluminum body had two fixtures for a microphone to the left and right (45˚) of the centrally located loudspeaker at a distance of 25 mm between the centre of the microphone and the centre of the loudspeaker (Fig 6A). For most of the measurements, we only used the right microphone which was a ¼" free-field microphone type 40BF in combination with the preamplifier 26AB connected to the power module 12AA (all from G.R. A.S. Sound & Vibration, Holte, Denmark). We used a custom-built EMFit (Electro Mechanical  For the measurements the distance between the front edge of the blossom and the front edge of the transducer was adjusted to 20 cm. Flowers were fixed at their back with a thin metal needle. This needle was mounted on the U-shaped holder (see Fig 6B), which was fixed on the extended axis of a stepper motor that permitted exact and reproducible rotation with defined increments. As the U-shaped holder was adjustable, the center of rotation for the objects could be varied. We always chose the front edge of the flowers' corolla as center of rotation so that the sound propagation direction was perpendicular to the plane of flower opening (0˚).
We ensonified the flowers with a constantly repeated MLS (maximum length sequence), which is a periodic pseudorandom binary sequence of zeros and ones (for example: 0001101001110. . .). It is basically a predetermined noise signal and an important property of any MLS is that its autocorrelation is a very short impulse (Dirac impulse). This means that if the recorded echo is combined with the original MLS in a deconvolution process (similar to a cross-correlation), the impulse response (IR) from the object is obtained. The impulse response can be thought of as an echo of a very short, broadband click and it fully characterizes the transmission properties of any linear and time-invariant (LTI) system. Actual echoes of bat echolocation calls can be derived from the impulse responses by convolution with calls. The advantage of using MLS instead of short impulses is that MLS has a longer duration and thus contains more energy. This results in a much better signal-to-noise ratio (for more information regarding MLS method see Cho, 1990). We used a MLS containing 16 383 samples, which was continuously played at a sampling rate of 500 kHz resulting in a duration of 33 ms. The analogous microphone signal was digitized with a sampling rate of 500 kHz at 16 bit resolution.
Acoustic measurements or scans were taken over a 180˚angle from -90˚to 90˚with regard to the center of the corolla in 1.8˚increments. This led to 101 measurements per flower. We scanned most the 168 flowers both in the horizontal as well as in the vertical plane. Thus, we received 202 individual echoes (to recover 202 impulse responses) for each flower acquired from different directions. 15 individual flowers from various species were only measured in one plane, or measurements had to be discarded respectively because the measurement was interrupted mainly due to an overheated amplifier. Replay and recording were sample synchronous. We made sound recordings of the MLS-signal which was backscattered at the objects surface. The raw recordings had a duration of two seconds each and we deduced the impulse response (IR) by deconvolution with the original MLS as described above. As the sequence length of the MLS was 33 ms the 2 s recording consisted 60 identical IRs. To further increase the signal to noise ratio of the IRs we averaged over 50 sequences. In this averaged IRs we searched for the peaks originated from the flowers, we cut them out (rectangular window, 1024 samples) and calculated the power spectral density.
To obtain spectra independent of the frequency response of the loudspeaker, an additional measurement was performed for calibration. We installed a 25 cm x 25 cm plexiglass plate in place of the flowers and measured its echo. To obtain the flowers' spectral characteristics regardless of the frequency response of the loudspeaker we used this 'plate measurement' to calculate the difference between its spectrum and the spectrum of the recorded flower. This permitted us to assess the flowers'spectral target strength. The values we obtained with our method resemble the spectral target strength (TS) at a reference distance of 10 cm. We analyzed the spectral target strength for 5 different frequency bands. One frequency band (40 kHz-160 kHz) was very broad, thus covering the typical bandwidth of a glossophagine bat's echolocation call. The others had a lower bandwidth: we choose 45 kHz, 68 kHz, 102 kHz and 153 kHz as center frequencies, upper and lower boundary of the bands where ±20% around these center frequencies. To determine TS values for each flower for these frequency bands, we averaged TS (after delog) for the azimuth and also for the elevation measurements (each consisted of 101 measurements) and then used the mean of azimuth and elevation for statistical comparisons between flower types.
Classification with a convolutional neural network (CNN): For each species we had 14 distinct scans of different individual flowers. The distinct scans of each class were separated into training, validation and testing subsets, stratified per individual flower to avoid the risk of overestimating classification performance due to any "overfitting" in the training process. The validation and testing set covered all 12 species, but contained echoes only from individual flowers that had not been used for training. Two instances of each class were put in the test set, eight instances in the train and two in the validation set, equivalent to a split of approximately 58% / 21% / 21%. We derived flower echoes by convolution of a typical echolocation call of Glossophaga soricina with the impulse responses we measured from the different angles (see Fig 7). The call was filtered prior to the convolution with a high pass filter with a cutoff of 5kHz, to ensure that no low frequency noise leakage was present. The spectrograms of each echo were calculated with a Hann window of size of 256 samples and a hop size of 90%. This allowed for a good trade-off between time and frequency resolution for this task [50].
Our classifier would use a small number of synthetic echoes to make a prediction, designed as a simplified analogy of a bat using multiple calls on approach to a flower. Only the echo spectrograms were used as input, without any information on the chosen angles or their relation. We studied groups of 1, 3, 5, 7 and 10 echoes. Further, for multi-echo situations we evaluated two ways to select groups of echoes, following different assumptions about a bat's physical approach.
1) Random split (assumption: bats randomly receive echoes from different positions during approach): Impulse responses from each single instance of all flowers were randomly shuffled along the angle axis, and separated into sets of 3, 5, 7 and 10. This procedure resulted in 4 different sets-each containing all available data in chunks of respectively 3, 5, 7, and 10.
2) Interval split (assumption: bats approach in a narrow trajectory and receive echoes from certain consecutive sequences): Again impulse responses were taken from each instance in chunks of X = (3; 5; 7; 10). This time however, the IRs were selected in structured sequences. A random interval n between 2 and 8 was selected and every nth impulse response was pickedup to length X. This allowed for a more realistic approximation of an actual bat's approach, which would include multiple echoes, coming in succession from one direction. Note that this option might result in less diversity within a chunk (increased redundancy of information between echoes) and thus is expected to reduce recognition performance.
The architecture of the CNN is inspired by experiments on classification of plant appearance qualities for flower grading [51]. 1, 3, 5, 7 or 10 echo spectrograms together constitute one simultaneous input sample to the network. Independently of the number of inputs, each echo goes through a 'preprocessing layer'-consisting of a convolutional layer with 16 channels and a max pooling layer, in which the weights are shared, i.e. the same preprocessing for each spectrogram in the chunk. The results of this preprocessing are then concatenated into one tensor and processed further through another set of convolutional (32 channels) and max pooling layers. Then, three dense layers, and finally output of probabilities for each of the 12 classes. Each convolutional layer has a kernel size of 3x3, with stride 1, and each max pooling layer a kernel of size 3x3 with stride 2. A detailed overview of the architecture can be seen in Fig 3. Note that one scan of one flower in the dataset had been mis-identified during manual labelling: it was labelled as Cc_e_01 but should have been Cs_e_10 (both bat-pollinated). The data of this scan has been removed from plots and analysis. It was included in training of the classifier but not in testing; hence classification results are not over-estimated.
We followed standard good practice to train a CNN deep learning classifier, as follows. Cross-entropy was used as the loss function to measure model performance during training. The Adam optimizer [52] was used for training, with a learning rate of .0001, and batch size of 16. During each epoch training was done over the entire train split in shuffled batches, and then the model was run with validation data, in not shuffled batches, to monitor performance. The final evaluation was done over the test set, after training was finished. Completion of training was decided by early stopping, as well as manual inspection of loss values for train and validation splits. Early stopping is a technique that prevents overfitting by stopping the training process if performance on the train set improves, but performance on the validation set stagnates, or degrades, over a predefined amount of epochs. The creation of an echo spectrogram occurred on-demand in the deep learning system, by convolving a selected impulse response with a bat call and then calculating the resulting spectrogram. This approach slows down processing to some degree, since the spectrograms are not pre-calculated, but allows for quick and convenient changes to the window/hop sizes during training and hyperparameter tuning. An example input to the CNN can be seen on Fig 6C-6E.

Morphological data
We collected morphological data for each flower including length of the corolla, inner depth, calyx length, inner and outer diameter of the corolla opening for both, the horizontal and vertical axis, see Fig 8. These characteristics were measured with a metal ruler (0.5 mm precision). To estimate the reflective flower surface area F sur , we calculated the area of a bell like form consisting of a circle surface (resembling the corolla opening) and a paraboloid surface (resembling the corolla chalice). We used half of the mean outer corolla diameter as radius r c to calculate the area of the corolla opening and we used half of the mean inner corolla diameter as radius r p and the corolla length as height h to calculate the outer corolla surface area. The mean corolla diameter was the average of the vertical and horizontal measurements.
As Witheringia sp. had a morphology that was very different from the rest of the flowers (Fig 1), we took the following measures for this species: length and width of the petals, length of the calyx, length and width of the anthers. For the species Condaminea corymbosa we did not measure calyx length, because it is reduced in this species and hard to measure. As flowers of Cantua quercifolia showed radial symmetry we measured inner and outer diameter of the corolla opening only in one plane rather than both horizontal and vertical. For the two latter species we adapted the surface calculation accordingly.