Assemblage of Focal Species Recognizers - AFSR: A technique for decreasing false positive rates of acoustic automatic identification in a multiple species context

Passive acoustic monitoring (PAM) coupled with automated species identification is a promising tool for species monitoring and conservation worldwide. However, high false positive rates are still an important limitation and a crucial factor for acceptance of these techniques in wildlife surveys. Here we present the Assemblage of Focal Species Recognizers - AFSR, a novel approach for decreasing false positives and increasing models’ precision in multispecies contexts. AFSR focusses on decreasing false positives by excluding unreliable sound file segments that are prone to misidentification. We used MatlabHTK, a hidden Markov models interface for bioacoustics analyses, for illustrating AFSR technique by comparing two approaches, 1) a multispecies recognizer where all species are identified simultaneously, and 2) an assemblage of focal species recognizers (AFSR), where several recognizers that each prioritise a single focal species are then summarised into a single output, according to a set of rules designed to exclude unreliable segments. Both approaches (the multispecies recognizer and AFSR) used the same sound files training dataset, but different processing workflow. We applied these recognisers to PAM recordings from a remote island colony with five seabird species and compared their outputs with manual species identifications. False positive rates and precision improved for all the five species when using AFSR, achieving remarkable 0% false positives and 100% precision for three of five seabird species, and < 6% false positive rates, and >90% precision for the other two species. AFSR’ output was also used to generate daily calling activity patterns for each species. Instead of attempting to withdraw useful information from every fragment in a sound recording, AFSR prioritises more trustworthy information from sections with better quality data. AFSR can be applied to automated species identification from multispecies PAM recordings worldwide.

2) an assemblage of focal species recognizers (AFSR), where several recognizers that each 30 prioritise a single focal species are then summarised into a single output, according to a set of 31 rules designed to exclude unreliable segments. Both approaches (the multispecies recognizer 32 and AFSR) used the same sound files training dataset, but different processing workflow. We 33 applied these recognisers to PAM recordings from a remote island colony with five seabird 34 species and compared their outputs with manual species identifications. False positive rates 35 and precision improved for all the five species when using AFSR, achieving remarkable 0% Introduction 43 Recent technical advances in sound-recording technologies and analyses have 44 considerably enlarged the potential application of bioacoustics in conservation studies. 45 Acoustic automated identification has been applied to numerous taxa including insects [1,2],  Automated identification studies commonly focus on increasing detection rates in order 64 to maximise the number of identified target calls in a sound file. However, in PAM, the amount of sound being recorded can easily reach terabytes of data. Potamitis et al. [21], for example, 66 rejected 90-95% of their recordings because they did not meet their target specifications in the 67 signal pre-processing stage. It is probably impossible to extract useful information from every 68 single sound segment when monitoring long term. Instead, we advocate focus should be given 69 to extracting more precise and trustworthy information from the segments with the highest   Acoustic Recordings 105 We used active recordings made by researchers in the field, and PAM recordings made could be identified to species by seabird experts, and had good sound quality. We used these 123 selected calls to create a preliminary species recogniser (S1 Supporting Information), that we

132
The workflow within each approach runs from left to right (columns) and the workflow from one 133 modelling approach to the next runs from top to bottom.

144
Five independent species-specific recognisers were built using exactly the same data 145 set of sound files data set previously described. In each case the sound files were associated 146 with different annotation text files. For example, in the Little shearwater independent 147 recogniser, all of this species' calls were assigned in the annotation files as 'Little Shearwater', 148 while all the other four species' calls were assigned as 'Other Species'. This framework was 149 applied to all the five independent recognisers, one for each of our five seabird species.  In this way, we obtained five species-specific outputs, which we then compared to 155 detect and remove unreliable sections of the recordings prone to misidentification. We did this 156 by creating a script named "AFSR_summarizing" that applies a set of rules to summarize the 157 independent outputs into one final annotation text file. Whenever the five recognisers disagreed 158 about the species identification of any segment of the sound recording, the section was then 159 labelled as 'Unidentified'. Only the recording segments that showed consistent species 160 identifications by all five independent recognisers were considered a valid indicator of species 161 presence. The data accessibility information containing the for "AFSR _summarizing" script 162 and the link for the MatlabHTK package are presented on the S2 Supporting Information.

163
To this processing approach which consists in, from a single sound files data set, to 164 create and run independent recognizers and then summarize their results into a single output 165 following a specific set of rules, we named AFSR (Assemblage of Focal Species Recognizers).

166
The set of rules used to summarize the five annotation text files (label format) into one is      similar to the daily patter for calls assign to the Unidentified category (Fig 4-F). seabirds at natural colonies. The first and last peaks of activity (Fig 4)  It is important to highlight that the activity pattern of calls assigned to the Unidentified 330 category (Fig 4 [f]) follows the overall acoustic activity pattern (Fig 5). This confirms our 331 assumption that when more birds are calling, there is a higher chance of recording overlapping 332 calls, and hence more misidentifications and false positives. These coincident patterns show 333 our AFSR approach was effective in categorising calls that are problematic to identify and thus 334 they were assigned to the Unidentified category. In this way, it reduces the false positives.

335
Nevertheless, all the seabird species were detected by AFSR in this study, emphasising the 336 success of this approach and its utility for multispecies monitoring.