Cleaner wrasse pass the mark test. What are the implications for consciousness and self-awareness testing in animals?

The ability to perceive and recognise a reflected mirror image as self (mirror self-recognition, MSR) is considered a hallmark of cognition across species. Although MSR has been reported in mammals and birds, it is not known to occur in any other major taxon. A factor potentially limiting the ability to test for MSR is that the established assay for MSR, the mark test, shows an interpretation bias towards animals with the dexterity (or limbs) required to touch a mark. Here, we show that the cleaner wrasse fish, Labroides dimidiatus, passes through all phases of the mark test: (i) social reactions towards the reflection, (ii) repeated idiosyncratic behaviours towards the mirror (contingency testing), and (iii) frequent observation of their reflection. When subsequently provided with a coloured tag, individuals attempt to remove the mark in the presence of a mirror but show no response towards transparent marks, or to coloured marks in the absence of a mirror. This remarkable finding presents a challenge to our interpretation of the mark test – do we accept that these behavioural responses in the mark test, which are taken as evidence of self-recognition in other species, mean that fish are self-aware? Or do we conclude that these behavioural patterns have a basis in a cognitive process other than self-recognition? If the former, what does this mean for our understanding of animal intelligence? If the latter, what does this mean for our application and interpretation of the mark test as a metric for animal cognitive abilities?

6 recognition in a fish. Importantly, this species allows us to ask whether the criteria that are 126 accepted as evidence for mirror self-recognition in mammals and birds can be applied to other taxa, and if they fulfil these criteria, what it means for our interpretation of the test itself.

128
In applying the mirror test, transitions among three behavioural phases after initial exposure 130 to a mirror are typically [1,4,5,6]; these transitions among behavioural phases are interpreted as additional evidence of self-recognition, although in themselves do not constitute passing 132 the mirror test [1,4]. We first tested whether the cleaner wrasse passed through all three behavioural phases upon exposure to a mirror placed in an experimental tank (Fig. 1A), and 134 if so, we describe the phases in cleaner fish. The first phase (i) is a social reaction towards the mirror, apparently as a consequence of the reflection being perceived as an unknown 136 conspecific. In phase (ii), animals begin to repetitively perform idiosyncratic behaviours that are rarely observed in the absence of the mirror. These behaviours are interpreted as 138 contingency testing between their own actions and the behaviour of the reflection [e.g. 1,4].
In phase (iii), the animal begins to gaze and examine their reflection as if it is a 140 representation of the self, and uses the mirror to explore their own body in the absence of aggression and mirror-testing behaviour [1,4,5]. If they passed these phases, we applied the 142 mark test.
144 Results and discussion 146 Progression of behaviours in response to the mirror 148 Prior to starting the experiments, the focal fish swam around the tank and showed no unusual reactions to the covered mirror. Immediately after initial exposure to the mirror, 7 of 150 10 fish responded aggressively to their reflection, attacking it and exhibiting mouth-to-mouth fighting [45,46] (Fig. 1A,B, Supplementary Movie S1) suggesting that the focal fish viewed 152 the reflection as a conspecific rival. The frequency of mouth fighting was highest on day 1, 7 and decreased rapidly thereafter, with zero occurrences by day 7 (Fig. 1Ca; cf. with the 154 similar decrease in aggression seen in chimpanzees, and shown in Fig. 2 of [1]) and hardly any aggression over the following month.

156
As mouth fighting towards the mirror reflection decreased, the incidence of unusual and 158 atypical behaviours (e.g. 'upside-down approach' and 'dashing along mirror'; Table 1, Movies S2, S3) significantly increased and was highest on days 3-5 (Fig. 1Ca). On days 3 and 4, the 160 estimated average frequency of these atypical behaviours among the seven individuals was 36 times per hour during the daytime. Each of these atypical behaviour types was of short 162 duration (≤1 s), often consisting of rapid actions that occurred suddenly within 5 cm of the mirror. At the end of each movement, the fish remained near the mirror, and appeared as if 164 they were viewing their reflection (Movies S2, S3). These atypical behaviours could be loosely grouped into five types: not against the mirror reflection, dashing along the mirror 166 without and with attaching the head on the surface (atypical behaviours, a and b, respectively); dashing towards the reflection but stopping before touching it (c); and 168 idiosyncratic postures and actions of short duration performed in front of the mirror: upsidedown approach (d), and quick dance (e) ( Table 1). While it is possible to interpret these 170 behaviours as a different form of aggression or social communication, they have not been recorded in previous studies of social behaviour in this species [46] and were not part of a 172 courtship display, as all of the subject fish were females.
174 These atypical behaviours were highly repeatable within an individual, with each fish performing one or two types of behaviour more than 400 times a day on average during days 176 3 and 4 (Table 2; Fisher's exact probability test for count data with simulated P-value based on 2,000 replicates of P = 0.0005). Crucially, these behaviours occurred only upon exposure 178 to the mirror, and were not observed in the absence of the mirror (i.e. before mirror presentation). Almost all of the behaviours ceased by day 10 (Fig. 1Ca), and thereafter were 8 182 These behaviours are different from the previously documented contingency-testing behaviours of great apes, elephants and magpies [1,4,7], but given the taxonomic distance 184 between them, this could hardly be otherwise. While primates and elephants may perform more anthropomorphic behaviours such as changing facial expression, or moving the hands, 186 legs or trunk in front of the mirror, wrasse and other fishes cannot perform behaviour that is so easily interpreted by a human observer. Nevertheless, behaviours such as rapid 188 swimming and other spontaneous actions could represent alternative indices of contingency that are within the behavioural repertoire of the study species (Table 1).

190
In summary, the atypical movements observed in cleaner wrasse were characterised by 192 almost every aspect of contingency-testing behaviour documented previously [1,4,5,7]: 1) atypical and idiosyncratic, 2) occurring repeatedly, 3) occurring in front of a mirror, 4) not 194 occurring in the absence of a mirror, 5) occurring after a phase of initial social behaviour, 6) occurring over a short period of time and 7) distinct from aggressive behaviour. Fulfilment of 196 these conditions supports the contingency-testing hypothesis. Although we reserve judgement as to whether these behaviours should be interpreted as evidence that the fish 198 examine and perceive the reflection as a representation of self, we nevertheless conclude that these behaviours are consistent with phase (ii) of MSR as presented for other taxa.

200
In phase (iii), which is difficult to clearly distinguish from phase (ii), species that pass the 202 mark test increase the amount of time spent in front of the mirror in non-aggressive postures while viewing the mirror image [1,4,5,7]. This interpretation is again rife with pitfalls, as it 208 We observed an increase in the amount of time spent in non-aggressive postures while close 9 to the mirror (distance of < 5 cm), peaking on day 5 after mirror presentation and remaining 210 consistently elevated ( Fig. 1Ca; 107.0 sec ± 21.2 [SD]/10 min) versus days 1-4 and the several days prior to mirror presentation (37.0 sec ± 11.5, Wilcoxon sign-ranked test, T = 36, 212 P = 0.008); this behaviour was consistent with phase (iii) of MSR. We did not observe specific viewing behaviour that is seen in chimpanzees and elephants, e.g. trying to look at 214 body parts, such as inside the mouth or between the legs. It is inherently difficult to distinguish such looking behaviours from other behaviours primarily because gaze direction 216 could not be determined in this species. Technological developments that allow eye tracking in free-swimming fish may alleviate this difficulty in future studies.

218
Similar to other studies, not all individuals we tested passed through each phase of the test.
220 After the initial presentation of the mirror, three fish (#4, #5, #6) showed low levels of aggression and rarely performed atypical behaviours during period E1 (Fig. 1Cb). Instead, 222 these three individuals spent relatively longer periods in front of the mirror, as is typically observed during phase (iii) in other focal fish (Fig. 1Cb). By applying the same criteria as 224 applied for other instances of the test, we conclude these fish failed the test. However, an alternative explanation is that these fish had already passed through the initial phases; at the 226 start of the experiment, the glass wall on the opposite side of the mirror in the tanks of these three fish was slightly reflective due to differences in lighting in the room, and the focal fish 228 were observed to occasionally remain in front of the glass wall. These observations suggest that these three fish may have already passed through phases (i) and (ii) during the 230 acclimation phase before the start of experiment. As discussed below, these three fish exhibited good responses to the mark test.

232
Species with MSR distinguish their own reflection from real animals viewed behind glass 234 [e.g. 29]. When we exposed naïve cleaner wrasse to conspecifics behind glass, we observed fundamentally different responses towards their mirror image. Aggressive behaviour 236 frequency towards real fish was generally low, yet did not diminish appreciably during the 2-10 week testing period (Fig. 1D). Time spent within 5 cm of the glass in the presence of 238 conspecifics was also higher than that in the presence of the mirror. Importantly, no atypical or idiosyncratic behaviour (i.e. contingency-testing) was exhibited towards conspecifics.  248 amount of coloured gel using a fine needle, a procedure, which has been repeatedly shown not to affect fish behaviour [51-54, NMTI], and is widely used in fish behavioural studies.

11
Fish were marked at night while under anaesthesia, and they swam normally early the next 266 morning under a no-mirror condition (Kohda, pers. obs.). After the initial settlement period "E1" (i.e. the initial 2 weeks of phases i-iii), we evaluated behaviour during periods "E2" (no 268 mark), "E3" (injection with invisible sham mark), "E4" (injection with coloured mark with no mirror present) and "E5" (coloured mark with mirror present) during a further 2-week period.  290 Posturing behaviours against the marked sites during periods E2 and E3 were infrequent and were not different between the two periods ( Fig. 2Ca), a pattern driven by all fish except fish 12 marking procedure itself had minimal effect on fish behaviour. In contrast, time spent 294 posturing while viewing the marked sites was significantly higher in the coloured-(E5) versus no-(E2) and sham-marked (E3) periods ( Fig. 2Ca), as predicted. This pattern held for all 296 individuals except fish #2, regardless of the sites marked (Table 3). Note that no comparisons to E4 can be made with respect to observations of reflections, as no mirror was 298 present during that period. Moreover, the time spent in postures reflecting the two remaining unmarked sites (e.g. right side of head and throat, for a fish marked on the left side of head) 300 for each fish were not different among periods (Fig. 2Cb). Taken together, these findings demonstrate that cleaner wrasse spend significantly longer in postures that would allow them 302 to observe colour-marked sites in the mirror reflection. These reactions also demonstrate that tactile stimuli alone are insufficient to elicit these behaviours, as they were only observed in 304 the colour mark/mirror condition. Rather, direct visual cues, or a combination of visual and tactile stimuli, are essential for posturing responses in the mirror test. In previous studies on 306 dolphins, similar patterns of activity were considered to constitute self-directed behaviour [5].
308 2) Scraping of the colour-marked throat after viewing it in the mirror 310 Although they cannot touch their own bodies directly, many species of fish scrape their bodies on a substrate to remove irritants and/or ectoparasites from the skin surface [48,49].
312 When we marked fish with brown-pigmented elastomer on the lateral body surfaces in locations that could be viewed directly, the fish increased scraping behaviour of the mark 314 sites, indicating they regard the colour dots as ectoparasites to be removed (Supplementary Figure S1). Similar scraping of colour-marked areas during the mark test is interpreted as an 316 indicator of self-directed behaviour for some mammal species that do not have hands [29,50]. Accordingly, we hypothesised that the cleaner wrasse would scrape their bodies in 318 an attempt to remove coloured marks from body parts not directly visible after observing them in the mirror (and crucially, that they did not scrape invisible sham marks, nor coloured 13 that is accepted to be functionally equivalent to a similar behaviour in mammals (in this case 322 scraping), and that behaviour is accepted as being self-directed in those mammals [29, 50], then it raises the question whether this behaviour may be similarly considered self-directed in 324 the fish. If this position is accepted, then any scraping behaviour of coloured marks in the mirror condition would constitute compelling evidence that fish use mark-directed behaviour 326 to remove visually perceived coloured marks from their bodies. By extension and comparison to similar mirror test studies, this would raise the question of whether fish are therefore aware 328 that the mirror reflection is a representation of their own body.
330 Like many natural behaviours, some scraping of the body flanks was observed outside the mirror condition in our studies. This body scraping behaviour was also difficult to distinguish 332 from head scraping. Because of these factors we took throat scraping, and not face scraping, as the only evidence of a putative self-directed behaviour because it was never observed 334 outside the period E5 in any of the subject fish. It is also important to note that fish marked on the head laterally scraped the body flank/facial region, but never the throat region, during 336 period E5, providing further evidence that marking itself does not induce throat scraping.
Three of the four throat-marked fish frequently scraped their throats against the substrate 338 after being exposed to the mirror during period E5 (Fig. 3A in fish #21). These three fish attempted to scrape their throats but this was occasionally 344 executed awkwardly, probably because they were not accustomed to performing this behaviour. As the marks were identical in periods E4 and E5, with the only change being the 346 visibility of the mirror, the difference in throat scraping provides further strong evidence that the colour injection itself did not cause direct physical stimulation that would lead to the 348 observed behaviours (e.g. itching or pain).
14 350 These results accord with the increased amount of time spent in postures indicating observation of the coloured marks in the reflection only during period E5 (Fig. 2Ca). The 352 motivation for scraping the mark is potentially to remove a perceived ectoparasite, which these wild-caught fish would have experienced previously. In all cases (n = 37), the scraping 354 behaviours followed soon after the fish had assumed a posture that reflected the throat mark, with an average latency between observation in the mirror and scraping of the substrate of

362
The majority of the throat scraping behaviour was immediately followed by another frontal- interpretation. Indeed, this type of behaviour is similar to that of chimpanzees, which examine 372 and smell their fingers after touching a paint mark [1,8], and which is considered intentional self-directed behaviour in that species.

374
Three of the four throat-marked individuals in this study passed the mark test, a success ratio 376 comparable to other species tested previously; one of three Asian elephants passed the test 15 [4], as did two of five magpies [6]. Fish #20 in our study was throat-marked but did not 378 perform throat scraping (Fig. 3B). However, this fish exhibited intensive contingency-testing behaviours (a) and (b) during period E1, prior to colour-marking, similar to the other fish 380 (Table 2), and assumed postures (self-directed behaviour) that reflected the throat more frequently during E5 after colour marking (Table 3). According to the mark test criteria used 382 for dolphins [5], these results suggest that this wrasse recognised the reflection as self, but "fell at the last hurdle". Nevertheless, given the controversial nature of the mark test in non-384 primates, and questions over the interpretation of these behaviours [8], we do not take this result as conclusive evidence for MSR in this individual. We do point out, however, that by 386 the same criteria used for e.g. dolphins, we would conclude that all four throat-marked fish recognised themselves in the mirror.

388
In this study we applied the mark test, a controversial assessment of animal cognition [8], to 390 a fish, a taxonomic group often considered to have lower cognitive abilities than other vertebrate taxa. We find compelling evidence that cleaner wrasse pass through all stages of 392 the mark test, ultimately attempting to remove the mark when it is able to be viewed in the mirror (Figure 3). We further find the parsimonious conclusion to be that the behaviours 394 displayed by this fish are equivalent to behaviours taken as evidence for self-recognition in other taxa (contingency testing, self-directed behaviour, observation and exploration of the 396 body in a reflected image, and removal attempts; Figures 1,2,3). We consider these behavioural responses to be a consequence of the particular feeding ecology, generally high 398 cognitive capacity, and problem-solving skills of the cleaner wrasse [14- 16,37,38]. This is the first report of successful passing of the mark test in vertebrates outside of mammals and 400 birds, suggesting that if mirror tests are applied considering the cognitive capacities and ecology of focal species outside of primates, they too may pass the test. Our study further 402 supports previous theories postulating that recognition and cognitive capacities are more closely related to social and behavioural ecology than relative brain size or phylogenetic 404 proximity to humans [14, 16,32]. 16 406 The results we present here will by their nature lead to controversy and dispute, and we welcome this discussion. We consider three possible interpretations of our results and their 408 significance for understanding the mark test: i) the behaviours we document are not selfdirected and so the cleaner wrasse does not pass the mark test, ii) cleaner wrasse pass the 410 mark test and are therefore self-aware, or iii) cleaner wrasse do pass the mark test but this does not mean they are self-aware. If one takes position i), rejecting the interpretation that 412 these behaviours are self-directed, it is necessary to demonstrate grounds for this rejection.
As noted above, touching or scraping behaviour is taken as evidence of a self-directed 414 behaviour in mammals, and so if these behaviours are not similarly considered self-directed in fish, the question must be asked why. For a test to be applicable across species, an 416 objective standard is required. Without such a standard, behaviours assessed in the mark test can be differently assessed depending on the taxon being investigated. This introduces 418 an impossible, and unscientific, standard for comparison and we therefore reject this conclusion or must reject the validity of the mark-test entirely.

420
We therefore consider the most parsimonious conclusion to be that the behaviours we 422 observe here in cleaner wrasse are equivalent to those in other taxa during the mirror test.
Based on this, and on the original interpretation of the mark test by its inventor Gallup, who 424 suggested species that pass the mark test are self-conscious and have a true theory of mind [1,57], would therefore lead us to take position (ii), that cleaner wrasse are self-aware.
426 However, we are more reserved about the interpretation of these behaviours during the mark test with respect to self-awareness in animals. We do not consider that the successful 428 behavioural responses to all phases of the mark test should be taken as evidence of selfawareness in the cleaner wrasse, but rather that these fish come to understand that the 430 mirror reflection represents their own body. From the behaviour we observe, we consider the interpretation that makes fewest assumptions to be that these fish undergo a process of self-432 referencing, whereby the fish use the mirror to see their own body, but without this involving 17 theory of mind or self-awareness [32]. This interpretation is supported by a supplementary 434 experiment (Supplementary Figure S1) that showed fish marked on the body in places they could directly see also performed scraping on those regions.

436
If we therefore accept position (iii), that cleaner wrasse show behavioural responses that fulfil 438 the criteria of the mark test, but that this result does not mean they are self-aware, a question naturally arises. Can passing the mark-test be taken as evidence of self-awareness in one 440 taxon but not another? A position that holds the same results can be interpreted different ways depending on where they are gathered is logically untenable, and so must be rejected.  Fig. 1A) and each fish was kept for at least 1 month prior 458 to beginning the experiments to ensure acclimation to captivity and the testing conditions, and that they were eating and behaving normally. Fish were between 51-68 mm in length; 460 this is smaller than the minimum male size, thus strongly suggesting that these individuals 20 of E4. After confirming that all marks were of the same size (1 × 2 mm 2 ), the fish were 518 returned to the tank. Given the location of the tags relative to the field of view of cleaner wrasse, direct observation of the marks on the head was unlikely, and was definitely 520 impossible for throat marks. To standardise the testing procedure, the brown-coloured mark was injected at the throat near the transparent marked site. Even with both marks applied, 522 the total volume of the tag was lower than the minimum recommended amount, even for 532 Behavioural analyses. Videos of the fish behaviours were used for all behavioural analyses.
Fish performed mouth-to-mouth fighting frequently during period E1, and the duration of this 534 behaviour was recorded (Fig. 1B, Movie S1). Unusual behaviours performed in front of the mirror, which have never been observed before in a mirror presentation task, nor in the 536 presence of a conspecific, were often observed during the first week of E1, and the type and frequency of these behaviours was recorded. Scraping behaviour, including the location on the body that was scraped, was observed 556 during periods E2-E5 in the eight subject fish. During period E5, when the fish were colourmarked and exposed to the mirror, individuals often displayed the marked site to the mirror 558 immediately prior to and following a scraping behaviour. Therefore, we also recorded the time interval between displaying and scraping during E5.

560
Responses towards real fish. A potential alternative explanation of behaviour in mark tests 562 (and one that is rarely tested for in other vertebrates) is that the focal individual perceives their reflection not as the self, but rather as another individual behind a glass divide. Although 564 many behaviours seen in the mark test suggest that this is not the case (e.g. contingencytesting, body exploration), and a growing body of evidence shows that fish perceive mirror 566 reflections in a fundamentally different way to conspecifics behind glass [59,60], we directly controlled for this possibility by comparing the behaviour of fish confronted with a reflection to 568 that when another individual was across a glass divide.
570 We tested the responses of eight fish (55-59 mm in size) in size-matched pairs. Two fish were introduced into a tank (45 × 30 × 26 cm 3 ). After the fish became acclimated, the cover 572 was removed from the divider to allow them to see one another. We then recorded 22 behavioural responses in the same manner as described for period E1 in the mirror test, for 2 574 weeks. The results of these observation are presented in Fig. 1D. After 3 weeks, we marked these fish on the throat and recorded whether they scraped their throat regions; however, we 576 did not observe any throat-scraping behaviour, although they must have observed the 'parasite' on the throat of the conspecifics. This indirectly supports the view that the fish were 578 attempting to remove the mark from their own bodies when presented with the mirror during period E5 of the actual mark test. 594 were analysed separately (Fig. 2Ca, b). Individual-level statistics on postures that reflected the marked sites are shown in  634 Visual and tactile stimuli by the colour mark. We further considered whether the elastomer tag could provide a tactile stimulus that, when paired with visual information, may 636 lead to individuals passing the mark test [8,55]. We can effectively rule out that tactile stimulation alone was sufficient to induce a self-directed behaviour [

698
Movie S5. Fish #1 tried to scrape a throat mark on the sandy bottom immediately after 700 viewing the mark in the mirror, but did not look at its throat in the mirror after scraping.
However, the fish failed to scrape its throat on the sandy bottom, although the sand moved 702 as the fish shook its head. The fish may not have checked its throat in the mirror, possibly because it had not been scraped.

704
Movie S6. The fish rapidly approached the mirror after scraping its throat on the sandy 706 substrate, stopped at a distance of about 1 cm from the mirror, and remained stationary for 1 s; during this time the fish assumed a position that reflected the scraped throat in the mirror.

32
848 Table 1: Description of five types of atypical (mirror-or contingency-testing) behaviours 850 frequently observed during days 3-5 after presentation of the mirror.

(a)
Dashing along the mirror: rapid dashing along the mirror surface in a single direction 852 for 10-30 cm. Fish did not swim directly against or make contact with their mirror reflection.

(b)
Dashing along the mirror with the head in contact with the mirror: the head of the fish 854 was always in contact with the mirror during dashing.
(c) Dashing and stopping: fish rapidly dashed towards the mirror reflection but stopped 856 before contact with the mirror.   Table 2. Number of atypical (mirror-testing) behaviours shown by seven fish during the 20min observation period in the first 5 days after presenting the mirror. See Table 1    914 * Time in E2 > time in E3 in fish #1, but E2 < E3 in fish #7.