Face recognition and memory in congenital amusia

Congenital amusia, commonly known as tone deafness, is a lifelong impairment of music perception and production. It remains a question of debate whether the impairments in musical domain observed in congenital amusia are paralleled in other non-musical perceptual abilities. Using behavioral measures in two experiments, the current study explored face perception and memory in congenital amusics. Both congenital amusics and matched controls performed a face perception task (Experiment 1) and an old/novel object memory task (for both faces and houses, Experiment 2). The results showed that the congenital amusic group had significantly slower reaction times than that in matched control group when identifying whether two faces presented together were the same or different. For different face-pairs, the deficit was greater for upright faces compared with inverted faces. For object memory task, the congenital amusic group also showed worse memory performance than the control group. The results of the present study suggest that the impairment attributed to congenital amusia is not only limited to music, but also extends to visual perception and visual memory domain.


Introduction
Congenital amusia is characterized as a lifelong musical impairment in pitch discrimination and recognition [1], or in beat synchronization [2]. Congenital amusia is estimated to affect about 4% of the general population in the United Kingdom [3] and 3.4% of the population in China [4]. The most recent report of prevalence of congenital amusia is 1.5% [5]. It is generally believed that the musical impairment that characterizes congenital amusia cannot be explained by prior brain damage, hearing impairment, level of education, lack of normal musical exposure or general cognitive deficits [6,7]. Previous studies showed that people with amusia have impaired recognition of fine-grained pitch changes in melodies, such as when another person sings out-of-tune [8]; they also have deficiencies in experiencing emotional prosody [9], although some of them still sensitive to musical emotion [10]. Amusics are also impaired in musical memory, singing ability, and tapping along to a beat [11]. PLOS  An important question underlying congenital amusia is whether the observed impairment is restricted to musical domains. Studies have found that congenital amusics are often impaired in the recognition of fine-grained pitch changes but not in other non-musical auditory stimuli [6,12,13], such as human voices or environmental sounds [14]. Other studies, however, have put forward the possibility that congenital amusics' deficits may extend to nonverbal sound attributes like timbre [15,16]. Amusia can also show deficits in the processing and acquisition of tonal and atonal language in early human life [4]. This suggests that individuals with congenital amusia also experience pitch deficits with language stimuli [4,[16][17][18][19][20][21]. Thus, music processing and language processing might share similar cognitive mechanisms [9,22,23].
Associations between pitch processing and visuospatial processing are controversial. A study by Douglas and Bilkey (2007) examined spatial abilities in congenital amusics and found that deficiencies went beyond the musical domain [24]. Specifically, they demonstrated that people with congenital amusia performed worse than controls on a mental rotation task. Along the same line, our recent behavioral studies showed that, compared with controls, people with congenital amusia perform worse on spatial representation and mental rotation task [25]. Similar studies also suggested that pitch processing may rely on cognitive mechanisms that are similar to those used for spatial processing [26,27], although, some other researchers failed to find an association between congenital amusia and deficits in spatial processing [28,29]. To date, investigations into whether congenital amusics have other kinds of non-musical deficiencies have been limited to the examination of language ability and spatial ability.
Recently, several studies explored the link between amusia and face recognition deficit [30][31]. For example, Paquette et al (2018) found that some tone-deafness cases also had deficit on face recognition [30]. A study found that 25% developmental prosopagnosia were impaired in fine pitch discrimination [31]. About 30% of congential amusia was diagnosed with dyslexia [32]. These studies showed that deficit might not be limited to one modality.
It seems to be reasonable to speculate that congenital amusia could have deficiency in face processing due to the similarities in underlying processes involved in both face and melodic processing. An infant study by Trehub (2001) suggested that melodic processing is processed in a holistic manner [33]. For instance, dividing well-known melodies into small segments before scrambling them impaired listeners' recognition performance significantly. This showed that the systematic arrangement of melodic elements is essential for the recognition [34]. In particular, a previous study demonstrated that meter might be a complex acousticholistic process which requires processing of sound intensity and rhythmical periodicity, while rhythm processing requires combining relationships between a series of durations into a whole [35]. With respect to face processing, a considerable number of studies have suggested that holistic grouping is also essential [36][37][38][39][40][41]. Therefore it seems that both faces and melodies are processed holistically. Thus, musical processing may very well share similar cognitive mechanisms with face processing.
The studies of neural mechanisms of congenital amusia have shown that amusia may be a connectivity disorder between an intact auditory perceptual system and the frontal cortex [42]. Stewart (2011) proposed a simplified model of melodic processing which have three processing stages, that means the impairments could took place at any or all of these processing stage [43]. For instance, this problem in connectivity shares similarity with the neural mechanism of congenital prosopagnosia (failures to develop normal face recognition) [44]. Both deficits represent altered connectivity between an intact core perceptual system and the frontal cortices [45]. Both music and face processing are complex cognitive tasks that might engage distributed networks, linking together distant cortical regions. When the connectivity fails, selective cognitive impairments arise. Since a music and face processing share similar neural mechanisms, it should be worthwhile to examine whether people with congenital amusia would have a deficit in face perception.
In addition to perception, memory deficiencies have also been demonstrated in amusics. Findings of several studies on deficiencies in congenital amusics demonstrated brain's potential to possess pitch-specific memory [46][47][48] along with timbre memory [15]. The memory tasks in these studies all used auditory stimuli and required participants to complete a recognition task by judging whether a pair of stimuli were identical or different based on certain auditory characteristics (pitch, tone, timbre, or verbal material). Similar to perceptual deficit, it would be important to determine whether the potential memory deficits identified in amusics represent general memory deficits independent of modality of the sensory input. To our knowledge, little research has been undertaken to explore whether this disorder is also accompanied by memory impairments outside of music domain.
The current study investigated whether people with congenital amusia exhibited deficits in face perception and face memory compared with matched controls. Such a study potentially provides new evidence as to whether congenital amusia is a domain-specific deficit. Thus, two experiments were performed to test for performance differences in visual stimuli processing between congenital amusics and matched controls. In Experiment 1, both congenital amusic participants and matched control participants were required to make a "same" or "different" judgment in a face perception task before they completed the Montreal Battery Evaluation of Amusia (MBEA). Both upright faces and inverted faces were used as stimuli. The inverted face stimuli served as control stimuli which contained the same low-level visual components. While discrimination of upright faces would require the expertise of face processing and discrimination of inverted face would demonstrate the processing of low-level visual information. In Experiment 2, a new group of amusic participants and matched controls were required to complete a recognition task (indicating old/new) for newly learned materials (pictures of faces and pictures of houses) after they completed the first three subscales of MBEA. House pictures served as control stimuli in order to assess whether the deficit (if any) would be specific to face stimuli.

Materials and methods
Participants. Fifty-four potentially congenital amusic participants who self-reported that they feared to sing in public, sing out of tune, and had difficulty keeping rhythm when singing or dancing, were recruited from Zhanjiang Normal University. All participants were assessed individually via the Montreal Battery Evaluation of Amusia (MBEA) [49]. Individuals with scores of less than 65 on the first three subtests (i.e. scale, contour, and interval subscales) were considered true congenital amusics [19]. Using these criteria, 14 congenital amusic participants were recruited from a sample of 54 self-reported amusics, formed a sample of 1 male and 13 females, with an average age of 19.9 years. Meanwhile, another 14 participants who selfreported that they were good at singing were recruited to participate in this experiment and served as the control group. Control participants (1 male and 13 females with an average age of 19.7 years.) were matched to the amusic group according to age (t(26) = 0.784,p>0.05), sex (each group have 1 male and 13 females), and years of education (all participants are undergraduate students). All participants had a normal or corrected-to-normal vision and selfreported right hand dominance.This study has been approved by the IRB School of Teacher Education at Huzhou Normal University. All participants gave written informed consent and were paid 20 Yuan for their participation.
Stimuli. The Montreal Battery of Evaluation of Amusia and face perception tasks were compiled using E-Prime 1.1 (Psychology Software Tools, Pittsburgh, Pennsylvania, USA). For the face perception task, 30 neutral greyscale faces of Chinese men were selected from a standard Chinese face database developed by Bai, Ma, Huang, & Luo (2005) [50]. The stimuli were displayed on a 17-inch SVGA monitor with a resolution of 1024 × 768 pixels, and each face was displayed in 160 × 240 pixels, resulting in a visual angle of 2.05º × 3.08º at a viewing distance of 75 cm. Faces were simultaneously presented in pairs side by side and the participants were required to make a "same" or "different" judgment. The face perception task followed a 2 (Group: amusic group, control) × 2 (Match: matched pair and non-matched pair) × 2 (Orientation: upright face, inverted face) design. Both self-reported amusic participants and matched controls completed the face perception task before completing the MBEA (30items in each subscale) where both accuracy and reaction time were recorded.
Procedure. For face perception task, each trial began with a black central fixation point comprised of a "+" symbol that was shown for 1 second. Following this, two faces were presented on the screen at the same time, one on the left and one on the right. The faces remained on the screen until the participant made a "same" or "different" response as quickly and as accurately as possible by pressing the "1" key (with the right hand index-finger) when the two faces were identical and pressing the "2" key (with the right hand middle-finger) when the two faces were different. After that, the text "Waiting for the Next Trial" was presented for 1 second. The reaction time and the error rate were recorded. For each participant, the practice trial consisted of 20 face pairs and the formal experiment consisted of 80 face pairs. Both the amusic and control groups completed the subtests of the MBEA that measured scale, contour, interval, rhythm, meter, and memory.

Results
Trials with RTs larger than two standard deviations from each group's mean or less than 200 ms (4.55% of trials for the congenital amusic group, 5.17% of trials for matched controls) were excluded from the analysis. Only the mean RTs for correct response trials were analyzed. There was no significant correlation between RT and error rate (r = 0.012, p = 0.861), suggesting that there was no speed/accuracy trade-off.
Accuracy scores for each subtest of the MBEA (scale, contour, interval, rhythm, meter, memory) were significantly lower for the amusic group compared with the control group:  Table 1.
The pattern of results for RT can be found in Fig 1. RTs were analyzed using a 2 × 2 × 2 mixed-effects ANOVA that was conducted with Group (amusics, normal) × Match

Discussion
In Experiment 1, the error rate did not show a significant difference between the amusic and control groups, but this was most likely due to very low error rates found in both groups. More interesting results came from reaction times which showed that, for face perception, people with congenital amusia were slower compared to the matched controls. One reason for the slower face perception in amusics could be that the impairment extends to non-musical stimuli.
A more interesting possibility could be that this deficit represents a specific deficit for complex and over-learned stimuli like face. Human faces share some basic features, composed of eyes, noses, and mouths whose perception has been found to be mainly dependent on holistic information or configurational representation [38][39][40][41]. Music is an acoustic signal that requires people to segment, interpret and anticipate elements that unfold over time. Due to the nature of the information processing, face and music processing might share a similar mechanism for holistic processing. Therefore, it is reasonable to speculate that the amusic group would show some deficit for face processing. Recently a study explored the coexistence between tone-deafness and prosopagnosia and supported this speculation [30]. If slower perception in the amusic group represents a deficit in holistic processing, the deficit should be greater for upright faces compared with inverted faces. When processing upright faces, normal controls might use holistic information while amusic participants might use feature-based processing. Holistic processing is believed to be faster than processing featurebased information [36]. However, for inverted faces, it has been suggested the processing is achieved through feature consolidation [51]. Therefore, for inverted faces, one would not expect much difference between the amusic and control groups. Our data showed a clear trend for the discrepancy in group differences for upright vs inverted faces, but the group difference was more pronounced in non-matched faces (see right panel of Fig 1). For non-matched faces, our results showed a greater difference between amusic and control group in upright vs inverted faces. There was a lack of statistically significant difference for the interaction between group and orientation when we ran an ANOVA with all the data included. However, when we conducted a separate ANOVA for nonmatched face condition only (corresponding to the data in the right side panel of Fig 1), the interaction between group and orientation was significant (F(1,26) = 4.86, p<0.05). Non-matched faces might offer a more sensitive test as the reaction time in that condition might contain less noise as participants could produce a response as soon as they identified any feature differences. For matched faces, however, participants had to compare all of the features between two faces before they could reach a decision.
Although for non-matching faces, there was a group difference between upright and inverted faces, suggesting that the deficit in amusia was in the holistic processing of faces, no such pattern was seen in matched faces, in which the amusic group showed a deficit regardless of the face orientation. This suggests a general perceptual deficit in amusics. The fact that a comparable magnitude of deficit was seen in both upright and inverted faces suggests that this was not due to any lack of statistical power in the experiment. Rather, there was a robust group difference regardless of the orientation.
Given the deficit in amusia at the perceptual level, it would be necessary to examine other possible deficiencies in non-music domains. A second question related to the generalization of congenital amusics' deficits is whether the deficits extend to face memory. Experiment 2 explored whether a difference between congenital amusics and normal controls exists in face memory.

Materials and methods
Participants. Another 14 new amusics (12 females, 2 males ages 18-22, with average age of 20.1) and 14 new matched controls (12 females, 2 males, ages 18-22, with average age of 19.5, took part in Experiment 2. All participants had normal or corrected-to-normal vision. Both amusics and controls were undergraduates without formal musical training and none of them had any previous neurological or psychiatric history. This study has been approved by the IRB School of Teacher Education at Huzhou Normal University.All participants were selfreported right hand dominant. Informed written consent was obtained from all participants, and each participant received 20 Yuan RMB for his or her participation. All participants were assessed with the first three subtests of the MBEA [50]. Participants whose scores were lower than 65 for the first three subtests (scale, contour, and interval) were considered as congenital amusics [22] and the amusic participants scored significantly below the matched control participants in these three subtests (t(26) = -11.40, p < 0.001, Cohen's d =  Table 2).
Stimuli. The experiment comprised of a learning phase and a subsequent test phase. For the learning phase, 20 faces (10 male and 10 female) were selected from the standard Chinese faces database [50], and 20 houses (10 houses and 10 apartments) were selected from the Park Aging Mind Laboratory (http://agingmind.utdallas.edu/other-stimulus/). In the test phase, besides these old 20 faces and 20 houses, another 20 novel faces and 20 novel houses were selected. All the stimuli were displayed on a 19-inch SVGA monitor with a resolution of 1024 × 768 pixels at a viewing distance of 75 cm. Each face was 260 × 300 pixels (visual angle 4.07º × 3.51º) and each house was 320 × 240 (visual angle 5.00º × 2.81º).
Procedure. In the learning phase, the participants were presented with 20 faces and 20 houses in randomized order. Before the learning phase began, participants were informed that they would be tested on their memory of the studied faces and houses following the learning phase. Each trial began with a black central fixation "+" symbol, shown for 1 sec., and was then followed by a picture (a face or house) displayed for 2 sec. After 3 minutes break following the Table 2. Scores in the amusia and control groups tested using the montreal battery of evaluation of amusia Experiment 2.

Group
Accuracy, mean (SD) learning phase, participants performed an old/new judgement. Participant made a response by pressing the "1" key (with the dominant right hand index finger) if the picture had been previously presented in the study phase (the "old" condition) or pressing the "2" key (with the dominant right hand middle finger) if the picture was novel (the new" condition). The experiment displayed 40 pictures in the learning phase and 80 pictures (40 old and 40 new) in the test phase. All the participants were instructed to respond as accurately as possible.

Results
The "old" and "new" responses were collected and combined into the sensitivity index of dprime (hit rates were the correct answers to the novel responses; false alarm rates were the wrong answers to the old responses). Hit and false alarm rates of 0 or 1 were adjusted to eliminate infinite z values. Large d-prime values indicate greater discriminability. A mixed-model two-way repeated measure ANOVA of d-prime with group (congenital amusics or controls) as a between-subjects factor and object type (face or house) as a withinsubjects factor indicated a main effect of group (F(1,26) = 10.13, p = 0.004, η 2 = 0.28), with lower d-prime for the congenital amusic groups (M = 1.02, SD = 0.13) compared with controls (M = 1.60, SD = 0.13) (see Fig 2), as well as a main effect of object type (F(1,26) = 8.69, p = 0.007, η 2 = 0.25), with lower d-prime for houses (M = 1.13, SD = 0.13) compared to faces (M = 1.48, SD = 0.09) (see Fig 2).

Discussion
The present experiment investigated face memory of a congenital amusic group and a matched control group. The results showed that object discriminability (for both faces and houses) was lower among the amusic participants compared with the matched controls. This supports the hypothesis that the deficiencies of congenital amusics extend to other non-musical domains, including visual memory of faces and house.
The results also demonstrated that, for both congenital amusics and matched controls, the performance for recognizing faces was better than that for recognizing houses, suggesting better memory for faces than houses. Prior reports have provided evidence that faces are distinct from other objects and that configurational information is critical in recognizing faces [51,52]. Additionally, face perception is a fundamental skill for daily communication in human society, and most people are remarkably proficient when recognizing faces [53][54][55].

General discussion
For both upright and inverted faces in matched pairs, reaction times for the amusic group were significantly slower than those for the matched controls. In addition, people with congenital amusia had lower performance than the control group for both face and house memory. These findings showed that the amusic participants not only had deficiencies in their musical ability (accuracy and reaction times on the MBEA), but also extend to the visual domain (face and house). There exists the possibility that the impairment of congenital amusia is not only exhibited in the musical domain but also is extended to the non-musical domains of face and object processing.
Furthermore, the findings from non-matched faces in Experiment 1 are in line with the possibility that music processing and face processing might share a common neural mechanism. Similarly, voice, which has been called the "auditory face" also has an inversion effect in an inverted audio paradigm, where participants have a more difficult time recognizing speech whose frequency or verbal dialogue was inverted. These findings are analogous to those of the classical face inversion effect, where individuals have a more difficult time processing and recognizing inverted over upright faces [56].
While previous studies showed that impairments of congenital amusia was limited to music domain [6,12,13,43], our results were more in line with the hypothesis of cross-domain deficits. Such shared deficits across several neurodevelopmental disorders were reported. Several tone-deafness cases also had the disorder of face recognition [30] and 25% developmental prosopagnosia were impaired in fine pitch discrimination [31]. These researches showed that deficit in one modality was not necessarily limited to that modality. A four processing stages of melodic processing model was proposed by Stewart (2011), each processing stage were related to difference brain cortices, such as primary auditory cortex, secondary auditory cortex, inferior frontal gyrus. Inferior frontal gyrus in particular might be heavily involved in processing other modality information [43]. The neurobiology of amusia showed that the connectivity between the right inferior frontal gyrus and the superior temporal gyrus are reduced [57].
There is evidence that people who listen to music activate a complex bilateral network of temporal, frontal, parietal, and subcortical brain areas, with the activation of the temporal lobes being highly involved in the perception of music [58][59][60]. The neuropsychological and neuroimaging literature has highlighted the fact that face processing produces additional activation of the temporal regions [61,62], while studies of brain injuries have provided information about cases of impaired face processing, usually following bilateral occipito-temporal damage [38]. These findings suggest the possibility of an overlapping neural mechanism for face and music processing that is associated with activity in the temporal regions.
Another study also found that object recognition tasks elicited activation primarily in the left occipito-temporal region [63]. However, for processing of man-made objects, such as tools, activation was also found in the left posterior middle temporal cortex [61]. Neuroimaging studies have also consistently indicated that object processing involves the lateral occipital complex (LOC) [64,65]. From these studies, a hypothesis could be proposed that music processing and object processing share a similar mechanism involving the temporal cortex. In the current study, the amusics had worse performance than the controls on the face memory task. A possible explanation, which would need to be assessed in detail in future studies, could be that the deficits associated with congenital amusia may be reflected both in music processing and in face memory. Neuroimaging studies revealing differential activations in the temporal lobes of amusic participants compared with controls across different musical and nonmusical recognition tasks could also be performed in order to create a more unified neurocognitive model of congenital amusia.
Poor performance in face perception and memory in people with congenital amusia do not support the modularity of the mind hypothesis, which is the view that certain specialized domains of cognition are functionally distinct from each other. Although recent studies have suggested the domain specificity of congenital amusia, this is still disputed, and many researchers have suggested that the amusia might involve some visual processing deficits, such as face recognition and face memory. To that effect, some work has shown that amusia is not restricted to music, but is also related to language processing and spatial processing [24,[66][67][68]. For confirmation purposes, further studies should explore whether congenital amusia extends to additional aspects of face processing and face memory.
The current results suggest a link between congenital amusia and deficits in processing of visual object (face perception and face memory). The results suggested that amusics' deficits might not be restricted to musical domains. It is important to point out, that congenital amusia might be divided into several subtypes, such as vocal amusia, instrumental amnesia, musical agraphia, musical amnesia, disorders of rhythm, receptive amusia [69], expressive amusia [70] and musical anhedonia [71]. Prior studies have shown that expressive amusics have no problem in perceiving pitch, but are poor singers when singing familiar melodies [72] or when imitating unfamiliar pitch patterns [73]. Althought MBEA is currently widely used for the diagnosis of congenital amusia, some studies have shown that using the result of the first three subtests (scale, contour, and interval) to examine the performance of pitch processing is not sufficient to discern congenital amusia from the other aforementioned amusia subcategories. The present studies didn't control the background in musical education and abilities of memory for both groups. It is not clear which categories our amusic participants belonged; a more discriminatory diagnostic instrument should be developed and more confound variables should be controlled. In broader issues, congenital amusia has been reported to correlate with general cognitive deficits [74][75][76]. Specifically, their results showed that music perception ability was associated with other cognitive abilities, such as attention, memory, or executive function, which may contribute to the appearance of overlapping processing deficits. Finally, the participants in the current study were mostly young women. Future studies should recruit an equal number of male and female participants to observe any potential sex differences existing for those with congenital amusia.

Conclusion
In summary, the findings of this study suggest that there might be a new aspect of the congenital amusia impairment, i.e., a reduced face perception and face memory abilities in congenital amusics compared to a control group. In addition, the findings provide some tentative evidence that congenital amusia impairment might be extended to the visual processing domain.