Figures
Abstract
Unbalanced bipolar stimulation, delivered using charge balanced pulses, was used to produce “Phantom stimulation”, stimulation beyond the most apical contact of a cochlear implant’s electrode array. The Phantom channel was allocated audio frequencies below 300Hz in a speech coding strategy, conveying energy some two octaves lower than the clinical strategy and hence delivering the fundamental frequency of speech and of many musical tones. A group of 12 Advanced Bionics cochlear implant recipients took part in a chronic study investigating the fitting of the Phantom strategy and speech and music perception when using Phantom. The evaluation of speech in noise was performed immediately after fitting Phantom for the first time (Session 1) and after one month of take-home experience (Session 2). A repeated measures of analysis of variance (ANOVA) within factors strategy (Clinical, Phantom) and interaction time (Session 1, Session 2) revealed a significant effect for the interaction time and strategy. Phantom obtained a significant improvement in speech intelligibility after one month of use. Furthermore, a trend towards a better performance with Phantom (48%) with respect to F120 (37%) after 1 month of use failed to reach significance after type 1 error correction. Questionnaire results show a preference for Phantom when listening to music, likely driven by an improved balance between high and low frequencies.
Citation: Nogueira W, Litvak LM, Saoji AA, Büchner A (2015) Design and Evaluation of a Cochlear Implant Strategy Based on a “Phantom” Channel. PLoS ONE 10(3): e0120148. https://doi.org/10.1371/journal.pone.0120148
Academic Editor: Bernd Sokolowski, University of South Florida, UNITED STATES
Received: July 8, 2014; Accepted: January 19, 2015; Published: March 25, 2015
Copyright: © 2015 Nogueira et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited
Data Availability: All relevant data are within the paper.
Funding: This work was supported by the DFG Cluster of Excellence EXC 1077/1 “Hearing4all”. Advanced Bionics LLC, provided support in the form of salaries for authors [Leonid M. Litvak and Aniket A. Saoji], but did not have any additional role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript. The specific roles of these authors are articulated in the “author contributions” section.
Competing interests: We would like to mention the following conflicts of interest: Leonid M. Litvak and Aniket A. Saoji are employees of the cochlear implant manufacturer Advanced Bionics LLC. Waldo Nogueira is a former employee of the same company. This does not alter the authors’ adherence to PLOS ONE policies on sharing data and materials.
Introduction
The minimum acceptable telephone bandwidth specified in the Comité Consultatif International Télégraphique et Teléphonique (CCITT) requires lower and upper cutoff frequencies of 300 Hz and 3,400 Hz respectively [1]. This bandwidth was determined using subjective listening tests. However, listening experiments have shown that an increase of the acoustic bandwidth significantly improved not only the perceived speech quality but also speech intelligibility [2, 3].
Cochlear implant (CI) processors from Advanced Bionics have been designed to only encode spectral information above 250 Hz, mimicking the bandwidth used in telephony. Sound processors manufactured by Cochlear only transmit signals above 188 Hz in their default configuration. MED-EL processors encode down to 100 Hz by default, extendable to 70 Hz. Recent evidence in the CI field suggests also that speech cues provided by frequencies below 300 Hz can improve implant outcomes [4, 5]. In combined electrical and acoustic hearing (EAS), a CI and a hearing instrument (HI) are used in the same ear, with the HI typically amplifying the residual low-frequency acoustic hearing. It has been shown that with EAS subjects the addition of low frequency acoustic stimulation often enhances speech understanding [6]. These findings, along with recently reported ways to encode low frequency information through electric stimulation [7, 8] motivated this work. We present an implementation and clinical validation of a sound coding strategy that transmits low frequency information for unilateral CIs.
One can convey low frequency information simply by extending the lowest cut-off frequency associated with the most apical electrode contact. However, the place-pitch percept for the most apical electrode contact will be higher than the frequency information mapped to it [7, 9]. For example Marel et. al [10] have shown that the HiFocus1J obtains a mean insertion depth of 480 degrees which corresponds approximately with a frequency of 480 Hz [11]. Adding additional low frequency information to the most apical electrode will result in further deviations from the subject’s spiral ganglion map. Some studies [12–14] suggest that implant users can adapt over time to spectrally shifted speech-frequency mappings. However, correcting the allocation of acoustic components to individual electrode contacts may be important for music and indeed for improved speech recognition [15–19].
Typically, monopolar electrode coupling is used to deliver electrical stimulation to the auditory neurons in today’s CI systems. Current flows between a primary intracochlear electrode and a remote extracochlear ground contact. For simultaneous current steering, stimulation is delivered to an adjacent pair of contacts using the same phase [20]. Shaping of the electrical field can be achieved by applying reverse phase compensating currents to the neighboring contacts. More recently, Saoji &; Litvak [7] presented a technique first reported by Wilson [21] based on biphasic pulses presented in partial bipolar mode. Saoji &; Litvak called this Phantom electrode stimulation. Phantom stimulation is illustrated in Fig. 1. Here a cathodic-anodic biphasic pulse was presented on the primary (apical) electrode contact 1, while a reverse phase pulse (anodic-cathodic) was presented on the compensating contact 2. The ratio of the current between the compensating and the primary electrode is termed σ. When σ = 0, all stimulation is on the primary electrode resulting in monopolar stimulation mode. When σ = 1, stimulation is equal between the primary and compensating electrode contact resulting in bipolar stimulation mode. When a partial return current (e.g., σ = 0.625) is presented on the neighboring basal contact, the spread of electrical excitation toward the high-frequency basal end of the cochlea is reduced [22], moving the electrical excitation more apically (Fig. 2). When Saoji &; Litvak [7] applied this technique to contacts in the middle of the electrode array, 10 Advanced Bionics cochlear implant users reported a lowering of pitch perception equivalent to 0.5 to 2 electrodes when compared to monopolar stimulation.
Here, the amount of current compensation is σ = 0.625.
The electrical field is simulated using triangular functions and assuming linear superposition of the electrical field produced by each electrode. The center of masses of the electrical field is assumed to be related to the pitch elicited by the stimulation. Using Phantom stimulation it is possible to push the electrical field away from the most apical electrode.
It has been suggested that accurate pitch perception may depend on a match between place and temporal cues, and that the mismatch between these two cues in CIs may limit discrimination performance in CI listeners [23–25]. The Advanced Bionics CI systems use the HiFocus1J electrode [26] which is typically inserted approximately 1.25 turns [10]. Using Phantom stimulation the insertion depth can virtually be increased by about 0.5 to 2 electrodes [7], which represents about 0.5 to 2 mm of additional insertion depth. In MED-EL CI systems the Standard and the FLEXSOFT electrode arrays are typically inserted approximately 1.75 to 2 full turns into the cochlea [27, 28]. Comparing the Standard and FLEXSOFT electrode arrays with the HiFocus1J, the most apical 1J electrode coincides approximately with the 3rd and 4th most apical electrode contacts of the MED-EL Standard electrode array. Using these long arrays Schatzer et al. [25] investigated pitch perception in a group of CI users with normal hearing in their non-implanted ear. They asked CI users to match rates of unmondulated pulse trains presented on one of their most apical six electrodes to pure tones at frequencies ranging from 100-450 Hz presented on their normal hearing ear. They found reliable electrical pitch percepts when rate and electrode place of stimulation were reasonably matched. Most subjects achieved pitch matches to pure tones up to 300 Hz only on electrodes at insertion angles larger than 360 degrees. Based on these findings they suggest that coding strategies that aim at representing low-frequency temporal fine structure via pulse rate modulations should map those fine-structure channels to electrodes placed in the second turn of the cochlea. However, it has also been shown [26, 29] that deeper insertion can lead to more insertion trauma increasing the possibility of some apical contacts translocating from scala media into scala vestibuli, with a negative impact on speech perception score. Phantom permits stimulation of the auditory nerve beyond the most apical electrode without using deeper electrode insertions, and therefore can be used to encode a lower frequency and to extend the range of stimulation sites and represented frequencies.
Using MED-EL Standard electrode arrays, Arnolder et al. [30] found that speech intelligibility can be improved increasing the electrode distance, and therefore decreasing channel interaction, in the apical part of the cochlea. It has been shown that Phantom stimulation produces a narrower electrical field than monopolar stimulation [22], and thus Phantom might be able to transmit electrical stimulation in the apical part of the cochlea more effectively.
Although some CI users can make use of temporal information at relatively high rate [31], the majority cannot perceive any difference for temporal modulations above 300 Hz [32, 33]. However, this rate limitation does not preclude coding of temporal envelope cues up to approximately 300 Hz, where rate or modulation rate of electrical pulse trains can be used to convey a percept of pitch [34]. Additionally some strategies like the FSP/FSP4 strategies used in MED-EL devices are intended for providing temporal fine structure (TFS) by using stimulations at the 1-4 most apical electrodes that are elicited at a variable rate that corresponds to the fine structure of the signal in the frequency range from 100 to 710Hz. It has been shown that transmitting fine structure in the low frequencies enhances speech perception in noise [35]. The Phantom channel is designed to convey temporal information on the most apical region of the cochlea. The idea is to transmit temporal fluctuations using high stimulation rates of around 1000 pulses per second to encode low frequency sounds from 65 Hz to 300 Hz.
Another pitch lowering technique has recently been proposed by Macherey et al [8]. Here pseudomonophasic pulses, consisting of short, high amplitude phase followed by a longer much lower amplitude opposite-polarity phase, are presented in bipolar mode. This work showed that rate pitch could be extended beyond the 300 Hz limit of monopolar stimulation when using pseudomonophasic pulses delivered to the most apical electrode contacts likely since neurons can phase lock to higher rates when neural information is originated from more apical sites of the cochlea. They also hypothesized that more focused stimulation in the cochlea, as provided by bipolar mode, is needed to convey better temporal information. Despite the above work, no sound coding strategy has yet been implemented with pseudomonophasic pulses in a commercial CI sound processor to allow chronic evaluation of speech intelligibility and music perception.
This study evaluates a new sound processing strategy that uses an additional Phantom channel to convey low frequency information in a slightly more apical region. The goals of this work were first, to investigate the applicability of such a strategy in a clinical sound processor; second, to investigate the fitting of the strategy; and third, to investigate whether the additional low frequency channel provided better speech intelligibility and sound quality than the clinical strategy.
Materials and Methods
The Phantom strategy is based on the HiRes with Fidelity 120 (F120) strategy [36] from Advanced Bionics but adds an additional low frequency channel. To represent this new channel, a virtual electrode is created using partial bipolar stimulation (Phantom). Fig. 2 (left) shows the primary stimulating current being delivered from electrode contact 1. The electrical field produced by each electrode is modelled using a triangular function. In Fig. 2b an additional smaller compensating current with opposite phase is delivered from the adjacent contact 2. We assumed linear superposition of the individual electrical fields to simulate the overall electrical field created by Phantom stimulation. Using this simple model, it can be observed that the compensating current makes the working phase of the primary current less effective on the compensating (basal) side, resulting in an apical shift in field and hence neural recruitment. The Phantom channel thus provides a lower pitch sensation than that of the most apical electrode contact alone, making this channel suitable to convey low frequency information.
Fig. 3 presents the basic block diagram of the Phantom strategy. The microphone signal is first digitized using a sampling frequency Fs of 17400 samples per second. Next the front-end implements a dual-action automatic-gain control (AGC) to remove the large variations in the acoustic environment [37, 38]. The resulting signal is sent through a filter bank based on a Fast Fourier Transform (FFT) of length L = 256 samples. The linearly spaced FFT bins then are grouped into 16 bands. Table 1 presents the number of FFT bins assigned to each analysis band and its associated center frequency. For each analysis band, the Hilbert envelope is computed from that channel’s FFT bins [36]. Non-linear amplification is used to compress the output of the envelope detector into the range between the threshold (T) level and most comfortable (M) level of a recipient’s electrical hearing range using a logarithmic compression function. For the lowest frequency analysis band, the envelope is used to amplitude modulate a pair of partial bipolar biphasic pulses like the one presented in the left side of Fig. 2. The partial bipolar channel is configured with a fixed value of σ that is set during the fitting session. The low frequency channel provides mostly temporal information because the spectral bandwidth associated with this channel is relatively large and no adaptive current steering is applied to this channel.
The Phantom strategy incorporates a low frequency analysis band used to deliver information to a partial bipolar (Phantom) channel. The rest of the analysis bands are exactly the same as in the commercial F120 strategy.
z is the band number, Nz is the number of bins in the zth band and fcenter is the center frequency in Hertz.
For the remaining analysis bands, the standard F120 strategy processing is used [36]. For each analysis band the Hilbert Envelope is computed from that channel’s FFT bins (Table 1). Additionally, in order to improve the spectral resolution of the audio signal analysis, an interpolation is performed, based on a spectral peak locator within each analysis band. The spectral peak locator estimates the most important frequency in each analysis channel. A frequency weight map converts this frequency into a current weighting and carrier synthesis for the current steered electrode pair in each channel (see [36] for more details). For each stimulation cycle, the electrode pair associated with one analysis band is stimulated simultaneously. However, different channels are stimulated sequentially in order to reduce undesired channel interactions. Furthermore, the order of stimulation is selected to maximize the distance between stimulation pairs to further reduce channel interaction. Fig. 4 illustrates the stimulation pattern for a complete Phantom strategy stimulation cycle. Because partial bipolar stimulation requires a larger charge per phase to produce the equivalent loudness of monopolar stimulation, a longer phase duration is allocated to reduce the risk of stimulating at levels that are out of compliance. The Phantom phase duration was configured to be 6 times longer than that for the remaining electrode contacts, thus reducing the overall stimulation rate by 29% in comparison to a F120 strategy. No significant reduction in speech perception was expected from this rate reduction [39] given the relatively large stimulation rates used by the clinical strategies of the subjects (Table 2).
CSR is the Channel Stimulation Rate for one electrode. In the Phantom strategy, the electrode contacts 1 and 2 are stimulated simultaneously but out of phase prior to stimulating the rest of analysis bands using the standard in phase current steering technique.
The only difference between the Phantom and the subject’s clinical strategy (F120) is the addition of the low frequency channel (transmitted through Phantom stimulation) and the consequent reduction of stimulation rate. All other aspects of the strategy remain the same for both strategies.
Subjects
The study protocol was reviewed and approved by a registered board (Freiburger Ethik-Komission International). Only adult CI users participated in the study. After explanation of the study protocol and the risks and benefits of participating, all subjects signed a consent form before participating. The consent form was approved by the ethics board.
12 postlingually deaf German speaking users of the Advanced Bionics CII or HiRes90k implants and the F120 strategy participated in the study. Only Advanced Bionics devices were used because of the need to simultaneously stimulate several electrodes in or out of phase, requiring multiple independent current sources. All subjects were postlingually deafened users and were able to score at ceiling on the HSM [40] sentence test delivered without background noise. All subjects had experience in previous clinical evaluations. Subject demographics, including age, duration of deafness and implant experience can be found in Table 2.
Study Design
The Phantom and the commercial F120 strategy were evaluated in two sessions. In session one, impedances were measured for the 16 electrode contacts using a phase duration of 32.32 msec/phase. The amplitude of the test pulse was 16 μAmperes. The impedance values were used to set the upper limit of the current (μAmperes) that could be used to stimulate individual electrodes, assuming a maximum compliance voltage of 8 V for each current source. Also, a maximum charge density safety limit of 100 μC/cm2 was used to ensure that the upper limit of the current (µAmperes) used to stimulate individual electrodes was within the safety limits applied for research studies by the United States Food and Drug Administration.
The Phantom strategy was fitted and stored on a Harmony research processor together with the F120 strategy. Next, speech tests based on the HSM sentence test were conducted to assess speech intelligibility with both strategies. The participants were instructed to use the Phantom strategy during the following 4 weeks.
Fitting
The BEPSnet software (Advanced Bionics LLC) was used to fit both the F120 and Phantom strategies. The Phantom strategy was fitted based on the F120 clinical map, with only the fitting for the Phantom channel being different. Pulse trains at the same rate as used for the remaining channels, but with a phase duration 6 times longer, were delivered using partial bipolar stimulation to the two most apical electrode contacts. The Phantom channel fitting required two stages: 1) the most comfortable level for the Phantom channel (Mph) in μAmperes was found and 2) the maximum value of σ that elicited a lower pitch sensation than electrode contact 1 in monopolar mode was found. Initially, Phantom was configured with a value of σ set to 0.625 because this value should be the one eliciting the lowest pitch perception while at the same time minimizing the risk of pitch reversal [7]. Given this value of σ, the initial M level on the Phantom channel (Mph) was set using the following empirical equation: (1) , where MEL1 denotes the M level on electrode 1 and N denotes the pulse width increase factor for the Phantom channel with respect to the remaining channels. Here N was always set to 6. Using σ = 0.625 means that 62.5% of the current flows from the primary electrode contact to the compensating contact and thus, does not contribute to loudness. Next, the level on the Phantom channel was modified until the subject perceived it to produce the same loudness sensation as electrode contact 1 stimulated at MEL1 in monopolar mode.
Later the subject was asked whether Phantom sounded lower in pitch than electrode contact 1 (using monopolar stimulation). The BEPSnet software allowed us to change contact 1 back and forth between Phantom and monopolar mode. The subject was asked to identify the lowest pitch sensation from the two stimuli (electrode 1: monopolar or Phantom). This task was repeated four times in a randomized order with the subject blind to the stimulation mode. If a subject selected Phantom to sound lower in pitch 100% of the time, that value of σ was allocated. Otherwise the value of σ was reduced to 0.5, the Mph was reestimated and the experiment repeated until Phantom was perceived to sound lower in pitch than electrode 1 in 100% of the trials. Finally, the strategy was switched on and the subject was asked about the sound balance produced by Phantom. If a sound quality was too much dominated by low frequency, the Mph was slightly reduced.
Speech Tests
Each condition was evaluated using two lists of the HSM-Sentence test [40]. Sentences were mixed with speech shaped CCITT noise (according to ITU-T Rec. G. 227 11/88 Conventional telephone signal) [41]. The Signal-to-Noise-Ratio was selected individually depending upon the performance of each individual subject. Testing was conducted in a sound treated room. The HSM sentences were played through a loudspeaker placed at 0 degrees azimuth using a presentation level of 65 dB(A). The distance between the participant and the loudspeaker was 1 meter. The HSM sentence test uses balanced lists of 20 sentences with a fixed SNR. The performance is measured in% of correct words correctly repeated. After completion of the chronic phase (4 weeks), the subject returned for a second session, at which time two HSM lists were presented with both programs (Phantom and F120).
Music Perception
At the end of the first session, the CI users listened to different musical pieces and were asked about their impression of music while using Phantom and F120. The first music piece was composed of 3 sentences of the Orchestersuite Nr. 2 H-Moll (BWV 1067) of Johann Sebastian Bach, where the section without Basso Continuo was removed (it was assumed that this passage should sound very similar using both strategies). The second music piece was Serendepity by the Jazz/Fussion-Bass player Tal Wilkenfeld, which contains a strong rhythm and a lot of low frequency content. The music pieces were presented through a standard loudspeaker and a subwoofer at 0 degrees azymuth to ensure that low frequency components were properly transmitted. The subwoofer was callibrated to present the music pieces at the same level of 60 dB SPL as the other loudspeaker at a 1 meter distance. For the second session, the selection of the music pieces was added to with the music piece “We only get what we give” from the “New Radicals”. This was a typical song from the Rock/Pop genre and the music piece contains vocals. Using these music pieces we developed a music questionnaire, based on the questionnaire from [42], that allowed for a direct comparison of the two programs. The questionnaire was filled only at session two. The following questions were asked, while the music piece between brackets was played:
- How easy is to follow music? (Music piece: New Radicals)
- How natural does the music sound? (Music piece: Bach)
- How good/natural is the tonal balance of the music? (Music piece: Tal Wilkenfeld)
- What is your overall impression of music? (All music pieces)
- With which program do you prefer to listen to music?
Results
Fitting of Phantom Channel
The fitting of the Phantom strategy was successful for all participants, meaning that all subjects were able to use Phantom in their daily life during a 4 week period. With the initial setting of σ = 0.625, 9 of 12 participants found that the Phantom channel sounded lower in pitch than electrode 1 in monopolar mode. The value of σ and the corresponding M level for each participant are presented in Table 3. For two subjects, we hypothesized that the Phantom channel caused a pitch reversal and for this reason, the value of σ was reduced. Subject ID 4 could not perceive a lower pitch sensation when using Phantom with any value of σ and therefore was eliminated from the study. This case shows that there may be a small proportion of patients for which Phantom is inappropriate as a strategy. The ratio between the M level on Phantom Mph and the M level on electrode 1 for each participant is presented in Fig. 5. The description of the sound produced by Phantom was very different among participants. One subject could not perceive any difference between the two programs, while the remainder were surprised by the dominant low frequency sound. For example, participant ID 1, who was fitted with a very high M level ratio between electrode 1 and Phantom, described the sound to be massively dominated by the low frequencies. For the remaining participants, the M level on the Phantom channel was slightly reduced after switching on the strategy, in order to get a better balance between high and low frequencies. From the informal music test at the end of the first test session, most of the subjects reported that they liked the bassy sound produced by the Phantom strategy to listen to music.
Study participants 3, 9, 11 and 12 were refitted during the second session reducing the Mph level.
At the beginning of the second session, the participants were again asked about their impression when listening to both programs during the 4 week take home trial. Here again there was a divided opinion. Around half of the participants responded positively about experience with the Phantom program for speech and music. In general, they were satisfied with the improved sound quality and the more natural sound perception. 6 participants showed a strong preference for Phantom, and they reported that they would like to use this strategy as a main program. The other 6 participants showed no preference or were even dissatisfied with Phantom, mostly because the Phantom channel sounded too loud. The sound was described as unclear and with echo on their own voice. However, these subjects did not report any negative effect from Phantom during the fitting phase, probably because they were fitted in a studio room with low noise. For these subjects, the Mph levels were reduced for the Phantom electrode. After this refit (Tables 3 and 4), all these subjects obtained an improved sound quality.
Speech Understanding
Speech perception was evaluated using the HSM sentence tests in noise using both, the F120 and the Phantom strategy. The sentence test was performed at a SNR of 5, 10 or 15 dB depending on the performance of each participant. The amount of noise was selected such that the performance in% of correct words remained between 25% and 75%. Table 5 shows the SNR level at which the HSM sentence test was performed. The results for session 1 and 2 are presented in Figs. 6 and 7 respectively. Fig. 8 presents the difference in HSM speech performance between session 1 and 2 for each strategy. In the Figures, the median value is indicated by a horizontal line and the mean value is indicated by an asterisk. A repeated measures of analysis of variance (ANOVA) within factors strategy (F120, Phantom) and interaction time (Session1, Session2) revealed a significant effect for the interaction time and strategy [F(1.00) = 6.476; p = 0.029]. Post-hoc paired t-tests were type I error corrected using Bonferroni correction which requires p < 0.0125. For the first session no significant differences between F120 and Phantom were observed, the mean value for F120 and Phantom was 37.47% and 40.56% respectively (paired t-test t(10) = 0.777, p = 0.455). For the second session, no significant difference between Phantom and F120 was observed after Bonferroni correction, although it seems that there is a trend towards better performance for Phantom (48.07%) with respect to F120 (36.96%) (paired t-test t(10) = 2.449, p = 0.034) after 1 month of use. Phantom obtained a significant improvement in speech intelligibility from the first session to the second session (40.56% vs 48.07%, paired t-test t(10) = 3.270, p = 0.008). No significant differences were observed between session 1 and 2 for F120 (37.47% vs 36.96%, paired t-test t(10) = 0.163, p = 0.874).
No significant difference was observed (paired t-test p = 0.455). The horizontal line in the box indicates the median value and the asterisk indicates the mean value.
No significant difference was observed after Bonferroni correction (paired t-test p = 0.034).
A repeated measures of ANOVA within factors strategy (F120, Phantom) and interaction time (Session1, Session2) revealed a significant effect for the interaction time and strategy [F(1.00) = 6.476; p = 0.029].
Participant ID 6 obtained a remarkable improvement when using the Phantom strategy. This participant reported that using Phantom, he could perceive much better the intonation of the voices and this was giving him the possibility to understand speech better.
Music Perception
Music was assessed in a controlled comparison condition via our own questionnaire. Study participants ID1 and ID10 were not selected to conduct the music questionnaire because they were not able to perform reliable music assessments. The responses were analyzed using a wilcoxon signed-rank test to assess their significance. Fig. 9 presents the results for the question “how easy is to follow the music”. No significant difference between F120 and Phantom could be observed for the question “easy to follow” (wilcoxon signed rank test p = 0.886). Some participants reported that because they were used to the sound produced by F120, it was easier for them to follow the sound using the F120 strategy (only 4 weeks of accommodation time using Phantom). Fig. 10 presents the results for the question “how natural music sounds”. No significant difference could be observed between both strategies (wilcoxon signed rank test p = 0.091). Fig. 11 shows the results for the question about the perceived sound balance. The results for F120 are situated above the natural region, whereas Phantom was rated much lower in balance than F120. This question revealed that Phantom sounds significantly more balanced than F120 (wilcoxon signed rank test p = 0.001). Fig. 12 shows the results for the question about the overall impression of music. The overall impression of music with Phantom was rated higher than with F120. This difference was significant (wilcoxon signed rank test p = 0.037) and shows the overall preference of the CI users to listen to music using the Phantom strategy as presented in Table 6.
No significant difference was observed between F120 and Phantom (wilcoxon signed tank test = 0.886).
No significant difference could be observed between both strategies (wilcoxon signed rank test p = 0.091).
Phantom produces a significantly (wilcoxon signed rank test p = 0.01) more ballanced sound between the low and high frequencies than F120.
The Phantom strategy was rated to sound significantly better than F120 (wilcoxon signed rank test p = 0.037).
The correlation between the differences (F120-Phantom) in speech intelligibility and the responses of the music questionnaire were evaluated for statistical significance by means of Spearman’s rank order correlation coefficients. No significant relationships were found.
Discussion
This study has presented a comparison between the clinical F120 strategy and a new strategy termed Phantom that adds an additional low frequency channel. The low frequency channel transmits frequencies below 300 Hz and presents them to the auditory nerve by stimulating the two most apical electrodes using a partial bipolar configuration. The low frequency channel aims to convey mostly temporal information to a region of the cochlea that is more apical than the most apical physical electrode contact. Research in the field of combined electric and acoustic hearing (EAS) has shown that low frequency information (below 300 Hz) perceived through the residual hearing can improve speech intelligibility and sound quality in general [6]. Based on this finding we investigated whether the coding of low frequencies can be improved through electrical stimulation only. The Phantom strategy was implemented in a commercial Harmony processor which allowed us to investigate the fitting procedure, as well as the perception of speech and music in a 1 month take home trial. The participants were selected from the database of the Medical University Hannover. These users were selected because of their good performance with their cochlear implant (near 100% speech intelligibility in the HSM sentence test without background noise), allowing a meaningful subjective feedback when comparing the sound sensations with both strategies.
Fitting
The subject’s impression of Phantom while listening in quiet was primarily used to create a fitting. The fitting of the Phantom channel is challenging because small changes in the M level produce large effects on the overall sound perception. One reason for this is that the new channel is used to transmit around two additional octaves on the low frequency region which the subjects have not been able to perceive since their implantation. In this study, a fitting procedure has been proposed. First for each subject the optimal value of σ has to be determined; where optimal σ is defined as the σ that produces a maximum pitch shift. For that value of σ the corresponding M level has to be fitted. Second, the M level on Phantom is fitted such that it elicits the same loudness perception as the loudness perceived when electrode 1 is stimulated in monopolar mode at its M level. An empirical equation to match the amplitude of the Phantom channel as a function of σ has been used to set the initial value for the fitting. However, many participants reported that using this value, the Phantom strategy sounded too loud and was dominated by low frequencies when all the channels were activated. For this reason, the M level on the Phantom channel was slightly reduced to optimize the sound balance between the low and high frequencies.
Effects of Rate.
The new partial bipolar channel was configured to use a longer pulse width (6 times longer) than the rest of channels. This caused the Phantom strategy to produce a 29% slower stimulation rate in comparison to F120 (Table 2), which in theory, can affect negatively the temporal information on each electrode. This aspect needs to be further investigated in a follow-up study. This configuration was chosen in order to reduce the amount of current needed to produce a comfortable loudness sensation on the Phantom channel at the expense of reducing the overall rate of the strategy. A benefit of the longer pulse width is that the Phantom strategy did not cause any additional power consumption on the device.
Effects of increasing bandwidth.
Phantom can be considered as a 16 electrode strategy with an additional apical electrode. We observed clear differences in speech and music perception. With a correct fitting, as those used during the second session, most of the subjects perceived a fuller and more natural sound. Using Phantom, music perception was rated to sound significantly better than using the clinical strategy, most probably because the overall sound balance was rated to sound significantly more neutral when using the Phantom strategy. We think that the optimization of the fitting procedure can have a positive impact on speech intelligibility in noise as observed with participant ID 6.
One could argue that the improvements observed in Phantom are produced by the increased low frequency bandwidth transmitted by this strategy. Vermeire et al. 2010 [35] reported that an extended low frequency bandwidth does not provide with a significant improvement in speech intelligibility in noise. However, they could show a significant improvement in speech intelligibility when adding temporal fine structure in the extended low frequency spectrum as provided by the FSP strategy through very apical stimulated electrodes. Unlike the study of Vermeire et al. 2010, the Phantom strategy does not transmit temporal fine structure explicitly and all subjects participating in the Phantom study received a functionally deeper insertion than they were used to. We think that the small shift in apical stimulation provided by an additional Phantom channel might be the reason why CI users seem to obtain benefits in speech intelligibility and music perception.
Using Phantom we could not observe a significant difference in speech intelligibility with respect to F120, although the data seems to show a trend towards better performance with Phantom and some CI users obtained clear benefits from the strategy. It is possible that differences in performance are just caused by the addition of frequencies below 300 Hz. However, in a study of Vermeire et al. [35] it was reported that low frequency bandwidth does not provide with a significant improvement in speech intelligibility in noise. Actually, they could show a significant improvement in speech intelligibility when adding temporal fine structure in the low frequencies through deeply inserted electrodes. Therefore, it seems that the small shift in apical stimulation produced by Phantom, or the fact that the stimulation is provided with an extra channel might be the cause to explain the improvements observed in individual CI users.
Speech Tests
Overall the speech intelligibility performance of the participants was very good at SNRs of 5 and 10 dB. In this study, it was not possible to show a significant improvement in speech intelligibility for Phantom with respect to the commercial strategy. The HSM sentence test was presented in noise at a fixed SNR that was adapted to the performance of each participant. For each condition, 2 HSM lists were presented. For the first session, the mean scores for F120 and Phantom were 37.47% and 40.56% respectively (paired t-test t(10) = 0.777, p = 0.455). For the second session Phantom showed a trend towards an improvement in speech intelligibility with respect to the F120 strategy, however results were not statistically significant after Bonferroni correction (36.96% F120 and 48.07% Phantom t(10) = 2.449, p = 0.034). The performance with Phantom increased significantly by almost 8% after one month of use (paired t-test t(10) = 3.270, p = 0.008). We think that speech intelligibility with Phantom could be further improved by increasing the take home period because other studies have shown increasing performance in speech intelligibility even three months after using a low frequency strategy for the first time [35, 44].
Music Questionnaire
Music perception obtained by CI users is limited by the poor pitch perception obtained with these devices which causes difficulties in instrument identification, melody recognition and harmonicity [45–47]. However, results from recent studies have shown that CI users perceive music well enough for making reliable subjective comparisons [44, 48]. In this study, a novel questionnaire has been presented that allows a direct comparison between two programs. CI subjects reported enjoyment during the execution of the questionnaire. It seems that CI technology allows for music enjoyment and for this reason, we think that more effort has to be made to create new strategies specially designed to improve music perception.
Additionally, the music questionnaire has shown that the sound balance with Phantom was significantly more neutral than with the clinical F120 strategy. It appears that in general music with CIs sounds too high pitched. This result is supported by [49] who suggest to enhance M levels on the low frequencies to improve music perception in CIs. In our study, the sound balance was compensated by introducing the Phantom channel and this might explain, at least partially, the significant preference of the participants for listening to music with Phantom. However, it remains unclear whether music perception is preferred with Phantom because this strategy makes better use of the temporal pitch mechanism, which is available until least 300 Hz. Further research is needed to understand better the mechanism that provides improved music perception with Phantom.
The music questionnaire has been shown to be a successful method to assess the sound quality differences between Phantom and F120. In two categories of the music questionnaire it was possible to show a significant result. Furthermore, the execution of the music questionnaire was fluent and pleasant for the participants. We think that for future studies the use of subjective questionnaires with founded questions can be very useful for the evaluation and development of new strategies.
Outlook
The Phantom strategy introduces a new low frequency channel and extends the low frequency bandwidth transmitted. A side effect of the addition of the low frequency channel is an overall reduction of stimulation rate. According to previous studies, no significant effect on performance is expected by this moderate reduction in stimulation rate. A follow-up study should investigate which of these factors has more impact on the promising trends in performance provided by Phantom. Psychophysical experiments should give more insight about the functioning of the Phantom channel. For example, the possible benefits for pitch perception using more apical stimulation and its correlation to music and speech perception should be more deeply investigated.
The design of the Phantom strategy has to be further optimized. So far, the Phantom channel has been used to transmit low-pass filtered sound signals below 300 Hz. The cut-off frequency that gives best performance should be investigated. Another topic of research is the type of information that has to be transmitted through this channel. If the fundamental frequency would be the cue that is causing the potential improvements when using Phantom, possibly only the fundamental frequency should be transmitted through the low frequency channel. However, there might be other cues, such as amplitude modulations of the fundamental frequency, or onsets and offset of sounds that might also help perceive sound. Additionally, the Phantom channel is currently used to transmit a relatively wide bandwidth of 2 octaves. The bandwidth allocated to this channel is much larger than that allocated to the remaining channels. We hypothesize that a possible improvement for Phantom could be the creation of additional Phantom channels using different values of σ. Each Phantom channel could be used to transmit different low frequency bands to not only make use of the temporal information, but also make use of place information in this low frequency region. However, additional Phantom channels would interact with each other because of the current spread in the cochlea and this fact could smear the temporal information delivered by a unique Phantom channel. To solve this issue one could use channel compensation techniques to reduce the negative effects of channel interaction [50].
Conclusion
The clinical F120 strategy and a new strategy termed Phantom which is identical to F120 except that it includes an additional channel to transmit low frequencies were evaluated in 12 CI users in a 4 week take home trial. The fitting of the Phantom channel was crucial to obtain good sound quality. 11 out of 12 CI patients obtained a pitch perception with the partial bipolar (Phantom) channel lower than the pitch perceived when stimulating electrode 1 in monopolar mode. Speech performance with F120 and Phantom were evaluated immediately after fitting Phantom for the first time and after one month of take-home experience. No significant differences could be observed between the group mean performances with Phantom and F120. Moreover, no significant difference could be observed between both strategies for the questions “how easy is to follow music” and “how natural does the music sound”. However, the sound produced by Phantom was reported to be significantly more neutral than with F120. This result probably explains why the overall impression of music was rated higher when listening with Phantom and the significant preference for Phantom to listen to music.
Acknowledgments
The authors would like to thank the subjects who have participated in the experiments and the two anonymous reviewers for their comments on different versions of this manuscript.
Author Contributions
Conceived and designed the experiments: WN AB. Performed the experiments: WN AB. Analyzed the data: WN AB. Contributed reagents/materials/analysis tools: WN LML AS AB. Wrote the paper: WN AB.
References
- 1.
ITU (1988) 60 mhz systems on standardized 2.6 9.5 mm coaxial cable pairs. International Telecommunication Union (ITU) ITU-T Rec. G. 333.
- 2. Moore BCJ, Tan CT (2003) Perceived naturalness of spectrally distorted speech and music. Journal of the Acoustical Society of America 114: 408–419. pmid:12880052
- 3.
Vary P, Martin R (2006) Digital Speech transmission enhancement, coding and error concealment, West Sussex, England: John Wiley and Sons Ltd, ISBN-13 978-0-471-56018-9 pp. 239–240.
- 4. Dorman M, Spahr A, Loizou P, et al (2005) Acoustic simulations of combined electric and acoustic hearing (eas). Ear and Hear 26: 371–380.
- 5. Turner CW, Reiss LA, Gantz BJ (2008) Combined acoustic and electric hearing: Preserving residual acoustic hearing. Hearing Research 242: 164–171. pmid:18164883
- 6. Brown CA, Bacon SP (2009) Low-frequency speech cues and simulated electric-acoustic hearing. Journal of the Acoustical Society of America 125: 1658–1665. pmid:19275323
- 7. Saoji A, Litvak L (2010) Use of phantom electrode technique to extend the range of pitches available through a cochlear implant. Ear and Hear 31: 693–701.
- 8. Macherey O, Deeks JM, Carlyon RP (2011) Extending the limits of place and temporal pitch perception in cochlear implant users. Journal of the Association for Research in Otolaryngology 12: 233–251. pmid:21116672
- 9. Boex C, Baud L, Cosenadi G, Sigrist A, Kós M-I, et al. (2006) Acoustic to electric pitch comparisons in cochlear imlpant subjects with residual hearing. Journal of the Association for Research in Otolaryngology 7: 110–124. pmid:16450213
- 10. Marel KSVD, Briaire JJ, Wolterbeek R, Snel-bongers J, Verbist BM, et al. (2013) Diversity in Cochlear Morphology and Its Influence on Cochlear Implant Electrode Position. Ear and Hearing 35: 9–20.
- 11. Stakhovskaya O, Sridhar D, Bonham B, Leake P (2007) Frequency map for the human cochlear spiral ganglion: Implications for cochlear implants. JARO: Journal of the Association for Research in Otolaryngology 8: 220–233. pmid:17318276
- 12. Rosen S, Faulkner A, Wilkinson L (1999) Adaptation by normal hearing listeners to upward spectral shifts of speech: implications for cochlear implants. Journal of the Association for Research in Otolaryngology 106: 3629–3636.
- 13. Fu Q, Nogaki G, Galvin J (2008) Auditory training with spectrally shifted speech: implications for cochlear implant patient auditory rehabilitation. J Assoc Res Otolaryngol 6: 180–189.
- 14. Reiss LAJ, Turner C, Gantz B (2007) Changes in pitch with a cochlear implant over time. J Assoc Res Otolaryngol 8: 241–257. pmid:17347777
- 15. Skinner MW, Holden LK, Holden TA (1995) Effect of frequency boundary assignement on speech recognition with the speak speech coding strategy. Ann Otol Rhinol Laryngol Suppl 166: 307–311. pmid:7668683
- 16. Dorman M, Loizou P, Rainey D (1997) Simulating the effect of cochlear-implant electrode insertion depth on speech understanding. Journal of the Acoustical Society of America 102: 2993–2996. pmid:9373986
- 17. Dorman M, Ketten D (2003) Adaptation by a cochlear-implant patient to upward shifts in the frequency representation of speech. Ear and Hear 102: 457–460.
- 18. Baskent D, Shannon RV (2005) Interactions between cochlear implant electrode instertion depth and frequency-place mapping. Journal of the Acoustical Society of America 117: 1405–1416. pmid:15807028
- 19. Middlebrooks JC, SnyderMacherey RL (2010) Selective electrical stimulation of the auditory nerve activates a pathway specialized for high temporal acuity. J Neurosci Author manuscript 30: 2225–2236.
- 20. Koch DB, Downing M, Osberger MJ, Litvak L (2007) Using current steering to increase spectral resolution in cii and hires90k users. Ear and Hearing 28: 38S–41S. pmid:17496643
- 21.
Wilson B (1993) Chapter 2: signal processing. Tyler RS, editor Cochlear Implants: Audiological Foundations Singular Pub Group; San Diego Calif: 35–84.
- 22. Saoji A, Landsberger D, Padilla M, Litvak L (2013) Masking patterns for monopolar and phantom electrode stimulation in cochlear implants. Hearing Research 298: 109–116. pmid:23299125
- 23. Oxenham AJ, Bernstein JG, Penagos H (2004) Correct tonotopic representation is necessary for complex pitch perception. Proc Natl Acad Sci USA 101: 1421–1425. pmid:14718671
- 24. Moore BCJ, Carlyon RP (2005) Perception of pitch by people with cochlear hearing loss and by cochlear implant users. Pitch: Neural Coding and Perception, Springer 101: 234–277.
- 25. Schatzer R, Vermeire K, Visser D, Krenmayr A, Kals M, et al. (2014) Electric-acoustic pitch comparisons in single-sided-deaf cochlear implant users: Frequency-place functions and rate pitch. Hearing Research 309: 26–35. pmid:24252455
- 26. Skinner MW, Holden TA, Whiting BR, Voie AH, Brunsden B, et al. (2007) In vivo estimates of the position of advanced bionics electrode arrays in the human cochlea. Ann Otol Rhinol Laryngol Suppl 197: 2–24. pmid:17542465
- 27. Baumann U, Nobbe A (2006) The cochlear implant electrode-pitch function. Hearing Research 213: 34–42. pmid:16442249
- 28. Vermeire K, Nobbe A, Schleich P, Voormolen HM, de Heyning PHV (2008) Neural tonotopy in cochlear implants: An evaluation in unilateral cochlear implant patients with unilateral deafness and tinnitus. Hearing Research 245: 98–106. pmid:18817861
- 29. Finley C, Skinner MW (2008) Role of electrode placement as a contributor to variability in cochlear implant outcomes. Otol Neurotol 29: 920–928. pmid:18667935
- 30. Riss CAD, Baumgartner WD, Kaider A, Hamzavi JS (2007) Cochlear implant channel separation and its influence on speech perception implications for a new electrode design. Audiol Neurootol 12: 313–24. pmid:17536200
- 31. Kong YY, Carlyon RP (2010) Temporal pitch perception at high rates in cochlear implants. The Journal of the Acoustical Society of America 127: 3114–3123. pmid:21117760
- 32. Zeng FG (2002) Temporal pitch in electric hearing. Hearing Research 174: 101–106. pmid:12433401
- 33. Shannon R (1983) Multichannel electrical stimulation of the auditory nerve in man.i. basic psychophysics. Hearing Research 11: 157–189.
- 34. McKay CM, McDermott HJ, Clark GM (1994) Pitch percepts associated with amplitude-modulated current pulse trains in cochlear implatees. Journal of the Acoustical Society of America 96: 2664–2673. pmid:7983272
- 35. Vermeire K, Punte AK, de Heyning PV (2010) Better speech recognition in noise with the fine structure processing coding strategy. ORL J Otorhinolaryngol Relat Spec 72: 26–35.
- 36. Nogueira W, Litvak LM, Edler B, Ostermann J, Buechner A (2009) Signal processing strategies for cochlear implants using current steering. EURASIP Journal on Advances in Signal Processing 2009 ID 531213.
- 37. Moore BCJ, Glasberg BR, Stone MA (1991) Optimization of a slow-acting automatic gain control system for use in hearing aids. Br J Audiol 25: 171–182. pmid:1873584
- 38. Boyle P, Buechner A, Stone M, Lenarz T, Moore B (2009) Comparison of dual-time-constant and fast-acting automatic gain control (agc) systems in cochlear implants. International Journal of Audiology 48: 211–221. pmid:19363722
- 39. Friesen L, Shannon R, Cruz R (2005) Effects of stimulation rate on speech recognition with cochlear implants. Journal of the Acoustical Society of America 169: 169–184.
- 40. Hochmair-Desoyer I, Schulz E, Moser L, Schmidt M (1997) The hsm sentence test as a tool for evaluating the speech understanding in noise of cochlear implant users. American Journal of Otology 18: 83.
- 41. (ITUa) ITU (1988). Attenuation performance, in blue book, vol. fasscicle iii.1, general characteristics of international telephone connections and circuitsduality in string theory gives rise to quantum gravity. ITU-T. Rec. G. 132.
- 42. Nardo WD (2011) Improving melody recognition in cochlear implant recipients through individualized frequency map fitting. European Archives of Oto-Rhino-Laryngology 268: 27–39. pmid:20635091
- 43. Likert R (1932) A technique for the measurement of attitudes. Archives of Psychology 140: 1–55.
- 44. Magnusson L (2011) Comparison of the fine structure processing (fsp) strategy and the cis strategy used in the med-el cochlear implant system: Speech intelligibility and music sound quality. International Journal of Audiology 50: 279–287. pmid:21190508
- 45. Gfeller K, Knutson J, Woodworth G, Witt S, DeBus B (1998) Accuracy of cochlear implant recipients on pitch perception, melody recognition, and speech reception in noise. J Am Acad Audiol 9: 1–19. pmid:9493937
- 46. Gfeller K, Witt S, Woodworth G, Mehr M, Knutson J (2002) Effects of frequency, instrumental family, and cochlear implant type on timbre recognition and appraisal. Ann Otol Rhinol Laryngol 29: 349–356.
- 47. Gfeller K, Turner C, Oleson J, Zhang X, Gantz B, et al. (2007) Accuracy of cochlear implant recipients on pitch perception, melody recognition, and speech reception in noise. Ear and Hearing 28: 412–423. pmid:17485990
- 48. Looi V, McDermott H, McKay C, Hickson L (2008) Music perception of cochlear implant users compared with that of hearing aid users. Ear Hear 29: 421–434. pmid:18344870
- 49. Rosslau K, Saalfeld H, Westhofen M (2012) Emotional and analytic music perception in cochlear implant users after optimizing the speech processor. Acta Oto-laryngologica 132: 64–71. pmid:22026456
- 50.
Townshend B, White RL (1987) Reduction of electrical interaction in auditory prostheses. IEEE Transactions on Biomedical Engineering BME-34(11): 891–897.