J Am Acad Audiol 2020; 31(07): 506-512
DOI: 10.3766/jaaa.19025
Research Article

Effect of Microphone Location and Beamforming Technology on Speech Recognition in Pediatric Cochlear Implant Recipients

Jourdan T. Holder
1  Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
,
Adrian L. Taylor
1  Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
,
Linsey W. Sunderhaus
1  Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
,
René H. Gifford
1  Department of Hearing and Speech Sciences, Vanderbilt University Medical Center, Nashville, TN
› Author Affiliations
Funding Funding was provided by NIDCD R01 DC009404 (investigator effort) and an unrestricted research grant from Advanced Bionics (participant renumeration).
 

Abstract

Background Despite improvements in cochlear implant (CI) technology, pediatric CI recipients continue to have more difficulty understanding speech than their typically hearing peers in background noise. A variety of strategies have been evaluated to help mitigate this disparity, such as signal processing, remote microphone technology, and microphone placement. Previous studies regarding microphone placement used speech processors that are now dated, and most studies investigating the improvement of speech recognition in background noise included adult listeners only.

Purpose The purpose of the present study was to investigate the effects of microphone location and beamforming technology on speech understanding for pediatric CI recipients in noise.

Research Design A prospective, repeated-measures, within-participant design was used to compare performance across listening conditions.

Study Sample A total of nine children (aged 6.6 to 15.3 years) with at least one Advanced Bionics CI were recruited for this study.

Data Collection and Analysis The Basic English Lexicon Sentences and AzBio Sentences were presented at 0o azimuth at 65-dB SPL in +5 signal-to-noise ratio noise presented from seven speakers using the R-SPACE system (Advanced Bionics, Valencia, CA). Performance was compared across three omnidirectional microphone configurations (processor microphone, T-Mic 2, and processor + T-Mic 2) and two directional microphone configurations (UltraZoom and auto UltraZoom). The two youngest participants were not tested in the directional microphone configurations.

Results No significant differences were found between the various omnidirectional microphone configurations. UltraZoom provided significant benefit over all omnidirectional microphone configurations (T-Mic 2, p = 0.004, processor microphone, p < 0.001, and processor microphone + T-Mic 2, p = 0.018) but was not significantly different from auto UltraZoom (p = 0.176).

Conclusions All omnidirectional microphone configurations yielded similar performance, suggesting that a child's listening performance in noise will not be compromised by choosing the microphone configuration best suited for the child. UltraZoom (adaptive beamformer) yielded higher performance than all omnidirectional microphones in moderate background noise for adolescents aged 9 to 15 years. The implications of these data suggest that for older children who are able to reliably use manual controls, UltraZoom will yield significantly higher performance in background noise when the target is in front of the listener.


#

Introduction

Cochlear implants (CIs) can significantly improve communication in adults and children with moderate-to-profound sensorineural hearing loss. However, it is widely known that CI recipients continue to experience difficulty understanding speech in the presence of background noise. This is of importance in the pediatric population because elementary school–aged children spend up to 90% of their awake hours listening to speech in noise (e.g., the classroom) (Crukley et al[5]; Fidêncio et al[10]).

Speech recognition degrades with the introduction of background noise for listeners with CIs. Previous reports of children with CIs score 63–75% at +5 dB signal-to-noise ratio (SNR) for tasks of sentence recognition in noise (e.g., Wolfe et al[27]; Eisenberg et al[8]; Gifford et al[14]), whereas their normal-hearing peers as young as 5 years old achieve ceiling-level performance even at 0 dB SNR (Lewis et al[18]; McCreery et al[19]; Holder et al[16]). Given these comparisons, children with CIs on average are not yet performing on par with their peers.

CI manufacturers and researchers have sought to improve listening performance for CI users via many methods. These include front-end processing such as directional microphones, noise reduction techniques, and the addition of compatible accessories such as FM systems. The benefit of FM systems for all children and adults has been clearly demonstrated in the literature (e.g., Wolfe et al[28]; De Ceulaer et al[6]); however, it is not always available in environments outside the classroom, and therefore, other solutions for improving access to the target speaker such as microphone location and directional microphones should be considered.

Effects of Microphone Location

Microphone location has been shown to affect SNR for adults wearing hearing aids. Festen and Plomp[9] demonstrated a 2-dB improvement in SNR when the microphone was placed at the entrance of the ear canal compared with placement at the top of the pinna where a traditional behind-the-ear (BTE) microphone would be placed. This finding was also replicated by Pumford et al.[21] This benefit is thought to be related to the microphone's placement near the ear canal, which benefits from the frequency-specific shaping effects of the pinna (e.g., Shaw[24]).

In 2002, Advanced Bionics released the T-Mic auxiliary microphone, which places an omnidirectional microphone at the opening of the ear canal. In an effort to understand the difference in Advanced Bionics' T-Mic and BTE microphone placements, Aronoff et al[2] measured speech reception thresholds (SRTs) in listeners with normal hearing using head related transfer functions for each microphone placement and found a 2-dB benefit from using the T-Mic placement, which was found to be statistically similar to the pinna effect. Gifford and Revit[13] replicated this finding in adult Advanced Bionics CI users listening in diffuse noise reporting a significant effect of the T-Mic location with a 4.2-dB advantage over the BTE microphone location.

The studies referenced up to this point measured effects of microphone location for adults using Advanced Bionics Harmony processors; however, newer generation processors (i.e., Naida Q70 and Q90) have a different processor microphone design. Specifically, the Naida CI processor microphone is only partially recessed within the casing, as compared with the Harmony processor which had a deeply recessed microphone. Dwyer et al[7] evaluated the effect of microphone placement for a group of 11 adult CI recipients using Naida CI Q90 processors. With speech originating from 0° and restaurant noise originating from 45 through 315°, they found no significant difference in sentence recognition scores obtained with any of the omnidirectional microphone configurations (T-Mic 2, processor microphone (previously referred to as BTE mic), and processor microphone + T-Mic 2). Furthermore, they found the physical output for the processor microphone was equal to or greater than T-Mic 2—as measured from ear level on a KEMAR—consistent with an improved processor microphone design as compared with previous generation processors (Kolberg et al[17]). These data support the supposition that previous performance differences observed between the processor/BTE microphone and the T-Mic were largely attributed to a deeply recessed processor/BTE microphone with the Harmony processor which is not an issue with the current processor design.

All previous studies investigating the effect of microphone placement were conducted with adults. Furthermore, all studies either used an adaptive speech reception threshold measure (Gifford and Revit[13]; Aronoff et al[2]) or an individually determined SNR to drive unilateral CI-alone performance to approximately 50% (Dwyer et al[7]). Thus, it is possible that there may be differences for pediatric CI recipients, particularly at SNRs most commonly encountered in real-world scenarios, such as classrooms, cafeterias, or playgrounds (Crukley et al[5]). In such cases, it is unclear whether pediatric CI recipients may derive benefit from placement of the microphone at the entrance of the ear canal.


#

Microphone Directionality

The use of directional microphone technology has been widely used in hearing aids for many years. More recently, this technology has made its way to CIs. Several previous studies have shown that directional microphone technology can improve speech recognition in noise similar to that expected with hearing aids (Spriet et al[26]; Gifford and Revit[13]; Brockmeyer and Potts[3]; Hersbach et al[15]; Buechner et al[4]; Mosnier et al[20]). Geißler et al[12] showed that the use of an adaptive beamformer, marketed as UltraZoom, with the Naida CI processor resulted in significant improvement in speech perception when compared with the T-Mic 2 for adult CI recipients. Dwyer et al[7] also found a significant advantage of UltraZoom in R-SPACE proprietary restaurant noise (mean = 69% correct) compared with all omnidirectional microphone configurations (mean = 62% correct). Allday use of an UltraZoom program, however, may not be recommended as this would result in attenuation of desirable sounds (i.e., speech or audible alerts) originating from behind the listener. By contrast, auto UltraZoom is a microphone setting that automatically switches from omnidirectional to adaptive directionality within a single program. A 2015 white paper demonstrated that auto UltraZoom was as effective as UltraZoom in canteen noise presented from eight surrounding speakers (including in front of the listener) with the speech presented at 0o azimuth (Advanced Bionics[1]). The impact of these technologies for the pediatric population, however, has not been investigated.

The aims of this study were to quantify speech recognition performance between (a) different microphone placements (T-Mic 2, processor microphone, and processor microphone + T-Mic 2) and (b) different microphone modes (omnidirectional, adaptive directional [UltraZoom], and automatic adaptive directional [auto UltraZoom]) for pediatric CI recipients in a realistic listening environment using Advanced Bionics Naida Q90 processors. Based on previous studies mentioned herein, the hypotheses of the current study were as follows: (a) beamforming (UltraZoom and auto UltraZoom) would yield significantly higher performance over all omnidirectional microphone configurations and (b) T-Mic 2 would yield highest speech recognition in noise scores as compared with all other omnidirectional microphone sources.


#
#

Methods

Participants

For this experiment, ten pediatric CI recipients were consented and enrolled in accordance with the ethical standards of the institutional review board at Vanderbilt University (IRB approval: 130229). Participants were recruited from the Vanderbilt Bill Wilkerson CI clinic patient pool. All received at least one Advanced Bionics HiRes 90K CI. Five were using a bimodal hearing configuration (CI plus contralateral hearing aid), and four used bilateral CIs. Participant 1 was excluded following data collection because it was found that he used one CI and is not aided in the contralateral ear. Therefore, for this study, nine participants were included for analysis. The children ranged in age from 6.6 to 15.3 years, with a mean age of 10.8 years. Mean duration of CI use was 4.5 years and mean age of implantation was 6.4 years (ranging from 1.1 to 12.4 years). [Table 1] lists age and device information for each participant.

Table 1

Demographics and Device Information are Shown for Each Participant

Participant

Condition

Age at Implant (years)

Age at Testing (years)

CI Use (years)

CI Internal Device(s) HiRes 90K

Binaural Word Score (%)

Pure-Tone Average in dB (500, 1000, and 2000 Hz) in Hearing Aid Ear

Everyday Mic Setting

2

Bimodal

12.4

14.5

2.1

HiFocus Mid-Scala

88

98.3

Processor

3

Bimodal

12.3

15.3

3.0

HiFocus Mid-Scala

92

73.3

T-Mic

4

Bilateral CI

1.1

10.2

9.1

HiFocus 1J

92

T-Mic

5

Bilateral CI

5.6

8.6

3.0

HiFocus 1J

84

Processor

6

Bimodal

4.4

10.1

5.7

HiFocus 1J

84

61.6

T-Mic

7

Bimodal

2.7

6.6

3.9

HiFocus 1J

60

58.3

Processor

8

Bimodal

6.6

9.2

2.7

HiFocus Mid-Scala

76

73.3

T-Mic

9

Bilateral CI

10.5

12.2

1.7

HiFocus Mid-Scala (Left); HiFocus 1J (Right)

80

T-Mic

10

Bilateral CI

1.7

11.2

9.5

HiFocus Helix

82

T-Mic


#

Procedure

Aided audiometric detection thresholds were assessed with the participants wearing one CI; aided detection was assessed for each ear individually if the participant was bilaterally implanted. Thresholds were obtained in the range of 15–30-dB HL using warble tones before testing to verify audibility of low-level speech across the frequency range from 250 through 6000 Hz. All testing was completed using an Advanced Bionics Naida CI Q90 processor programmed using the patient's clinical upper and lower stimulation levels. Clear Voice was set to medium, and Wind Block, Sound Relax, and Echo Block were not active. For the bimodal listeners, personal hearing aids in their everyday setting were used. All five bimodal patients used Phonak Naida digital hearing aids. Hearing aid settings were verified to ensure that DSL v5 (Scollie et al[23]) prescriptive targets were met for 65-dB SPL speech. A precise match to DSL targets was obtained for 250–6000 Hz for participants 3, 6, and 7. Because of severity of hearing loss, participant 1's hearing aid met targets for 250–1500 Hz and participant 2's hearing aid met targets for 250–4000 Hz. The microphone settings that the participants used in their everyday program are provided in [Table 1]. Of note, three participants had an UltraZoom program in their CI processor; however, data logging indicated that none of the participants used their UltraZoom program before enrollment in the study.

Calibration was completed using a Larson Davis LxT sound level meter (Depew, NY) placed at the level of the listener's head. Using the R-SPACE sound simulation system, sentences were presented at 65-dB SPL at 0° azimuth and the R-SPACE proprietary restaurant noise was presented continuously (i.e., noise did not start and stop for each sentence) from the remaining seven speakers (45–315°) at 60-dB SPL. Participants were seated in the sound booth and instructed to repeat as much of each sentence as possible and were encouraged to guess when necessary. Frequent breaks were provided. Two different types of sentence stimuli were used, the Basic English Lexicon (BEL) (Rimikis et al[22]) and AzBio sentences (Spahr et al[25]), and both were presented at a +5-dB SNR. Two lists were presented for each condition; the two lists were averaged resulting in one score per condition. All nine participants were assessed using the BEL sentences and seven of the nine participated in experimentation with the AzBio sentences. AzBio sentences were not age appropriate for participants 5 and 7. The BEL sentences were presented in three different omnidirectional microphone configurations for all nine participants: T-Mic 2 only, processor microphone only, and processor microphone + T-Mic 2. The AzBio sentences were then presented in all five microphone conditions for seven of the nine participants: T-Mic 2 only, processor microphone only, and processor + T-Mic 2, UltraZoom, and auto UltraZoom. Test conditions were randomized for each participant, and program changes between conditions were made by the audiologist to ensure that the participant was listening to the correct program.


#

Analyses

A Friedman test (nonparametric alternative to the one-way repeated measures analysis of variance) was conducted to compare the effect of omnidirectional microphone configuration on BEL sentence performance. Nonparametric analysis was chosen because of the small sample size. The same analysis was repeated to compare the effect of microphone configuration and processing on AzBio sentence performance.


#
#

Results

BEL Sentences: Effect of Omnidirectional Microphone Location

[Figure 1] displays individual sentence recognition in each of the omnidirectional microphone conditions for all nine participants. Mean BEL sentence recognition at +5-dB SNR was 75.0%, 72.3%, and 75.4% for the T-Mic 2 only, processor microphone only, and processor microphone + T-Mic 2 conditions, respectively. A nonparametric Friedman test of differences among repeated measures showed that there was not a significant effect of microphone configuration, χ2(2) = 0.75, p = 0.794; however, participant 9 did show significantly higher performance for the processor microphone + T-mic 2 condition.

Zoom Image
Fig. 1 Individual and mean performance (dashed, black line) on the BEL sentences presented at a 15 SNR for nine participants. Individual data points are connected to illustrate individual differences between conditions. Dotted connecting line indicates bilateral participants; continuous connecting line indicates bimodal participants.

#

AzBio Sentences: Comparison between Omnidirectional and Directional Microphone Settings

[Figure 2] shows individual and mean (bold dashed line) AzBio sentence recognition scores in five different microphone conditions for seven participants. The first three conditions (T-Mic 2 only, processor microphone only, and processor microphone + T-Mic 2 condition) replicate the BEL sentence data shown in [Figure 1]. The fourth and fifth conditions assessed two beamforming options, UltraZoom and auto UltraZoom. Mean AzBio sentence recognition for the omnidirectional microphone configurations was 48.90%, 48.2%, and 51.7% for the T-Mic 2, processor microphone, and processor microphone + T-Mic 2 conditions, respectively. Mean AzBio sentence recognition for the beamforming conditions was 64.0% and 56.7% for the UltraZoom and auto UltraZoom conditions, respectively. A nonparametric Friedman test of differences among repeated measures showed that there was a significant effect of microphone configuration, χ2(4) = 15.429, p = 0.004. Post hoc analyses using Dunn's multiple comparisons test revealed that there were no differences in sentence recognition across any of the omnidirectional microphone settings (p > 0.05), consistent with BEL sentence recognition findings ([Figure 2]). Although the difference was not significant at the group level, there were some individual differences as shown in [Figure 2]. For example, participant 2 performed 18 percentage points better with T-Mic 2 than T-Mic 2 + processor microphone, and participants 3 and 10 performed 19 and 25 percentage points better, respectively, with T-Mic 2 + processor microphone than with T-Mic 2. Multiple comparisons revealed that UltraZoom yielded significantly higher sentence recognition than T-Mic 2 (p = 0.004), processor microphone (p < 0.001), and T-Mic 2 + processor microphone (p = 0.018), but not significantly different from auto UltraZoom (p = 0.176).

Zoom Image
Fig. 2 Individual and mean performance (dashed, black line) on the AzBio sentences presented at a 15 SNR for seven participants. Individual data points are connected to illustrate individual differences between conditions. Dotted connecting line indicates bilateral participants; continuous connecting line indicates bimodal participants.

#
#

Discussion

The current study investigated the effects of omnidirectional microphone location as well as an adaptive beamformer (UltraZoom) and an automatic adaptive beamformer (auto UltraZoom) for pediatric CI recipients listening to sentences in noise at +5-dB SNR. For omnidirectional microphone configurations, the current dataset showed no differences in sentence recognition in noise for any of the three omnidirectional microphone configurations for the Naida CI sound processor in pediatric CI users. This is contrary to previous reports from our center and others with previous generation processors (e.g., Gifford and Revit[13]; Aronoff et al[2]; Kolberg et al[17]) demonstrating that a microphone located at the level of the ear canal led to improved speech perception in noise for adult CI users. However, the current results are consistent with the recent data presented by Dwyer et al[7] demonstrating no difference in performance across the same three omnidirectional microphone configurations for adult CI recipients (Naida CI Q90) with stimuli randomly presented from 0, 90, or 270 degrees. The most likely reason for the differences in outcomes with past studies is that both the current dataset and Dwyer et al[7] investigated effects of omnidirectional microphone location for CI recipients using the newest generation CI sound processor (Naida CI Q90), whereas previous reports had all used previous generation sound processors (Harmony and Auria). Kolberg et al[17] published microphone output statistics for the omnidirectional microphones on the Harmony demonstrating significantly higher output for the Harmony T-Mic than the Harmony processor microphone, particularly in the 1500- to 4500-Hz region. Specifically, output for the processor microphone decreased by 5 dB in a spectral region known to be heavily weighted for recognition of speech (e.g., French and Steinberg[11]). By contrast, Dwyer et al[7] replicated the methods of Kolberg et al[17] with the Naida CI Q90 and found that the processor microphone no longer has a decrease in the 1500- to 4500-Hz region. Indeed, it is likely that the recessed processor microphone in the Harmony case was largely responsible for this effect.

The results of the omnidirectional microphone comparison suggest similar performance across microphones. This finding has clinical implications for audiologists when selecting which microphone configuration to use for a child. Aside from speech recognition in noise, the T-Mic 2 has potential benefits such as optimal microphone placement for typical use of a telephone receiver or circumaural headphone placement. For example, an audiologist may want to select the T-Mic 2 for older children so that they are able to take advantage of these benefits; however, for younger children in the age range not tested here, audiologists will likely still want to select processor microphone or processor microphone + T-Mic 2 as these younger children may be more prone to damaging the T-Mic 2 and/or less reliable reporters should the T-Mic 2 be comprised. Given the current results, selecting the processor microphone or processor microphone + T-Mic 2—as presented in the second scenario for younger children—would result in similar performance to the T-Mic 2 for understanding speech in noise.

The second aim of the study was to investigate the effect of directional beamforming technology on speech understanding in noise in the pediatric population. In the current study, UltraZoom yielded significantly higher performance than all omnidirectional microphone configurations in moderate background noise for children aged 9 to 15 years. On average, listeners experienced a 15-percentage point improvement with UltraZoom, and no scores declined while using UltraZoom in this listening environment. This finding is in agreement with previous adult studies which showed the benefit of using UltraZoom in a sound booth setting and in a natural environment setting (Geißler et al[12]; Mosnier et al[20]; Dwyer et al[7]). This finding suggests that children, who are mature enough to appropriately switch between omnidirectional and directional programs, may benefit from an UltraZoom program for use in noise.

For the auto UltraZoom condition, five of seven children showed improved speech recognition compared with the T-Mic 2 condition—8-percentage points, on average; however, this improvement was not statistically significant at the group level. When comparing auto UltraZoom to UltraZoom, UltraZoom yielded scores that were 7-percentage points higher on average, but this difference was also not statistically significant (p = 0.176). On an individual level, six of seven children performed better using UltraZoom than auto UltraZoom. Of note, bilateral participants reported that their processors switched into UltraZoom at different times from one another while in the auto UltraZoom program. This may be one explanation for slightly poorer speech understanding in auto UltraZoom than UltraZoom in a real-life scenario; however, it should be noted that this was not formally measured. These data also hold clinical implications for the pediatric CI population as we would not expect negative outcomes for speech understanding in noise with auto UltraZoom for speech originating from 0°. However, additional research is needed to determine whether head position and attentional effects common in the pediatric population may impact outcomes. As such, we cannot yet generalize these data to the larger pediatric CI population. Older children, however, would be expected to derive significant benefit from use of UltraZoom and possibly derive benefit from the use of auto UltraZoom over omnidirectional microphone settings.

Limitations

The current study was limited by small sample size and speech presentation at 0o azimuth using a fixed SNR. Future studies should consider more difficult SNRs and roving speaker locations to account for more difficult listening conditions. With the development of StereoZoom, a binaural beamformer, future research should also investigate this signal processing in the pediatric population.


#
#

Conclusion

We investigated the effects of omnidirectional microphone configurations as well as adaptive (UltraZoom) and automatic adaptive (auto UltraZoom) beamformers on sentence recognition in noise for pediatric CI recipients. The results of the current study can be summarized as follows.

  • The three omnidirectional microphone configurations (T-Mic 2, processor microphone, and T-Mic 2 + processor microphone) resulted in similar sentence recognition in noise performance for pediatric CI recipients at the group level.

  • The beamformer, UltraZoom, resulted in significantly higher sentence recognition in noise than all omnidirectional microphone configurations.

The addition of an UltraZoom program for children who are independently able to regulate their programs will be beneficial for speech recognition in noise.

  • Auto UltraZoom was not statistically significantly different than the T-Mic 2 or UltraZoom programs.

Further research is warranted to determine whether implementation of the auto UltraZoom program is appropriate in the pediatric population.


#

Abbreviations

BEL: Basic English Lexicon
BTE: behind the ear
CI: cochlear implant
SNR: signal-to-noise ratio

#

Conflict of Interest

None declared.

Notes

Portions of the following data were presented at the 14th International Conference on Cochlear Implants, Toronto, ON, May 11–14, 2016, and at the 15th Symposium on Cochlear Implants in Children, San Francisco, CA, July 26–29, 2017.



Address for correspondence

Jourdan T. Holder
Department of Hearing and Speech Sciences, Vanderbilt University Medical Center
Nashville, TN 37232

Publication History

Publication Date:
02 September 2020 (online)

© 2020. Copyright © 2020 by the American Academy of Audiology. All rights reserved.

Thieme Medical Publishers
333 Seventh Avenue, New York, NY 10001, USA.


Zoom Image
Fig. 1 Individual and mean performance (dashed, black line) on the BEL sentences presented at a 15 SNR for nine participants. Individual data points are connected to illustrate individual differences between conditions. Dotted connecting line indicates bilateral participants; continuous connecting line indicates bimodal participants.
Zoom Image
Fig. 2 Individual and mean performance (dashed, black line) on the AzBio sentences presented at a 15 SNR for seven participants. Individual data points are connected to illustrate individual differences between conditions. Dotted connecting line indicates bilateral participants; continuous connecting line indicates bimodal participants.