J Am Acad Audiol 2022; 33(03): 170-180
DOI: 10.1055/a-1678-3381
Research Article

Influence of Audibility and Distortion on Recognition of Reverberant Speech for Children and Adults with Hearing Aid Amplification

Marc A. Brennan
1   Department of Special Education and Communication Disorders, University of Nebraska-Lincoln, Lincoln, Nebraska
,
Ryan W. McCreery
2   Center for Hearing Research, Boys Town National Research Hospital, Omaha, Nebraska
,
John Massey
3   Florida Ear and Sinus Center, Silverstein Institute, Sarasota, Florida
› Author Affiliations
Funding Portions of this work were funded by the Nebraska Tobacco Settlement Biomedical Research Development Fund (principal investigator Marc A. Brennan), NIDCD grants T35 DC008757 (Boys Town National Research Hospital), P30 DC4662 (Boys Town National Research Hospital), P20 GM109023 (principal investigator Marc A. Brennan), R01 DC013591 (principal investigator Ryan W. McCreery).
 

Abstract

Background Adults and children with sensorineural hearing loss (SNHL) have trouble understanding speech in rooms with reverberation when using hearing aid amplification. While the use of amplitude compression signal processing in hearing aids may contribute to this difficulty, there is conflicting evidence on the effects of amplitude compression settings on speech recognition. Less clear is the effect of a fast release time for adults and children with SNHL when using compression ratios derived from a prescriptive procedure.

Purpose The aim of the study is to determine whether release time impacts speech recognition in reverberation for children and adults with SNHL and to determine if these effects of release time and reverberation can be predicted using indices of audibility or temporal and spectral distortion.

Research Design This is a quasi-experimental cohort study. Participants used a hearing aid simulator set to the Desired Sensation Level algorithm m[i/o] for three different amplitude compression release times. Reverberation was simulated using three different reverberation times.

Participants Participants were 20 children and 16 adults with SNHL.

Data Collection and Analyses Participants were seated in a sound-attenuating booth and then nonsense syllable recognition was measured. Predictions of speech recognition were made using indices of audibility, temporal distortion, and spectral distortion and the effects of release time and reverberation were analyzed using linear mixed models.

Results While nonsense syllable recognition decreased in reverberation release time did not significantly affect nonsense syllable recognition. Participants with lower audibility were more susceptible to the negative effect of reverberation on nonsense syllable recognition.

Conclusion We have extended previous work on the effects of reverberation on aided speech recognition to children with SNHL. Variations in release time did not impact the understanding of speech. An index of audibility best predicted nonsense syllable recognition in reverberation and, clinically, these results suggest that patients with less audibility are more susceptible to nonsense syllable recognition in reverberation.


#

Introduction

Purpose

Improving the ability of adults and children with SNHL to communicate is of paramount importance. It is well established that adults and children with SNHL correctly recognize fewer words than adults and children with “normal” hearing (NH), even with the provision of hearing aid amplification.[1] [2] Children, regardless of hearing status, recognize fewer words in sentences than adults in conditions with noise or reverberation.[2] [3] Most environments encountered by adults with hearing loss have a favorable signal-to-noise ratio (SNR) with minimal reverberation, but listening situations rated as difficult include those with less a favorable SNR and greater reverberation.[4] [5] [6] [7] Reverberation is a particularly problematic environment for both adults and children, in that their ability to correctly repeat back spoken words degrades.[3] [8] This negative effect of reverberation on understanding is particularly problematic for children, who commonly receive academic instruction in reverberant environments[9] but it is also problematic for adults who wish to communicate in restaurants or understand a sermon at a house of worship.[7] [10] [11] Consequently, improving the ability of children and adults with SNHL to correctly perceive speech in environments with reverberation is a necessary step toward improving their outcomes in real-world listening environments. Toward that end, this experiment examined the influence of simulated room reverberation and three different hearing aid release times for amplitude compression amplification on speech recognition and indices of audibility, temporal distortion, and spectral distortion. Predictability of speech recognition from the indices was also assessed. Participants were children and adults with sensorineural hearing loss (SNHL).


#

Effects of Amplitude Compression Parameters on Acoustic Cues and Speech Recognition

Amplitude compression is a signal processing method used by hearing aids and different amplitude compression parameters have been documented to influence acoustic cues used to recognize speech. The attack and release times refer to the time required for the amplitude compression circuit to adjust to increases and decreases, respectively, in the input level. The compression ratio is the change in output level that occurs as the input level varies, with higher compression ratios resulting in smaller changes in output level. Input levels above the knee point are compressed and input levels below the knee point are linearly or expansively amplified.[12] Higher compression ratios in conjunction with fast attack and release times and greater numbers of channels improve audibility for low-input levels, maintain loudness comfort for high-input levels, and improve the SNR for negative SNRs. However, higher compression ratios are also associated with increased temporal and spectral distortion of the speech signal.[13] [14] [15] [16] [17] [18] These temporal and spectral distortions include reduced modulation depth,[14] increased correlation of independent sound sources,[19] reduced spectral contrast,[15] and decreased SNR for positive SNRs.[17] [20]

Perhaps because of the complex relationship between amplitude compression parameters and acoustic cues, differing results between studies on the effect of different amplitude compression parameters on speech recognition have been observed. A common approach has been to systematically vary an amplitude compression setting, such as the compression ratio.[21] Typically, fast attack and release times were used, in conjunction with a fixed knee point. Sometimes the gain across frequency was set to a prescriptive procedure but otherwise a high and audible presentation level was used. For adults, such a procedure has been documented to degrade speech recognition for consonants, vowels, words, and sentences—both in quiet and in noise.[15] [21] [22] The lowest compression ratio that resulted in poorer performance varied across studies and ranged from 3:1 to 5:1.

A problem with studies that used a high knee point or compression ratio is that audibility and distortion varied across subjects in a way that it does not normally with prescribed hearing aids fitted in clinical settings. Instead, the compression parameters are often set as specified by a prescriptive procedure, such as the National Acoustic Laboratories' nonlinear fitting procedure versions 1 and 2 NAL-NL1, NAL-NL2[23]: CAM2,[24] or the Desired Sensation Level multistage input/output algorithm DSL m[i/o][25]; Knee points and compression ratios reported for CAM2, DSL m[i/o], and the nonlinear versions of NAL and have ranged from 30 to 70 dB SPL and 1.0 to 3.1, respectively, with higher knee points and compression ratios being prescribed for children, higher frequencies, and greater hearing loss.[25] [26]

Another approach is to systematically vary the compression speed, but otherwise use knee points, compression ratios, and gains as prescribed by a prescriptive procedure.[2] [20] [27] [28] [29] [30] With such an approach, the use of a fast compression speed can improve speech recognition in quiet, especially for low-level inputs.[12] [31] In noise, Moore et al[32] observed better sentence recognition with fast than slow compression when using 8 or 16 channels of compression. No difference between compression speeds was observed when using four channels of compression. Paradoxically, Alexander and Masterson[20] observed better sentence recognition with fast compression for four channels but not for 8 or 16 channels. Several other studies did not observe a change in speech recognition between different compression speeds.[2] [27] [28] [29] Any number of methodological differences may have contributed to the varying results across studies, including the use of adaptive compression by Rallapalli and Alexander[28] or adjusting the gain prior to instead of after compression amplification.[27] Regardless of the reason for differing results across studies, it appears that fast compression when fit using a prescriptive procedure can improve speech recognition relative to a slower compression speed or linear amplification.

It is necessary to assess the effect of different compression speeds on speech recognition for children with SNHL. Children with SNHL have poorer temporal resolution,[33] children are more suspectable to temporal distortions than adults,[34] and children place greater reliance on dynamic acoustic cues.[35] Consequently, children with SNHL might be more susceptible to the temporal distortion introduced by a faster release time than adults with SNHL. As noted above, children with SNHL are prescribed higher compression ratios than adults with SNHL. The use of higher compression ratios for children with SNHL could have an unintended consequence. Specifically, the use of faster release times could have a greater decrement on speech recognition when using a prescriptive procedure derived for children.

While studies are mixed, only one study found a detriment of fast acting compression relative to linear amplification for children's perception of acoustics patterns in speech.[36] Most studies observed better perception for a variety of speech stimuli, including sentences in quiet and words in quiet and noise[37] [38] While Liu et al[39] observed better sentence recognition in noise with adaptive than fast acting wide-dynamic range compression (WDRC), a difference in sentence recognition in noise was not observed between slow and fast compression by Brennan et al.[2] Possibly the use of a faster-acting WDRC or adaptive compression by Liu et al contributed to the divergent result relative to Brennan and colleagues.


#

Effect of Reverberation

Due to the importance placed on being able to communicate in reverberant environments,[7] [9] it is necessary to understand the effect of different release times on speech recognition in reverberation. Reverberation results in temporal smearing of both the desired signal and any undesired noise, resulting in decreased modulation depth of both speech and noise[40] and increased temporal distortion.[41] Reverberation also results in a smaller difference in SNR between different release times.[16]

Several studies have examined the influence of different amplitude compression parameters on speech recognition in reverberation.[16] [28] [29] [41] [42] [43] [44] Only Rallapalli and Alexander[28] and Reinhart et al[44] included noise and none examined children. The studies by Reinhart and colleagues examined the recognition of nonsense syllables and sentences in simulated rooms with different reverberation times (0, 0.5, 1, 2, and 4 s). They used a hearing aid simulator with six frequency filter bands, 45 dB SPL kneepoint threshold, 2.1 or 3:1 compression ratio, 10 milliseconds attack time, and four different release times (12, 90, 800, and 1,500 ms). Participants recognized more words[41] and sentences[43] in reverberation as release time was increased. In contrast to the findings of Reinhart and colleagues, studies by Novick et al[29] Shi and Doherty[42] and Rallapalli and Alexander[28] observed that sentence recognition in several reverberant environments did not differ for the tested amplitude compression release times (40-, 90-, 160-, 320-, 640-, and 1,500 ms). Although Rallapalli and Alexander used a fixed 2:1 compression ratio, the compression knee points were set to the levels of the long-term average speech spectrum and Novick et al and Shi and Doherty set the compression knee points and ratios to either the manufacturer's (Oticon, Smørum, Denmark) proprietary algorithm or to the NAL-NL1 prescription.

Thus, it appears that studies to date that examined the effect of reverberation and used a common clinical procedure observed no difference in the ability to correctly repeat sentences across a variety of release times. In contrast, studies that used a high knee point and compression ratio observed an effect of the release time on speech recognition. However, prescriptive procedures, such as DSL m[i/o], prescribe lower knee points and higher compression ratios for children[25] and children are more suspectable to temporal distortions than adults.[34] Consequently, even when fit using a prescriptive procedure, an effect of a faster release time on speech recognition in a reverberant environment may be more apparent for children with SNHL compared with previous research with adults.


#

Indices of Audibility, Temporal Distortion, and Spectral Distortion

Due to the complex relationship of amplitude compression parameters on acoustic cues and speech recognition, there is interest in developing indices of audibility, temporal distortion, and spectral distortion that correlate with behavioral data.[45] [46] [47] The validation of such indices would be useful for two reasons. First, such indices could assist clinicians in selecting appropriate release times for their patients. Second, these indices can be used to assess the contribution of differing effects of audibility, temporal distortion, and spectral distortion to changes in speech recognition with reverberation and different release times.

Procedures include indices of audibility, temporal distortion, and SNR.[19] [45] [48] [49] Indices of audibility, such as the speech intelligibility index (SII), can predict differences in speech recognition across listeners in quiet and noisy conditions and can account for variations in speech recognition between different amplitude compression settings.[50] To accurately predict speech recognition in reverberation, indices that quantify temporal distortion, such as the envelope difference index (EDI) or Hearing-Aid Speech Perception Index (HASPI), are hypothesized to better predict performance than indices that do not quantify temporal distortion. The EDI quantifies temporal distortion to the waveform envelope, with higher values indicting greater temporal distortion. Along with temporal distortion, the HASPI also quantifies spectral distortion and audibility of the speech signal. In general, increased audibility increases HASPI values; however, changes to the temporal envelope or nonlinear deviations from the original spectrum decrease the values. Unlike the SII, HASPI does not account for the reduction in speech recognition associated with excessively high presentation levels (i.e., level distortion factor). Consequently, regardless of the output level or degree of hearing loss, the HASPI predicts 100% recognition of speech once a sufficient SNR is achieved.[51]

Without considering the effects of reverberation, the EDI and HASPI predict poorer speech recognition with faster release times relative to slower release times.[13] [45] [52] Both the EDI and HASPI can accurately predict changes in speech recognition for different amplitude compression parameters.[20] [45] [46] However, Alexander and Masterson[20] also observed that the amplitude compression parameters that resulted in the best sentence recognition (four channels with a fast release time) did not result in the most favorable EDI. EDI and HASPI values decrease in the presence of reverberation due to temporal distortion and HASPI values appear to accurately predict decreased speech recognition with longer reverberation times.[41] [53] [54] The decreased nonsense syllable recognition with faster release times and longer reverberation times observed by Reinhart et al[41] was accurately captured by the EDI. Most studies that evaluated the EDI or HASPI used fixed amplitude compression parameters[41] [45] [46] [54] or did not measure speech recognition in reverberation with amplitude compression.[52] [53] Only Alexander and Masterson[20] used amplitude compression parameters set to a prescriptive procedure. Thus, it is unclear the extent to which predictions of speech recognition with the EDI or HASPI extend to conditions with knee points and compression ratios set based on a prescriptive procedure.


#

Summary, Purpose, and Hypotheses

The first purpose of this study was to document, using hearing aid and reverberation simulators, the effect of different release and reverberation times on the recognition of nonsense syllables for children and adults with SNHL. The following hypotheses were made:

  1. Nonsense-syllable recognition will increase as the reverberation time is decreased.

  2. Due to less temporal distortion, nonsense-syllable recognition will increase with a slower release time.

  3. Because the pediatric version of DSL m[i/o] prescribes higher compression ratios,[25] the difference in nonsense-syllable recognition by release time will be larger with the pediatric than with the adult version of DSL m[i/o].

  4. Because children rely more on acoustic cues than adults,[34] children will show a greater benefit with a slower release time than the adult participants.

The second purpose was to determine the contributions of audibility, temporal distortion, and spectral distortion to changes in nonsense syllable recognition with changes in the release and reverberation times. It was hypothesized that an index that incorporated audibility, temporal distortion, and spectral distortion (HASPI) would better predict nonsense-syllable recognition than indices that only incorporated audibility (SII) or envelope distortion (EDI).


#
#

Method

Participants

The data were collected at Boys Town National Research Hospital under approval from the Institutional Review Board. Assent or consent was obtained, and participants received $15 an hour, with a typical 1- to 2-hour study visit duration. Participants underwent otoscopy followed by pure-tone threshold audiometry at all octave frequencies and 6 kHz—following the procedures of American Speech-Language-Hearing Association.[55] Participants were split into two groups: 20 children with SNHL (age in years: range = 6–17, mean = 11) and 16 adults with SNHL (age in years: range = 51–68, mean = 62). Audiometric thresholds are shown in [Fig. 1]. A repeated measures analysis of variance indicated that audiometric threshold differed significantly by test frequency (F[6,476] = 9.4, p < 0.001) but did not differ significantly by test ear (F[1,476] = 0.04, p = 0.845) or age group (F[1,476] = 0.2, p = 0.660) and none of the higher order interactions were significant (p > 0.05). Age of identification for the children ranged from birth to 7 years (mean = 2.3 years) and all the children with SNHL were hearing aid users. In the adult group, age of onset for hearing loss ranged from birth to 63 years (mean = 42 years) and eight of the adults were hearing aid users. All used spoken English as their primary language and all of the children were in mainstream classrooms.

Zoom Image
Fig. 1 Audiometric thresholds for the participants. Age group is indicated by each title. For this and remaining box plots, box boundaries represent the 25th and 75th percentiles, and error bars represent 2.7 standard deviations or the most extreme value that is not an outlier, whichever is lower. Horizontal lines represent the medians, and pluses represent outliers (greater than 2.7 standard deviations).

#

Equipment and Stimuli

Equipment consisted of a double-walled sound-attenuating room (where all experimental testing took place), Knowles Electronics Manikin for Acoustic Research (KEMAR) with an IEC 711 coupler (GRAS Sound & Vibration, Holte, Denmark), personal computer, RME Babyface sound card (Haimhausen, Germany), PreSonus HP4 headphone distribution amplifier (Baton Rouge, LA), and Sennheiser HD-25 (Wedemark, Germany) headphones.

Stimuli were 693 consonant-vowel-consonant nonsense syllables from McCreery and Stelmachowicz[56] with mean duration of 0.74 seconds (range = 0.51–1.1). Following the procedure of Reinhart et al[41] these nonsense syllables were convolved with impulse responses for a 5.7 m by 4.3 m by 2.3 m room with the sound source 1.4 m from the subject. The impulse responses were derived using the real-time spatial audio processor (Spat, https://forum.ircam.fr/projects/detail/spat/) implemented in the Max 6 programming language. The reverberation was simulated with two layers. The first used an image-source model calculation to present first- and second-order reflections. The second layer used a randomized comb filter tail to generate the decay of the late reflection. These two layers were combined to simulate a 0-, 0.5-, and 1-second reverberation time. The 0-second reverberation time condition corresponds to an anechoic chamber and these times encompass the range of reverberation times measured previously in classrooms.[57] [58] [59]


#

Amplification

Stimuli were amplified bilaterally using a previously described hearing aid simulator implemented in MATLAB 2015b.[2] Thresholds in hearing level were converted to sound pressure level using a conversion factor for KEMAR and entered into the DSL m[i/o] program. The stimulus used to measure hearing aid output consisted of the “carrot passage” from the Verifit electroacoustic system (Audioscan, Dorchester, Ontario) set to the long-term average speech spectrum,[60] with an overall level of 60 dB SPL. Hearing aid output was measured for one-third octave bands.[61] Hearing aid output was estimated by first measuring the one-third octave band output levels with the hearing aid stimulator (i.e., the input levels to the headphones) and then adding a previously measured conversion factor. This process estimated the hearing aid output levels for Sennheiser HD-25 headphones attached to KEMAR. When DSL m[i/o] prescribed less output than the input level the output was instead set to LTASS. Otherwise, the estimated output for the long-term average speech spectrum on KEMAR was set to within 5 dB of the DSL m[i/o] target from 250 to 8,000 Hz for most participants. Hearing aid output was set individually for each ear and using the pediatric and adult versions of the algorithm for the children and adults with SNHL, respectively. Due to the severity of hearing loss for several participants and because gain was limited to 65 dB, output deviated by more than 5 dB for some of the participants. The mean absolute fit-to-target differences were 2.7, 2.2, 1.0, 0.7, 1.1, 0.4, and 2.4 dB for 0.25, 0.5, 1, 2, 4, and 8 kHz, respectively.

Following a similar procedure to that of Reinhart et al[41] three different release times—12-, 90, and 1200 milliseconds—were used. The attack time was always 10 milliseconds. The compression ratio used in each filterbank channel was that prescribed by DSL m[i/o] and the mean compression ratio for the frequencies 0.5, 1, 2, 4 kHz, respectively, were 1.2, 1.3, 1.4, and 1.6 (adults) and 1.5, 1.8, 1.9, and 1.9 (children). Otherwise, the hearing aid settings were the same as those used by Brennan et al[2] and included the same eight-channel filterbank and 10:1 output limiter.


#

Procedure

The nonsense syllables were presented bilaterally at 60 dB SPL to the input of the hearing aid simulator. The presentation level was calibrated in a 6 cc coupler to the root-mean-square of the concatenated nonsense syllables. For each participant, 450 nonsense syllables were randomly assigned to one of nine conditions (three reverberation times × three release times), for 50 nonsense syllables per condition. An additional 10 randomly selected nonsense syllables were used for practice. Practice was always with the 0-second reverberation time. Practice for each participant was conducted using a randomly chosen release time. The examiner, who was the third author, was seated adjacent to the participant. Responses for each nonsense syllable were then scored as incorrect or correct by the examiner. Feedback was not provided to the participants and the order of conditions was randomized for each participant.


#

Speech Intelligibility Index, Envelope Difference Index, and Hearing Aid Speech Perception Index

Using MATLAB 2019a the SII, EDI, and HASPI were calculated for each individual stimulus, within each reverberation and release time condition. The maximum value across the two ears was selected and then averaged across stimuli to compute an overall mean for each subject. For SII, participant thresholds in dB SPL were linearly interpolated to the center frequencies for one-third octave filters, adjusted to account for the internal noise spectrum and transformed to one-third octave band levels. The SII was then calculated following the one-third octave band procedure, using the standard band importance function and level distortion factor, described by the American National Standards Institute.[48] The EDI was calculated following the procedure detailed by Jenstad and Souza.[45] Each stimulus was rectified, low-pass filtered using a Butterworth filter at 50 Hz, downsampled to 1 kHz and scaled by the mean amplitude. The difference in amplitude between the compressed and uncompressed stimulus for each sample was calculated and then divided by the total number of samples times two. The uncompressed stimulus was each nonsense syllable amplified with linear amplification in the 0-second reverberation time condition. HASPI version 2 values were calculated using MATLAB code provided by the developers.[54] The reference stimulus was each unaided nonsense syllable (60 dB SPL) for the 0-second reverberation time. SII, EDI, and HASPI values can range from 0 to 1. Higher values with the SII and HASPI indicate better audibility and intelligibility, respectively. Lower values with the EDI indicate less temporal distortion.


#

Analyses

Outcome variables included proportion correct nonsense syllable recognition, SII, EDI, and HASPI values. A series of linear mixed models with random intercepts for each subject were then completed to answer each of the following research questions:

  • (1) Does release time, reverberation time, or age group affect nonsense syllable recognition?

  • (2) Do indices that incorporate audibility, temporal distortion, and spectral distortion (HASPI) better predict, relative to audibility (SII) or temporal distortion alone (EDI), individual changes in nonsense syllable recognition for different release or reverberation times?

As recommended by Richardson[62] the order of conditions was included in each statistical model; however, for simplicity, is otherwise not reported. Model fits and statistical significance were evaluated by comparing Akaike's information criterion (AIC) and computing the χ2 change, respectively. The AIC is a goodness-of-fit measure that accounts for the number of parameters in a model, with smaller values representing better fit. The χ2 change was computed by subtracting the log-likelihood of the statistical model with values from an index (e.g., HASPI) to that of a statistical model without the values from an index. For simplicity, only significant interactions were reported. For reviews and examples of the application of linear mixed-modeling in speech and hearing science, see Oleson et al[63] and Walker et al.[64]


#
#

Results

[Fig. 2] depicts proportion correct nonsense syllable recognition for the adults (top panel) and children (bottom panel). The potential effects of reverberation time, release time, and age group were evaluated using a linear mixed effects model and is shown in [Table 1]. Relative to the 0-second reverberation time condition, mean nonsense syllable recognition for the adults decreased significantly by 0.13 (SD = 0.07) and 0.14 (SD = 0.06) for the 0.5- and 1-second reverberation time conditions, respectively. Nonsense syllable recognition did not change significantly from the 12- to 90- or 1,200 milliseconds release times (p = 0.971 and 0.456, respectively) and did not differ significantly for the child and adult participants (p = 0.671). While not hypothesized, mean nonsense syllable recognition from the 0- to 0.5- and 0- to 1-second reverberation time decreased by less for the children (M = 0.7) than the adults, and these smaller decreases in nonsense syllable recognition with the addition of reverberation for the children relative to the adults were statistically significant (p ≤0.045). None of the interactions of the reverberation time with release time conditions were significant.

Table 1

Linear mixed effect model evaluating effects of age group, reverberation time, and release time on speech recognition. Correct recognition of nonsense syllables decreased in reverberation

Main effects and interactions

Estimate

Standard error

t-Value

p-Value

Intercept

0.621

0.047

13.285

<0.001

Adult vs. child

0.026

0.062

0.425

0.671

0- vs. 0.5-s reverberation time

0.149

0.026

5.667

<0.001

0- vs. 1-s reverberation time

0.162

0.026

6.172

<0.001

12- vs. 90 ms release time

−0.001

0.026

−0.036

0.971

12- vs. 1,200 ms release time

−0.020

0.026

−0.746

0.456

Child × 0.5-s reverberation time

0.071

0.035

2.015

0.045

Child × 1-s reverberation time

0.086

0.035

2.455

0.015

Abbreviations: Bold, p < 0.05; NH, “normal” hearing, SNHL, sensorineural hearing loss.


Zoom Image
Fig. 2 Nonsense syllable recognition for the adults (top panel) and children (bottom panel). The release time is indicated by the legend. Nonsense syllable recognition decreased from the 0- to 0.5 and 1-second reverberation times. Box plots are shown in the same manner as in [Fig. 1].

[Table 2] contains the bivariate relationships of the predictor and independent variables. Due to the lack of a significant effect of the release time condition, index values and nonsense syllable recognition were averaged across the three release times. Nonsense syllable recognition was significantly correlated with SII and HASPI for all three reverberation time conditions, but there was no relationship with the EDI. [Fig. 3] compares the SII, EDI, and HASPI values to nonsense syllable recognition. Lines depict the predicted nonsense syllable recognition, generated from each linear mixed model. SII, EDI, and HASPI values were similar for the adult and child participants. While SII values did not differ by reverberation time, HASPI values decreased with the addition of reverberation by a mean of 0.27 and EDI values increased with reverberation by a mean of 0.16. EDI and HASPI values varied by release time, with a maximum mean change of 0.05.

Zoom Image
Fig. 3 Relationship of audibility, temporal distortion, and spectral distortion to nonsense syllable recognition. Each column depicts the 0-, 0.5-, 1-second reverberation time conditions, respectively. Each row depicts the raw values for the Speech Intelligibility Index (SII), Envelope Difference Index (EDI), and Hearing-Aid Speech Perception Index (HASPI), respectively. Child participants are represented by the diamond symbols and adults by the circle symbols. Release time is indicated by the insert. Lines indicate the linear-mixed effect model prediction for nonsense syllable recognition. Increases in the SII were associated with increases in nonsense syllable recognition.
Table 2

Correlation matrix

0-s reverberation time

0.5-s reverberation time

1-s reverberation time

Score

SII

EDI

Score

SII

EDI

Score

SII

EDI

SII

0.872***

0.851***

0.835***

EDI

0.290

0.362*

0.164

0.257

0.063

0.050

HASPI

0.713***

0.679***

0.350*

0.850***

0.868***

0.224

0.832***

0.878***

0.012

Abbreviations: Bold, p <0.05; EDI, envelope difference index; HASPI, Hearing Aid Speech Perception Index by reverberation time condition; Score, correlation of nonsense syllable recognition; SII, speech intelligibility index (SII). Note: *<0.05; ***p<.001.


Models were compared by computing the χ2 change, which subtracted the log-likelihood of a linear mixed effects model with the SII, EDI, or HASPI to that of a linear mixed effects model without the additional predictor variable. Because there was not a significant effect of release time, release time was not included as a fixed effect. Including the SII significantly improved the fit of the model (χ2(6) = 77.4, AIC = − 666.2, p <0.001). As SII increased, nonsense syllable recognition increased by a model estimated 0.155 for each 0.1 unit increase in SII (95% CI: 0.102, 0.209, p <0.001). There was also a significant main effect of the reverberation times and a significant interaction of the 0.5-second reverberation time with SII (95% CI: 0.034–0.772, p = 0.032). The significant interaction was due to the slope between SII and nonsense syllable recognition being significantly steeper for the 0.5- than 0-second reverberation time. Specifically, participants with a lower SII had a larger decrease in nonsense syllable recognition from the 0- to 0.5-second reverberation time than participants with a higher SII. None of the other interactions were significant. In contrast to the SII values, nonsense syllable recognition did not vary with individual variations in EDI values and including the EDI did not significantly improve the fit of the model (χ2[6] = 5.5, AIC = − 594.2, p = 0.483).

While including the HASPI (χ2[6] = 24.9, AIC = − 613.6, p< 0.001) significantly improved the fit of the model without the HASPI, the resultant AIC was higher (poorer) than the model with SII. As HASPI values increased, nonsense syllable recognition increased by a model estimated 0.158 for each 0.1 unit increase in HASPI (95% CI: 0.061–0.332, p = 0.005). However, as observed in [Fig. 3] nonsense syllable recognition increased as HASPI values increased but only when proportion correct was less than 0.5 approximately. None of the other main effects or interactions were statistically significant.


#

Discussion

The present study evaluated the effect of three different reverberation (0, 0.5, 1 s) and release (12, 90, 1,200 ms) times on nonsense syllable recognition for children and adults. Contributions of audibility (SII), temporal distortion (EDI), and combined audibility, temporal distortion, and spectral distortion (HASPI) to nonsense syllable recognition were estimated. These effects were evaluated because reverberation is known to be challenging for listeners[3] and fast release times are argued to impair perception in reverberation.[41] Prior work has focused on the effects of reverberation and release time for adults; however, children commonly receive academic instruction in reverberant environments and children might be more susceptible to temporal distortion. This work extended prior work by documenting the contributions of audibility, temporal distortion, and spectral distortion to individual differences in nonsense syllable recognition in reverberation.

For these participants with SNHL, nonsense syllable recognition was reduced by 11-percentage points with reverberation than without, and this difference was statistically significant. Children with SNHL were less affected by reverberation than adults with SNHL. There was not a significance effect of release time—for either the child or adult participants—on nonsense syllable recognition, nor was there a significant interaction of reverberation time with release time. Individual differences in audibility (SII) strongly contributed to individual differences in nonsense syllable recognition and individuals with less audibility experienced a larger decrement in nonsense syllable recognition from 0- to 0.5-second reverberation time condition.

Reverberation

The findings of the present study are consistent with previous work suggesting that listeners with SNHL are significantly impacted by reverberation.[8] [65] Nonsense syllable recognition decreased by approximately 0.14 for the adults but only by 0.09 for the children. While children are more susceptible to temporal distortion during a psychoacoustic task,[34] this finding seemingly does not support the notion that children with SNHL are more susceptible to temporal distortion when perceiving speech with reverberation. Here we consider the possible contribution of audibility and presbycusis. Because higher audibility positively impacts speech recognition in reverberation,[8] greater audibility with amplification for the children may have reduced the negative impact of reverberation and allowed for greater access to the speech signal, even in the presence of reverberation. However, the mean SII was not higher for the children than adults. Nor was this age difference predictable from the EDI or HASPI, as the mean change in EDI and HASPI values with reverberation were similar for the children and adults. Instead, aging is known to cause systemic changes, including less inhibition throughout the auditory system, undersampling due to loss of neurons, and less regulation of outer hair cell function.[66] Changes in the neural coding of temporal and spectral information with age have been observed for simple stimuli, including synthesized /da/, /b/ and /p/.[67] [68] Some of these changes in neural coding are thought to underlie deficits in echoic memory and contribute to poorer recognition of phonemes for older adults.[67] [69] Possibly, these effects of aging on the auditory system contributed to the older adults showing a greater decline in performance with reverberation than the children. Replication of this effect is necessary, however, before these findings can be extrapolated to the general population.

The changes in nonsense syllable recognition as a function of the reverberation condition were paralleled by the changes in the EDI and HASPI values. Specifically, both nonsense syllable recognition and EDI and HASPI values decreased from the 0- to 0.5-second reverberation time condition but showed minimal difference between the two reverberant conditions. In contrast, as expected, the SII did not vary across the three reverberation time conditions. Reverberation is known to temporally and spectrally smear the individual phonemes in speech. Therefore, the EDI appears to capture these temporal distortions and the HASPI appears to capture both the temporal and spectral distortions caused by reverberation. The decrease in HASPI values with reverberation observed in this study is consistent with the HASPI measurements made by Muralimanohar et al.[53] A modification of the SII that accounts for early relative to late reflections may have better captured individual differences in the relationships between audibility and speech recognition.

Although the changes in EDI and HASPI values between the different reverberation conditions coincided with the changes in nonsense syllable recognition, it was observed that the SII best predicted individual nonsense syllable recognition. Increases in SII values corresponded with increased nonsense syllable recognition for each reverberation time condition. In contrast, individual variability in EDI values did not correspond to individual variability in nonsense syllable recognition—for all three reverberation time conditions. Because HASPI values were at or near ceiling (>0.8) for the 0-reverberation time condition, HASPI values over predicted performance for this condition. This over prediction may have been due to the lack of a level distortion factor. Even for the conditions with reverberation, individual changes in HASPI values did not correspond well to individual differences in nonsense syllable recognition, except perhaps for the participants with nonsense syllable recognition less than approximately 0.5. A more consistent relationship of HASPI values to sentence recognition was observed by Kates and Arehart.[54] Numerous methodological differences could account for these disparate results, including the fact that the HASPI was updated to better model their reverberation data and, consequently, would not be expected to model a new set of data as well and the use of longer, by 2 seconds, reverberation time by Kates and Arehart. To better account for the nonlinear relationship of HASPI values to nonsense syllable recognition, we considered log transforming the HASPI values. However, because one purpose of this study was to examine the predictive value of the HASPI relative to the SII and EDI, this transformation was not completed. These data, then, suggest that while indices of temporal or spectral distortion can capture the effects of reverberation on temporal and spectral cues, individual susceptibility to reverberation was best predicted by SII. Thus, the results presented herein are consistent with the notion that reduced audibility (due to greater hearing loss) is associated with a poorer ability to extract temporal and spectral cues from speech,[70] [71] making such individuals more susceptible to the deleterious effects of reverberation on temporal and spectral cues.


#

Release Time

In the present study, changing the release time of amplitude compression did not impact nonsense syllable recognition and the effect of release time on this measure did not differ by reverberation time. This finding did not support the hypothesis that children are more susceptible than adults to temporal distortion created by a fast release time. Note that the children were fit to DSL-child, which prescribed higher compression ratios than the DSL-adult procedure used for the adults. To the extent that higher compression ratios negatively impact speech recognition, a larger effect of manipulating the release time on nonsense syllable recognition might have been expected for the children than adults. Instead, release time did not have an effect on both the child and adult participants. Such a finding does not support the notion that the increased compression ratios associated with DSL-Child relative to DSL-Adult lead to greater temporal distortion, which in turn, cause a greater negative effect of a fast release time on speech recognition. Instead, the changes in EDI and HASPI values from the slowest to fastest release time were similar for the children and adults. Inconsistent with the effect of release time on nonsense syllable recognition, EDI values decreased and HASPI values increased as the release time increased. However, the changes in values were a maximum of 0.05 and, based on prior work,[45] not likely to impact perception. The changes in the EDI and HASPI values as the release time changed are consistent with the designs of the EDI and HASPI. Specifically, deviations from the original temporal envelope, especially for low- and mid-levels with the HASPI, are penalized.

Although, this experiment did not explicitly test the effect of varying the compression ratios on nonsense syllable recognition, the lack of a change in nonsense syllable recognition with release time when using individual compression ratios is consistent with extant work. Specifically, the data were consistent with previous work in adults,[29] [42] where the compression ratios were prescribed for each participant based on the degree of hearing loss and contrasts with work that used the same high 2.1:1 or 3:1 compression ratio for all participants.[41] [44] While using the same compression ratio for all participants induces greater temporal and spectral distortions, leading to poorer SNR in reverberation[16] and greater variance in audibility across listeners, studies that have used compression ratios prescribed for each participant suggest that these distortions caused by a fast release time are insufficient to affect perception.

Additional considerations include the choice of stimuli, presentation level, and number of compression channels. The use of nonsense words may have limited the effect of changes in release time on audibility and distortion. Due to being longer in duration, sentences provide more opportunities to engage the compressor and therefore larger differences in performance might be expected. However, given that Shi and Doherty[42] and Novick et al[29] also used sentences but did not observe a negative effect of fast compression suggests otherwise. Instead, their findings support the argument that—when using compression ratios set to a prescriptive procedure—the negative effect of a fast release time on perception is at best negligible.

Several studies have compared amplitude compression to linear amplification. As the presentation level decreased, larger increases in the correct recognition of words with amplitude compression amplification relative to linear amplification occurred.[12] [72] This benefit of amplitude compression amplification is presumably due to greater gain at lower- relative to higher input levels, which in turn improves low-level audibility.[45] Consequently, differences in performance across various release times might be observed when using a lower presentation level than the 60 dB SPL presentation level used in the current study.


#
#

Conclusion

The present study advances our understanding of the influence that reverberation and release time have on speech recognition. Relative to recent work, this study used compression ratios derived from the DSL m[i/o] prescriptive procedure. Due to the use of lower compression ratios, it was hypothesized that the negative effect of release time on performance that was previously observed would be smaller. Consistent with this view, an effect of the release time on proportion-correct recognition was not observed. While reverberation decreased nonsense syllable recognition, the lack of an interaction with release time indicates that a fast release time, when using a prescriptive procedure, should not impact speech recognition for children or adults with SNHL. The work herein also suggests that differences in audibility contributed the most to individual variability in nonsense syllable recognition and that reverberation has a larger negative impact for individuals with a lower SII value.


#
#

Conflict of Interest

None declared.

Acknowledgment

The authors thank Dawna Lewis for helpful discussion about the project; research assistants Sarah Garvey, Joslyn Parsons, and Manami Shah for assistance with data analyses; Joshua Alexander for providing the hearing aid simulator; and Tim Vallier for providing the room-reverberation simulator.

Authors' note

This work was presented at the Annual Meeting of the American Auditory Society in March of 2018.


Disclaimer

Any mention of a product, service, or procedure in the Journal of the American Academy of Audiology does not constitute an endorsement of the product, service, or procedure by the American Academy of Audiology.


  • References

  • 1 McCreery RW, Walker EA, Spratford M. et al. Speech recognition and parent ratings from auditory development questionnaires in children who are hard of hearing. Ear Hear 2015; 36 (Suppl. 01) 60S-75S
  • 2 Brennan M, McCreery R, Kopun J, Lewis D, Alexander J, Stelmachowicz P. Masking release in children and adults with hearing loss when using amplification. J Speech Lang Hear Res 2016; 59 (01) 110-121
  • 3 Wróblewski M, Lewis DE, Valente DL, Stelmachowicz PG. Effects of reverberation on speech recognition in stationary and modulated noise by school-aged children and young adults. Ear Hear 2012; 33 (06) 731-744
  • 4 Smeds K, Wolters F, Rung M. Estimation of signal-to-noise ratios in realistic sound scenarios. J Am Acad Audiol 2015; 26 (02) 183-196
  • 5 Wolters F, Smeds K, Schmidt E, Christensen EK, Norup C. Common sound scenarios: a context-driven categorization of everyday sound environments for application in hearing-device research. J Am Acad Audiol 2016; 27 (07) 527-540
  • 6 Smeds K, Gotowiec S, Wolters F, Herrlin P, Larsson J, Dahlquist M. Selecting scenarios for hearing-related laboratory testing. Ear Hear 2020; 41 (Suppl. 01) 20S-30S
  • 7 Wagener KC, Hansen M, Ludvigsen C. Recording and classification of the acoustic environment of hearing aid users. J Am Acad Audiol 2008; 19 (04) 348-370
  • 8 McCreery RW, Walker EA, Spratford M, Lewis D, Brennan M. Auditory, cognitive, and linguistic factors predict speech recognition in adverse listening conditions for children with hearing loss. Front Neurosci 2019; 13: 1093
  • 9 Crukley J, Scollie S, Parsa V. An exploration of non-quiet listening at school. J Educ Audiol 2011; 17: 23-35
  • 10 Culling JF. Speech intelligibility in virtual restaurants. J Acoust Soc Am 2016; 140 (04) 2418-2426
  • 11 Quartieri J, D'Ambrosio S, Guarnaccia C, Iannone G. Experiments in room acoustics: modelling of a church sound field and reverberation time measurements. WSEAS Trans Signal Process 2009; 5: 126-135
  • 12 Brennan M, Souza P. Effects of expansion on consonant recognition and consonant audibility. J Am Acad Audiol 2009; 20 (02) 119-127
  • 13 Kates JM. Understanding compression: modeling the effects of dynamic-range compression in hearing aids. Int J Audiol 2010; 49 (06) 395-409
  • 14 Stone MA, Moore BCJ. Syllabic compression: effective compression ratios for signals modulated at different rates. Br J Audiol 1992; 26 (06) 351-361
  • 15 Bor S, Souza P, Wright R. Multichannel compression: Effects of reduced spectral contrast on vowel identification. J Speech Lang Hear Res 2008; Oct; 51 (05) 1315-1527
  • 16 Reinhart P, Zahorik P, Souza PE. Effects of reverberation, background talker number, and compression release time on signal-to-noise ratio. J Acoust Soc Am 2017; 142 (01) EL130-EL135
  • 17 Naylor G, Johannesson RB. Long-term signal-to-noise ratio at the input and output of amplitude-compression systems. J Am Acad Audiol 2009; 20 (03) 161-171
  • 18 Alexander JM, Rallapalli V. Acoustic and perceptual effects of amplitude and frequency compression on high-frequency speech. J Acoust Soc Am 2017; 142 (02) 908-923
  • 19 Stone MA, Moore BCJ. Quantifying the effects of fast-acting compression on the envelope of speech. J Acoust Soc Am 2007; 121 (03) 1654-1664
  • 20 Alexander JM, Masterson K. Effects of WDRC release time and number of channels on output SNR and speech recognition. Ear Hear 2015; 36 (02) e35-e49
  • 21 Boike KT, Souza PE. Effect of compression ratio on speech recognition and speech-quality ratings with wide dynamic range compression amplification. J Speech Lang Hear Res 2000; 43 (02) 456-468
  • 22 van Buuren RA, Festen JM, Houtgast T. Compression and expansion of the temporal envelope: evaluation of speech intelligibility and sound quality. J Acoust Soc Am 1999; 105 (05) 2903-2913
  • 23 Keidser G, Dillon H, Flax M, Ching T, Brewer S. The NAL-NL2 prescription procedure. Audiology Res 2011; 1 (01) e24
  • 24 Moore BCJ, Glasberg BR, Stone MA. Development of a new method for deriving initial fittings for hearing aids with multi-channel compression: CAMEQ2-HF. Int J Audiol 2010; 49 (03) 216-227
  • 25 Scollie S, Seewald R, Cornelisse L. et al. The desired sensation level multistage input/output algorithm. Trends Amplif 2005; 9 (04) 159-197
  • 26 Johnson EE, Dillon H. A comparison of gain for adults from generic hearing aid prescriptive methods: impacts on predicted loudness, frequency bandwidth, and speech intelligibility. J Am Acad Audiol 2011; 22 (07) 441-459
  • 27 Salorio-Corbetto M, Baer T, Stone MA, Moore BCJ. Effect of the number of amplitude-compression channels and compression speed on speech recognition by listeners with mild to moderate sensorineural hearing loss. J Acoust Soc Am 2020; 147 (03) 1344-1358
  • 28 Rallapalli VH, Alexander JM. Effects of noise and reverberation on speech recognition with variants of a multichannel adaptive dynamic range compression scheme. Int J Audiol 2019; 58 (10) 661-669
  • 29 Novick ML, Bentler RA, Dittberner A, Flamme GA. Effects of release time and directionality on unilateral and bilateral hearing aid fittings in complex sound fields. J Am Acad Audiol 2001; 12 (10) 534-544
  • 30 Gatehouse S, Naylor G, Elberling C. Linear and nonlinear hearing aid fittings—2. Patterns of candidature. Int J Audiol 2006; 45 (03) 153-171
  • 31 McCreery RW, Venediktov RA, Coleman JJ, Leech HM. An evidence-based systematic review of amplitude compression in hearing aids for school-age children with hearing loss. Am J Audiol 2012; 21 (02) 269-294
  • 32 Moore BCJ, Peters RW, Stone MA. Benefits of linear amplification and multichannel compression for speech comprehension in backgrounds with spectral and temporal dips. J Acoust Soc Am 1999; 105 (01) 400-411
  • 33 Brennan MA, McCreery RW, Buss E, Jesteadt W. The influence of hearing aid gain on gap-detection thresholds for children and adults with hearing loss. Ear Hear 2018; 39 (05) 969-979
  • 34 Hall JW, Buss E, Grose JH, Roush PA. Effects of age and hearing impairment on the ability to benefit from temporal and spectral modulation. Ear Hear 2012; 33 (03) 340-348
  • 35 Nittrouer S, Crowther CS, Miller ME. The relative weighting of acoustic properties in the perception of [s] + stop clusters by children and adults. Percept Psychophys 1998; 60 (01) 51-64
  • 36 Boothroyd A. Perception of speech pattern contrasts from auditory presentation of voice fundamental frequency. Ear Hear 1988; 9 (06) 313-321
  • 37 Marriage JE, Moore BCJ. New speech tests reveal benefit of wide-dynamic-range, fast-acting compression for consonant discrimination in children with moderate-to-profound hearing loss. Int J Audiol 2003; 42 (07) 418-425
  • 38 Marriage JE, Moore BCJ, Stone MA, Baer T. Effects of three amplification strategies on speech perception by children with severe and profound hearing loss. Ear Hear 2005; 26 (01) 35-47
  • 39 Liu H, Liu Y, Li Y. et al. Effect of adaptive compression and fast-acting WDRC strategies on sentence recognition in noise in mandarin-speaking pediatric hearing aid users. J Am Acad Audiol 2018; 29 (04) 273-278
  • 40 Houtgast T, Steeneken HJM. A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J Acoust Soc Am 1985; 77: 1069-1077
  • 41 Reinhart PN, Souza PE, Srinivasan NK, Gallun FJ. Effects of reverberation and compression on consonant identification in individuals with hearing impairment. Ear Hear 2016; 37 (02) 144-152
  • 42 Shi L-F, Doherty KA. Subjective and objective effects of fast and slow compression on the perception of reverberant speech in listeners with hearing loss. J Speech Lang Hear Res 2008; 51 (05) 1328-1340
  • 43 Reinhart PN, Souza PE. Intelligibility and clarity of reverberant speech: effects of wide dynamic range compression release time and working memory. J Speech Lang Hear Res 2016; 59 (06) 1543-1554
  • 44 Reinhart P, Zahorik P, Souza P. Effects of reverberation on the relation between compression speed and working memory for speech-in-noise perception. Ear Hear 2019; 40 (05) 1098-1105
  • 45 Jenstad LM, Souza PE. Quantifying the effect of compression hearing aid release time on speech acoustics and intelligibility. J Speech Lang Hear Res 2005; 48 (03) 651-667
  • 46 Kates JM, Arehart KH. The hearing-aid speech perception index (HASPI). Speech Commun 2014; 65: 75-93
  • 47 Davies-Venn E, Souza P, Brennan M, Stecker GC. Effects of audibility and multichannel wide dynamic range compression on consonant recognition for listeners with severe hearing loss. Ear Hear 2009; 30 (05) 494-504
  • 48 American National Standards Institute. Methods for Calculation of the Speech Intelligibility Index (s3.5). New York, NY: ANSI; 1997
  • 49 Hagerman B, Olofsson Å.. A method to measure the effect of noise reduction algorithms using simultaneous speech and noise. Acta Acust United Acust 2004; 90: 356-361
  • 50 Souza PE, Turner CW. Quantifying the contribution of audibility to recognition of compression-amplified speech. Ear Hear 1999; 20 (01) 12-20
  • 51 Kates JM, Arehart KH, Anderson MC, Kumar Muralimanohar R, Harvey Jr LO. Using objective metrics to measure hearing aid performance. Ear Hear 2018; 39 (06) 1165-1175
  • 52 Rasetshwane DM, Raybine DA, Kopun JG, Gorga MP, Neely ST. Influence of instantaneous compression on recognition of speech in noise with temporal dips. J Am Acad Audiol 2019; 30 (01) 16-30
  • 53 Muralimanohar RK, Kates JM, Arehart KH. Using envelope modulation to explain speech intelligibility in the presence of a single reflection. J Acoust Soc Am 2017; 141 (05) EL482-EL487
  • 54 Kates JM, Arehart KH. The hearing-aid speech perception index (HASPI) version 2. Speech Commun 2020; 131: 35-46
  • 55 American Speech-Language-Hearing Association. Guidelines for manual pure-tone threshold audiometry. Rockville, MD: ASHA; 2005
  • 56 McCreery RW, Stelmachowicz PG. Audibility-based predictions of speech recognition for children and adults with normal hearing. J Acoust Soc Am 2011; 130 (06) 4070-4081
  • 57 Knecht HA, Nelson PB, Whitelaw GM, Feth LL. Background noise levels and reverberation times in unoccupied classrooms: predictions and measurements. Am J Audiol 2002; 11 (02) 65-71
  • 58 Bradley JS. Speech intelligibility studies in classrooms. J Acoust Soc Am 1986; 80 (03) 846-854
  • 59 Sato H, Bradley JS. Evaluation of acoustical conditions for speech communication in working elementary school classrooms. J Acoust Soc Am 2008; 123 (04) 2064-2077
  • 60 Cox RM, Moore JN. Composite speech spectrum for hearing and gain prescriptions. J Speech Hear Res 1988; 31 (01) 102-107
  • 61 American National Standards Institute A.. American National Standard Specification for Octave-Band and Fractional-Octave-Band Analog and Digital Filters (ANSI s1.11–1986). New York, NY: ASA; 1986
  • 62 Richardson JTE. The use of Latin-square designs in educational and psychological research. Educ Res Rev 2018; 24: 84-97
  • 63 Oleson JJ, Brown GD, McCreery R. The evolution of statistical methods in speech, language, and hearing sciences. J Speech Lang Hear Res 2019; 62 (03) 498-506
  • 64 Walker EA, Redfern A, Oleson JJ. Linear mixed-model analysis to examine longitudinal trajectories in vocabulary depth and breadth in children who are hard of hearing. J Speech Lang Hear Res 2019; 62 (03) 525-542
  • 65 Xia J, Xu B, Pentony S, Xu J, Swaminathan J. Effects of reverberation and noise on speech intelligibility in normal-hearing and aided hearing-impaired listeners. J Acoust Soc Am 2018; 143 (03) 1523-1533
  • 66 Parthasarathy A, Bartlett EL, Kujawa SG. Age-related changes in neural coding of envelope cues: peripheral declines and central compensation. Neuroscience 2019; 407: 21-31
  • 67 Tremblay KL, Piskosz M, Souza P. Aging alters the neural representation of speech cues. Neuroreport 2002; 13 (15) 1865-1870
  • 68 Anderson S, Parbery-Clark A, White-Schwoch T, Kraus N. Aging affects neural precision of speech encoding. J Neurosci 2012; 32 (41) 14156-14164
  • 69 Bartha-Doering L, Deuster D, Giordano V, am Zehnhoff-Dinnesen A, Dobel C. A systematic review of the mismatch negativity as an index for auditory sensory memory: from basic research to clinical and developmental perspectives. Psychophysiology 2015; 52 (09) 1115-1130
  • 70 Davies-Venn E, Nelson P, Souza P. Comparing auditory filter bandwidths, spectral ripple modulation detection, spectral ripple discrimination, and speech recognition: normal and impaired hearing. J Acoust Soc Am 2015; 138 (01) 492-503
  • 71 Gifford RH, Bacon SP, Williams EJ. An examination of speech recognition in a modulated background and of forward masking in younger and older listeners. J Speech Lang Hear Res 2007; 50 (04) 857-864
  • 72 Larson VD, Williams DW, Henderson WG. et al. NIDCD/VA Hearing Aid Clinical Trial Group. Efficacy of 3 commonly used hearing aid circuits: a crossover trial. JAMA 2000; 284 (14) 1806-1813

Address for correspondence

Marc A. Brennan, PhD

Publication History

Received: 14 September 2020

Accepted: 21 October 2021

Accepted Manuscript online:
25 October 2021

Article published online:
10 October 2022

© 2022. American Academy of Audiology. This article is published by Thieme.

Thieme Medical Publishers, Inc.
333 Seventh Avenue, 18th Floor, New York, NY 10001, USA

  • References

  • 1 McCreery RW, Walker EA, Spratford M. et al. Speech recognition and parent ratings from auditory development questionnaires in children who are hard of hearing. Ear Hear 2015; 36 (Suppl. 01) 60S-75S
  • 2 Brennan M, McCreery R, Kopun J, Lewis D, Alexander J, Stelmachowicz P. Masking release in children and adults with hearing loss when using amplification. J Speech Lang Hear Res 2016; 59 (01) 110-121
  • 3 Wróblewski M, Lewis DE, Valente DL, Stelmachowicz PG. Effects of reverberation on speech recognition in stationary and modulated noise by school-aged children and young adults. Ear Hear 2012; 33 (06) 731-744
  • 4 Smeds K, Wolters F, Rung M. Estimation of signal-to-noise ratios in realistic sound scenarios. J Am Acad Audiol 2015; 26 (02) 183-196
  • 5 Wolters F, Smeds K, Schmidt E, Christensen EK, Norup C. Common sound scenarios: a context-driven categorization of everyday sound environments for application in hearing-device research. J Am Acad Audiol 2016; 27 (07) 527-540
  • 6 Smeds K, Gotowiec S, Wolters F, Herrlin P, Larsson J, Dahlquist M. Selecting scenarios for hearing-related laboratory testing. Ear Hear 2020; 41 (Suppl. 01) 20S-30S
  • 7 Wagener KC, Hansen M, Ludvigsen C. Recording and classification of the acoustic environment of hearing aid users. J Am Acad Audiol 2008; 19 (04) 348-370
  • 8 McCreery RW, Walker EA, Spratford M, Lewis D, Brennan M. Auditory, cognitive, and linguistic factors predict speech recognition in adverse listening conditions for children with hearing loss. Front Neurosci 2019; 13: 1093
  • 9 Crukley J, Scollie S, Parsa V. An exploration of non-quiet listening at school. J Educ Audiol 2011; 17: 23-35
  • 10 Culling JF. Speech intelligibility in virtual restaurants. J Acoust Soc Am 2016; 140 (04) 2418-2426
  • 11 Quartieri J, D'Ambrosio S, Guarnaccia C, Iannone G. Experiments in room acoustics: modelling of a church sound field and reverberation time measurements. WSEAS Trans Signal Process 2009; 5: 126-135
  • 12 Brennan M, Souza P. Effects of expansion on consonant recognition and consonant audibility. J Am Acad Audiol 2009; 20 (02) 119-127
  • 13 Kates JM. Understanding compression: modeling the effects of dynamic-range compression in hearing aids. Int J Audiol 2010; 49 (06) 395-409
  • 14 Stone MA, Moore BCJ. Syllabic compression: effective compression ratios for signals modulated at different rates. Br J Audiol 1992; 26 (06) 351-361
  • 15 Bor S, Souza P, Wright R. Multichannel compression: Effects of reduced spectral contrast on vowel identification. J Speech Lang Hear Res 2008; Oct; 51 (05) 1315-1527
  • 16 Reinhart P, Zahorik P, Souza PE. Effects of reverberation, background talker number, and compression release time on signal-to-noise ratio. J Acoust Soc Am 2017; 142 (01) EL130-EL135
  • 17 Naylor G, Johannesson RB. Long-term signal-to-noise ratio at the input and output of amplitude-compression systems. J Am Acad Audiol 2009; 20 (03) 161-171
  • 18 Alexander JM, Rallapalli V. Acoustic and perceptual effects of amplitude and frequency compression on high-frequency speech. J Acoust Soc Am 2017; 142 (02) 908-923
  • 19 Stone MA, Moore BCJ. Quantifying the effects of fast-acting compression on the envelope of speech. J Acoust Soc Am 2007; 121 (03) 1654-1664
  • 20 Alexander JM, Masterson K. Effects of WDRC release time and number of channels on output SNR and speech recognition. Ear Hear 2015; 36 (02) e35-e49
  • 21 Boike KT, Souza PE. Effect of compression ratio on speech recognition and speech-quality ratings with wide dynamic range compression amplification. J Speech Lang Hear Res 2000; 43 (02) 456-468
  • 22 van Buuren RA, Festen JM, Houtgast T. Compression and expansion of the temporal envelope: evaluation of speech intelligibility and sound quality. J Acoust Soc Am 1999; 105 (05) 2903-2913
  • 23 Keidser G, Dillon H, Flax M, Ching T, Brewer S. The NAL-NL2 prescription procedure. Audiology Res 2011; 1 (01) e24
  • 24 Moore BCJ, Glasberg BR, Stone MA. Development of a new method for deriving initial fittings for hearing aids with multi-channel compression: CAMEQ2-HF. Int J Audiol 2010; 49 (03) 216-227
  • 25 Scollie S, Seewald R, Cornelisse L. et al. The desired sensation level multistage input/output algorithm. Trends Amplif 2005; 9 (04) 159-197
  • 26 Johnson EE, Dillon H. A comparison of gain for adults from generic hearing aid prescriptive methods: impacts on predicted loudness, frequency bandwidth, and speech intelligibility. J Am Acad Audiol 2011; 22 (07) 441-459
  • 27 Salorio-Corbetto M, Baer T, Stone MA, Moore BCJ. Effect of the number of amplitude-compression channels and compression speed on speech recognition by listeners with mild to moderate sensorineural hearing loss. J Acoust Soc Am 2020; 147 (03) 1344-1358
  • 28 Rallapalli VH, Alexander JM. Effects of noise and reverberation on speech recognition with variants of a multichannel adaptive dynamic range compression scheme. Int J Audiol 2019; 58 (10) 661-669
  • 29 Novick ML, Bentler RA, Dittberner A, Flamme GA. Effects of release time and directionality on unilateral and bilateral hearing aid fittings in complex sound fields. J Am Acad Audiol 2001; 12 (10) 534-544
  • 30 Gatehouse S, Naylor G, Elberling C. Linear and nonlinear hearing aid fittings—2. Patterns of candidature. Int J Audiol 2006; 45 (03) 153-171
  • 31 McCreery RW, Venediktov RA, Coleman JJ, Leech HM. An evidence-based systematic review of amplitude compression in hearing aids for school-age children with hearing loss. Am J Audiol 2012; 21 (02) 269-294
  • 32 Moore BCJ, Peters RW, Stone MA. Benefits of linear amplification and multichannel compression for speech comprehension in backgrounds with spectral and temporal dips. J Acoust Soc Am 1999; 105 (01) 400-411
  • 33 Brennan MA, McCreery RW, Buss E, Jesteadt W. The influence of hearing aid gain on gap-detection thresholds for children and adults with hearing loss. Ear Hear 2018; 39 (05) 969-979
  • 34 Hall JW, Buss E, Grose JH, Roush PA. Effects of age and hearing impairment on the ability to benefit from temporal and spectral modulation. Ear Hear 2012; 33 (03) 340-348
  • 35 Nittrouer S, Crowther CS, Miller ME. The relative weighting of acoustic properties in the perception of [s] + stop clusters by children and adults. Percept Psychophys 1998; 60 (01) 51-64
  • 36 Boothroyd A. Perception of speech pattern contrasts from auditory presentation of voice fundamental frequency. Ear Hear 1988; 9 (06) 313-321
  • 37 Marriage JE, Moore BCJ. New speech tests reveal benefit of wide-dynamic-range, fast-acting compression for consonant discrimination in children with moderate-to-profound hearing loss. Int J Audiol 2003; 42 (07) 418-425
  • 38 Marriage JE, Moore BCJ, Stone MA, Baer T. Effects of three amplification strategies on speech perception by children with severe and profound hearing loss. Ear Hear 2005; 26 (01) 35-47
  • 39 Liu H, Liu Y, Li Y. et al. Effect of adaptive compression and fast-acting WDRC strategies on sentence recognition in noise in mandarin-speaking pediatric hearing aid users. J Am Acad Audiol 2018; 29 (04) 273-278
  • 40 Houtgast T, Steeneken HJM. A review of the MTF concept in room acoustics and its use for estimating speech intelligibility in auditoria. J Acoust Soc Am 1985; 77: 1069-1077
  • 41 Reinhart PN, Souza PE, Srinivasan NK, Gallun FJ. Effects of reverberation and compression on consonant identification in individuals with hearing impairment. Ear Hear 2016; 37 (02) 144-152
  • 42 Shi L-F, Doherty KA. Subjective and objective effects of fast and slow compression on the perception of reverberant speech in listeners with hearing loss. J Speech Lang Hear Res 2008; 51 (05) 1328-1340
  • 43 Reinhart PN, Souza PE. Intelligibility and clarity of reverberant speech: effects of wide dynamic range compression release time and working memory. J Speech Lang Hear Res 2016; 59 (06) 1543-1554
  • 44 Reinhart P, Zahorik P, Souza P. Effects of reverberation on the relation between compression speed and working memory for speech-in-noise perception. Ear Hear 2019; 40 (05) 1098-1105
  • 45 Jenstad LM, Souza PE. Quantifying the effect of compression hearing aid release time on speech acoustics and intelligibility. J Speech Lang Hear Res 2005; 48 (03) 651-667
  • 46 Kates JM, Arehart KH. The hearing-aid speech perception index (HASPI). Speech Commun 2014; 65: 75-93
  • 47 Davies-Venn E, Souza P, Brennan M, Stecker GC. Effects of audibility and multichannel wide dynamic range compression on consonant recognition for listeners with severe hearing loss. Ear Hear 2009; 30 (05) 494-504
  • 48 American National Standards Institute. Methods for Calculation of the Speech Intelligibility Index (s3.5). New York, NY: ANSI; 1997
  • 49 Hagerman B, Olofsson Å.. A method to measure the effect of noise reduction algorithms using simultaneous speech and noise. Acta Acust United Acust 2004; 90: 356-361
  • 50 Souza PE, Turner CW. Quantifying the contribution of audibility to recognition of compression-amplified speech. Ear Hear 1999; 20 (01) 12-20
  • 51 Kates JM, Arehart KH, Anderson MC, Kumar Muralimanohar R, Harvey Jr LO. Using objective metrics to measure hearing aid performance. Ear Hear 2018; 39 (06) 1165-1175
  • 52 Rasetshwane DM, Raybine DA, Kopun JG, Gorga MP, Neely ST. Influence of instantaneous compression on recognition of speech in noise with temporal dips. J Am Acad Audiol 2019; 30 (01) 16-30
  • 53 Muralimanohar RK, Kates JM, Arehart KH. Using envelope modulation to explain speech intelligibility in the presence of a single reflection. J Acoust Soc Am 2017; 141 (05) EL482-EL487
  • 54 Kates JM, Arehart KH. The hearing-aid speech perception index (HASPI) version 2. Speech Commun 2020; 131: 35-46
  • 55 American Speech-Language-Hearing Association. Guidelines for manual pure-tone threshold audiometry. Rockville, MD: ASHA; 2005
  • 56 McCreery RW, Stelmachowicz PG. Audibility-based predictions of speech recognition for children and adults with normal hearing. J Acoust Soc Am 2011; 130 (06) 4070-4081
  • 57 Knecht HA, Nelson PB, Whitelaw GM, Feth LL. Background noise levels and reverberation times in unoccupied classrooms: predictions and measurements. Am J Audiol 2002; 11 (02) 65-71
  • 58 Bradley JS. Speech intelligibility studies in classrooms. J Acoust Soc Am 1986; 80 (03) 846-854
  • 59 Sato H, Bradley JS. Evaluation of acoustical conditions for speech communication in working elementary school classrooms. J Acoust Soc Am 2008; 123 (04) 2064-2077
  • 60 Cox RM, Moore JN. Composite speech spectrum for hearing and gain prescriptions. J Speech Hear Res 1988; 31 (01) 102-107
  • 61 American National Standards Institute A.. American National Standard Specification for Octave-Band and Fractional-Octave-Band Analog and Digital Filters (ANSI s1.11–1986). New York, NY: ASA; 1986
  • 62 Richardson JTE. The use of Latin-square designs in educational and psychological research. Educ Res Rev 2018; 24: 84-97
  • 63 Oleson JJ, Brown GD, McCreery R. The evolution of statistical methods in speech, language, and hearing sciences. J Speech Lang Hear Res 2019; 62 (03) 498-506
  • 64 Walker EA, Redfern A, Oleson JJ. Linear mixed-model analysis to examine longitudinal trajectories in vocabulary depth and breadth in children who are hard of hearing. J Speech Lang Hear Res 2019; 62 (03) 525-542
  • 65 Xia J, Xu B, Pentony S, Xu J, Swaminathan J. Effects of reverberation and noise on speech intelligibility in normal-hearing and aided hearing-impaired listeners. J Acoust Soc Am 2018; 143 (03) 1523-1533
  • 66 Parthasarathy A, Bartlett EL, Kujawa SG. Age-related changes in neural coding of envelope cues: peripheral declines and central compensation. Neuroscience 2019; 407: 21-31
  • 67 Tremblay KL, Piskosz M, Souza P. Aging alters the neural representation of speech cues. Neuroreport 2002; 13 (15) 1865-1870
  • 68 Anderson S, Parbery-Clark A, White-Schwoch T, Kraus N. Aging affects neural precision of speech encoding. J Neurosci 2012; 32 (41) 14156-14164
  • 69 Bartha-Doering L, Deuster D, Giordano V, am Zehnhoff-Dinnesen A, Dobel C. A systematic review of the mismatch negativity as an index for auditory sensory memory: from basic research to clinical and developmental perspectives. Psychophysiology 2015; 52 (09) 1115-1130
  • 70 Davies-Venn E, Nelson P, Souza P. Comparing auditory filter bandwidths, spectral ripple modulation detection, spectral ripple discrimination, and speech recognition: normal and impaired hearing. J Acoust Soc Am 2015; 138 (01) 492-503
  • 71 Gifford RH, Bacon SP, Williams EJ. An examination of speech recognition in a modulated background and of forward masking in younger and older listeners. J Speech Lang Hear Res 2007; 50 (04) 857-864
  • 72 Larson VD, Williams DW, Henderson WG. et al. NIDCD/VA Hearing Aid Clinical Trial Group. Efficacy of 3 commonly used hearing aid circuits: a crossover trial. JAMA 2000; 284 (14) 1806-1813

Zoom Image
Fig. 1 Audiometric thresholds for the participants. Age group is indicated by each title. For this and remaining box plots, box boundaries represent the 25th and 75th percentiles, and error bars represent 2.7 standard deviations or the most extreme value that is not an outlier, whichever is lower. Horizontal lines represent the medians, and pluses represent outliers (greater than 2.7 standard deviations).
Zoom Image
Fig. 2 Nonsense syllable recognition for the adults (top panel) and children (bottom panel). The release time is indicated by the legend. Nonsense syllable recognition decreased from the 0- to 0.5 and 1-second reverberation times. Box plots are shown in the same manner as in [Fig. 1].
Zoom Image
Fig. 3 Relationship of audibility, temporal distortion, and spectral distortion to nonsense syllable recognition. Each column depicts the 0-, 0.5-, 1-second reverberation time conditions, respectively. Each row depicts the raw values for the Speech Intelligibility Index (SII), Envelope Difference Index (EDI), and Hearing-Aid Speech Perception Index (HASPI), respectively. Child participants are represented by the diamond symbols and adults by the circle symbols. Release time is indicated by the insert. Lines indicate the linear-mixed effect model prediction for nonsense syllable recognition. Increases in the SII were associated with increases in nonsense syllable recognition.