J Am Acad Audiol 2019; 30(05): 363-369
DOI: 10.3766/jaaa.17125
Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

Comparison of Children’s Double Dichotic Digits and SCAN-3 Competing Words Free Recall Scores

Kairn Stetler Kelley
*   Program in Clinical and Translational Science, University of Vermont, Burlington, VT
Benjamin Littenberg
*   Program in Clinical and Translational Science, University of Vermont, Burlington, VT
†   Robert Larner, MD College of Medicine, University of Vermont, Burlington, VT
› Author Affiliations
Further Information

Corresponding author

Kairn Stetler Kelley
Clinical and Translational Science, University of Vermont
Burlington, VT 05405

Publication History

06 October 2018

10 October 2018

Publication Date:
26 May 2020 (online)




Practice guidelines do not specify which test recordings are best for assessing dichotic deficit or interaural asymmetry. Dichotic Digits and SCAN-3 Competing Words Free Recall are among the most widely used dichotic tests, but it is not known if the choice of test results in important differences in the identification of children with deficits or if they can be used interchangeably.


To determine whether two commonly used dichotic tests, SCAN-3 Competing Words Free Recall (CW) and Musiek’s Dichotic Digits (DD), agree on interaural asymmetry and dichotic deficit in children.

Research Design:

CW and DD tests were administered to all participants. Each participant had a single study visit.

Study Sample:

Sixty volunteers aged 7–14 years with normal hearing sensitivity participated in the study.

Data Collection and Analysis:

Hearing sensitivity, CW, and DD performance were measured at a single study visit. We used Spearman’s rho (ρ) to assess associations between rank ordering of participants by each test and the kappa statistic (κ) to assess decision consistency between tests.


Participants were rank-ordered similarly by CW and DD for the right ear (ρ = 0.58), left ear (ρ = 0.51), and total (ρ = 0.73) scores, but not for interaural asymmetry (ρ =0.18). They agreed no better than chance on direction of ear advantage (κ = 0.01, p = 0.93) and had poor agreement on which children scored below cut-scores (κ = 0.22, p < 0.01). DD identified significantly more participants with deficits (n = 18) than CW (n = 3) (p < 0.001).


Although children with high scores on one test tend to have high scores on the other, CW and DD do not agree on ear advantage or the presence of deficit. They are not interchangeable for clinical use. Additional research is needed to determine whether either is appropriate for identifying children who would benefit from treatment for dichotic listening deficits.



Free recall dichotic listening tests, in which a listener is asked to repeat everything heard when different stimuli are presented simultaneously to each ear, are a staple of auditory processing evaluation of children in the United States ([Chermak et al, 2007]; [American Academy of Audiology, 2010]; [Emanuel et al, 2011]). Dichotic tests may give insight into the organization and capacity of the auditory central nervous system ([American Speech-Language Hearing Association, 2005]; [American Academy of Audiology, 2010]; [Hugdahl, 2011]). Although dichotic speech test scores vary somewhat with listener characteristics (e.g., age, cognition, and handedness), stimulus type (i.e., syllables, words, digits, or sentences), and stimulus complexity (e.g., length, degradation, or linguistic load), normal listeners of all ages tend to show a small right-ear advantage (REA) when comparable stimuli are presented to each ear ([Cullen et al, 1974]; [Musiek, 1983a]; [Noffsinger et al, 1994]; [Noffsinger et al, 1996]; [Wilson and Jaffe, 1996]; [Wilson and Leigh, 1996]). In people with documented lesions, bilaterally low scores on dichotic listening speech tests are associated with damage to the auditory cortex and asymmetric right- and left-ear scores are consistent with unilateral lesions or damage to the corpus callosum ([Kimura, 1961]; [Musiek, 1983b]; [Musiek et al, 1991]). Low and asymmetric dichotic listening scores in children have been associated with reading and language disorders ([Abigail and Johnson, 1976]; [Moncrieff and Musiek, 2002]; [Agnew, 2004]; [Dlouha et al, 2007]; [Moncrieff and Black, 2008]).

The American Academy of Audiology Clinical Practice Guidelines for Diagnosis, Treatment, and Management of Children and Adults with Central Auditory Processing Disorder (APD) ([American Academy of Audiology, 2010]) advises that interpretation of dichotic listening test results should include both interaural asymmetry (i.e., difference between right- and left-ear performance by a listener) and listener performance relative to normative cutoff criteria [i.e., two standard deviations (SDs) below normal-listeners’ mean]. The guidelines state that “a child with a typically developing auditory system should (…) have greater right-ear score than left-ear score on dichotic speech tasks. This REA diminishes and left-ear performance improves as the child matures. Findings other than these, such as an exaggerated REA or a left-ear advantage, have implications for the diagnosis of (central) auditory processing disorder.” They recommend avoiding diagnosis of APD in the face of conflicting test findings such as “right-ear deficit on one task combined with a left-ear deficit on another similar task within the same individual.” However, the recommendation to cross-check the direction of ear advantage among dichotic tasks has not been validated among the many dichotic tests available, nor the many different strategies used to quantify and compare interaural asymmetry ([Harshman and Lundy, 1988]; [Kelley and Littenberg, 2018]). If audiologists administer only one dichotic measure to save time and prevent fatigue, it is important to know whether different tests yield similar results. Adults’ performance on dichotic tests using syllables, words, digits, and sentences has been compared ([Musiek 1983a]; [Noffsinger et al, 1994]; [Wilson and Jaffe, 1996]), but no similar comparison is available for children.

The purpose of this study was to compare children’s performance on two free recall dichotic listening tests that are among the most commonly used for contributing to clinical diagnosis of APD ([Emanuel, 2002]; [Chermak et al, 2007]; [Emanuel et al, 2011]): SCAN-3 Competing Words Free Recall (CW) ([Keith, 2009a],[b]) and Double Dichotic Digits (DD) ([Musiek, 1983a]). Performance on CW is considered abnormal if the total score (right plus left) falls below the age-specific cut-score or if interaural asymmetry is greater than that expected for age. DD is evaluated against age-specific cut-scores for right and left ears. Failing in only one ear on DD implies asymmetry, but there are no guidelines for how to interpret the size of asymmetry in the presence of bilaterally low scores on DD. Scores and classification of individual children by free recall CW and DD have not previously been directly compared in a single sample. We sought to determine whether children who had high scores on one test also scored well on the other test. If the tests are measuring the same construct, they should agree on rank ordering of listeners. We also wanted to determine whether CW and DD could be used interchangeably to describe ear advantage and to identify children with abnormal dichotic listening.



This is a cross-sectional, concurrent, within-subjects assessment of agreement of children’s performance on CW and DD tests measured at a single study visit in the spring of 2014. The data were collected as part of a study of test–retest reliability that demonstrated no systematic change in CW or DD on retest and no difference in within-subject SD between CW and DD scores based on 40 items ([Kelley and Littenberg, 2019]). In the present article, we compare scores from only the first administration of each test. The study was approved by the University of Vermont Committee on Human Research. Child consent (age 11 years and older) or assent (age 10 years and younger) and guardian consent were documented at the start of the visit. K.S.K. administered all aspects of the protocol, including scheduling, testing, and follow-up as needed.


English-speaking volunteers between the ages of 7 and 14 years, with normal hearing sensitivity and able to complete the study visit, were eligible to participate. Participants were recruited using flyers, word of mouth, and social media postings. Participants were provided with a report of test performance and coupons donated by local businesses (e.g., arcade card and free beverage) to thank them for their time and cooperation. Demographic information (gender, race, ethnicity, education of adults in household, and household income) and participants’ characteristics (handedness; difficulties with academics, attention, or development; special services received; medications; and musical training) were captured on a questionnaire completed by the adult(s) accompanying the participant to the study visit.


Environment and Equipment

Testing was conducted between April and June of 2014 in an International Acoustics Company double-walled audiology booth using a GSI-61 clinical audiometer (Grason-Stadler, Eden Prairie, MN) and the participant’s choice of EAR-Tone 3A insert earphones or Telephonics TDH-50P headphones (Telephonics, Huntington, NY). All were calibrated to the ANSI standard ([ANSI, 1996]; [1999]). Hearing sensitivity was measured for each at 500, 1000, 2000, and 4000 Hz using a modified Hughson–Westlake method ([Carhart and Jerger, 1959]). Participants were eligible for dichotic testing if hearing sensitivity was better than 20-dB HL in each ear.

Dichotic stimuli were presented from a commercially available (Denon, Japan) CD changer. Dichotic tests were administered at 50-dB HL on the audiometer dial, which corresponded to the recommended presentation of 50-dB HL for CW and to 59-dB HL for DD, acceptably close to recommended 50-dB SL re: spondee threshold (which we did not measure for this study). Dichotic tests were presented after hearing sensitivity was documented. The order of the dichotic tests was determined for each participant by block randomization.


SCAN-3 Competing Words (CW)

The stimuli for Competing Words (CW) subtest of the SCAN-3 ([Keith, 2009a],[b]) were 40 single-syllable English words (20 dichotic pairs) spoken by a male voice. Recorded instructions preceding test materials instructed participants to repeat both words for each pair. Each word pair was presented once during the test. For clinical use, there is a right- and a left-ear list. In this study, the routing of the stereo channels was randomized, so approximately half the children had the “right list” directed to their left ear. Reliability for CW is reported as Pearson’s product-moment correlation coefficient of r = 0.59 for children and r = 0.69 for adolescents and adults ([Keith, 2009a],[b]). [Kelley and Littenberg (2019)] measured average within-subject SD of 5.3% for total (40-item) scores and 7.6% for ear-specific (20-item) measures.

Participants’ responses were scored as correct or incorrect for each target word (i.e., child did or did not repeat the target word). Participants were classified using the diagnostic cut-scores for their specific ages as published in the test manuals for SCAN-3 for Children ([Keith, 2009b]) and SCAN-3 for Adolescents and Adults ([Keith, 2009a]). Cut-scores for SCAN-3 were derived from a normative sample stratified by age, gender, race/ethnicity, geographic region, and educational level of the primary caregiver. It is important to note that SCAN-3 is a fully normed diagnostic test distinct from the earlier SCAN Screening Test for APD ([Keith, 1986]).

Dichotic deficit on CW was defined as having a total score (right plus left) more than two SDs below the mean for age or interaural asymmetry (right minus left) more extreme than 96% of the normative sample for age (2% at each tail of the distribution).


Double Dichotic Digits (DD) Test

Musiek’s Dichotic Digits (DD) test ([Hurley and Musiek, 1997]) double pairs stimuli were audio recordings of single-syllable digits (1–10, excluding 7) spoken in English by a male voice. Each of the 20 trials of the test included two consecutive pairs of digits presented to each ear (four digits total per trial). The listener was instructed by the test administrator that she/he would be hearing different numbers in each ear at the same time and should repeat all of the numbers heard, regardless of the order. As per Musiek’s recommended protocol ([Musiek, 1983a]), the recording was paused if participants needed more time to respond. [Musiek et al (1991)] reported test–retest reliability of DD as r = 0.77. [Kelley and Littenberg (2019)] measured average within-subject SD of 5.2% for ear-specific 40-item scores. Audiologists are advised to collect local norms for DD ([Musiek, 1983a]), but sample cutoff criteria for DD right-and left-ear scores are available ([Bellis, 2002]; [Rosenberg, 2011]). We used the cut-scores published in [Bellis (2002)] because peers indicate these criteria are in common use.

Dichotic deficit on DD was defined as having either right-ear score or left-ear score below the age-specific cutoffs published in [Bellis (2002)].



CW and DD right ear, left ear, and total scores are presented as proportion correct (number of correct responses divided by number of stimuli presented). Interaural asymmetry is presented as the number of items different (right-ear number correct minus left-ear number correct). We used Wilcoxon’s sign-rank test to compare scores between tests and between ears.

We sought to determine if CW and DD were measuring the same underlying phenomenon (efficiency of right- and left-auditory pathways) by comparing the association between each test’s raw (right and left) and calculated scores (total and difference). Because the distribution of scores violated the assumptions required to interpret linear correlation, we used the Spearman correlation coefficient (ρ) to quantify the association between rank ordering by each score. The null hypothesis was “no association” for each of the four comparisons (right, left, total, and asymmetry).

Decision consistency of CW and DD pass/fail classification and classification of interaural asymmetry (REA present or absent) was evaluated using the kappa coefficient (κ). Kappa compares observed agreement to the amount of chance agreement expected, given the distribution of the two variables compared ([Viera and Garrett, 2005]). We classified each participant as having REA on a test if the right-ear score minus the left-ear score was greater than zero.



We enrolled 60 volunteer participants aged 7–14 years with normal hearing sensitivity. Characteristics of participants are summarized in [Table 1]. About one-third (n = 22) were receiving support in school for developmental, educational, or emotional difficulties; 10 were reported as having attention deficit hyperactivity disorder. Adult-completed questionnaires of 13 participants endorsed “concerns about hearing, listening, or ability to understand.”

Table 1

Characteristics of 60 Participants


Mean or n

(SD or %)

Age (years), mean (SD)



Female, no. (%)



Right handed, no. (%)



Homeschool, no. (%)



Non-Hispanic white (%)



Highest parental education

 High school, no. (%)



 College, no. (%)



 Graduate, no. (%)



Annual household income

 <$25K, no. (%)



 $25K–$74K, no. (%)



 $75–$99K, no. (%)



 >$99K, no. (%)



Dichotic test scores are summarized in [Table 2]. Participants had higher proportions of correct responses on DD than CW (p < 0.001) and higher right-ear than left-ear scores on both CW (p < 0.001) and DD (p < 0.001). Mean interaural asymmetry was 1.6 words on CW and 2.2 digits on DD. Having a higher score on the right ear was associated with having a higher left-ear score on both CW (ρ = 0.43, p < 0.001) and DD (ρ = 0.52, p < 0.001).

Table 2

Mean (SD) and Range of 60 Participants’ Competing Words (CW) and Dichotic Digits (DD) Scores


Right Ear

Left Ear

Total Score

Interaural Asymmetry


64% (14) [30, 90]

55% (19) [0, 90]

59% (14) [22, 90]

1.7 words (3.7) [−5,10]


90% (9) [65, 100]

84% (15) [28, 100]

87% (11) [47, 100]

2.2 digits (4.1) [−5,16]

Association between CW and DD Scores

Participants were rank-ordered similarly (see [Figure 1]) by DD and CW right-ear scores (ρ = 0.58, p < 0.0001) and left-ear scores (ρ = 0.51; p < 0.0001). The association was even stronger (see [Figure 2]) between CW and DD total correct scores (ρ = 0.73, p < 0.0001). The association of rankings by CW and DD interaural asymmetry scores (right-left) was weak (ρ = 0.18) and not statistically significant (p = 0.18) (see [Figure 2]).

Zoom Image
Figure 1 Association between right- and left-ear scores on Competing Words and Dichotic Digits tests (n = 60).
Zoom Image
Figure 2 Associations between Competing Words and Dichotic Digits total scores and interaural asymmetry.


Interaural Asymmetry Measured by CW and DD

Thirty-eight participants had REA on CW. DD identified 34 participants with REA. The two tests agreed on the presence of REA for only 31/60 participants (52%), a result easily explained by chance (see [Table 3]; κ= 0.01; p = 0.93).

Table 3

REA Classification by Competing Words and Dichotic Digits Tests


















Listener Performance Relative to Normative Cutoff Criteria on CW and DD

DD identified significantly more participants with dichotic deficits (n = 18) than CW (n = 3) (p < 0.001) (see [Table 4]). The two tests agreed on classification of 45/60 participants (75%, κ = 0.22, p < 0.01). Based on the proportion of participants classified as passing by each test, agreement of 68% is expected by chance. The observed agreement of 75% is only 7% higher than the agreement expected by chance and is, therefore, classified as “poor agreement.” When there was disagreement between the tests (n = 15), participants were always classified as normal by CW but abnormal by DD. Among the children who were classified as having abnormal dichotic listening by both tests (n = 3), one had bilaterally low scores on DD, low CW total, and normal interaural asymmetry; one had bilaterally low scores on DD, low CW total, and abnormal CW interaural asymmetry; and one low DD left-ear score with abnormally large CW REA and abnormal CW total score.

Table 4

Participants Normal/Abnormal Classification of Dichotic Performance by Competing Words and Dichotic Digits Tests

DD Abnormal

DD Normal


CW abnormal




CW normal










Current theoretical models of APD underpinning practice guidelines for audiologists ([American Speech-Language Hearing Association, 2005]; [American Academy of Audiology, 2010]) posit that dichotic listening is one of several processes that could be impaired by disruption of auditory neural pathways. Consistently low or asymmetric dichotic listening scores could reflect APD. Inconsistent patterns of responses (e.g., right-ear deficit on one task combined with a left-ear deficit on another, similar task) may be a result of attention or cognitive processes and would not contribute to supporting a diagnosis of APD. Audiologists are given flexibility to select among normed dichotic tests, with the implied expectation that similar tests will reveal similar patterns of results if abnormal scores are in fact because of impaired dichotic processing.

The purpose of this study was to determine if children were classified similarly by two dichotic tests in common clinical use. Because the two tests present the listener with similar tasks, we expected agreement between the two tests would be high. Instead, we observed poor agreement on both direction of ear advantage and classification of children as normal or disordered. These data demonstrate that the two tests cannot be used interchangeably to describe interaural asymmetry or to identify children with dichotic listening deficit. Both tests showed an average interaural asymmetry that favored the right ear as expected, but CW and DD had only chance agreement on which individuals had REA. The association between ranking of participants in the study sample by size and direction of asymmetry (right-left) between CW and DD was also indistinguishable from chance. This lack of association could be due to homogeneity in the study sample, poor precision of interaural asymmetry estimates, or that the two tests are measuring fundamentally different phenomena. Few children in our sample have significant asymmetry, and noise in the measurements could obscure small differences. However, if poor precision prevents the detection of association between interaural asymmetry measures by CW and DD, it is unclear how audiologists could use direction of ear advantage to cross-check individuals’ results.

In our sample, DD identified six times more children as abnormal than did CW (30% versus 5%). Using published criteria, 25% of participants (n = 15) were diagnosed as disordered by DD but not by CW. Because the cutoff criteria use different scores (DD right and left versus CW total and difference), it is not clear whether poor agreement is caused by different sensitivity and specificity or whether the scores represent measurements of different constructs. However, whatever the source of disagreement, the tests were clearly not interchangeable in this sample. If an audiologist is using only one test to evaluate dichotic listening, the test selected will have an important impact on the likelihood of whether a child is identified as having a dichotic deficit. If an audiologist is using CW and DD in parallel for a cross-check protocol, a third test will often be needed to evaluate the children who have disagreement on the first two tests ([Jerger and Hayes, 1976]; [Turner, 2003]).The findings of the present study add to the growing body of literature that questions whether a diagnosis of APD is viable and demonstrates the importance of explicitly reporting exactly the tests and criteria used in the diagnosis of individual children ([Wilson and Arnott, 2013]; [Vermiglio, 2014]; [DeBonis, 2015]). It is possible that the fundamental assumption that all dichotic tests measure similar phenomenon is wrong. At the very least, differences in test selection may introduce an important outcome: a lack of consistency of results among audiologists.


This study is limited to a sample of mostly typically developing volunteers with few participants identified by either test as abnormal. The very narrow range of interaural asymmetry scores (about −5 to +15) observed in this sample may not reveal an association that could be apparent in a clinical sample. Differences between the two tests in maturational effects on size of ear advantage could also potentially decrease the apparent association between scores across the age range of the children tested. In addition, the reduced power of nonparametric statistics might have contributed to the apparent lack of association in ear difference between the two tests, but our data violated the assumptions necessary to interpret linear correlation.

Because there is no gold standard for dichotic listening deficit, we cannot address questions about which test is more accurate or why the sensitivity and specificity of the two tests appears to be different.



Scores on Dichotic Digits and Competing Words Free Recall tests are associated, but the two tests do not agree on interaural asymmetry or on which children should be identified with dichotic listening deficits. The tests are not interchangeable for clinical use. Additional research is urgently needed to determine if either of the tests is appropriate for identifying children who would benefit from treatment for dichotic listening deficits. Clinicians should be transparent about this uncertainty to patients, families, and referring providers before administering dichotic listening tests.



APD: auditory processing disorder
CW: SCAN-3 Competing Words Free Recall
DD: Musiek’s Dichotic Digits
REA: right-ear advantage
SD: standard deviation


No conflict of interest has been declared by the author(s).

A preliminary analysis of these data was presented to the 2015 Scientific and Technology Meeting of American Auditory Society in Scottsdale, AZ, March 5.

Corresponding author

Kairn Stetler Kelley
Clinical and Translational Science, University of Vermont
Burlington, VT 05405

Zoom Image
Figure 1 Association between right- and left-ear scores on Competing Words and Dichotic Digits tests (n = 60).
Zoom Image
Figure 2 Associations between Competing Words and Dichotic Digits total scores and interaural asymmetry.