Keywords
feedback - motor learning - training - speech
Learning Outcomes: As a result of this activity, the reader will be able to:
-
Define and describe principles of motor learning.
-
Explain the differences between two types of feedback, knowledge of performance, and
knowledge of results.
-
Provide examples of how to implement different feedback types, knowledge of performance,
and knowledge of results, in a therapy session.
Motor learning is defined as a relatively permanent change in the ability to execute
a motor skill due to practice and/or experience (Schmidt & Lee 2005). It allows us
to develop skills, such as mastering a volleyball serve or fluently speaking a foreign
language, and also safeguards the accuracy of simpler reflexive behaviors, such as
ducking your head when something is suddenly coming your way (Cullen & Mitchell 2017).
Researchers interested in motor learning seek to understand how people best acquire
new motor skills and relearn or rehabilitate impaired movements.
Decades of research focused on limb motor learning has led to the identification of
practice and feedback conditions shown to enhance the learning of trained movements
(Schmidt et al. 2019). Together, these practice and feedback conditions are known
as the principles of motor learning (PML; Schmidt 1988). Practice conditions include
variables such as practice amount (large vs. small), distribution (massed vs. distributed),
variability (variable vs. constant), and schedule (blocked vs. random), as well as
attentional focus (external vs. internal) and target complexity (single vs. complex
or part vs. whole). Feedback conditions include feedback type (knowledge of performance
[KP] vs. knowledge of results [KR]), frequency (frequent vs. reduced), and timing
(immediate vs. delayed) (Schmidt & Lee 2005; also see Bislick et al. 2012; Maas et
al. 2008 for a brief review of the PML in speech). Overall, the extant literature
suggests that these principles promote the acquisition, transfer, and retention of
trained skills when practice consists of a large number of trails, is distributed
over time, the training stimuli are varied and randomized, and when feedback consists
of KR, is less frequent, and is delayed (e.g., Baddeley & Longman 1978; Park & Shea
2003, 2005; Shea et al. 2000; Wright et al. 2004; Wulf & Schmidt 1997). Schema theory
of motor control (Schmidt 1975; Schmidt & Lee 2005) provides support for the positive
impact of these principles on limb and speech motor learning.
Schema Theory
Schema theory, a prominent theory of motor control and learning, can be used to describe
the process by which the limb and speech motor systems adapt and learn (Schmidt 1975,
2003; Schmidt & Lee 2005). Schema theory provides a framework, encompassing generalized
motor programs (GMPs) and parameters, for the learning and execution of movements.
Movement, as described by schema theory, includes the retrieval and sequencing of
a stored set of generalized motor commands to form motor programs (i.e., GMPs; Keele
1968; Schmidt 1975). GMPs represent the relative timing and force of muscle commands
necessary for carrying out an action for a given class of movement (e.g., throwing
a ball), whereas the parameters assigned to a GMP represent the details of motor execution,
such as the absolute timing, force, and muscle selection (e.g., speed or distance
a ball is thrown). As mentioned earlier, true motor learning occurs when permanent
changes are made to GMPs and/or parameters (Schmidt 1975)—and the PMLs are thought
to facilitate these changes for both limb movement and speech production.
In speech, it is not clear which aspects are considered GMPs and which are considered
parameters (Ballard et al. 2000; Maas et al. 2008). A GMP may represent the motor
commands associated with a phoneme, a syllable, a word, or even a phrase, whereas
speech rate, volume, and precision may be considered the parameters (Bislick et al.
2013; Maas et al. 2008; Varley et al. 2006). The speech difficulties observed in persons
with motor speech disorders (MSDs) can also be described using schema theory. For
example, the speech characteristics of apraxia of speech (AOS), that is, distorted
sound and sound substitutions, slowed speech rate, and abnormal prosody, are hypothesized
to result from deficits in activating and/or parameterizing GMPs (Ballard et al. 2000;
Clark & Robin 1998). Research suggests that motor programming may also be impaired
in persons with Parkinson's disease (PD) and cerebellar disease (Spencer & Rogers
2005). Specifically, persons with hypokinetic dysarthria from PD demonstrate deficits
in the ability to maintain activation of motor programs and/or quickly switch between
motor programs (Spencer & Rogers 2005), as evidenced by abnormally placed pauses during
speech production and trouble with speech initiation and progression through an utterance.
Speech characteristics observed in persons with ataxic dysarthria due to cerebellar
disease, such as impaired prosody and irregular articulatory breakdown, may be attributed
to problems with activating GMPs prior to the initiation of speech (Spencer & Rogers
2005). Thus, if motor programming is indeed disrupted in these populations, the use
of PML may positively impact rehabilitation outcomes by facilitating learning, retention,
and transfer of trained skills. However, the implementation of PML for speech learning
(and relearning after brain injury) is understudied, particularly in comparison to
the study of limb motor learning.
Many researchers and rehabilitation experts suggest that the key to understanding
how to improve disordered movement can be found through investigations examining how
normal movement is controlled (e.g., Levin & Demers 2021). The investigation of PML
began with neurologically healthy individuals, both young and older adults, and has
been extended to persons with physical and neurological damage. The extant literature
provides strong support for the application of PML to enhance limb motor learning
in neurologically healthy individuals (e.g., Levin & Demers 2021; Schmidt & Bjork
1992; Schmidt & Lee 2011). The benefit for persons with neurological injury is also
supported, yet it is not as straightforward given the heterogeneity in these populations.
Findings suggest that the use of PML has probable benefit to the rehabilitation of
limb movement for populations with neurological injury or disease. In particular,
the application of PML to limb motor learning (or relearning) has shown promising
outcomes in individuals with a history of stroke (Jonsdottir et al. 2010; Molier et
al. 2010; Woldag et al. 2010), traumatic brain injury (Croce et al. 1996), Alzheimer's
disease (AD) (Rice et al. 2008), PD (Onla-or & Winstein 2008), cerebral palsy (Hemayattalab
& Rostami 2010), and developmental delay (Rice & Hernandez 2006). These studies, however,
include small sample sizes, are limited in range of severity (typically mild-moderate),
and do not address all the PML. Thus, while it is accepted that the PML enhance the
training of limb motor skills in healthy individuals and can be extended to some individuals
with neurological damage, continued research is warranted to further explore the benefit
of all PML in populations with neurological injury. It is especially important to
continue these investigations in impaired populations to address the impact of individual
variables that may influence patient performance, such as severity of impairment,
cognitive resources (i.e., attention and memory), and time post–onset injury or diagnosis.
Importantly, findings from the limb motor learning literature have raised awareness
of the PML across disciplines and led researchers to explore the application of the
PML to speech motor learning in neurologically healthy adults and persons with acquired
AOS and dysarthria.
When applying the PML to speech production, it is important to consider the similarities
and differences of the speech and limb motor systems. As discussed by Bislick and
colleagues (2012), speech articulation is a highly complex and varied motor skill
that is performed at an exceptionally rapid rate, without visual feedback of all the
speech structures, and, unlike some limb movements, speech movements require symmetric
and synchronous movements of bilaterally innervated structures that do not involve
joint action. However, as reported by Weir-Mayta et al. (2019 2022), the similarities
between the two motor systems in their requirements for movement planning, trajectory,
timing, coordination, sequencing, and biomechanics (Grimme et al. 2011) provide support
for applicability of PMLs to facilitate motor learning in speech as well. These similarities
have motivated the investigation of the application of PMLs to speech in healthy young
adults (Adams & Page 2000; Jones & Croot 2016; Kim et al. 2012; Lowe & Buchwald 2017;
Steinhauer &Grayhack 2000; Scheiner et al. 2014) and healthy older adults (Kaipa et
al. 2017; Weir-Mayta et al. 2019. 2022). Many of these investigations have employed
a foreign language task as the targeted motor skill (e.g., Korean phrases), while
others addressed the modification of speech features (e.g., speech rate, nasality)
or production of novel words (non-words) using combinations of English phonemes. Findings
from most of the studies examining PML in young adults suggest that the application
of PML benefits speech motor learning similarly with limb motor learning. Investigations
in older adults have yielded inconsistent results, with some showing outcomes similar
to limb studies, and others yielding inconsistent findings (Weir-Mayta et al. 2022).
Investigation of PML in persons with acquired MSDs, including acquired AOS from stroke
(Austermann Hula et al. 2008; Ballard et al. 2007; Bislick 2020; Bislick et al. 2013,
2014; Katz et al. 2010; Knock et al. 2000; Van der Merwe 2011; Wambaugh et al. 2013
2014) and hypokinetic dysarthria from PD (Adams et al. 2002; Spielman et al. 2007),
has also been conducted. In particular, a small but growing body of literature, primarily
focused on AOS, suggests that the implementation of the PML during speech training
and/or treatment may enhance (re)learning and retention of trained speech skills (Austermann
Hula et al. 2008; Katz et al. 2010; Knock et al. 2000; Wambaugh et al. 2013; 2014).
In a recent systematic review of the AOS treatment literature, Ballard and colleagues
(2015) identified 14 treatment studies, out of the 26 treatment studies reviewed,
that included PML. Several published studies have explicitly assessed specific PML
in persons with AOS (e.g., Austermann Hula et al. 2008; Bislick et al. 2013; Katz
et al. 2010; Knock et al. 2000; Wambaugh et al. 2013, 2014), while others have incorporated
PML into their treatment protocols (e.g., Bislick 2020; Bislick et al. 2014; Van der
Merwe 2011).
Speakers without Impairment
As discussed by Lowe and Buchwald (2017) and others (e.g., Weir-Mayta et al. 2019),
findings from studies that examine the effects of the structure of practice and/or
nature of feedback on acquisition and retention in speakers without impairment can
inform translation to clinical populations. The benefits of working with neurologically
healthy populations include the opportunity to engage a larger sample size, perform
more group analyses, and collect data from a relatively homogeneous group—in terms
of ability level and typically functioning cognitive processes required for learning.
From here, we can then extend investigation to individuals with neurological impairment,
assess similarities and differences, and better understand how differences within
clinical populations (e.g., attention) may impact performance. A challenge, when working
with neurologically healthy populations, is that participants tend to learn new speech
motor behaviors quickly and with a high degree of accuracy, thereby leading to ceiling
effects which can make it difficult to observe potential effects of an experimental
manipulation (Lowe & Buchwald 2017, p. 1713). It is, therefore, necessary to employ
stimuli that provide enough of a challenge to task the intact speech motor system
(e.g., Lisman & Sadagopan 2013; Sasisekaran et al. 2010). To achieve this goal, studies
have employed the use of novel nonnative stimuli, such as nonnative sounds, words,
and phrases, which can challenge speakers without impairment with novel sequencing
of the articulators, defy common phonotactic principles found in the speaker's native
language, and include nuances in prosodic aspects of speech production. Thus, the
production of novel nonnative stimuli will require more explicit motor learning than
real or nonword stimuli from a speaker's native language (Lowe & Buchwald 2017).
Feedback Type
Past investigations (see Bislick et al. 2012; Maas et al. 2008) have compared the
aspects of practice variability, stimulus complexity, attentional focus, and feedback
frequency and timing on speech motor learning in neurologically healthy adults and
persons with MSDs, whereas only one PML study has compared the different types of
feedback information, KP and KR, on speech performance (Ballard et al. 2012). Feedback
type, or the type of augmented feedback provided during limb or speech training/treatment,
typically by a trainer or therapist, is of particular interest as it has important
implications for clinical application. There are two main types of feedback, KP and
KR (Schmidt & Lee 2005; van Vliet & Wulf 2006). Feedback in the form of KP provides
specific information about how a movement should be modified to successfully achieve
the target (e.g., “move your tongue forward”; Lauber & Keller 2014). KP may also include
biofeedback, such as using a mirror to watch the movement or other visual or auditory
feedback about movement accuracy in relation to the target. Thus, there is some variability
in how KP feedback is provided and the amount of detail that can be obtained from
that feedback (e.g., biofeedback vs. auditory instruction, or a combination of the
two). Feedback in the form of KR, however, consists of information about the general
outcome of a movement (e.g., “close, but not quite right,” “You've got it!”) after
a task has been completed and may refer to a deviation from a spatial or temporal
goal (Lauber & Keller 2014; Maas et al. 2008). Although each of these feedback types
can serve as a basis for error correction, the limb motor learning literature indicates
that they may address learning in different ways (Maas et al. 2008; Maier et al. 2019).
Specifically, KP is thought to facilitate learning during the acquisition phase, whereas
KR is thought to enhance retention of trained skills (Schmidt & Lee 2005; Young &
Schmidt 1992). It is important to note, however, that KR is often inherit with KP
feedback. In other words, when KP is provided, general movement accuracy is also conveyed
(Knock et al. 2000).
While KR is often associated with superior performance post-training, results of limb
learning studies that have compared the benefits of KR and KP on the retention of
trained skills are equivocal (Kaipa 2013; Sharma et al. 2016) and suggest that the
influence of feedback type on motor learning may be dependent on the task (Newell
& Carlton 1987; Newell et al. 1987, 1983; Ronsse et al. 2011; Sharma et al. 2016).
As discussed by Sharma and colleagues (Newell & Carlton 1987), KP may be superior
to KR when (1) skill execution requires specified movement characteristics (e.g.,
gymnastics); (2) skills that require complex coordination must be improved or correct
(e.g., playing the saxophone, speech production); (3) the focus is on the movement
or specific muscle activity involved in mastering the skill (e.g., tennis serve);
or (4) KR is redundant with intrinsic (vs. augmented) feedback. In contrast, KR may
be superior to KP when (1) learners use KR to compare with their own intrinsic feedback
about task performance; (2) learners cannot determine the outcome of performing a
skill via intrinsic feedback; (3) KR motivates the learning (especially when KR is
positive; e.g., Saemi et al. 2012); and/or (4) when the goal is to create a discovery
learning practice environment (i.e., trial and error method of skill performance).
Additionally, although yet to be examined systematically, other motor learning variables
may impact the benefit of feedback type on skill learning, for both limb and speech
motor learning. For example, studies have shown interactions between feedback frequency
and practice schedule (e.g., Adams & Page 2000) and feedback frequency and task complexity
(e.g., Sidaway et al. 2012) on (re)learning. Thus, these same variables may also impact
the influence of feedback type on speech (re)learning.
A small number of studies have examined feedback type as it relates to speech motor
learning; however, only one study has compared the effect of different types of feedback
on speech learning. Ballard and colleagues (2012) examined the impact of two feedback
conditions, (1) a combined feedback KR and KP condition (KP in the form of biofeedback
via electropalatography) and (2) a KR only (no KP) condition, on the acquisition and
retention of a trilled Russian /r/ in monolingual, neurologically healthy English
speakers. Participants in each group received high amount of feedback on all trials.
Participants in the biofeedback plus KR group received both types of feedback simultaneously,
after every trial. The authors report no between-group differences in accuracy on
1 day post-training (1 day after the last training session); however, the 100% KR-only
group (no KP) demonstrated superior performance at 1 week post-training (1 week after
the last training session) compared to the KR plus KP group. These results suggest
that KR feedback facilitates retention of a trained novel speech task better than
the combined KR and KP (biofeedback) condition. The authors attribute the lower retention
rates in the combined KR and KP group to the continuous delivery of biofeedback (i.e.,
KP) during skill acquisition.
In numerous everyday learning or relearning environments, KP is used over KR, when
instructors want to direct the attention of the learner to the essential elements
of the movement pattern or of the context in which that pattern occurs (Nunes et al.
2014). In clinical practice, professionals often take the approach of providing more
detailed feedback early on during therapy, while the learner develops an understanding
of task expectations. As therapy progresses, feedback often becomes less detailed,
more like KR, as the learner is encouraged to self-monitor, identify, and self-correct
errors that occur. Given their common use, further research comparing the effects
of KP and KR feedback is critical to better understand the influence of these two
feedback types on different aspects of speech learning. Specifically, examining KP
feedback and KR feedback independently, and in sequential application, starting with
KP and moving to KR (with no KP), may provide insight as to which type of augmented
feedback, or a combination of the two, is most beneficial for the learning and rehabilitation
of speech skills.
The primary aim of this pilot study was to examine the effect of three feedback conditions
on novel speech learning in neurologically healthy adults, as measured by listener
ratings of intelligibility, precision, and naturalness at 1 day and 1 week post-training.
The three feedback conditions consisted of (1) KP only, (2) KR only, and (3) a combined
condition (KP + KR), moving from KP initially to only KR. Given that both KP and KR
are thought to assist learning in different ways (skill acquisition and skill retention,
KP and KR, respectively Schmidt and Lee, 2005), we predicted that speech learning,
and therefore listeners' perception of the participant's speech, would be enhanced
in the combined KP + KR condition. This study serves as pilot work to inform procedures
for future investigations with older adults in a larger sample size. The long-term
goal of this work is to inform the use of different feedback types to assist with
speech rehabilitation in speakers with MSDs.
Method
This research received ethics approval from the Institutional Review Board of the
University of Central Florida.
Participants
Twenty-four neurologically healthy female college students participated in this study.
Participants were included in the study if they met the following criteria: (1) monolingual
English speakers; (2) 18 to 40 years old; (3) passed an audiometric pure-tone, air-conduction
screening at 25-dB HL at 500, 1,000, 2,000 and 4,000 Hz in both ears; and (4) performed
within normal limits on the Montreal Cognitive Assessment (MOCA; version 7.1; Nasreddine
et al. 2005). Participants were excluded from the study if they had a positive history
of developmental or acquired communication impairment and/or previous exposure to
the Hindi language; each was determined via participant self-report. Please see [Table 1] for participant demographic information organized by group.
Table 1
Participant demographics
Pt. ID
|
Group
|
Age
|
MOCA
|
7
|
KP
|
20
|
29
|
8
|
KP
|
21
|
29
|
9
|
KP
|
19
|
27
|
10
|
KP
|
22
|
26
|
17
|
KP
|
22
|
29
|
20
|
KP
|
20
|
27
|
23
|
KP
|
21
|
29
|
26
|
KP
|
23
|
29
|
1
|
KR
|
19
|
29
|
2
|
KR
|
21
|
27
|
3
|
KR
|
40
|
30
|
6
|
KR
|
20
|
27
|
14
|
KR
|
20
|
30
|
19
|
KR
|
22
|
27
|
21
|
KR
|
21
|
28
|
25
|
KR
|
20
|
30
|
12
|
KP + KR
|
22
|
30
|
13
|
KP + KR
|
20
|
29
|
15
|
KP + KR
|
19
|
27
|
16
|
KP + KR
|
24
|
30
|
18
|
KP + KR
|
22
|
30
|
22
|
KP + KR
|
20
|
27
|
24
|
KP + KR
|
21
|
28
|
27
|
KP + KR
|
19
|
28
|
|
Mean (SD)
|
21.70 (4.19)
|
28.42 (1.28)
|
Note: MOCA, Montreal Cognitive Assessment (Nasreddine et al. 2005)—MOCA scores of
26 and up are considered within normal limits; Pt., participant; SD, standard deviation.
Bold text at the bottom of the table represents the means and standard deviation for
participant age and performance on the MOCA.
Twenty native-Hindi speakers, blind to study conditions, acted as expert raters and
judged participants' productions of the trained stimuli, at 1 day and 1 week post-training,
using measures of intelligibility, precision, and naturalness via a 7-point rating
scale. Raters consisted of 9 males and 11 females and ranged in age from 20 to 48
years (M = 33 years; standard deviation [SD] = 11.21) and had 14 to 20 years of education
(M = 16; SD = 3.01). Included raters spoke the same Hindi dialect as that taught to
the study participants (also spoken by the second author) and reported adequate hearing
acuity, no history of developmental or acquired communication impairments, and access
to reliable internet and computer audio. Details regarding the raters' scoring procedures
are described in “Data Analysis” section.
Design
In the context of an experimental group design, across three sessions, the influence
of feedback type on the intelligibility, precision, and naturalness of a novel speech
task was explored at 1 day and 1 week post-training. This design replicated that of
Kim and colleagues (2012). The 24 participants were randomly assigned to one of three
feedback groups: KP-only group (n = 8), KR-only group (n = 8), and KP + KR group (n = 8). All participants completed one 1-hour training session, followed by two testing
sessions. Testing occurred at 1 day post-training and 1 week post-training. This training
and testing schedule followed that of previous studies examining the application of
PML limb and speech learning (e.g., Anderson et al. 2001; Bislick et al. 2013; Kim
et al. 2012; Schmidt & Bjork 1992; Weir-Mayta et al. 2019). All sessions were audio
recorded and assessed for fidelity.
Stimuli
The training stimuli consisted of 10 Hindi phrases, ranging from three to four words.
The phrases varied in frequency of occurrence and phrase type. For example, a highly
frequent phrase trained in this study included the question “How are you today?” (आप
कैसे हो)/ɑp kæse hθ/. A less frequent phrase trained in this study was the statement
“Beauty of nature” (प्रकृति की सुंदरता)/prək̮rʌtɪ kɪ sʊnd̪ərʌt̪ɑ/. Trained phrases
contained a spectrum of phonetic sounds and sound combinations native to the Hindi
language (Samudravijaya et al. 2000). As previously mentioned, using a foreign language,
rather than participants' native language or sounds native to participants, can increase
the complexity of the motor task, and help facilitate new learning. Since the focus
of this study was on speech, not language, the meaning of the phrases was not shared
with the participants until the study had been completed.
Procedures
All participants were administered the MOCA (Nasreddine et al. 2005) and had their
hearing screened within 2 weeks prior to the start of the training. The assessment,
training, and two testing sessions took place in a quiet office space at the University
of Central Florida. During the training session, 10 Hindi phrases were verbally presented
to each participant by the primary research assistant, a native-Hindi speaker (second
author and second-year graduate student of speech-language pathology). The same native-Hindi
speaker worked with each participant for all three sessions to maintain consistency
in the delivery of stimuli. The Hindi phrases were presented live to simulate clinical
practice and allow the second author to accurately implement each unique protocol
and respond appropriately to the participant's responses. A written procedural protocol,
specific to each feedback condition, was used in every training session to ensure
fidelity of the training protocol for each participant and feedback condition. Participants
were instructed to “repeat each phrase after the model as accurately as possible.”
All participants completed 100 trials. Phrases were pseudo randomized into blocks
of 10, for a total of 10 productions of each phrase during the training session. Regardless
of feedback group, all participants received low-frequency feedback, 20%, during the
training phase, following that of previous studies on speech motor learning (Adams
& Page 2000; Adams et al. 2002). Order of stimuli presentation was randomized within
and across conditions for each participant at all three timepoints (training, 1-day
testing, 1-week testing). Participants were requested to repeat each phrase after
the second author provided a model of the target phrase. More specifically, the second
author would provide a model and the participant would follow with one repetition
of the target. Stimuli were presented in serial fashion, with a brief pause after
each production, unless feedback was being provided. See below for more details regarding
feedback.
Knowledge of Performance Group
During the training phase, the eight participants received feedback on 20% of trials.
Specifically, the second author used pictures of the mouth and articulators to provide
detailed feedback about placement of the articulators and verbal instruction, such
as “Your tongue needs to go here to make that sound,” “The nasality of that sound
was very good.”
Knowledge of Results Group
During the training phase, the eight participants in this group received KR (no KP)
feedback on 20% of trials. In particular, the second author provided general feedback
about production accuracy, such as “You're doing great,” “We sound the same,” or “That
is not quite right.”
KP + KR Group
During the training phase, the eight participants in this group received KP + KR feedback
on 20% of trials. In this condition, the first 50 trials practiced received KP feedback
and the second 50 trials received KR (no KP) feedback. The same phrases were practiced
in each condition (5 trials of each of the 10 phrases in each feedback condition,
for a total of 10 trials of each phrase). The order of type of feedback, KP first
and then KR, was chosen for two reasons. First, as discussed in the “Introduction,”
there is theoretical support that KP is most beneficial during the initial learning
of a task, whereas KR promotes retention and generalization. Second, in clinical practice,
clinicians often begin with more detailed feedback and then fade that level of detail
away, as the task becomes known, to encourage self-monitoring.
Testing
During the 1-day and 1-week post-testing session, the same 10 Hindi utterances used
during the training session were verbally presented by the second author. Stimulus
presentation was again randomized within and across conditions and phases. Participants
were asked to repeat the utterances after the model provided by the second author
as accurately as possible; a delay was not imposed on the speaker. This method of
eliciting a speech sample and assessing speech motor learning is consistent with speech
motor learning studies in neurologically healthy speakers and speakers with MSDs and
is one method in which speech learning is measured in clinical practice. Feedback
was not provided to the participants during the testing sessions. All phrases produced
by the participants during the testing phase were recorded using an Olympus digital
voice recorder (WS-852) with an external headset microphone (Audio-Technica ATM75).
Data Analysis
A total of 480 de-identified audio recordings of participant productions of trained
target phrases from 1-day and 1-week testing were randomized within and across conditions
and presented to the 20 expert listeners (i.e., Hindi-native raters) via Qualtrics
software (https://www.qualtrics.com), an online survey platform. The recorded phrases, the participants' productions
only, were auditorily presented along with the associated written target phrase, written
in both English and Hindi. Prior to listening and scoring the audio recordings of
participant responses, raters were provided with written and auditory instructions
for how to complete the ratings, as well as three practice trials. Feedback was not
provided to the raters about their selections. Once the raters completed the practice
trials they were permitted to move on to the experimental stimuli and complete the
study. Presentation of stimuli was randomized within Qualtrics. Given the large number
of stimuli and to avoid listener fatigue, the data were presented to listeners via
two separate Qualtrics surveys that could be completed at different timepoints. Raters
were asked to complete the two surveys within a 72-hour period and were encouraged
to only complete ratings in a quiet environment when they were able to focus. Raters
were provided with contact information and encouraged to contact the authors if they
had questions or difficulty with the protocol. Following the procedures of Kim et
al. (2012), who also assessed the impact of specific PML on listener ratings of a
novel speech task, raters in the current study were asked to judge each recording
on three constructs, intelligibility, naturalness, and precision using a 7-point rating
scale. Each scale offered seven different options to choose from, ranging from low
to high performance (e.g., 1 = very unnatural; 7 = very natural). The three constructs
were defined and described to each rater as follows:
-
Intelligibility—Intelligibility was defined as how clearly a person speaks so that his or her speech
is comprehensible to the listener (Leddy 1999). Raters were asked to rate intelligibility
based on the degree to which they understood the speaker (Duffy 2013, p. 78).
-
Precision—Precision, referred to articulatory precision, was defined as how clearly a person
articulates their spoken productions (Lubold et al. 2019). Raters were asked to rate
precision based on the degree to which the speaker accurately produced the sounds
in the words.
-
Naturalness—Naturalness was defined as how one's speech conformed to the “listener's standard
of rate, rhythm, intonation, and stress patterning, and if it conforms to the syntactic
structure of the utterance being produced” (Yorkston et al. 1999, p. 464). Raters
were asked to rate naturalness based on the degree to which the speaker sounded like
a native Hindi speaker.
Procedural Fidelity
To assess fidelity of the training phase for the three feedback conditions, a trained
research assistant listened to each of the recorded training sessions offline, using
the written protocol for each condition as a checklist. No deviations from the planned
protocol for each feedback condition were observed.
Reliability
Cronbach's coefficient alpha was run to determine the internal consistency of ratings
across the three listener rating scales. Values for Cronbach's coefficient alpha for
the listener ratings across the 24 participants for each speech-dependent variable
(intelligibility, precision, naturalness) was ≥ 0.93 at 1 day and ≥ 0.86 at 1 week
post-training. These scores indicate a high level of internal consistency within raters
for the three scales with this specific sample at these two timepoints. Kendall's
coefficient of concordance, W (Gibbons & Chakraborti 2011; Laerd Statistics 2016), was run for > 30% of the data
to determine inter-rater agreement on ratings of intelligibility, precision, and naturalness
for the 20 participants. Raters statistically significantly agreed in their ratings,
W = 0.55, p < 0.000.
Statistical Analysis
Two-way repeated measures analysis of variances (ANOVAs) was run to assess the impact
of feedback type (KP group, KR group, or KP + KR group) on ratings of intelligibility,
precision, and naturalness for the two timepoints, 1 day and 1 week post-training.
Post hoc multiple comparisons with a Bonferroni correction were conducted to further
examine significant findings.
Results
All 24 participants completed a 1-day training session, followed by a testing session
at 1 day and 1 week post-training. A total of 480 de-identified speech samples, 10
from each participant for each retention testing phase, were used to examine the effect
of feedback condition on learning and retention of a novel speech task. Twenty native
Hindi-speaking raters judged the quality of all 480 recordings using the three rating
scales for intelligibility, precision, and naturalness.
Training Data
While the purpose of this study was not to assess participant performance during the
training session, it is important to demonstrate that there was an impact of training
on initial acquisition during the training session. Changes in performance during
the training session were determined offline by a native Hindi speaker, blind to the
study conditions. Specifically, participant productions were transcribed and scored
for accuracy of word production offline. For each participant, percent accuracy of
the first 10 productions was compared to percent accuracy of the last 10 productions
([Fig. 1]). All participants demonstrated positive changes in word accuracy during their training
session. Participants in the KP group demonstrated a small effect of immediate training
(d = 2.82) when comparing accuracy of initial productions (M = 45.16%; SD = 14.43) to
accuracy of final productions (M = 82.66%; SD = 12.06), a difference of 35.50% accuracy.
Participants in the KR group demonstrated a small effect of training (d = 2.13) when comparing accuracy of initial productions (M = 54.30%; SD = 12.31) to
accuracy of final productions (M = 78.50%; SD = 10.34), a difference of 24.19% accuracy.
Finally, participants in the KP + KR group demonstrated a small effect of training
(d = 2.44) when comparing accuracy of initial productions (M = 52.42%; SD = 10.73) to
accuracy of final productions (M = 80.65%; SD = 12.31), a difference of 28.23% accuracy.
These data indicate that participants across all three groups showed improvement in
the accuracy of their productions of the trained Hindi phrases from the beginning
to the end of the training session. The main findings, reported later, focus on the
effects of feedback type on the perceptions of native speakers of Hindi at 1 day and
1 week post-training.
Figure 1 Participant accuracy during training session: comparing the first 10 productions
to the last 10 productions.
Results of Listener Ratings
A two-way repeated measures ANOVAs was run to compare the effect of feedback type
(KP, KR, and KP + KR) on listener ratings of intelligibility, precision, and naturalness
across two timepoints, 1 day and 1 week post-training. Please see [Fig. 2] for the group statistics obtained for the dependent variables (intelligibility,
precision, and naturalness) across the three feedback conditions and assessment timepoints.
Figure 2 Listener ratings for 1 day and 1 week post-training across feedback type. *p < 0.05, **p < 0.005, ***p < 0.0005.
Intelligibility
A two-way repeated measures ANOVA was run to determine the effect of different feedback
conditions on listener ratings of intelligibility at two timepoints. Analysis of the
studentized residuals showed that there was normality for the KR and KP conditions
at both timepoints (p > 0.05), but not the KP + KR condition (p = 0.026), as assessed by the Shapiro–Wilk test of normality. There were no outliers
for all feedback conditions at both timepoints, as assessed by no studentized residuals
greater than ± 3 SDs. The assumption of sphericity was violated, as assessed by Mauchly's
test of sphericity, χ
2(2) = 15.023, p = 0.001. Therefore, the Greenhouse–Geisser correction was applied (ε = 0.851). There was a statistically significant interaction between feedback type
and time on listener ratings of intelligibility, F(1.702, 134.447) = 3.658, p < 0.035. Therefore, simple main effects were run to further examine effects of feedback
type on listener ratings at 1 day and 1 week post-training.
One day post-training. Statistically significant differences in mean intelligibility ratings were found
for feedback type at 1 day post-training, F(1.835, 144.943) = 15.946, p < 0.0005. Post hoc analysis with a Bonferroni adjustment revealed that intelligibility
ratings were statistically significantly higher for KP compared to KR (0.630 [95%
CI, 0.318–0.942], p < 0.0005) and KP + KR compared to KR (0.413 [95% CI, 0.138–0.687], p = 0.001), but not KP compared to KP + KR (0.218 [95% CI, −0.22 to 0.457], p = 0.088).
One week post-training. Statistically significant differences in mean intelligibility ratings were found
for feedback type at 1 week post-training, F(1.298, 11.663) = 26.94, p < 0.0005. Post hoc analysis with a Bonferroni adjustment revealed that intelligibility
ratings were statistically significantly higher for KP compared to KR (0.616 [95%
CI, 0.356–0.876], p < 0.0005) and KP compared to KP + KR (0.448 [95% CI, 0.166–0.730], p = 0.001), but not KP + KR compared to KR (0.168 [95% CI, −0.0910 to 0.427], p = 0.348).
Timepoint. Finally, statistically significant differences in mean intelligibility ratings were
found for time for the KP + KR condition only, F(1, 79) = 68.35, p = 0.011. Mean intelligibility ratings were 0.188 (95% CI, 0.45–0.332) higher at 1
day compared to 1 week post-training. For the KR condition, mean intelligibility ratings
were 0.056 (95% CI, −0.209 to 0.096) lower at 1 day post-training as opposed to 1
week post-training, a difference that was not statistically significant, F(1, 79) = 0.541, p = 0.464. For the KP condition, mean intelligibility ratings were 0.042 (95% CI, −0.191
to 0.06) lower at 1 day post-training as opposed to 1 week post-training, a difference
that was not statistically significant, F(1, 79) = 0.320, p = 0.573.
Precision
A two-way repeated measures ANOVA was run to determine the effect of different feedback
conditions on listener ratings of precision at two timepoints. Analysis of the studentized
residuals showed that there was normality for all conditions at both timepoints (p > 0.05) except for the KP condition at 1 week post-training (p = 0.013), as assessed by the Shapiro–Wilk test of normality. There were no outliers
for all feedback conditions at both timepoints, as assessed by no studentized residuals
greater than ± 3 SDs. The assumption of sphericity was met, as assessed by Mauchly's
test of sphericity, χ
2(2) = 4.235, p = 0.120. There was a statistically significant interaction between feedback type
and time on listener ratings of precision, F(2, 158) = 3.658, p = 0.028. Therefore, simple main effects were run to further examine effects of feedback
type on listener ratings at 1 day and 1 week post-training.
One day post-training. Statistically significant differences in mean precision ratings were found for feedback
type at 1 day post-training, F(1.826, 144.255) = 10.834, p < 0.0005. Post hoc analysis with a Bonferroni adjustment revealed that listener ratings
of precision were statistically significantly higher for KP compared to KR (0.539
[95% CI, 0.206–0.872], p < 0.0005) and KP + KR compared to KR (0.383 [95% CI, 0.111–0.654], p = 0.003), but not KP compared to KP + KR (0.156 [95% CI, −0.108 to 0.420], p = 0.465).
One week. Statistically significant differences in mean precision ratings were found for feedback
type at 1 week post-training, F(2, 158) = 12.459, p < 0.0005. Post hoc analysis with a Bonferroni adjustment revealed that ratings of
precision were statistically significantly higher for KP compared to KR (0.592 [95%
CI, 0.301–0.883], p < 0.0005) and KP compared to KP + KR (0.384 [95% CI, 0.060–0.708], p = 0.014), but not KP + KR compared to KR (0.208 [95% CI, −0.057 to 0.473], p = 0.177).
Timepoint. Finally, there were no statistically significant differences in mean precision ratings
at 1 day post-training opposed to 1 week post-training for all feedback conditions.
For the KR condition, mean precision ratings were 0.044 (95% CI, −0.197 to 0.109)
lower at 1 day post-training as opposed to 1 week post-training, F(1, 79) = 0.324, p = 0.571. For the KP condition, mean precision ratings were 0.097 (95% CI, −0.244
to 0.050) lower at 1 day post-training as opposed to 1 week post-training, F(1, 79) = 0.374, p = 0.573. For the KP + KR condition, mean precision ratings were 0.131 (95% CI, −0.011
to 0.273) higher at 1 day post-training as opposed to 1 week post-training, F(1, 79) = 0.688, p = 0.069.
Naturalness
A two-way repeated measures ANOVA was run to determine the effect of different feedback
conditions on listener ratings of naturalness at two timepoints. Analysis of the studentized
residuals showed that there was normality for all conditions at both timepoints (p > 0.05) except for the KP condition at 1 week post-training (p = 0.008), as assessed by the Shapiro–Wilk test of normality. There were no outliers
for all feedback conditions at both timepoints, as assessed by no studentized residuals
greater than ± 3 SDs. The assumption of sphericity was violated, as assessed by Mauchly's
test of sphericity, χ
2(2) = 23.025, p = 0.0005. Therefore, the Greenhouse–Geisser correction was applied (ε = 0.796). There
was a statistically significant interaction between feedback type and time on listener
ratings of naturalness, F(1.593, 125.835) = 18.546, p = 0.0005. Therefore, simple main effects were run to further examine effects of feedback
type on listener ratings at 1 day and 1 week post-training.
One day. Statistically significant differences in mean naturalness ratings were found for
feedback type at 1 day post-training, F(2, 158) = 13.181, p = 0.0005. Post hoc analysis with a Bonferroni adjustment revealed that listener ratings
of naturalness were statistically significantly higher for KP compared to KR (0.511
[95% CI, 0.239–0.783], p < 0.0005) and KP + KR compared to KR (0.324 [95% CI, 0.104–0.544], p = .002), but not KP compared to KP + KR (0.188 [95% CI, −0.057 to 0.432], p = 0.194).
One week. Statistically significant differences in mean naturalness ratings were found for
feedback type at 1 week post-training, F(2, 158) = 13.502, p < 0.0005. Post hoc analysis with a Bonferroni adjustment revealed that ratings of
precision were statistically significantly higher for KP compared to KR (0.557 [95%
CI, 0.272–0.842], p < 0.0005) and KP compared to KP + KR (0.482 [95% CI, 0.169–0.794], p = 0.001), but not KP + KR compared to KR (0.075 [95% CI, −0.177 to 0.327], p = 1.000).
Timepoint. Finally, statistically significant differences in mean naturalness ratings for time
were found for the KR and KP conditions. For the KR condition, mean naturalness ratings
were 0.280 (95% CI, −0.429 to −0.131) lower at 1 day compared to 1 week post-training,
F(1, 79) = 13.986, p = 0.0005. For the KP condition, mean naturalness ratings were 0.325 (95% CI, −0.460
to 0.191) lower at 1 day post-training as opposed to 1 week post-training, F(1, 79) = 23.194, p = 0.0005. For the KP + KR condition, mean naturalness ratings were 0.031 (95% CI,
−0.176 to −0.113) lower at 1-day post-training as opposed to 1-week post-training,
a difference that was not statistically significant, F(1, 79) = 0.187, p = 0.667.
Discussion
This pilot study sought to examine the effect of three feedback conditions on novel
speech learning in neurologically healthy adults, as measured by listener ratings
of intelligibility, precision, and naturalness at 1 day and 1 week post-training.
This work serves as a foundation for future investigations with larger and more diverse
samples of participants that vary in age and neurological diagnosis, to further investigate
the benefits of feedback type during speech learning. In general, the results of this
investigation suggest the type of feedback provided during a 1-hour training session
for a novel speech task (Hindi phrases) may influence listener ratings of intelligibility,
precision, and naturalness at 1 day and 1 week post-training. On average, listener
ratings were highest across all three perceptual scales and at both timepoints for
the group that received KP feedback, whereas listener ratings were lowest for the
group that received KR feedback. Relative to listener ratings for the KP group, listener
ratings for the KP + KR group were more variable across timepoint. These findings
are discussed in more depth later.
Influence of Feedback Type on Listener Ratings
One day post–speech training. Listener ratings at 1 day post-training suggest that KP and KP + KP feedbacks are
superior to KR feedback in promoting intelligibility, precision, and naturalness of
trained speech skills in college-aged neurologically healthy adults when learning
a novel speech task. Listener ratings for individuals in the KP condition were, on
average, higher than those for individuals in the KP + KR condition. These results
are partially consistent with our hypothesis which predicted that the combined feedback
condition would enhance performance compared to the other conditions. These findings
may suggest that KP is an essential component in training complex novel speech tasks
to young healthy adults, especially in the earlier stages of learning. As discussed
previously, the nature of the task may impact the benefit of feedback type. Researchers
conducting limb motor learning studies have identified guidelines thought to influence
the benefit of one feedback type over another (Magill 2004; Sharma et al. 2016). These
guidelines may also apply to speech motor learning (Kaipa 2013; Maas et al. 2008).
In the current study, the task required specified movement characteristics (e.g.,
production of new patterns of resonance with new and known articulatory postures)
and complex coordination of the articulators and subsystems involved in speech production.
According to the guidelines, these specified and complex task characteristics warrant
KP over KR (Sharma et al. 2016). Furthermore, KP has proven to be superior to KR when
the goal of the task is unknown (Newell et al. 1990). In this study, learners did
not have a reliable internal representation of the movement goal because the task
was novel, and, therefore, could not use KR to compare with their own intrinsic feedback
or determine the outcome of their performance independently.
The results of this study differ from that of Ballard and colleagues (2012) which
showed greater benefit of KR feedback compared to a combined KR and KP approach. The
current study differs from Ballard and colleagues (2012) in several ways, including
a sequential versus simultaneous KP + KR feedback, low (20%) versus high (100%) feedback
frequency, biofeedback versus clinician delivered feedback, and differing levels of
task complexity (Hindi phrases vs. trilled Russian /r/). In the current study, we
were specifically interested in examining each type of feedback on its own (KP only
and KR only) and a combined condition thought to be more reflective of clinical care
(sequential delivery, first KP and then KR). We chose to incorporate low-frequency
feedback as lower rates of feedback have been shown to facilitate learning relative
to higher rates of feedback (Maas et al. 2008; Schmidt & Lee 2005). Additionally,
there is evidence to suggest that high-frequency feedback may negatively impact the
benefits of feedback type (Wulf et al. 2002). The type of KP feedback also differed
between the two studies, with the current study providing clinician-delivered KP feedback
with verbal instruction and pictures of the articulators, and the study by Ballard
and colleagues (2012) using biofeedback via electropalatography. While clinician-delivered
feedback can address speech errors or the inaccurate movements of the articulators
(e.g., “your lips need to come together to make that sound”), tongue positions and
movements, which are not highly visible, can be challenging to verbally cue and describe.
Therefore, KP in the form of visual feedback of one's own tongue positions and movements
has the potential to bolster motor learning (Preston et al. 2013). Finally, the training
task employed in this study was complex and may have driven the need for KP feedback,
rather than KR. In contrast, the training task in the study of Ballard et al. (2012)
was relatively less complex, and likely did not require the same level of detailed
instruction to promote learning. The combination of task complexity, frequency of
feedback, and type of KP in the study of Ballard et al. (2012) may have negatively
impacted learning. Biofeedback may be more appropriate when training complex tasks;
however, research is needed to better understand when biofeedback may be optimal to
clinician-delivered feedback to promote speech learning.
One-week post–speech training. Listener ratings at 1 week post-training indicate that KP feedback remained superior
to KP + KR and KR feedback in promoting intelligibility, precision, and intelligibility
of a novel speech task. These findings support the idea that the benefit of feedback
type is determined by the nature of the task, and further suggest that low-frequency
KP feedback may be superior to other feedback types when the task is complex and the
training phase is short. In the current study, the novelty and complexity of the task
coupled with the short training phase most likely enforced a reliance on more detailed
feedback. Future work should examine the benefit of feedback type when implementing
a performance criterion (e.g., training to 80% accuracy).
Timepoint
The results of time of testing show a similar pattern of benefit of feedback condition
across the two timepoints. On average, however, ratings of intelligibility and precision
increased from 1-day to 1-week testing for both the KP and KR conditions but were
significant only for the naturalness measure. The motor learning literature suggests
that consolidation of trained skills occurs during periods of rest (Robertson et al.
2004; Schmidt & Lee 2005). Thus, the slight increase in listener ratings may indicate
improvements in aspects of nonnative speakers' speech abilities from 1-day to 1-week
testing. In the KP + KR feedback condition, a statistically significant difference
was found for listener ratings of intelligibility. In contrast to the other two groups,
ratings were higher at 1 day post-training compared to 1 week for the combined feedback
group. A similar pattern was observed for listener ratings of precision; however,
this finding was not statistically significant. Overall, the pattern of results for
the KP + KR feedback condition differed from the KP and KR feedback conditions and
do not support the hypothesis that KP + KR would enhance learning at 1 week post-training.
As mentioned earlier, the results of the KP + KR condition may indicate that more
of only one type of training was needed during the short training phase.
Considerations for MSDs. While this study included neurologically healthy adults without MSDs, this work may
have implications for clinical practice with individuals with MSDs. Our findings suggest
that KP is an important initial step in training a novel speech task; this may also
suggest that KP is an important initial step in the speech rehabilitation process,
especially if the speech impairment results from damage to the motor plans for speech
production (e.g., acquired AOS; Van der Merwe 2009) and/or impaired auditory or perceptual
feedback (e.g., hypokinetic dysarthria; Mollaei et al. 2016). Furthermore, KP may
be an important component in speech learning in children with developmental MSDs,
such as childhood AOS or dysarthria resulting from cerebral palsy. For example, McKechnie
and colleagues (2020) found that a group of speakers with CAS who received KP showed
a significant change from pre- to immediately post-treatment compared to those who
received KR; however, the group differences dissipated by 1-month post-treatment.
Regarding task complexity, our findings may suggest that adults and children with
MSDs rely on KP more frequently or for a longer period if the speech task is complex
(e.g., multisyllabic words with clusters or phrase production) compared to more simplified
tasks (e.g., syllable production). Similarly, persons with MSDs with more severe deficits
may also rely on KP for a longer period (prior to moving on to KR) than those with
milder deficits. Continued research is needed to better understand the benefit of
PML on treatment outcomes for persons with MSDs and the impact of individual (e.g.,
severity) and task (e.g., complexity) on the application of each PML.
Study Limitations
This study aimed to better understand the impact of feedback type on learning speech
skills with hopes to provide guidance for clinical practice. While the findings of
this investigation are informative, there are limitations that minimize the application
to current clinical practice. First and foremost, the lack of a baseline measure provided
by native listeners is a significant limitation to this study. Future work will address
this issue by recording participant responses prior to training and then, asking native
listeners to rate performance. This will allow for a direct comparison of listener
ratings of participant performance pre- and post-training. It is important to note,
however, that the training data show similar baseline performance across all three
feedback groups. Although the length of the training phase and timing of testing in
the current study match previous investigations examining the PML, the expansion of
these variables would greatly enhance the clinical applicability of these findings
for speech (re)learning in individuals with MSDs, such as AOS, as well as neurologically
healthy individuals with articulation difficulties. More specifically, the short training
phase (i.e., small amount of practice) coupled with the novelty and complexity of
the task may have influenced the benefit of feedback type. Lengthening the training
phase and/or training to a specified criterion would better mirror clinical practice.
Additionally, extending testing to weeks and/or months post-training will provide
insight into the long-term maintenance and generalization effects of feedback type
on speech motor learning. Another limitation to the study includes the length of the
rating protocol for the expert raters. To avoid listener fatigue and encourage a break
after rating half of the audio recordings, the data were presented to listeners via
two separate Qualtrics surveys. The presentation of the data in two separate surveys
appeared to invite some raters to complete ratings for only one of the two data sets.
As a result, some listeners completed more ratings than others with four native listeners
rating both data sets (480 items) and 16 native listeners rating one data set (240
items); however, data were randomized and all raters were blind to the testing timepoints
(e.g., 1 day and 1 week). Finally, this study did not examine intelligibility, precision,
and naturalness during the training phase, only at 1 day and 1 week post–speech training.
While we have data to support learning during the training phase, intelligibility,
precision, and naturalness were not specifically addressed since the aim of this project
was on post-training speech perception. Future work might focus on pre- and post-training
comparisons to capture the degree and rate of change over a longer period.
Conclusions and Future Directions
Conclusions and Future Directions
There is much to learn about the application of motor learning theory to speech motor
learning and management of acquired and developmental MSDs. While some literature
supports the implementation of the PML into treatment protocols, the evidence is mixed
and warrants further investigation. In particular, understanding the type and/or combination
of feedback required to optimize treatment outcomes is valuable to everyday therapeutic
activities. The results of this investigation suggest that KP feedback may be superior
to KR and KP + KR for enhancing speech production of a novel and complex speech task
during a short training phase, as measured by listener ratings. Future studies, however,
should further examine the impact of task complexity on feedback type (e.g., Do more
complex speech tasks require more detailed feedback compared to simpler tasks?) and
the relationship between practice amount and the feedback type (e.g., Is KP feedback
superior to KR feedback when the practice amount is small vs. large?). Finally, additional
research should also explore training to a specified criterion (e.g., 80% accuracy),
as this approach is more applicable to clinical practice.