Keywords
usability testing - clinical decision support - falls - primary care
Background and Significance
Background and Significance
Among community-dwelling older adults, falls are a leading cause of disability and
independence loss.[1] They are widespread and persistent, impacting 29.5% of rural older adults and 27%
of urban older adults.[2] The importance of preventing falls in the community has been well established in
the literature. However, the role of computerized clinical decision support (CCDS)
in the adoption of fall prevention guidelines in primary care practice represents
a gap. CCDS is defined as technology that provides timely, patient-specific information
to improve care quality, typically by invoking automation at the point of care.[3]
[4] A recent randomized control trial found that CCDS systems that leverage decision
support best practices (e.g., user-centered design) typically have higher adoption
rates and effect clinical behavior more than off-the-shelf solutions.[5]
Despite use of fall risk screening tools, guideline adherence for fall prevention
continues to be suboptimal. The 2017 Medicare Health Outcomes Survey found only 51.5%
of those screened at risk for falls received any intervention.[6]
[7] Providers have reported a lack of skills to address fall risk suggesting that more
robust CCDS that goes beyond screening would be useful.[8] We are aware of one study that addresses fall prevention in primary care using CCDS
that has been published since 2012. It consisted of a screening reminder, but no recommended
actions.[9] Common barriers to use of prevention guidelines include time pressure, competing
clinical priorities, lack of agreement with or awareness of guidelines, and lack of
training.[10]
[11]
[12] A primary care provider responsible for 2,500 patients would need 1,773 hours per
year to provide all U.S. Preventive Service Task Force grade A and B recommendations,
including fall prevention.[13] High workloads including management of chronic conditions can leave little time
or cognitive bandwidth to address preventive services.[14]
CCDS tools have been shown to increase adherence with preventive services.[15] Many CCDS tools are developed for use in one location, limiting generalizability.[16] However, if they were interoperable the potential for scalability would be greater.
CCDS using data exchange standards (e.g., Fast Healthcare Interoperability Resources
[FHIR]) can be integrated into any electronic health record (EHR) and integrate fragmented
clinical information providing a concise comprehensive picture.[13]
Before new user-facing CCDS can be implemented, it should be tested to ensure usability.
The concept of usability is multi-dimensional and includes attributes such as efficiency,
learnability, errors, and satisfaction.[17] Measures of usability evaluate a systems' ability to allow users to complete intended
tasks safely, effectively, efficiently, and enjoyably.[18] In addition to usability, the concept of acceptability may be more important; acceptability
or willingness to engage with and accept recommendations from the system.[17] If users are unwilling or unable to engage with a system, it will not matter how
easy it is to use. Usability testing can be done before or after system implementation
and use quantitative or qualitative approaches.[19] Preimplementation testing is typically done in a laboratory and often has quantitative
components (e.g., time-on-task, error rates). However, usability evaluation through
observation in real-world settings can provide richer data than laboratory-only studies.
Simulation is one approach that tries to gain contextual information from observations
with quantitative data more typical of laboratory-based testing.[20] Simulation recreates the context within which the CCDS is intended and primes participants
to behave more like they would in practice compared with think-aloud protocols. Simulation
also allows for more control over testing scenarios and limits potential confounding
factors compared with postimplementation observation studies.
Objective
To support wider engagement with the fall prevention process, our team developed Advancing
Fall ASsessment and Prevention PatIent-Centered Outcomes REsearch (ASPIRE), an interoperable
CCDS. ASPIRE provides tailored fall prevention recommendations in the context of primary
care visits for those screened at risk. Summative testing was conducted using simulation
to measure ease of access, overall usability, learnability, and acceptability of ASPIRE
prior to pilot testing, see [Table 1] for operational definitions. While the focus of this article is a summative evaluation,
a brief description of system development and formative testing is included for context.
Table 1
Usability metrics and operational definition
Metric
|
Operational definition
|
Measure(s)
|
Usability
|
Usability (measure of the ease of use) to be measured via the SUS
|
SUS questionnaire score and percentile
|
Ease of access
|
Ease of finding and launching tool within the EHR
|
Single Ease Question for landing page
|
Learnability
|
Learnability is the ability of users to quickly become familiar with and be able to
use the tool/system. Total time-on-task is the sum of the time coded for the four
tasks (launch and landing, risk factors, recommendations, and document and print)
|
Difference in total time-on-task between scenario 1 and 2 per user
|
Difference in number of hints between scenario 1 and 2 per user
|
Difference in number of errors between scenario 1 and 2 per user
|
Acceptability
|
Acceptability—measured by number of default recommendations changed per risk factor.
And “Would you recommend”: question in posttest
|
Difference between the number of default recommendations kept and number of recommendations
seen
|
Abbreviation: SUS, System Usability Scale.
Methods
System Development
ASPIRE is middleware that enables primary care providers across disparate EHR systems
to launch the tool from within the EHR and develop actionable fall prevention plans
with older adults who screened positive for fall risk. By providing tailored recommendations
based on EHR data, ASPIRE helps providers engage with patients to determine a mutually
agreed-upon plan. ASPIRE used all supported FHIR standards and had to use some EHR
specific resources when FHIR standards were not supported by local EHRs. Due to differences
in which services are supported within each EHR and for security reasons, each site
ran a separate instance of the ASPIRE logic. ASPIRE was developed based on user-centered
design principles and information on user requirements is reported elsewhere.[21] ASPIRE's design is based on fall prevention evidence, user requirements from primary
care staff and patients, and input from usability experts to ensure compliance with
heuristics.[17]
[22] These sources combined with prior research experience resulted in a design that
focused on three risk factors: (1) mobility/exercise, (2) fall risk increasing drugs
(FRIDs), and (3) bone health. ASPIRE pulls patient information from the EHR related
to these risk factors and displays them for the provider to validate with the patient.
Based on selections in the first step, recommendations are made in the second step,
resulting in a fall prevention plan that can be sent back to the EHR in the third
step. Recommendations depend on patient data and may include referral to physical
therapy, exercise handouts, medication-deprescribing handouts, and information on
osteoporosis including bisphosphonates. Because fall prevention exercises have the
strongest supporting evidence, they were provided to all patients and varied in difficulty.[23] All exercise handouts were based on Otago exercises and developed by a physical
therapist specializing in fall prevention.[24]
Formative testing or iterative design feedback was conducted using a think-aloud approach
and was grounded by principles of user-centered design.[25] Formative testing is a small-sample qualitative approach to find and fix usability
problems.[26] The first three sessions utilized a static prototype. The following 14 sessions
were conducted with a click-able prototype in Figma, an interface design and testing
program. Following completion of formative testing, the prototype was revised and
integrated into the EHR at two sites, one urban site using Epic and one rural site
using Athena Practice.
Approach
After integration into respective EHRs, but before summative testing, the research
team tested the system using 12 different test patients developed specifically for
this project. Summative testing evaluates system usability and typically reports metrics
including time on task, errors, and user satisfaction,.[26] Time on task and errors are part of the learnability concept, while user satisfaction
is linked to overall usability.
Setting and Recruitment
Participants were recruited using purposive sampling from two sites with differing
EHR systems to enable us to account for system-specific issues and ensure interoperability.
The first was a large urban health care system serving the Boston area. The second
was a federally designated rural health clinic associated with a nearby academic medical
center in north-central Florida. These two settings utilize different EHR systems
which provided the necessary environment to develop and test an interoperable solution.
Staff were eligible to participate if they were primary care providers whose patient
population included older adults. Sample estimation for usability studies balances
cost and time to find the most errors possible.[26] This study planned for 20 summative testing participants, which would uncover an
estimated 95% of errors.[27] However, due to challenges in recruiting, the final sample size was 14 (10 urban,
4 rural), which should reveal approximately 90% of errors.[27]
Procedure
Summative testing was conducted via secure video conferencing with audio and video
recordings, similar to a virtual visit, and consisted of a participant, facilitator,
and patient-actor. The facilitator and patient-actor were members of the research
team. At the beginning of the session, the facilitator provided a brief introduction
of ASPIRE and gave the participant remote control of the screen. Participants were
advised that questions about the tool would be answered after the session and to complete
tasks to the best of their abilities by engaging with the patient-actor. The use of
the patient-actor was intended to simulate near real-world use. Each session included
two scenarios and participants were randomized to determine which they would see first.
Each scenario included information about age, gender, chronic diseases, general activity
levels, assistive devices, medications, and history of osteoporosis or osteopenia.
Scenarios covered a variety of clinical situations including differences in ability
to access physical therapy, differences in mobility, osteoporosis, osteopenia, and
a variety of FRIDs. During both scenarios the patient-actor used predetermined personas
that tried to anticipate participant questions. If an unanticipated question was asked,
the patient-actor answered “I don't know” for consistency. Each participant used ASPIRE
across four steps: (1) launch and landing, (2) risk factors, (3) recommendations,
(4) Document and Print. The facilitator only gave hints related to system use if participants
were unable to complete the step or if an error would have prevented the completion
of subsequent step. During launch and landing, the user was expected to navigate to
a button integrated into their EHR and review the landing page (see [Fig. 1]). During the second step, the provider was presented with risk factors identified
by ASPIRE from the EHR ([Fig. 2]), which they were expected to validate with the patient. ASPIRE preselected mobility
and bone health risk factors based on EHR data and pulled any actively prescribed
FRIDs. Based on feedback from providers during formative testing, FRIDs were not preselected
as most providers reported only wanting to change one medication at a time. During
the third step, the provider was presented with recommendations, including talking
points, based on previous selections (see [Fig. 3]). Recommendations were grouped by risk factor with exercise first, FRIDs second,
and bone health third. During this step, the provider was presented a recommended
exercise level but could choose a different level based on clinical judgement. They
could also preview handouts and de-select any items they did not want to use. When
the participant hovered over the different exercise levels, a description of intended
recipients appeared (see [Fig. 4]) for descriptions. In the medication section, recommendations were based on class.
For example, if the participant selected diazepam, recommendations for de-prescribing
benzodiazepines were provided. In the document and print task ([Fig. 5]), the provider had a summary of resources to print, recommended orders, and a prepopulated
note that summarized the fall prevention plan. All items in this task were the result
of selections made in previous steps. The prepopulated note could be sent to the EHR,
reducing documentation time in the EHR.
Fig. 1 ASPIRE landing page as seen in task one.
Fig. 2 Example of content seen in task two.
Fig. 3 Example of content seen in task three.
Fig. 4 Descriptions of each exercise level that appear as hover-overs.
Fig. 5 Example of content seen in task four.
Metrics
This study used the Single Ease Question (SEQ) and System Usability Scale (SUS). The
SEQ is a post-task measure of perceived difficulty and was asked at the completion
of each task during the first scenario using a 7-point scale.[28] It was used to provide insight into potential differences in usability between tasks
and inform interpretation of global usability results. Global usability was measured
using the SUS, a validated posttest measure of subjective overall usability and satisfaction.[26] After completion of the second scenario, the SUS was administered using the polling
feature. The polling feature was used to allow participants to answer items without
being asked verbally in an attempt to minimize social desirability response bias.
Learnability refers to the ease in which a novice user can reach a reasonable level
of proficiency in a short period of time.[17] To assess learnability, time on task, hints, and errors were compared for each user
between the first and second scenarios. Users were randomized to determine which scenario
they saw first to ensure that learnability measures were not influenced by scenario
order. Some studies have suggested that learnability can be calculated as a sub-scale
of the SUS.[29] However, subsequent studies have shown that the factorization of the SUS depends
on how much experience participants have with the system being tested.[30] For new users the SUS has a single-factor structure, whereas studies of more experienced
users result in a two-factor structure. Based on this information, it would have been
inappropriate to use SUS subscales to evaluate a tool after two uses.
Acceptability of health care interventions is a prerequisite to effectiveness and
requires thoughtful design allowing best possible outcomes with available resources.[31] This study measured acceptability by comparing the number of recommendations presented
to each participant compared with the number of recommendations included in the fall
prevention plan. If there was a conflict between content of the written plan and verbalized
intent of the participant, the verbalized intent was counted. For example, if a participant
clearly verbalized to the patient-actor that they were not going to start bisphosphonates,
but did not edit the note or de-select this recommendation, the verbalized intent
to not start bisphosphonates was used in the acceptability calculation.
Upon completion of the SUS questionnaire, participants were asked open-ended questions
by the facilitator. These questions included what participants liked and did not like
about the tool, if they preferred it to their current fall prevention practices, if
they would recommend it to others, and if the tool should do anything else. Participants
completed a demographic form and received a $50 gift card.
Analysis
Recorded sessions were analyzed using NVivo 12 to describe and quantify usability
data. Iterative content analysis was used of field notes and recordings was used to
analyze interview questions from the end of the session. Analysis was done by the
lead author, shared with usability expert, and the team. An a priori code book was
developed by the lead author in consultation with a usability expert and reviewed
by the team. The code book, available as Supplementary Material, was developed to
allow analysis of the summative testing measures in [Table 1]. Descriptive statistics of time on task, errors, hints, and acceptability were calculated
for the total sample, each site, and each scenario order. A paired t-test to assess within-subject time on task between scenarios was completed after
verifying normality. The Wilcoxon ranked sum test was used to compare recommendations
seen by participants to recommendations included in the fall prevention plan due to
nonnormal distribution of results. Statistics were calculated using R-Studio.
Results
A total of 14 summative usability sessions were conducted, 10 from the urban site
and four from the rural site. Participants included physicians (n = 7), nurse practitioners (n = 5), and physician assistants (n = 2). Most participants reported caring for some older adults (n = 9), while some reported caring mostly for older adults (n = 5). On average they had 5.8 years' of experience working with their respective
EHR and nearly all reported intermediate skill level with technology in general (n = 12).
Usability and Ease of Access
Mean SUS score was 77.3 (median = 80, range = 30–92.5). To interpret SUS results,
scores were converted to percentiles using published tables.[26] The mean percentile score was 80.9 (median = 90, range = 2–100), indicating above-average
usability. This is further supported by 13 of 14 participants stating they would recommend
ASPIRE to a colleague and they preferred ASPIRE to their current practice. Participants
rated tasks as relatively easy with mean SEQ ranging between 5.1 and 5.5 across the
four steps. Providers found the tool accessible with an average SEQ for the launch
and landing task of 5.5 (median = 6, range = 3–7).
Learnability
Total time on task was 12.9 minutes for the first scenario (median = 13.5, range = 5.4–16.8)
and 9.4 minutes for the second (median = 8.9, range = 4.6–17.3). This represents a
statistically significant reduction in total time on task (p = 0.0001), indicating that users were able to quickly become familiar with the tool.
Reduction in total time on task was also evaluated by site and scenario order with
all groups showing statistically significant reductions in total time (p < 0.05).
On average the number of hints required during the first scenario was 2.6 (median = 2,
range = 1–5) and decreased to an average of 0.14 for the second scenario (median = 0,
range = 0–2). Reductions in hints were seen uniformly across sites and scenario order.
However, urban participants required more hints during the first scenario (mean = 1.5)
compared with rural participants (mean = 1). Based on review of field notes and recordings,
this difference was due to button location in the EHR and not the ASPIRE system itself.
Urban participants had difficulty finding the fall risk icon in Epic, whereas rural
participants received hints to click “let's begin” to complete the task.
Error rates remained consistent between scenarios with 0.1 fewer errors made during
the second scenario (mean = 2.86) compared with the first (mean = 2.79). Error analysis
showed users made errors of commission and omission. Errors of commission occurred
when users did something (e.g., accidently clicking on something) while errors of
omission occurred when a user failed to engage in an action (e.g., not editing prepopulated
note to reflect verbalized plan). See [Table 2] for a description of errors with associated steps, frequencies, severity scores,
and mitigating factors.
Table 2
Description of errors identified in summative testing by task and frequency
Step(s)
|
Error type
|
User error description
|
Frequency (no. of users)
|
Error severity
|
Severity rationale (R) and mitigation (M)
|
1. Launch and landing
|
Commission
|
Clicking incorrect place to launch tool; difficulty finding fall icon/launch button
|
6
|
2
|
R: users rated this step as easy on SEQ and stated that now that they had done it
once they would be able to do it again. Was an issue in EPIC integration, but not
Athena
M: cover in initial training. Consider alternative locations for integration if utilization
of tool is lower than expected at EPIC location
|
2. Risk factors
|
Commission
|
Selecting “Osteoporotic Fracture” when none present
|
3
|
2
|
R: selecting this option did not change the recommendations generated by the system
M: consider removing in future iterations
|
2. Risk factors
|
Commission
|
Verbalized click on buttons to “open additional information”
|
1
|
1
|
R: does not impact system recommendations.
M: initial training and system use will address this low frequency error.
|
2. Risk factors
|
Omission
|
Not selecting risk factors; verbalizes that they plan to address risk factor/medication
but fails to select in step 1
|
2
|
3
|
R: not selecting items prevents the system from providing tailored recommendations
M: emphasize instructions for this step during training and consider if on-screen
instructions could be improved
|
2. Risk factors
3. Recommendations
|
Omission
|
Not engaging with patient about homebound status in step 1, but engages with patient
in step 2 or 3 when formulating a plan/entering order
|
2
|
2
|
R: if homebound had been selected in step 1, the system would have recommended home
PT rather than ambulatory PT. Users consistently addressed this while placing the
order
M: homebound button may not be required based on provider behaviors
|
2. Risk factors
3. Recommendations
4. Document and print
|
Commission
|
Navigate into EHR to review fall risk screener
|
1
|
2
|
R: did not change system recommendations but did add time to system use as screener
was visible from within ASPIRE. M: review during initial training.
|
2. Risk factors
3. Recommendations
4. Document and print
|
Commission
|
Clicked back into EHR by accident
|
1
|
1
|
R: system design as middleware and therefore generates a pop-up window. Added time
to system use, but did not impact clinical recommendations
|
2. Risk factors
3. Recommendations
4. Document and print
|
Commission
|
Accidental selection/de-selection
|
8
|
3
|
R: inadvertent selections/deselections change system recommendations. Most users realized
the error and were able to correct it
M: suggest limiting use of dark-blue color that indicates preselection to only buttons
that can be selected
|
2. Risk factors
3. Recommendations
4. Document and print
|
Omission
|
Selecting loop diuretic in step 1, but not discussing it in step 2 or 3, nor entering
an order to discontinue the diuretic.
|
4
|
3
|
R: due to likelihood of provider to discontinue loop diuretic without consulting specialty
care or use of patient handouts, no action was included in step 2 of ASPIRE system
M: consider adding recommendation in step 2 for loop diuretics in future versions
despite lack of associated patient handouts
|
3. Recommendations
4. Document and print
|
Commission
|
Navigate into EHR to review medication list to see if was on bisphosphonate
|
2
|
2
|
R: system pulls this information from the EHR when generating recommendations. This
does not change system recommendations, but does add to time required for system use.
M: include this information in initial training
|
3. Recommendations
4. Document and print
|
Commission
|
Clicking on preview icon and verbalizing “sending to printer” without opening/printing
from PDF
|
1
|
3
|
R: results in patients not receiving intended information. System limitations for
printing resulted in the need to generate PDF that then needs to be printed
M: consider alternative label to “print” to convey this EHR system limitation
|
3. Recommendations
4. Document and print
|
Commission
|
Printing same handout multiple times
|
1
|
2
|
R: does not impact clinical recommendations, documentation, or patient education
M: may resolve with additional experience with ASPIRE
|
3. Recommendations
4. Document and print
|
Commission
|
Treating osteopenia patients as osteoporosis. Engaging with patient that has osteopenia,
but stating osteoporosis comments like “We know you have osteoporosis” (concerning
because of section label may be confusing users)
|
3
|
3
|
R: appropriate dose of bisphosphonate differs for osteoporosis vs. osteopenia
M: consider revising section label to read Bone health in future version to decrease
potential for confusion
|
3. Recommendations
4. Document and print
|
Omission
|
Assumption of auto entered orders; statements about “I sent the order” or the “order
was sent” without entering or verbalizing that they would go into the EHR to manually
enter
|
2
|
3
|
R: explicitly add instructions regarding need to enter orders manually and emphasize
in initial training
M: if EHR capabilities change in the future, consider adjusting platform to send orders
to EHR. Current EHR limitations do not allow for this capability
|
4. Document and print
|
Commission
|
Sending note with incorrect information to EHR (related to not editing)
|
6
|
4
|
R: providers will need to make edits to note in EHR. This was not part of testing
and could lead to erroneous information in medication record
M: emphasize in training and evaluate risk in pilot if possible
|
4. Document and print
|
Commission
|
Clicking on note to try and edit (before scrolling down to edit button)
|
1
|
2
|
R: most users quickly realized they needed to scroll to edit button
M: move edit button to top of note section so user does not need to scroll
|
4. Document and print
|
Omission
|
Not editing note when verbalized plan was different than the content in the note
|
7
|
3
|
R: this error could result in subsequent error of sending note with incorrect information
M: emphasize in training and evaluate risk in pilot if possible
|
4. Document and print
|
Omission
|
Not printing handouts
|
6
|
3
|
R: this results in patients not having additional information to reference following
the visit.
M: likely to resolve with system use and training as providers verbalized they would
provide handout. This increases likelihood that patients ask about them before leaving
clinic.
|
4. Document and print
|
Omission
|
Not saving note to EHR/visit summary; verbalized that was going to save note to EHR
or verbalized to patient that info will be in patient instructions/visit summary but
did not save the note
|
4
|
3
|
R: could result in lack of information in visit summary or require providers to manually
enter information into summary
M: initial training and system use should demonstrate the value of this capability
|
4. Document and print
|
Omission
|
Editing text, but not saving edits
|
2
|
3
|
R: could result in misinformation being saved
M: change system to autosave to reduce risk
|
4. Document and print
|
Omission
|
Verbalized did not want to print certain handouts, but did not uncheck boxes
|
1
|
2
|
R: provider can remove from printed resources
M: cover in initial training and may resolve with system use
|
Abbreviations: EHR, electronic health record; PT, physical therapy.
Note: Error severity: 1 = cosmetic, 2 = minor, 3 = major, 4 = catastrophic.
Acceptability was calculated for each scenario for a total of 28 observations across
14 participants. Recommendations provided varied by scenario and participant based
on the selections made during the risk factor task. For example, if only one FRID
was selected then only recommendations for that selection were shown. Total recommendations
seen by participants varied from three to six (mean = 4.9, median = 5). Total acceptability
was based on the number of recommendations provided compared with recommendations
accepted. Recommendations accepted ranged from one to six (mean = 3.6, median = 3.5).
There was a statistically significant difference found between number of recommendations
provided and number accepted (p < 0.001). The most accepted recommendation was exercise. In 22 of the 28 observations,
the recommended level of exercise was included in the final fall prevention plan.
In five of the six observations a higher level of exercise was selected. Only one
participant in one scenario felt that exercise was not applicable. Acceptance of FRID-related
recommendations was mixed. Benzodiazepine handouts and tapering calendar were accepted
all but once (n = 11), and gabapentin was addressed each time it was selected (n = 14). However, loop diuretics were only addressed 7 of the 20 times they were selected.
The least accepted recommendation was the prescription of bisphosphonates. This recommendation
was provided in 26 observations, but only accepted in 14.
Discussion
By designing our study to include simulation and open-ended questions, we were able
to evaluate the ASPIRE system using both quantitative metrics and provide important
contextual insight into that data through qualitative means. If we had relied only
on quantitative metrics like SUS and SEQ, we would not have had insight into why scores
were chosen by participants. Interview questions also provided insight into what value
users saw in the system; see [Table 3] for themes and representative responses. Overall, participants found ASPIRE easy
to use and preferred it over current practice. This is further supported by comparing
our SUS scores to a large SUS database. Compared with the 446 studies in the database,
ASPIRE is in the 80th percentile and receives a grade of “B.”[26] To our knowledge, ASPIRE is the only tailored fall prevention tool designed for
primary care. One other CCDS study focused on fall prevention in primary care was
found, it but did not include SUS scores or quantitative usability metrics.[9] This study redesigned a clinical reminder to conduct fall risk screening within
a specific EHR. That CCDS did not include tailored recommendations nor was it available
for integration into other EHRs. This makes ASPIRE a novel approach to addressing
fall prevention in primary care because it provides actionable recommendations that
could overcome previously reported skill gaps. The ASPIRE system contains the four
features identified as critical to impact practice including being computer based,
provision of recommendations, workflow integration, and provision of support during
the decision process.[32] Furthermore, on completion of the ongoing pilot study, ASPIRE will be available
for integration into any EHR from the CDS Connect repository.[33]
Table 3
Qualitative themes and representative responses
Question
|
Themes
|
Representative responses
|
What did you like about this tool?
|
•Overcomes clinical inertia (n = 5)
|
•I tend to be less aggressive on deprescribing medications in my current practice,
but the system forced me to ask if these meds are really still needed. (U2)
•Tool prompts deliberate process of considering osteoporosis treatment. (U9)
|
•Guides conversation (n = 5)
|
•Talk points were good so I didn't have to come up with the words on my own. (U2)
•Guides you through the conversation. (U7)
•Talking points are good. Some patients are nervous about changing these meds and
the talking points are helpful. (U8)
|
•Tailorable (n = 5)
|
•Like the ability to edit text to individualize information. (R10)
•Liked ability to select/deselect. (R14)
•Liked the multiple levels of exercise. (U1)
|
•Handouts (n = 4)
|
•I really like that the exercises are right there to pull up and go over with the
patient. “Let's see if we can do this together right now before I send you home doing
these.” (U9)
•I like the exercise handouts those are good. (R12)
|
•Documentation assistance (n = 3)
|
•I think that's great for progress note. It would make my life so much easier because
then you don't have to type this up and it's all in there. (U9)
•Love the note “a real time saver.” (U11)
|
•Action oriented (n = 3)
|
•Love the concrete suggestions. (U2)
•Very action oriented. (U7)
|
•Good scope with 3 risk factors (n = 3)
|
•Liked that is focused just on 3 main areas, not too cumbersome. (R12)
|
What did you dislike about this tool?
|
•Note location in EHR (n = 4)
|
•Location where the text inserts into the note. (R10)
•Unclear where the note would insert within the encounter. (U1)
|
•New system (n = 2)
|
•Just takes time to get used to what is coming at you; general dislike about having
to adjust or learn flow, but “This is better than relying on provider's memory.” (U6)
|
•Disagree with recommendations (n = 2)
|
•Medication section was least helpful; liked the flagging, but not recommendations.
(U11)
|
What else should the tool do?
|
•Generate prefilled orders (n = 8)
|
•Would be nice if there was a way to pull in a PT order. (U11)
•Link to bisphosphonate standard order set; prefilled orders. (U4)
|
•DEXA results/FRAX score (n = 6)
|
•Access to date for DEXA scan and FRAX score. (R12)
•More information to support bisphosphonate orders (e.g., laboratories to check, common
side effects). (U4)
|
Would you recommend this tool to your colleagues? Why?
|
•Yes, actionable (n = 5)
|
•It's easy to overlook especially if not sure what next steps are. This helps to break
it down. (U4)
•Yes, especially for the medication prompts to overcome visits being busy. Useful
to be challenged on the meds. (U2)
|
•Maybe, concern for time (n = 4)
|
•Like any tool it takes time. Would recommend for those not experts in fall prevention
and not already doing anything about fall prevention. (U6)
|
Would you prefer this system compared with your existing system?
|
•Yes, actionable prompts (n = 5)
|
•Yes, because it gives concrete suggestions. (U2)
•Yes, much more comprehensive. In the current system you have to remember all these
things. At least this prompts you. (R14)
|
•Yes, handouts (n = 4)
|
•It has handouts right there and ready to go. (U1)
•I think they are helpful. I don't do handouts as much as I should be. Some pts like
something to review. (U8)
|
•Yes, documentation assistance (n = 2)
|
•It lets me pull the info into the note, so I don't have to type as much when I'm
talking to the patient. (U4)
•I do because it also helps you document and shows how you went through things and
how it pulls into the note would save time for me documenting. (U9)
|
Abbreviation: EHR, electronic health record; PT, physical therapy.
While providers rated the tool easy to access based on SEQ ratings, most urban participants
required at least one hint to complete the launch and landing task during the first
scenario. Several providers from the urban site verbalized something like “now that
I know where the button is it will be easy to do again.” It is not uncommon for participants
to rate a task favorably on the SEQ if they believe it will be easily repeatable.[26] This was further supported by participants quickly launching the tool during the
second scenario without prompting.
Our results also showed that ASPIRE was easy to learn with significantly decreased
time required during the second scenario. However, on average, participants spent
9.4 minutes using ASPIRE during the second scenario. This could represent a substantial
portion of a typical primary care visit. However, preliminary data of the ongoing
pilot suggest that on average providers are spending 4 minutes with the tool. This
suggests that time using the system is further reduced when the provider has a relationship
with the patient and has received training. In the context of an annual wellness visit,
which can allow for more time than a regular follow-up appointment, this could be
implemented. Even within shorter appointments, this tool could be used when falls
are of significant concern. Some participants also commented that this tool would
save them time on other tasks like documentation so time spent with the tool may save
time on other visit-related tasks.
While error rates remained consistent between scenarios, this could be mitigated by
user interface (UI) adjustments and providing training prior to implementation. In
this study, no initial system training nor corrective instruction between scenarios
was completed, suggesting that training could be beneficial. Many participants commented
on errors and implied increased familiarity with the tool would prevent future errors
and decrease time required to use the system. Training should include where the tool
is accessed, a walk-through of the steps, review of handouts available, and provide
references to the evidence used to develop the logic. An explanation of how the system
preselects items is also recommended. To further reduce errors, adjustments to the
UI were made based on results, EHR limitations, and budget. Changes included adjusting
the color of buttons not related to prepopulated data. In the prototype tested navigation
buttons and the buttons to send information back to the EHR were the same dark blue
that indicated a selection was based on patient data from the EHR. Some participants
verbalized that the dark blue on those buttons indicated that information was automatically
sent back to the EHR. Some providers verbalized that the recommended orders were automatically
sent to the EHR. However, due to technical limitations this was not feasible. To ensure
orders are entered, an instruction was added alerting users that they must be manually
entered. Because there was confusion over location and visibility of the edit button
for the prepopulated note, the UI was updated so that the note defaulted to an editable
status. Lastly, a resource library of all handouts and supporting evidence was created.
This will enable providers to access materials that the CCDS did not automatically
recommend and allow providers to print more than one level of exercise handout, which
was a recommendation from testing. Several providers requested the ability to provide
multiple levels of exercise so patients could progress to higher levels without returning
to clinic.
Acceptability of recommendations tested varied widely. Exercise was the most accepted
recommendation and was the result of guidelines specific to primary care and research
team experience from prior fall prevention studies in primary care. Loop diuretics
were often selected in the risk factor step, but not addressed in the final plan.
This divergence may be due to differences in logic for diuretics compared with other
medication classes. During development the team anticipated that providers would not
consult a specialist nor require a handout to address diuretics, therefore information
regarding diuretics is not displayed in the recommendation task. One participant commented
that it was interesting that the diuretic selected in step one was not included in
step two. Based on results and participants' comments, future versions of the CCDS
should consider revising step two to include information about diuretics. When developing
bisphosphonate recommendations, guidelines from the American College of Endocrinology
were used, which recommended prescription of bisphosphonates for osteopenia and osteoporosis.[34] Our participants agreed bisphosphonate use to treat osteoporosis, but most felt
it was inappropriate for osteopenia. This may represent a difference in clinical practice
between primary and specialty care providers.
Limitations
The use of simulation attempted to mimic real-world use, but was not able to replicate
the patient–provider relationship vital to primary care practice.[35] Several participants verbalized that the system would be easier to use with patients
they know. Participants remotely controlled ASPIRE during testing which may have contributed
to some lag and increased time on task. Our study also fell short of initial recruiting
targets. The limited number of eligible participants from the rural site coupled with
anecdotal suggestions of the coronavirus disease 2019 (COVID-19) burnout from the
urban site may have contributed to this challenge. Lastly, due to limited time and
budget, this study only included two clinical scenarios which covered only some of
the recommendations ASPIRE can produce. Further pilot testing should be done to evaluate
acceptability of all possible recommendations and the ability of ASPIRE to integrate
into clinical workflows. Future studies should also measure ability of ASPIRE to influence
clinical practice and patient outcomes.
Conclusion
Usability data suggest that ASPIRE represents an improvement to current practice in
both rural and urban clinics with different available resources. Our results highlight
the importance of using guidelines already acceptable to the target end-user when
developing CCDS and support previous findings suggesting that workflow integration
is important to successful CCDS. Due to its interoperable design ASPIRE has the potential
for broad impact across organizations; however, limited support for some FHIR services
could add to implementation burden in some EHRs. Pilot testing is needed to validate
that that our favorable usability results translate into clinical practice workflows
and how its use impacts utilization of fall prevention guidelines.
Clinical Relevance Statement
Clinical Relevance Statement
This study demonstrates that it is possible to develop an interoperable computerized
clinical decision support to target the fall prevention process with better than average
usability. Due to its' interoperable design and focus on three risk factors, it has
the potential to integrate into any electronic health record and be used in time-constrained
primary care environments.
Multiple-Choice Questions
Multiple-Choice Questions
-
Factors associated with successful clinical decision support systems include which
of the following?
Correct Answer: The correct answer is option e. Systems should be integrated into workflow, otherwise
users are far less likely to access them. Computer-based systems can make provision
of recommendations more timely, which is important when making decisions at the point
of care. Systems that provide actionable recommendations are more likely to change
behavior when compared with systems that only flag risks without providing ways to
address that risk.
-
When should summative usability testing be done?
-
During system development
-
Before full implementation
-
After full implementation
Correct Answer: The correct answer is option b. Summative testing provides input on what to fix and
metrics can be used as a baseline for posttest design changes. It is important to
fix any issues that will have significant negative impact on usability and users willingness
to adopt a new system before full implementation. If a system has too many flaws at
implementation, users will lose trust and may not be willing to adopt it, even if
improvements are made later.