Background and Significance
Early warning scores (EWSs) are developed to predict future patient deterioration
and support clinicians in saving patient lives.[1 ]
[2 ]
[3 ] Inpatient care is fast-paced, high pressure, and decisions are a complex process
of cognition and actions rife with human fragility (e.g., errors, decision biases).[4 ]
[5 ]
[6 ]
[7 ] Advanced computational or machine learning model approaches to identify patients
at risk of clinical deterioration are promising but complex issues hamper uptake and
widespread adoption.[2 ]
[8 ]
[9 ] Intelligence augmentation (IA) refers to supplementing human cognition with technology.
Distinguished from artificial intelligence (AI), defined broadly as tasks a machine
is performing instead of a human, intelligence augmentation is referred to here by
the acronym IA.[1 ]
The incorporation of IA into clinical care is a revolutionary development. However,
the implementation of clinical decision support (CDS) tools even without IA is challenging.[10 ] EWS may be viewed as unnecessary, time-consuming, or excessively challenging to
use in team-based care environments.[11 ]
[12 ]
[13 ]
[14 ] IA-specific barriers including limited clinician interest exacerbate CDS implementation
hurdles.[10 ]
[15 ] Few randomized clinical trials have assessed the impacts of IA-enabled EWS in clinical
care and these studies alone will not provide sufficient information about implementation.[16 ]
[17 ] Sociotechnical approaches are crucial to understanding the complex personal, technical,
work system, and societal implications of IA in clinical care.[7 ]
[15 ]
[18 ]
[19 ]
[20 ]
Studies of IA-based systems have focused on patient end users,[21 ] but a need remains for clinician-focused studies incorporating data visualizations
and designed for cognitive processes. Cognitive theories such as Wickens' human information processing point to a preattention storage of sensory information including visual and auditory
information about a patient. This sensory information decays from memory very quickly.
Subsequent higher-level cognitive processes such as categorization support decision-making
in a feedback loop, by which perception is further impacted by a decision pathway.
This suggests the need for designs that attract clinician attention and supplement
with more complex information.[22 ]
[23 ] Dual-process theory categorizes thinking processes as intuitive, fast, and automatic
(system 1) or deliberate, slow, and controlled (system 2). This theory also highlights
the potential to coordinate information presentation to support cognitive processes
by drawing attention to key information quickly and then providing detailed displays
with far more patient information to explore.[24 ]
[25 ]
[26 ] Prototype designs in this study were developed based on cognitive theory and user-centered
design principles identified in prior work by our team.[23 ]
[27 ]
[28 ] This qualitative study aimed to generate guidance for the theory-based design of
user interfaces (UIs) for IA-based risk-scoring approaches.
Methods
The study consisted of two phases: (1) the design of prototype IA visualization displays
and (2) clinician interviews stimulated by the presentation of iterations of the displays.
Design of Prototype Intelligence Augmentation Visualization Displays
Steps involved in generating the prototype displays included
Performing a review of the literature on advanced computation model approaches to
risk score information displays to identify, for example, UI designs and important
design questions.[2 ]
[23 ]
Identifying key design elements for interview guide development such as the type of
risk score (e.g., general deterioration, sepsis), individual-patient versus multipatient
displays, trend visualization, and approaches to IA explanation.
Generating prototype data visualization displays that combined both general and condition-specific
EWSs and clinical data.
Information in the displays was drawn from de-identified clinical data and resulting
electronic Cardiac Arrest Risk Triage (eCART) deterioration risk scores.[29 ] Initial display variations presented to participants including multipatient views
with EWSs by multiple systems (e.g., sepsis, respiratory), physiology, and laboratory
test results ([Fig. 1 ]) and individual patient detail views included EWSs trending over time and views
of physiology by hour ([Figs. 2 ] and [3 ]).[30 ]
Fig. 1 Multipatient view of early warning scores (the names in the displays are not actual
patient names, these are simulations based on real patient data but no actual patient
information is included).
Fig. 2 Individual patient view (the names in the displays are not actual patient names,
these are simulations based on real patient data but no actual patient information
is included).
Fig. 3 Individual patient view with trends.
Clinician Interviews
Recruitment
A “snowball” recruitment strategy was used approaching clinical leaders in the investigators'
health care systems for recommendations for physicians or nurses with 5 or more years
of clinical experience in critical care environments and emailing those recommended
potential interviewees with an invitation to participate. The goal was to design general
visualization tools rather than tailoring them for a specific user population. A semistructured
interview guide was developed by the author team based on cognitive theories and the
study objectives were to assess (1) clinicians' experience with AI-based EWSs in clinical
care, (2) visualization preferences for attention-drawing and detailed information;
and (3) clinicians' explanations of those preferences (see [Supplementary Appendix A ] [available in the online version]). Qualitative interviews of approximately 1 hour
in length were conducted over a virtual meeting platform and stimulated by the prototype
visualization displays. Interviews were conducted by female investigators with sociotechnical
and qualitative data collection expertise. Demographic data were collected during
the interview.
Refinement of Visualization Displays
After seven interviews, participant feedback was used to update the prototype visualization
displays. Based on the theoretically informed design and specific research questions,
seven interviews provided adequate information power for the iteration of the designs.[31 ]
[32 ] In the second phase of interviews, displays were iterated to include a single EWS
depicting general deterioration and align displays with preferences from the first
set of interviews ([Fig. 4 ]). Single-patient displays included comet graphs which displayed variability and
the age of numerical data using a combination of color and shapes (e.g., more data
observations with a longer comet tail; [Fig. 5 ]), and varying time frames for trends ([Fig. 6 ]).
Fig. 4 Multipatient display with general early warning score (the names in the displays
are not actual patient names, these are simulations based on real patient data but
no actual patient information is included).
Fig. 5 Single-patient display with time trend information.
Fig. 6 Single-patient display with varying time trend information (the name in the display
is not actual patient name, it is simulations based on real patient data but no actual
patient information is included). EWS, early warning score.
Data Analysis
Information power was used rather than saturation to determine adequate sample size
because the information power model is for theory-based design.[33 ] It is also important to note that the sample size matched the recommendation for
informatics studies using qualitative methods.[34 ] Each interview was audio-recorded, transcribed, and coded using NVivo software.[35 ] In thematic analysis, three interview transcripts were coded collaboratively by
the entire coding team (J.M.B., A.D., U.S., M.N., A.J., and K.M-K.) to generate an
initial codebook based closely on the words of the participants. The 12 subsequent
transcripts were coded individually and in duplicate with additional codes included
in the codebook, and any coding discrepancies were resolved through discussion. Codes
were grouped into representative themes through discussion and synthesis among the
coding team.[32 ]
[36 ]
[37 ]
Theme 1: Clinicians perceived Intelligence Augmentation as Valuable with Some Caveats
Related to Function and Context
Clinicians were generally interested in the incorporation of IA into practice (although
they often did not distinguish between IA and AI) and highlighted the potential to
improve efficiency. They addressed important caveats related to the function, context,
and data that inform an IA visualization. Their reflections about how they might use
(or not use) IA tools in the future signal potential user needs.
Clinicians recognized that computer-aided tasks support efficiency, “I know that the computer could do this in a second ,” and that IA-based tools may be valuable for prediction “AI or informatics is going to be able to pick up in trends that you're just not .” Respondents noted that advanced computational model systems are built on data and
algorithms—some of which might be problematic, particularly with challenges of explaining
IA, unreliability, or potential for biased output, “AI models that are black boxes that end up leading to serious problems with inequity …” or other issues with data quality. Many clinicians understand that model performance
problems can arise from the underlying data (e.g., if data are not measured frequently
enough) suggesting user needs related to information about data quality, particularly
when the advanced computational model outputs are inconsistent with other clinical
findings.
Some clinicians thought that EWSs were not needed given their own personal clinical
vigilance, “Am I supposed to go see every patient or just be more vigilant? But I'm already vigilant .” There were caveats related to the novelty of AI in clinical care. AI may be “just another technology ,” and not likely to influence medicine in the extreme ways suggested by the promises
of AI. Some clinicians believed that close patient observation and clinical judgment
were unlikely to be surpassed by IA “people get so fixated and excited about the technology that they forget to take care
of the patient .” In contrast, others mentioned the potential for EWS to draw attention to important
parameters “So what if you had a scale that predicted heart failure that was different from sepsis?
Would we have caught it? ” highlighting examples of complex information related to sepsis diagnosis that could
potentially be recognized sooner using IA. Clinicians indicated that once AI/IA is
established into practice, concerns may recede because people “resist change in medicine. But then once it's out …no one questions .” Participants thought that IA would be most likely to be successfully implemented
if clinicians recognized its value for complementarity.
Taken together, this theme reflects clinicians' beliefs about the capacity of AI to
improve efficiency and patient care alongside concerns about data quality and appropriate
use of AI/IA in clinical environments. Clinicians referenced the special role of a
human and information that a human has that a computer either cannot have or likely
does not have.
Theme 2: Individual Differences among Users Influence Preferences for Customizability
Clinicians described their preferences for customizing data visualization. Differences
in preferences related to the role of the user in their cognitive approach (e.g.,
incorporating a cognitive pattern matching strategy such as the preference for a display
similar to those they already use).
Clinicians noted that EWS would be interpreted differently by individuals in separate
roles, “based on the background or the lens of the different users .” User roles influenced preferences related to the time horizon of data trends, “I think if I was on an ICU, I would want probably a slightly longer, like several
days, maybe three days, 72 hours ….” compared to shorter time windows for views in an emergency department. Clinicians
pointed to novice versus expert characteristics, “We've got such brand-new nurses. They're still trying to figure out how to just get
through the day .” Another clinician referred to differentiating high-priority information, “Some things I want to be alerted about differently than others and some that would
be interruptive, some that would be non-interruptive .”
What is cognitively intuitive is shaped both by past experience and attentional control.
One clinician described the potential for cognitive overload if the IA system is not
harmoniously embedded in the surrounding electronic health record, “be really careful about that because as a clinician, when I come to the clinical data
systems, there's already an intuitive color-coding and bolding scheme for honing my
attention in on things. And if this system is going to use a different scheme … that's going to be a problem .”
Another clinician described individual differences as driven more by preferences than
by logical (or system 2, deliberate, dual process) cognitive processing, “I suspect most of their opinions are not based in data or good reason, and it's just
because that's how they feel about it .” This clinician also addresses the need for approaches to IA that do not focus only
on preferences—but also on systematic visualization studies to assess impacts on patient
care outcomes, “I would try and deconstruct that …with data …this is what it's best based on all the studies we've done .”
Clinician responses in this theme characterized the importance of individual differences
as a factor in shaping preferences, such as for time trends, and more generally as
a “lens” to filter information. Clinicians also identified features that draw or could
draw their attention, consistent with both Wickens' preattentional sensory input (e.g.,
intuitive color coding) and with dual process theory. For example, pattern matching
to a familiar clinical data system (system 1) and interest in systematic, logical
processing (system 2).
Theme 3: Early Warning Score is Particularly Useful for Patient Prioritization
Clinicians noted the value of the EWS to identify patients who need care soon—a triage
process. Information needs included identifying dynamic changes and the need for clinician
attention to be drawn to key determinant factors.
Clinicians viewed the EWS as useful for detecting dynamic change, “So this (score) would be something that would be a part of an acute care person's
every moment on their floor, real-time, right ?” Clinicians pointed to different cognition during the process of triage (faster
pattern matching system 1 approach) as compared to a slower, system 2 process to explore
a detailed view of patient information, “I do that in a different workflow. …I'm looking at my list and I'm saying , ‘Who's sick and not sick ?’” suggesting that the multipatient display was a particularly powerful use case
for the EWS.
Clinicians noted that high-priority triage information supports quick identification
and related actions. An example is that of subtle neurological changes—recognizing
those more quickly because a higher EWS could permit quicker recognition of a potentially
correctable condition, “we get called to neuro changes quite a bit …the nurses are like , ‘They’re just more sleepy today ', or , ‘They’re saying things that are off …..'” suggesting that neurological changes may be a high priority for attention.
Clinicians expressed a need to identify the directional change in EWS trends over
time. “Even if (the score) was high, if it's going down, I'm probably not going to pull my
attention to that because I'm going to look at the ones that are getting worse .” One emergency department clinician commented on the utility of trending EWS for
communication, “to let the hospitalist who I'm admitting to know what level of care they will need .”
Clinicians recognized that the EWS could support triage, identified specific categories
of information that are a high priority for triage, and the importance of trend-related
information.
Theme 4: Need for Patient-Specific Contextual Information
Clinicians expressed a need for visual integration of clear, patient-focused information
related to the EWS and the importance of general framing for the score.
One clinician stated, “If [the score] was framed as if you want to know why this score is three, click here,
then I would be in the mindset of, okay, now it's telling me how it got to the score
of three rather than it's trying to help me take care of the patient. Those are two
very different things .” Clinicians also expressed the need to view data within a patient-specific context
“if I know that the patient who has a heart rate of eighty-four has a hemoglobin of
six, and yesterday their hemoglobin was nine, I'm going to be really worried because
that's an inappropriately low heart rate for their acute blood loss anemia … .”
One clinician highlighted an example of when critical safety information was challenging
to find and the importance of including crucial patient context to support understanding
the score, “Another thing that we did … for our emergency department was actually add how much supplemental oxygen they're
on ….” Clinicians also pointed to the need for actions related to communicating with
others, “And so that'll get our attention when you say , ‘This is an 80% chance of mortality .’ We got to talk to the family ….”
Taken together, these responses demonstrated that AI should address supporting clinicians
in caring for their specific patients and should provide additional context for the
interpretation of EWS for clinical decisions.
Theme 5: Perspectives Related to Understanding the Composition of the Early Warning
Score
Understanding how the EWS was calculated—the quantitative values of the variables
driving the score—was a low priority for some clinicians. When information (quantitative
or semiquantitative) could support their understanding of how to use the score to
care for the patient, clinicians were more engaged.
Some clinicians valued semiquantitative information, “I
think just defining … the moderate and high range [of contribution] is pretty sufficient .” When one clinician was asked if they would like to see clinical data values contributing
to increased or decreased risk they responded, “Yes, that's helpful. Because what is deteriorating the patient is helpful at that
time .”
Clinicians asked questions related to understanding how clinical documentation is
used in the calculation of the EWS. “…the AVPU (alert, verbally responsive, pain responsive, unresponsive) score —is it just the last documented? And if the predictive model …., discounted somebody's documentation, but it's still showing that the …neurotic status is altered … .” In response, the interviewer explained how clinical data can be used to inform
the AI predictive model based on population data (e.g., patients with these risk factors
are likely to show this specific outcome). Some clinicians indicated that understanding
the components would be useful as part of the big picture, not necessarily during
triage of the patient, but during flexible times to explore a more detailed explanation,
“And then in my free time , I want to understand the (score in the) tool better .”
Considering how they would use information about EWS composition, a clinician noted
the tradeoff between components that reflect fixed versus actionable information and
how that would impact care, “if age is 95% of the value, I'm not going to change her age .” Clinicians also questioned whether the score is responsive to the up-to-the-minute
clinical changes of the patient, “I gave them a bolus of IV fluid, I'm going to want to go and see what does their blood
pressure look like 15 minutes later, 30 minutes later. How do they look ? … And it's not clear to me that this score is going to know those answers .”
Given clinical demands, minimal curiosity, and the need for actionable information,
there is a clear need to frame the purpose of the EWS. There were indications that
some clinicians did not understand prediction models well (e.g., “How does it know what I did ?”), which could diminish the potential for successful implementation.
Theme 6: Design Preferences Focus on clarity for interpretation of information
Clinicians considered the semiquantitative presentation of EWS contributing variables
and color in the display to be valuable. Clinicians pointed to the importance of simplicity
but also the ability to customize based on specific preferences.
Clinicians addressed ways in which mathematical representation of the EWS can influence
interpretation such as how odds ratios might be challenging because of the difficulty
understanding the reference group or what the percentile might refer to, “it's just easier to quantify by having the base integers … rather than a percentile .” Color was considered useful for indicating whether the EWS was low, medium, high,
or critical. Many clinicians saw value in customizable views, centered on personal
preferences, “I like simplicity in the context of how I would like it ,” a reminder of the importance of personalizing information and supporting clinicians'
need for autonomy.
Color was perceived as a valuable clue to the age of the displayed data. “This lab is more than 24 hours old …. Have it be greyed out ….” Comet graphs, which were designed to show the range of time of specific values
(see [Fig. 2 ], bottom middle panel for comet graph images), were not popular among our participants.
“coming from somebody who's not used to looking at the graphs on the far right, I don't
tend towards those ....”
Generally, clinicians preferred design features that helped them understand the patient
data, and preferred features of customization that allowed them to understand the
score as a characteristic of the patient condition (e.g., trends over time). As in
other themes, information that supports understanding priorities quickly (e.g., using
color to attract or deflect attention), is consistent with the need to support a transition
from preattention sensory storage to processing in more depth, consistent with Wickens'
theory.[22 ]
[23 ]
Discussion
This study elicited clinician perceptions and preferences for prototypes incorporating
data visualizations and an IA-based EWS for patient deterioration. Clinicians generally
had positive impressions of the displays alongside concerns about data quality and
score reliability. Customizability preferences related to individual differences among
clinicians. The EWS was seen as particularly useful for prioritization. Having sufficient
patient-focused information to contextualize the EWS and related data was identified
as important. Some clinicians were curious about how the score was calculated but
were not always motivated to investigate this deeply themselves given job demands.
To that end, preferred design features supported quick synthesis of the included information.
The results of this study point to (1) the importance of framing the purpose of IA-based
EWS systems for clinical use, (2) the need to match tool function to individual differences
of the clinical user, (3) the importance of trend data showing change over time, and
(4) the need for transparency to support user assessment of tool validity. These findings
are consistent with work by others finding the need to match functions of a tool to
clinical user,[38 ] and the need for information contextualized by trend and treatment information.[39 ] Other studies of individual differences including how novices or experts would use
the tool and also reinforce the gaps, we found that can be reduced through display
design.[5 ]
[40 ]
[41 ] The need for transparency and the need for users to assess the validity of the recommendations
and/or score is also well-described in other works.[42 ]
[43 ]
There were participants who explicitly referenced the potential of the IA system design
to support efficiency. No clinicians referenced the potential for less burnout, another
hope for the impact of IA implementation.[15 ] Notable design features for consideration included the age of patient data, highlighting
parameter severity using color, and ensuring that trajectories of data were easy to
understand which suggests efficiency in implicit ways. Clinicians referenced special
human-like characteristics that potentially relate to concerns about replacement by
IA (e.g., “take care of the patient, that mantra is never going to get lost ”).
Across themes, clinicians addressed the importance of supporting their attention to
crucial issues (e.g., a tool that supports them in not missing key information, attention
regulation). There were hints that IA is seen as potentially threatening—unnecessary
for tasks clinicians already performed—and overhyped. This study is unique in examining
a theory-based visualization design and demonstrating results consistent with the
applied theoretical foundations. Customizability based on the clinicians' role and
cognitive processing were identified as important design components.[26 ]
[44 ] This finding is consistent with dual process theory and with recognition-primed
decision-making models of naturalistic decision-making among experts.[25 ]
[28 ]
[45 ] Clinicians in this study explicitly addressed the importance of intuitive, fast,
processing (system 1) consistent with dual process theory and the “lens” of the clinician
viewing the information, consistent with Wickens' theory of human information processing.[22 ]
[25 ] Of note, participants focus on the high-level pattern matching information, and
“how things are” or “how I usually see things” suggest some of the challenges of presenting
something “new” to clinicians. Potentially, the presentation of information that fits
existing workflows or cognitive patterns will neglect some of the potential to disrupt
those workflows and patterns.
In clinicians' descriptions of what they would like IA to do (e.g., recommend treatment,
demonstrate knowledge of what the clinician is doing for treatment), clinicians are
speaking to hopes about what AI/IA could do—and reflecting needs for flexible adjustment
to patient-centered factors including treatment input. Taken together, this suggests
that although disruptive change may be challenging given existing workflows and patterns,
when care is demonstrably improved for patients by disruption clinicians are likely
to welcome it.
Study strengths include stimulated discussion through prototype displays incorporating
complex displays and incorporating feedback into the iterative design.[2 ]
[46 ]
[47 ] The sample size was appropriate for thematic analysis, a method designed to promote
the exploration of the complexity of clinician views.[34 ]
[36 ] However, our sample size was not sufficient to compare providers systematically
in different roles or across practice settings.[48 ] In addition, some study participants worked in multiple critical care settings which
also precluded systematic comparisons by practice settings. Finally, study prototypes
were not suitable for extensive exploration of the data by clinicians related to their
specific patient data or details of the predictive model.