Keywords
Vital signs - data quality - emergency medicine - electronic health records - clinical
decision support systems.
1. Background and significance
1. Background and significance
Early detection of patients at risk of sepsis or other physiological instabilities
is known to improve clinical outcome [[1]]. In order to identify patients at risk, a number of scoring systems can be used,
for example, RETTS [[2]], NEWS [[3]] or more specific diagnostic scoring systems for sepsis detection [[1]]. All scoring systems use vital signs for their calculation of warning scores [[4]]. The triage process and the clinical workflow in emergency wards usually include
the measurement and documentation of vital signs (►[Figure 1]) [[5], [6]].
Vital Sign Documentation Process
To assist clinicians in detecting patients at risk of physiological instability clinical
decision support and automation of warnings have been advocated [[7], [8]]. Such systems have to rely on the data in the Electronic Health Record Systems
(EHRs) to provide accurate warnings. Hence, data quality is essential, and can be
defined as the capability of the data to be used effectively and rapidly to inform
and evaluate decisions [[9]]. This capability is labelled fitness for use [[10]], and to assess data quality in EHRs Weiskopf and Weng propose a framework consisting
of three dimensions: Completeness, Correctness and Currency [[11]].
Completeness may be defined as availability and accessibility of expected entries
in the EHR. Vital signs are routinely measured [[12]], and EHRs are available at all Swedish emergency departments [[13]]. From those facts a high level of completed vital sign registrations may be expected.
However, previous research at nine Swedish emergency departments showed that less
than half of them document triage vital signs directly in the EHR [[6]]. Correctness refers to how true the data is. If the correctness cannot be directly
measured surrogate measures like plausibility and concordance may be used to estimate
correctness [[11]]. Plausibility is an estimate whether it seems reasonable to assume that the data
are true while concordance compares agreement between corresponding data sets. The
currency of the data deals with the temporal aspects of data quality. Many studies
of EHR data quality have a perspective on retrospective reuse of data for research
or storage in quality registries. From that perspective, the currency of the data
may be of less importance. For real-time CDS, not least in acute medicine, the currency
of the data is, however, of utmost importance Although there are some studies on vital
sign data quality [[6], [14], [15]], little is known about fitness for use for automatic calculations of warning scores
and triage in the emergency care context.
2. Objectives
This study aims to describe the effects of different types of documentation practices
on vital sign data quality, and to evaluate vital sign fitness for use in emergency
care clinical decision support systems that provide calculations of warning and triage
scores. The study presents a method for analysing vital sign data quality in EHRs
and provides reference data on triage vital sign data in emergency care from five
different Swedish hospitals.
3. Methods
3.1 Data collection
Data were extracted from the electronic health records of emergency departments at
five different sites (►[Table I]). These sites were purposely selected to represent different documentation practices
found in a previous qualitative study of factors affecting vital sign quality [[6]]. Three groups were formed according to the different types of documentation practices
at the sites; paper-based documentation, mixed documentation and electronic documentation
(►[Table I]). At two sites documentation was done on a structured paper-based template, and
no entries of vital signs were routinely done in the EHR. Albeit EHRs were available
at these two sites, they were not always used for documenting the vital signs measured
immediately during triage. One site used a mixed approach where documentation was
first done on a paper-based template and later transferred into the EHR. Finally,
two sites had a fully electronic documentation practice where vital signs were entered
directly into the EHR. No site made use of an automated registration of vital signs
using medico-technical equipment.
Table 1
Demographic data, Completeness and non-valid registrations
|
Site 1
|
Site 2
|
Site 3
|
Site 4
|
Site 5
|
n Visits
|
59900
|
59679
|
62764
|
78991
|
73693
|
Age Mean
|
54
|
54
|
45
|
51
|
50
|
Male/Female
|
47/53
|
50/50
|
50/50
|
44/56
|
50/50
|
HER system
|
EHR 1
|
EHR 1
|
EHR 1
|
EHR 2
|
EHR 2
|
Installation of EHR
|
Inst A
|
Inst B
|
Inst C
|
Inst D
|
Inst D
|
Documentation practice
|
Mixed
|
Paper-based
|
Paper-based
|
Electronic
|
Electronic
|
Clinical workflow, % of visits
|
Internal medicine
|
48%
|
46%
|
45%
|
40%
|
–
|
Surgery
|
28%
|
31%
|
30%
|
23%
|
–
|
One flow
|
–
|
–
|
–
|
–
|
52%
|
Orthopaedics
|
24%
|
20%
|
18%
|
15%
|
12%
|
Other
|
–
|
3%
|
7%
|
22%
|
36%
|
Extracted data
|
Completeness
|
95%
|
2%
|
1%
|
71%
|
62%
|
Non-valid data
|
0.3%
|
0.1%
|
0.1%
|
0.1%
|
0.1%
|
Two different EHR systems, with four separate installations, were used at the five
sites. The two paper-based sites and the mixed documentation practice site used EHR1,
and all these three sites had separate installations of the system and the patient
database. The two sites with a completely electronic documentation practice used the
same installation of the EHR 2, and they also shared the same patient database. Although
the installation was shared between the fully electronic sites, assigned clinical
workflows and work practices varied between the sites. All emergency department visits
at the five sites during 2013 were included in this study. An exclusion criterion
was age < 18 years. For each visit a specified dataset was extracted that included;
patient age, gender, assigned clinical workflow (e.g. Surgery, Orthopedics, Internal
Medicine), registered arrival time, saturation, systolic blood pressure, heart rate,
temperature, respiratory rate and time of entry for the vital sign measurements.
3.2 Data analysis
Statistical analysis was done using SPSS Statistics (IBM, 2014) and Microsoft Excel
(Microsoft Excel, 2016). As proposed by previous work on data quality in EHRs the
data were evaluated regarding completeness, currency and correctness [[11]].
3.3 Completeness
Completeness was calculated for all vital signs, and the mean value of completeness
for the sites is reported in the study. The completeness per vital sign was calculated
by dividing the number of visits with a registration of the specified vital sign by
the total number of visits.
3.4 Correctness
To directly measure correctness, the state of the patient at the time of measurement
would be the golden standard for comparison, but because the study was performed retrospectively
this was not known, and therefore, correctness was evaluated by the surrogate measures
plausibility and concordance. Plausibility [[11]] can be defined as an estimate of how reasonable a data entry is in respect to the
biological process it is supposed to represent, and in this study all non-valid data
are treated as affecting the plausibility negatively. Data that was deemed not valid
could be either not applicable or out of biological/clinical relevant range. Not applicable
data could be reported in a wrong format, for example, strings of text were a number
would be expected. Out of range data was defined by the researchers and discussed
with senior consultants in emergency medicine at the sites (►[Table II]). For the definitions of out of range data the aim was to identify outliers from
a clinical and biological perspective. The sum of outliers and non-valid data was
identified and reported for the studied parameters. The mean value of the non-valid
vital sign data per site is reported in the study. Plausibility was also assessed
by studying the distributions of the data expecting normal biological distributions
of the vital signs. Concordance [[11]] relates to whether data agreement exists between sources that aim to describe the
same phenomenon, and in this study we compared agreement between the data sets in
the sites. We used descriptive statistical data, boxplots and distribution plots to
describe and analyse the concordance between the data sets of vital sign measurements.
By evaluating both the plausibility and the concordance of the data we were able to
assess the correctness of the vital sign data.
Table 2
Valid vital sign range
Vital Sign
|
Validity range
|
Saturation SPO2
|
70–100%
|
Systolic Blood Pressure
|
60–240 mm/Hg
|
Heart Rate
|
20–250 bpm
|
Temperature
|
33–42 C°
|
Respiratory Rate
|
4–45 breaths per minute
|
Time to registered vital sign
|
0–240 min
|
3.5 Currency
Triage including vital sign measurements is recommended to be performed within 15
minutes of arrival [[12]], and therefore this study was designed to relate the time of the vital sign registrations
to the documented time of arrival in the EHR. The time of arrival is automatically
set when the patient registers at the arrival desk in the emergency department. The
time of arrival is considered robust and is used for quality measures and reporting
in research, like in ED waiting times [[16]] and door to needle time in thrombolysis of stroke [[17]]. The time to registration of measurement was calculated by calculating the difference
in minutes between the arrival time and the time of the closest registered vital signs.
Descriptive statistics were used to evaluate the time to documentation. Negative time
differences and differences of more than 240 minutes were considered non-valid.
4. Results
4.1 Background information
The demographic data (►[Table I]) show that the majority of patients (63–77%) were assigned to the clinical workflow
regarding internal medicine or surgically related complaints. All sites differentiate
between surgical and internal medicine complaints except one site that has a common
“single clinical workflow” for these patients.
4.2 Completeness
The level of completeness varied greatly between the sites 1–95% (►[Table I]). The group that used a paper-based documentation practice had the lowest completeness
(1–2%). Routines in the electronic documentation group did not include triage with
vital sign measurements for all patients; patients assigned to the clinical workflow
“other” were not triaged. These patients often had ear/nose/ throat related problems
or complaints that were perceived to have lower acuity. Subgroup analysis in the electronic
documentation group, with adjustment for the patients not expected to be triaged with
vital signs increased the completeness to 85%. This showed that completeness was acceptable
for the sites using mixed or electronic documentation and that simple descriptive
statistics can be used to assess completeness. There was a lack of standardized documentation
routines for heart rate in the electronic documentation group. This made it complicated
to retrieve data and resulted in low completeness for heart rate data in the group
(►[Table III]).
Table 3
Descriptive statistics of vital sign values
Vital Sign
|
Documentation
|
n
|
Mean
|
Median
|
SD
|
Q1
|
Q3
|
Min
|
Max
|
Heart Rate
|
Paper based
|
2316
|
82,2
|
80
|
18,1
|
70
|
94
|
36
|
250
|
Mixed
|
56559
|
83,3
|
81
|
18,3
|
70
|
91
|
0
|
175
|
Electronic
|
4133
|
84,9
|
82
|
19,7
|
71
|
96
|
0
|
200
|
Systolic Blood Pressure
|
Paper based
|
3439
|
139,1
|
137
|
24,1
|
121
|
153
|
13
|
260
|
Mixed
|
58673
|
142,7
|
140
|
25,3
|
125
|
158
|
0
|
273
|
Electronic
|
107483
|
140,4
|
138
|
24,4
|
123
|
154
|
0
|
286
|
Respiratory Rate
|
Paper based
|
2577
|
17,7
|
16
|
4,6
|
15
|
20
|
0
|
110
|
Mixed
|
56490
|
15,6
|
14
|
5,1
|
13
|
16
|
0
|
100
|
Electronic
|
95216
|
17,3
|
16
|
4,2
|
15
|
20
|
3
|
65
|
Oxygen Saturation
|
Paper based
|
1603
|
97,3
|
98
|
2,8
|
96
|
99
|
70
|
100
|
Mixed
|
57785
|
96,5
|
97
|
3,8
|
96
|
98
|
0
|
100
|
Electronic
|
105320
|
97,0
|
97
|
2,7
|
96
|
99
|
8
|
100
|
Core Body Temperature
|
Paper based
|
2741
|
37,0
|
37
|
0,7
|
36,6
|
37,4
|
31,4
|
40,8
|
Mixed
|
56112
|
36,8
|
36,8
|
0,9
|
36,4
|
37,1
|
0
|
41
|
Electronic
|
101253
|
37,0
|
36,9
|
0,7
|
36,6
|
37,3
|
27,6
|
41,9
|
Time to Registration
|
Paper based
|
3079
|
116
|
116
|
62
|
62
|
171
|
1
|
240
|
Mixed
|
58673
|
38,3
|
19
|
326
|
11
|
37
|
1
|
240
|
Electronic
|
113199
|
24,2
|
18
|
90
|
11
|
29
|
1
|
240
|
4.3 Correctness of vital sign measurements
The descriptive statistical analysis of the registered vital sign values showed similarities
between the groups (►[Table III]) (►[Figure 2]). Although there was overall concordance between the vital sign values there were
deviations in the distributions of oxygen saturation in the paper-based documentation
practice, where visual inspection of the distribution curve showed a lack of concordance
and indicated lower data quality in the segment above 94%. Further there was also
some discord in the distributions of respiratory rate. The registrations of systolic
blood pressure, respiratory rate and heart rate were affected by round-off practices.
Systolic blood pressure had values aggregating at zero end digits and to a lesser
degree at endings of five. Sub-group analysis of blood pressures between 105 and 115
showed that the registration of 110 was three times as frequent as any other registration
in the interval, and that blood pressures wrongly rounded upwards to 110 constituted
about 1,4% of all the measured blood pressures in the study. Similar patterns were
evident for heart rate and respiratory rate, while the temperature registrations seemed
to be unaffected by round off practices. Subgroup analysis of correctness depending
on time of entry and assigned clinical workflow did not alter any of these patterns
and did not seem to affect correctness. The non-valid data was low (0.1%) in all groups.
Thus the vital sign values gave an overall plausible impression. The round-off practices
are accepted in the current work practice, and the discord in the respiratory rate
and oxygen saturation was within the normal reference intervals and thus unlikely
to affect the results of triage or warning score calculations. Overall, the results
showed that the correctness was acceptable for use for automatic calculation of triage
and warning scores.
Vital Sign Data Boxplots and Distribution
4.4 Currency
The results indicate that in the mixed and electronic documentation practices about
50% of the measurements were registered in the EHR within 20 minutes and 75% within
37 and 29 minutes of arrival respectively. This contrasted the paper-based flow where
the median time to documentation was 116 minutes. The difference was even more obvious
when studying the boxplots and distribution curves of the time to documentation (►[Figure 3]), where a lack of normally distributed data showed that there was no robust standardized
process for documenting the vital signs in the paper-based documentation practice.
This was not unexpected as there were no routines for documenting the vital signs
in the EHR in this group. In all groups, the time to documentation varied over the
time of day. ►[Figure 4] describes the percentage of patient registrations and the mean time from arrival
to vital sign documentation throughout the day. In the mixed and paper-based documentation
groups, there was clear covariation with the number of patient registrations over
the time of the day (►[Figure 4]). This indicated that vital sign documentation was delayed when the demand of measurements
was high and that the time to documentation was more affected in the mixed documentation
group than in the electronic documentation group. In the paper-based and mixed documentation
groups, there were high levels of invalid data (33 vs 47%). Further analysis showed
that these vital signs were usually documented through dictation by doctors and entered
into the EHR much later by secretaries. When the secretaries registered the vital
signs, the time was set to the time of patient arrival resulting in a zero-time difference.
This affected the plausibility of the paper-based and mixed documentation time differences.
Only the electronic documentation practice resulted in high currency of the documented
vital signs. However, even when using electronic documentation practices, less than
50% of the registrations were available within 15 minutes, which is the recommended
time for triage in emergency departments. Thus the relationship between the time of
measurement and the time of arrival turned out to be a relevant indicator of vital
sign currency.
Currency Distribution and Boxplot
Percent of registered arriving patients and time to vital sign documentation related
to the time of day
5. Discussion
5.1 Completeness
To be used for automatic calculation of warning scores the vital signs have to be
registered in the EHR. Because most Swedish sites use a triage system to prioritize
the arriving patients, a high level of vital sign measurements is expected. Although
all studied groups had EHRs implemented at the emergency departments, the results
show that they are not uniformly used for registrations of vital signs. In this study
completeness was high in sites and assigned clinical workflows where department routines
stated that documentation should be done in the EHR. This supports findings that completeness
may be improved by standardizing the documentation [[6], [15]]. The variability within and between sites may be attributed to EHR implementation
barriers. A switch from a paper-based documentation practice needs leadership to overcome
resistance to change [[18]]. In sites using electronic documentation, completeness may be improved by quality
improvement programs [[14]]. Findings from this study show that completeness can be evaluated by descriptive
studies. Before a triage CDSS can be implemented the completeness of the vital sign
recordings has to be ascertained because there seems to be variability between sites
and even within sites depending on the assigned clinical workflow.
5.2 Correctness
When comparing the datasets with descriptive statistics there are similarities between
distributions. This was interpreted as a high concordance between the data sets, and
the finding is in line with other studies of vital sign correctness in EHRs [[19]]. The discord found in respiratory rate may signal a clinical challenge in respiratory
rate measurement [[20]]. Measuring respiratory rate is known to be complicated because it is time-consuming,
and a manual observation has to be done while the patient is unaware of what measurement
is occurring. We suspect that for patients with a perceived as normal respiration
rate the RR is set to 14 in some sites and 16 in other instead of staff actually taking
an exact measurement. The discord in oxygen saturation above 94% in the paper-based
group may indicate a tendency to deem values above 94% as normal saturation and rounding
them off upwards, either to 98 or 100%.
Despite the overall impression of stable quality in the correctness, there were findings
showing that round-off practice is present in all sites. Rounding off with a preference
to zero end digits is described in multiple studies of blood pressure measurements
[[21]]. The clinical impact of the roundoff practice is not clear, but other studies have
warned that zero end digit preferences may affect the likelihood of eligibility for
drug treatment in hypertension [[21]]. This finding may be important when applying decision rules based on registered
vital signs. In the NEWS scale a trigger for low systolic blood pressure is set to
below 111 mm/hg [[3]]. Sub-group analysis shows an end-digit preference that would shift a number of
patients from the normal cohort to the low blood pressure cohort. In the study at
least 1,5% of the patients would be shifted to a higher national early warning score
considering blood pressure. However, this practice seems to be accepted, and at present
decisions are likely to be on rounded data. From this perspective, the data can be
considered fit for use in the current state. This study shows that the correctness
of the vital signs is fit for use in calculations of warning scores, triage, and research.
5.3 Currency
Time is important when calculating triage and early warning scores. In this study
we used the time of arrival and the time of documentation to assess currency of the
vital sign registrations. Because the use case in this study was calculation of triage
and warning scores, the time it takes from arrival until the automatic calculations
can be made is critical. The shorter the time, the higher the currency. Findings in
this study indicate that currency is higher in the sites with electronic documentation
practice. In these sites, data show that 75% of the measurements are registered within
30 minutes of arrival, which seems clinically relevant. We and other groups have earlier
shown that variability in manual vital sign documentation will impact currency and
completeness of the vital signs [[6], [15]]. In this study using a paper-based template to record vital signs lead to a delay
of entry in the EHR and invalid time stamps. According to our results a manual system
seems to fare progressively worse with increased workload compared to a more electronic
documentation practice. This is in line with other studies that have shown that the
currency of the registered vital signs may be improved by facilitating point of care
documentation and/or automatized registrations by medical devices [[22]–[24]]. None of the studied sites had EHR integrations with medical devices for vital
sign measurements, and it is interesting to note that the staff may perceive that
such integrations may reduce patient contact and even influence correctness negatively
[[6], [25]]. Even if such risk is addressed by human validation of the entries, there are still
challenges with all aspects of interoperability, technical, operational, and semantical
that need to be overcome before auto-entry will be standard in the emergency care
process. We showed that currency of the vital sign recordings might be studied by
plotting the distributions of the time to measurement, and further our results showed
that the currency of the vital sign cannot be considered generally fit for use in
CDSS or clinical research focusing on the time of measurement. This finding is in
line with other studies that have shown systematic errors in the timestamps in EHR
vital sign data [[22], [26]].
5.4 Summary of discussion
In [table IV] we have summarized the effects of documentation practice on data quality. To be
fit for use in clinical decision support used for calculation of emergency care warning
scores, vital signs need to be correct, complete and available at the right time.
This study shows that for paper-based and mixed documentation practices the currency
of the vital signs will not suffice to give timely warnings during all hours of the
day. Earlier studies indicate that less than half of the Swedish emergency departments
have a fully electronic documentation practice for vital signs [[6]]. We conclude that vital signs in emergency care EHRs cannot generally be considered
fit for use in emergency care clinical decision support systems.
Table 4
Effects of documentation practice on data quality
|
Paper-based
|
Mixed
|
Electronic
|
Completeness
|
Low
|
High
|
High
|
Correctness
|
High
|
High
|
High
|
Currency
|
Low
|
Low
|
Medium
|
5.5 Limitations
The main limitation of the study is the retrospective perspective. As the time of
measurement was not known we have used the time from arrival to the time of vital
sign documentation to study the currency of the vital signs. Since triage ideally
is expected to be performed within 15 minutes of arrival, the time to documentation
is a relevant measure because this is the earliest point in time that vital signs
are available for a CDSS using EHR data.
6. Conclusion
Before emergency department vital signs can be reused for automatic calculation of
triage and warning scores, data quality has to be ascertained and improved. Additional
effects of improving and automatizing the documentation practice could be that staff
workload decreases and more focus may be directed towards giving rather that documenting
care. We showed that currency, completeness, plausibility and concordance of vital
signs can be evaluated by descriptive statistics and comparison of multiple data sets.
We provided reference vital sign data from five emergency departments for such use.
We showed that electronic documentation of vital signs can result in acceptable data
quality for reuse in the calculation of triage and warning scores. Using mixed documentation
or paper-based documentation, however, will result in inadequate currency and completeness.
7. Implications for clinical work and research
7. Implications for clinical work and research
This study shows that Swedish emergency care vital sign data cannot be considered
generally fit for reuse in real time CDSS. This is of importance to clinicians aiming
to develop alerting systems based on EHR vital signs in emergency care. Further studies
on evaluating and improving vital sign data quality, especially the currency aspects,
are encouraged.
Multiple Choice Questions
Multiple Choice Questions
-
What influences vital sign data quality in emergency care electronic health records?
-
Time of the day when the emergency care visit takes place
-
Documentation practice
-
The number of physicians on duty
-
The brand of the EHR system
Answer b. is correct. Currency and Completeness will be affected by the documentation
practice. Time of the day seems to affect the time to documentation but the effect
is moderate in this study. We could not observe any difference between the different
EHR systems used in this study on vital sign data quality.
-
Which data quality aspect is directly related to electronic documentation or data
capture of vital signs?
-
a. Currency
-
b. Correctness
-
c. Plausibility
-
d. Concordance
Answer a. is correct. At least in our study currency of vital sign data was only achieved
in completely electronic documentation practice.