Methods
Data Source
Data were collected from the health inspection database of the First Affiliated Hospital,
Medical School of Zhejiang University. The database contained data from 295,241 physical
examinations performed between January 2011 and September 2017. The data include the
sex, age, history, medication history, lifestyle records, height, weight, body mass
index, systolic blood pressure (SBP), and diastolic blood pressure (DBP) of patients,
as well as the following blood test data: fasting blood glucose (FBG), TG levels,
and high-density lipoprotein cholesterol (HDL-C). The scope of the health check targets
includes public institutions, government agencies, private companies, etc., without
restrictions on sex or age. The baseline data were generated from the first health
check of this population, which excluded patients with baseline coronary heart disease,
type I diabetes, and familial hyperlipidemia.
A total of 96,506 people had at least 2 consecutive years of complete health check
data. Among them, 15,984 (16.6%) had MetS the first year, and 17,060 (18.7%) had MetS
the following year. Of these individuals, 5,086 (who converted to non-MetS the following
year) were not included in the study. Therefore, 91,420 individuals were included,
among whom 10,898 had MetS at baseline and 6,162 converted from non-MetS to MetS.
[Table 1 ] lists the demographic and clinical characteristics of the study population at baseline
and the following year. Among them, 47,098 were males, accounting for 51.5% of the
total sample. The average age of the individuals was 43.7 years. Compared with the
first year, all features deteriorated slightly on average the following year (body
mass index [BMI], systolic blood pressure [SBP], diastolic blood pressure [DBP], FBG,
and TG were higher; HDL-C was lower).
Table 1
Characteristics of the datasets
Variable
n = 91,420
Previous year
Subsequent year
Mean age (SD)
43.7 (14.0)
Percentage of male participants (total number)
51.5 (47,098)
Percentage of non-MetS patients (total number)
88.1% (80,522)
81.3% (74,360)
Percentage of MetS patients (total number)
11.9% (10,898)
18.7% (17,060)
Mean BMI (SD)
22.8 (3.1)
22.9 (3.2)
Mean SBP (SD)
121.1 (16.6)
122.3 (16.8)
Mean DBP (SD)
74.0 (10.4)
74.4 (10.7)
Mean FBG (SD)
4.92 (0.90)
4.97 (0.96)
Mean TG (SD)
1.297 (0.879)
1.358 (0.923)
Mean HDL-C (SD)
1.335 (0.344)
1.323 (0.347)
Abbreviations: BMI, body mass index; DBP, diastolic blood pressure; FBG, fasting blood
glucose; HDL-C, high-density lipoprotein cholesterol; MetS, metabolic syndrome; SBP,
systolic blood pressure; SD, standard deviation; TG, triglyceride.
We set up positive samples for people who were diagnosed with MetS the following year.
[Fig. 1 ] shows the distribution of the densities of the various indicators in the positive
and negative samples. We observed significant differences between the positive samples
and the negative samples with respect to the distribution of the inspection indicators,
which demonstrated that the health check indicators of the dataset in the baseline
year were significantly related to future diagnosis of MetS.
Fig. 1 (A-F) Density map for BMI, DBP, SBP, FBG, TG and HDL-C for positive and negative MetS samples.
BMI, body mass index; DBP, diastolic blood pressure; FBG, fasting blood glucose; HDL-C,
high-density lipoprotein cholesterol; MetS, metabolic syndrome; SBP, systolic blood
pressure; TG, triglyceride.
In this study, the diagnosis of MetS was based on the new MetS definition criteria
that were jointly developed by the 2009 guidelines of the International Diabetes Federation
and the American Heart Association/National Heart, Lung and Blood Institute.[20 ] The criteria are primarily employed to assess the risk of obesity (BMI or waist
circumference) and cardiovascular risk (SBP, DBP, FBG, TG, and HDL-C). We used BMI
as an assessment of obesity risk. The latest study published in Metabolism reported that the use of BMI to evaluate MetS risk is equivalent to the use of waist
circumference. In other words, the use of BMI to evaluate MetS risk has greater clinical
potential than the use of waist circumference.[21 ]
Based on the analysis of the population data, several methods were utilized for data
processing. [Fig. 2 ] displays a map of the research workflow.
Fig. 2 Map of the research process. AUC, area under the curve; ACC, accuracy; MICE, multivariate
imputation by chained equations; SMOTE, synthetic minority oversampling technique.
Data Preprocessing
Standardization
To improve the comparability among the feature indexes and the convergence speed and
data processing performance, we applied the Z -score standardization method to normalize each continuous variable.[22 ] This method normalizes the data based on the mean and standard deviation (SD) of
the raw data. Z -scores have a mean of zero and a SD of one; they are informative when the empirical
distribution is close to a normal distribution. In such cases, Z -scores may be used to compare relative locations of values from distributions with
different means or SDs.
Sample Balance
A serious class imbalance was observed in the dataset, and the number of patients
who had MetS the following year (18.7% of the total) was significantly smaller than
the untransformed population. To prevent deviations in the results and improve the
results, we employed the synthetic minority oversampling technique (SMOTE) to solve
the problem of unbalanced categorical data.[23 ]
The SMOTE is an improved oversampling technique that is based on a random oversampling
algorithm. The main idea is to use the similarity among the few existing classes of
samples in the feature space to create artificial data. The basic principle is to
use [Eq. (1) ] to linearly interpolate between the closely spaced samples of the minority class
to generate a new minority sample. For each sample from the minority class (x), five
samples from the minority class with the smallest Euclidean distance from the original
sample were identified (nearest neighbors), and one of them was randomly chosen (xNN
). Because the data constructed by the algorithm is a new sample that does not exist
in the original dataset, the risk of overfitting to the minority-class data is minimized.[24 ]
where u was randomly chosen from U (0,1). u was the same for all variables but differed for each SMOTE sample; this guarantees
that the SMOTE sample lies on the line joining the two original samples used to generate
it.
A randomly sampled dataset that represents one-fifth of the data was employed as a
test dataset. Of the remaining data, four-fifths were subsampled as the training dataset,
and the rest were used for validation. The training dataset consisted of 58,509 people,
the validation dataset consisted of 14,627 people, and the test dataset consisted
of 18,284 people. To improve the performance of the classifier, we used the SMOTE
implementation from the DMwR package[25 ] of R software (version 3.4.3) to oversample the unbalanced training dataset. After
SMOTE oversampling, 69,335 training samples, of which 21,838 were positive samples
(31.5% of the total), were obtained.
Features
To understand the impact of existing health check indicators on the development of
MetS, we generated a regularized gradient-boosted decision tree model using eXtreme
Gradient Boosting (XGBoost) to estimate the importance of the model features, which
indicates how useful or valuable each feature was in the construction of the boosted
decision trees within the model. The more an attribute is used to make key decisions
with decision trees, the higher its relative importance. Importance is calculated
for a single decision tree by the amount that each attribute split point improves
the performance measure, weighted by the number of observations the node is responsible
for. The performance measure may be the purity (Gini index) used to select the split
points or another more specific error function. The feature importance scores are
then averaged across all of the decision trees within the model.
By estimating the feature importance, we obtain three indicators: Gain, Cover, and
Frequency. Features are classified by Gain. Gain is the improvement in accuracy brought
by a feature to the branches it is on. Cover measures the relative quantity of observations
related to a feature. Frequency is a simpler way to measure the Gain. It counts only
the number of times a feature is used in all generated trees.
Features included sex, age, BMI, SBP, DBP, TG, FBG, and HDL-C of the previous year
and BMI, SBP, and DBP of the subsequent year, which were referred to as body mass
index of the subsequent year (BMI_L), systolic blood pressure of the subsequent year
(SBP_L), and diastolic blood pressure of the subsequent year (DBP_L). Positive samples
were patients who were diagnosed with MetS in the subsequent year. The inputs were
the health check indicators for 2 consecutive years of the study population. The output
was the diagnosis of MetS in the subsequent year. We calculated the importance of
the features to assess the extent to which these features (2-year home-based data
and blood test data) affected the classification of MetS. [Fig. 3 ] and [Table 2 ] present the prioritization of each feature in the model.
Fig. 3 Order of importance of the model features. BMI, body mass index; DBP, diastolic blood
pressure; FBG, fasting blood glucose; HDL, high-density lipoprotein; MetS, metabolic
syndrome; SBP, systolic blood pressure; TG, triglyceride.
Table 2
Model feature importance
Feature
Gain
Cover
Frequency
BMI_L
0.408
0.190
0.106
TG
0.199
0.149
0.124
SBP_L
0.131
0.113
0.080
FBG
0.081
0.140
0.129
HDL-C
0.080
0.086
0.139
SBP
0.026
0.062
0.088
DBP_L
0.022
0.075
0.077
BMI
0.021
0.089
0.113
Sex
0.018
0.031
0.029
Age
0.007
0.034
0.060
DBP
0.007
0.032
0.055
Abbreviations: BMI-L, body mass index of the subsequent year; DBP, diastolic blood
pressure; DBP-L, diastolic blood pressure of the subsequent year; FBG, fasting blood
glucose; HDL-C, high-density lipoprotein cholesterol; SBP, systolic blood pressure;
SBP-L, systolic blood pressure of the subsequent year; TG, triglyceride.
As shown, the contribution of BMI_L to the outcome of MetS was the largest, and the
third largest was SBP_L, which reflects that the features of the subsequent year are
important to the current recognition. The blood test indexes of the previous year
(TG, HDL-C, and FBG) comprise the next largest contribution, which means that historical
blood test data are relatively important for the outcome of MetS.
The mean and SD of the BMI and BMI_L shown in [Table 2 ] are similar; however, the importance of these two features is significantly different,
as shown in [Fig. 3 ]. We calculated the differences between the BMI_L and BMI for all individuals and
performed a t -test with the differences and the value zero. The resulting p -value is 0.022; therefore, the difference between the 2-year BMI and zero is statistically
significant at the 5% level of significance. This result shows that for the same individual,
the BMI in year 1 does not fully reflect the BMI_L in year 2 ([Table 3 ]).
Table 3
Variables of each model
Model
Variables
HOME
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
RBTIBE
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L, TG, HDL-C, FBG
IB
Step 1: sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
Step 2: sex, age, BMI, SBP, DBP, TG (inferred), HDL-C (inferred), FBG (inferred),
BMI_L, SBP_L, DBP_L
Abbreviations: BMI-L, body mass index of the subsequent year; DBP, diastolic blood
pressure; DBP-L, diastolic blood pressure of the subsequent year; FBG, fasting blood
glucose; HDL-C, high-density lipoprotein cholesterol; SBP, systolic blood pressure;
SBP-L, systolic blood pressure of the subsequent year; TG, triglyceride.
Clinical Feature Augmented Model
It is possible to evaluate the risk of MetS in the following year by using only home-based
data; however, blood test data contribute to MetS diagnosis ([Fig. 3 ]). The substantial importance of the data from the three blood tests has become an
important basis for us to interpolate the data from the three blood tests. Therefore,
in our proposed method ([Fig. 4C ]), inferred blood test data could be helpful as supplementary data.
Fig. 4 (A-C) Block diagram for the three models.
The goal is to evaluate MetS risk in the following year based on the absence of a
health check-up (i.e., no blood test data). We aim to use a large amount of health
check-up data to obtain a model that can predict blood test data by learning the relationship
between home-based data and blood test data to provide additional effective features
for the final classification model. The augmented model consists of two steps. In
the first step, blood test features are inferred by the multivariate imputation by
chained equations (MICE).[26 ] In the second step, the results obtained from the first step are combined with the
home-based data for the modeling of MetS by using the regularized gradient-boosted
decision tree algorithm.[27 ]
MICE
MICE is a practical approach to creating imputed datasets based on a set of imputation
models, with one model for each variable with missing values. MICE is an increasingly
popular method of performing multiple imputations. Here, we outlined the MICE algorithm
for a set of variables, x
1 , ..., xk
, some or all of which have missing values. Initially, all missing values are filled
in at random. The first variable (say x
1 ) with missing values is regressed on all other variables x
2 , ..., xk
The estimation is restricted to individuals with observed x
1 Missing values in x
1 are replaced by simulated draws from the posterior predictive distribution of x
1 , an important step known as proper imputation . Next, x
2 with missing values is regressed on all other variables x
1 , x
3 , ..., xk
and using the imputed values of x
1 . Again, missing values of x
2 are replaced by draws from the posterior predictive distribution of x
2 . The process is repeated in turn; one such round is called a cycle . The procedure is repeated for several cycles to produce a single imputed data point
to stabilize the results, and the whole procedure is repeated independently m times to give m imputed data points. MICE has the ability to handle different variable types (continuous,
binary, unordered categorical, and ordered categorical) as each variable is imputed
using its own imputation model.[28 ] Compared with k -nearest neighbors interpolation and recursive partitioning and regression tree interpolation,
the MICE interpolation method has better flexibility and higher precision. We applied
the MICE package in R to perform interpolation.
Regularized Gradient-Boosted Decision Tree
The regularized gradient-boosted decision tree algorithm is an algorithm implemented
by XGBoost.[29 ] Compared with the traditional gradient boosting decision tree algorithm, the regularized
gradient-boosted decision tree method adds a regularization term helping to smooth
the final learned weights to reduce the risk of overfitting. The regularized objective
tends to choose a model that employs simple and predictive functions. The objective
function consists of a loss function and complexity, which limits the number of leaves
and prevents overfitting to some extent; the function is defined as
where .
Here, i is the sample id, k is the tree id (number of rounds), represents the prediction error of the i th sample, Σ
k
Ω (fk
) penalizes the complexity of the tree, T is the number of leaf nodes, and ω is the value of the node. When the regularization parameter is set to zero, the objective
will fall back to the traditional gradient tree boosting.
The tree ensemble model in [Eq. (2) ] includes functions as parameters and is trained in an additive manner. For each
iteration, the training objective function of a tree can be written as
where is the prediction of the i th instance at the t − 1 iteration, which is employed to fit the residual f (x ). The objective function is approximated by Taylor's second-order expansion as follows:
where and are the first- and second-order gradient statistics on the loss function.
In addition to the regularized objective, shrinkage and column subsampling are used
to further prevent overfitting.
Model Generation
As shown in [Fig. 4C ], during the training phase of the augmented model with inferred blood features (abbreviated
as IB), the original training data were subsampled into 10 equal parts. Each time,
nine complete parts were used to interpolate the blood test data for the one remaining
missing part. After 10 imputations, all datasets had inferred blood test values. MICE
was used to impute blood test data from home-based data, and the inferred results
were provided to the regularized gradient-boosted decision tree algorithm as additional
features. In the testing phase, only home-based features were used in the test dataset,
and blood test data were inferred by the same method. The prediction of the blood
test data using MICE in the test dataset utilized a priori knowledge of the large
training dataset.
We compared the performance of IB with those of two other models. One model (abbreviated
as HOME) was given only the home-based features ([Fig. 4A ]), and the other (abbreviated as RBTIBE) was trained with extra blood test data ([Fig. 4B ]). The augmented model and the two other models were compared.
In addition to the HOME model, another baseline model could be the one that includes
only features from year 1. However, the main goal of this work is to provide a continuous
self-assisted diagnosis of MetS, so we did not use previous data to predict the risk
of developing MetS in the future. Therefore, we chose HOME as the baseline model,
which continuously uses the latest physiological data (BMI, SBP, and DBP) as the input.
Furthermore, the IB model proposed here also requires the latest physiological data
to infer the blood features for modeling; therefore, for a consistency comparison,
the HOME model was selected as the baseline model.
HOME contained home-based variables (sex, age, BMI, SBP, DBP, BMI_L, SBP_L, and DBP_L)
and used only home-based variables to directly train the regularized gradient-boosted
decision tree model to achieve a MetS auxiliary diagnosis, while RBTIBE was trained
using additional true blood test data (TG, HDL-C, FBG); accordingly, the three features
missing from the test dataset were interpolated using MICE. IB consisted of two steps:
step 1 used home-based variables (sex, age, BMI, SBP, DBP, BMI_L, SBP_L, and DBP_L)
to predict blood test data (TG, HDL-C, FBG), and step 2 merged the home-based variables
and the inferred values from the first step for MetS modeling. Compared with HOME,
the difference was that the training data had additional blood test data; compared
with RBTIBE, the difference was that the inferred blood test features were used for
regularized gradient-boosted decision tree training rather than real blood test features.
Tenfold cross-validation was applied in the boosting part of the three models for
parameter adjustment and selection, which ensured the reliability of area under the
curve (AUC) and limited overfitting to some extent.
Our purpose is to achieve a better model for MetS self-care. Blood test data are unavailable
at home; therefore, the three models were generated to compare their performance using
a test dataset that contains only home-based data.
In the regularized gradient-boosted decision tree model, parameter optimization was
performed using a grid search. The parameters were general parameters, booster parameters,
and task parameters. We chose a relatively high learning speed (0.3) and the optimal
number of trees based on the selected learning rate. We prioritized tree-specific
parameters (max_depth, min_child_weight, gamma, subsample, colsample_bytree) for decided
learning rate and number of trees. Tune regularization parameters (lambda, α ) were optimized to help reduce model complexity and enhance performance. Then, we
lowered the learning rate and decided the optimal parameters. [Table 4 ] lists the final classifier parameter values.
Table 4
Parameters of the regularized gradient boosted decision tree model
Parameter
Model
HOME
RBTIBE
IB
nrounds
100
100
100
booster
gbtree
gbtree
gbtree
objective
reg:logistic
reg:logistic
reg:logistic
eta
0.1
0.1
0.1
gamma
0.6
0.5
0.4
max_depth
6
6
6
max_delta_step
0
0
0
min_child_weight
1
1
1
subsample
0.9
0.9
0.8
colsample_bytree
0.5
0.7
0.8
Evaluation Metrics
The AUC, sensitivity (true positive rate, TPR), specificity (true negative rate, TNR),
precision (positive predictive value, PPV), negative predictive value (NPV), accuracy
(ACC), F1 score, and the area under the precision-recall curve (AUPRC) were used to
evaluate the predictive performance of the three models. In predictive analytics,
the number of false positives, false negatives, true positives, and true negatives
in a confusion matrix are written relatively as FP, FN, TP, and TN, respectively.
The calculation formulas of the evaluation metrics are as follows:
The receiver operating characteristic (ROC) curve[30 ] is plotted with the TPR as the ordinate and the false positive rate as the abscissa,
which is often used to evaluate the merits of a binary classifier. The precision—recall
(PR) graph[31 ] takes precision as the ordinate and recall as the abscissa, which visually shows
the recall and precision of the learner on the sample.
Results
Model Comparison
In the first step of the augmented model, our goal was to obtain the lowest root mean
square error (RMSE) for each of the predicted metrics using the strategy. [Table 5 ] lists the RMSE and mean absolute percentage error of MICE.
Table 5
Interpolation effect of the MICE method
Interpolation accuracy
TG
HDL-C
FBG
RMSE
0.0693
0.0734
0.0763
MAPE
0.0173
0.0267
0.0276
Abbreviations: FBG, fasting blood glucose; HDL-C, high-density lipoprotein cholesterol;
MAPE, mean absolute percentage error; MICE, multivariate imputation by chained equations;
RMSE, root mean square error; TG, triglyceride.
Calculations of the AUC, sensitivity (TPR), specificity (TNR), precision (PPV), NPV,
ACC, F1 score, and AUPRC of the three models are shown in [Table 6 ]. All metrics were computed at the same threshold of 0.425. We obtained the performance
of the test dataset in the model by ROC curve ([Fig. 5A ]) and PR graph ([Fig. 5B ]).
Table 6
Performance of the three models
Model
AUC
95%CI
Sensitivity
Specificity
Precision
NPV
F1
ACC
AUPRC
p -Value of AUC
HOME
0.905
0.902–0.907
0.702
0.897
0.609
0.929
0.652
0.860
0.703
<0.001
RBTIBE
0.950
0.949–0.951
0.809
0.922
0.705
0.955
0.753
0.901
0.842
Na
IB
0.971
0.970–0.971
0.856
0.935
0.751
0.966
0.800
0.920
0.917
<0.001
Abbreviations: AUC, area under the curve; ACC, accuracy; AUPRC, area under the precision-recall
curve; CI, confidence interval; IB, inferred blood features; NPV, negative predictive
value.
Fig. 5 (A ) ROC curves of the three classifiers. (B ) Precision-recall graphs of the three classifiers. ROC, receiver operating characteristic.
The AUC value of the test dataset reflects the total discriminative power of the classifier.[32 ] As shown in [Table 6 ], the AUCs of HOME and RBTIBE are 0.905 (95%CI: 0.902–0.907) and 0.950 (95%CI: 0.949–0.951),
respectively. The total performance of RBTIBE is greater than that of HOME (p < 0.001), which indicates that RBTIBE has higher reliability and accuracy.
Furthermore, the performance of each indicator of IB is better than that of RBTIBE
(AUC: 0.971 vs. 0.950, p < 0.001; ACC: 0.920 vs. 0.901; F1: 0.800 vs. 0.753; AUPRC: 0.917 vs. 0.842), confirming
the advantage of the blood test data imputation in the training process. In the prediction
of true positives, recall was increased to a value of 0.859 (RBTIBE: 0.809), meaning
that IB has a better precise positioning rate and a lower missing rate for people
at high risk of MetS. Additionally, the precision in the test dataset was more satisfying,
and the specificity was improved from 0.922 to 0.935, which reflects the improved
correct recognition rate for patients at low risk of MetS of IB.
The performance improvement of IB compared with HOME was due to the input of additional
inferred blood test features, which are important factors in the diagnosis of MetS,
in both the training and testing processes. Interestingly, the only difference between
RBTIBE and IB is that during the training process, IB used the inferred blood test
features for the regularized gradient-boosted decision tree model instead of the actual
blood test features used in RBTIBE. To our knowledge, the inferred blood test data
derived from the same MICE method in both the training and testing processes in IB
may have had better data consistency and lower estimation bias than, respectively
using the actual blood test data in training process and inferred data in the testing
process in RBTIBE, thereby optimizing the training model and improving the performance
in IB.
Multiscene Model Analysis
Some of the blood test information provided could be useful for improving the performance
of the augmented model if the patient has undergone a physical examination in the
previous year. We developed seven extra-augmented models (IB1 –IB7 ) for different scenarios to evaluate the applicability of our augmented method. [Table 7 ] lists the scenarios and the corresponding models. The performances are shown in
[Table 8 ] and [Fig. 6 ].
Table 7
Different scenarios of the augmented models
Model
Measured variables
Inferred variables in step 1
IB
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
TG, HDL-C, FBG
IB1
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
TG, HDL-C
IB2
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
HDL-C, FBG
IB3
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
TG, FBG
IB4
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
HDL-C
IB5
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
TG
IB6
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
FBG
IB7
sex, age, BMI, SBP, DBP, BMI_L, SBP_L, DBP_L
none
Abbreviations: BMI, body mass index; BMI-L, body mass index of the subsequent year;
DBP, diastolic blood pressure; DBP-L, diastolic blood pressure of the subsequent year;
FBG, fasting blood glucose; HDL-C, high-density lipoprotein cholesterol; IB, inferred
blood features; SBP, systolic blood pressure; SBP-L, systolic blood pressure of the
subsequent year; TG, triglyceride.
Table 8
Performance of IB in different scenarios
Model
AUC
95% CI
Sensitivity
Specificity
Precision
NPV
F1 score
AUPRC
ACC
IB
0.971
0.970–0.971
0.856
0.935
0.751
0.966
0.800
0.917
0.920
IB1
0.979
0.978–0.979
0.881
0.943
0.779
0.972
0.827
0.934
0.931
IB2
0.984
0.983–0.984
0.899
0.949
0.801
0.976
0.847
0.947
0.940
IB3
0.980
0.979–0.980
0.882
0.944
0.782
0.972
0.828
0.938
0.932
IB4
0.987
0.987–0.987
0.910
0.953
0.815
0.979
0.860
0.958
0.945
IB5
0.982
0.981–0.982
0.887
0.946
0.791
0.973
0.836
0.945
0.935
IB6
0.986
0.986–0.986
0.906
0.952
0.814
0.978
0.857
0.958
0.944
IB7
0.993
0.993–0.993
0.941
0.961
0.848
0.986
0.892
0.976
0.958
Abbreviations: AUC, area under the curve; ACC, accuracy; AUPRC, area under the precision-recall
curve; IB, inferred blood features; CI, confidence interval; NPV, negative predictive
value.
Fig. 6 (A ) ROC curves of the classifiers. (B ) Precision-recall graphs of the classifiers. ROC, receiver operating characteristic.
As shown in [Table 8 ], the seven augmented models in the scenarios all demonstrate good predictive performance,
and the performance of the augmented models could be further improved if more detailed
blood test data could be obtained, i.e., AUC varied from 0.979 to 0.993. That is,
the augmented method is also suitable when previous blood test data are provided and
guarantee excellent performance in terms of home-based MetS auxiliary diagnosis. If
a person can provide extra blood test results from physical examinations for self-care,
the model will show even better predictive performance. However, the performance of
the best model (IB7 ) did not differ significantly from that of IB (p < 0.014).
Discussion
The main purpose of our proposed model is to provide a ubiquitous self-diagnosis approach
to MetS for self-care in the context of low physical examination awareness of individuals
in China. Therefore, in the application scenarios, the model must support the smallest
amount of input data that can be acquired at home. However, the recall of HOME (0.702)
with home-based inputs did not satisfy the availability for MetS self-diagnosis and
management. Thus, we took the blood test data into consideration to enrich features
(RBTIBE) and utilized the MICE method to impute the blood test data instead of the
raw data (IB); thus, our study provides new ideas for innovative research in health
management.
Among the three models, the performance of RBTIBE was much better than that of HOME,
which implies the importance of blood test data for the auxiliary diagnosis of MetS.
Furthermore, we developed an augmented model (IB) that uses a large amount of physical
examination data to predict the blood test items instead of using real blood test
data. Concretely, the MICE method was used to learn the relationship between blood
test data and home-based data within the context physical examination data, and the
output was used in the second step to develop a better predictive performance model.
As shown in [Table 6 ], the AUC, ACC, F1 score, and AUPRC of IB were better than those of RBTIBE, which
confirmed the advantage of the blood test data imputation in the training process.
The superior results in IB showed that our model, which is constructed from existing
health check-up data, may have the ability to provide MetS self-diagnosis and promote
health management, verifying the availability of the augmented method and the feasibility
of MetS self-diagnosis. In addition, the recall of IB was 0.856, which embodies the
model's good ability to recognize MetS patients in the second year. The ability of
the augmented model to identify the at-risk MetS population is acceptable, especially
for the minority who developed MetS in the second year.
Since prevention and treatment of MetS have become a global issue,[20 ] several algorithmic approaches have already been applied to various aspects of MetS
care, including the findings of associated risk factors,[33 ]
[34 ]
[35 ]
[36 ]
[37 ]
[38 ] prediction of complications,[39 ]
[40 ] and large-scale factors such as managing health care systems.[41 ]
[42 ] In particular, several studies have focused on the early prediction or diagnosis
of MetS and demonstrated its clinical significance,[16 ]
[17 ]
[43 ] and several efforts have been made to improve the performance of models.[18 ]
[44 ] Several effective machine learning methods proposed by Akihiro Shimoda and Daisuke
Ichikawa could immediately obtain an accurate diagnosis of MetS and determine the
candidates for health guidance by using an individual's historical medical examination
data.[16 ]
[17 ]
[18 ] A primary motivation for our study, however, is that despite these efforts, a home-based
auxiliary diagnosis method for MetS would be more versatile and more valuable in China
because of the low rate of participation in physical examinations. Moreover, for the
test dataset with only home-based data (missing important blood test data), our goal
was to use the augmented method with inferred blood features to obtain an effective
model with good performance.
The implementation of our method could guarantee that the self-diagnosis of MetS is
not limited by time or place and ensures effective self-care. Compared with the MetS
models in a recent study,[16 ] the convenience of a MetS auxiliary diagnosis at home can increase the frequency
and performance of MetS self-examination, which could ameliorate China's national
health check-up conditions. A variety of studies attempted to achieve more effective
self-management to improve health.[45 ]
[46 ]
[47 ] The ubiquitous auxiliary diagnostic approach could substantially improve the national
health level based on the following: (1) the precise prediction plays an important
role in the enhancement of people's health awareness, which helps people have a clear
understanding of their health condition and engage in better self-care behaviors,
such as targeted treatments and avoiding blind medication. An increase in disease
awareness is helpful in reducing the risk of disease; (2) our method enhances the
awareness of MetS and encourages high-risk patients to go to the hospital for further
examinations; and (3) considering the population with physical examination habits,
our model helps to ensure their healthy self-management in daily life.