Predictive Ability of a Clinical-Genetic Risk Score for Venous Thromboembolism in Northern and Southern European Populations

Venous thromboembolism (VTE) is a complex, multifactorial problem, the development of which depends on a combination of genetic and acqfiguired risk factors. In a Spanish population, the Thrombo inCode score (or TiC score), which combines clinical and genetic risk components, was recently proven better at determining the risk of VTE than the commonly used model involving the analysis of two genetic variants associated with thrombophilia: the Factor V Leiden (F5 rs6025) and the G20210A prothrombin (F2 rs1799963). The aim of the present case–control study was to validate the VTE risk predictive capacity of the TiC score in a Northern European population (from Sweden). The study included 173 subjects with VTE and 196 controls. All were analyzed for the genetic risk variants included in the TiC gene panel. Standard measures —receiver operating characteristic (ROC) area under the curve (AUC), sensitivity, specificity, and odds ratio (OR)—were calculated. The TiC score returned an AUC value of 0.673, a sensitivity of 72.25%, a specificity of 60.62%, and an OR of 4.11. These AUC, sensitivity, and OR values are all greater than those associated with the currently used combination of genetic variants. A TiC version adjusted for the allelic frequencies of the Swedish population significantly improved its AUC value (0.783). In summary, the TiC score returned more reliable risk estimates for the studied Northern European population than did the analysis of the Factor V Leiden and the G20210A genetic variations in combination. Thus, the TiC score can be reliably used with European populations, despite differences in allelic frequencies.


Introduction
Venous thromboembolism (VTE)-primarily deep vein thrombosis (DVT) and pulmonary embolism (PE)-is a common disorder that affects some 0.2% of the population annually. Mortality rates reach between 5% (DVT) and 33% (PE) within the first months of diagnosis. 1,2 VTE is thought to be the leading cause of preventable hospital mortality. 3 Unfortunately, survivors of VTE are at risk of long-term complications, such as recurrence, postthrombotic syndrome, and pulmonary hypertension. 1,4,5 VTE recurs in 20 to 30% of patients within 5 years. 6,7 It is, therefore, a considerable public health problem with a large economic burden. [8][9][10] VTE is a complex, multifactorial problem, the development of which depends on a combination of genetic and acquired risk factors (with the former responsible for some 60% of the total risk). 11 Until recently, the risk of VTE was determined by testing for the Factor V Leiden (F5 rs6025) and the G20210A prothrombin (PT) (F2 rs1799963) genetic variants only (hereinafter the F5L þ F2 combination). This has been challenged, however, by the Thrombo inCode (TiC) score, a clinical/genetic algorithm for assessing the risk of VTE developed by Soria et al. 12 As well as taking into account several clinical variables, the TiC score includes low-frequency genetic variants with high odds ratios (ORs) for thrombosis, as well as common risk alleles with low ORs. Compared with the F5L þ F2 combination, this risk score returned a significantly higher area under the receiver operating characteristic (ROC) curve (AUC) for a population from Sant Pau in Spain (0.677 vs. 0.575; p < 0.001). It also showed good reclassification capacity and had a high integrated discrimination index. 12 The TiC score takes into account the single-nucleotide polymorphisms (SNPs) F2 rs1799963, F5 rs6025 (Factor V Leiden), F5 rs118203905 (Factor V Cambridge), F5 rs118203906 (Factor V Hong-Kong), F12 rs1801020 (in the gene for Factor XII), SERPINC1 rs121909548 (Antithrombin Cambridge II), and SERPINA10 rs2232698 (in the gene that codes for protein Z-dependent protease inhibitor), along with SNPs in the ABO gene that predispose one to blood type A1 (ABO rs8176719, ABO rs7853989, ABO rs8176743, and ABO rs8176750), and a protective variant F13 rs5985 (in the gene that codes for Factor XIII). The addition of genetic variants from genome-wide association studies, such as F11 rs2036914 (in the gene for Factor XI) and fibrinogen gamma gene (FGG) rs2066865 (in the gene for the fibrinogen γ chain), did not improve the results obtained. 12 One of the potential concerns with genetic risk scores (GRS), however, is that the magnitude and direction of allelic effects can differ between populations. An example is the north European aggregation of F5 rs6025. 13 The main aim of the present work was to validate the VTE risk predictive capacity of the TiC score in subjects from a Northern European country (Sweden), among which the frequency of at least F5 rs6025 is higher than in southern Europe.

Study Population
This case-control study was performed at the Department of Molecular Medicine and Surgery at the Karolinska Institute (Stockholm, Sweden), in a Swedish population whose members had experienced a VTE, and who had consequently been tested for thrombophilia. All consecutive adult patients with samples examined by the Karolinska University Hospital Coagulation Laboratory between 2014 and 2016 for inherited thrombophilia were invited to participate. After written informed consent was obtained, only those who fulfilled the clinical criteria for thrombophilia testing were included, that is, having suffered a first provoked or unprovoked VTE (DVT or PE) before the age of 50. As family history is usually used to identify the subjects at high risk of VTE, we have forced the controls to have a similar family history of VTE than the cases. The final study population was 173 unrelated patients; 196 apparently healthy persons were recruited as controls.
To avoid genetic stratification, the members of both the case and control groups were all recruited from central Sweden. None of the patients or controls had been prescribed VTE prophylactic treatment.
A medical history was obtained for each subject, including their acquired risk factors for VTE. A diagnosis of DVT in the lower limbs was established objectively by ultrasonography or ascending venography. PE was diagnosed by computed tomography, pulmonary angiography, or ventilation-perfusion lung scintigraphy. A subject's family history was considered positive if at least one first-degree family member had suffered a VTE.

DNA Extraction and Genetic Analysis
DNA was extracted from leukocytes in ethylenediaminetetraacetic acid-treated whole blood by digestion and selective precipitation with ethanol in an automated QiaCube system using the QIAamp DNA-blood mini Kit (Qiagen, Düsseldorf, Germany) following the manufacturer's instructions. Extracts were stored at -20°C until use. ►Table 1 shows a comprehensive summary of the genes examined. The prothrombotic genetic variables associated with the TiC score were genotyped using the Thrombo inCode kit (GEN inCode, Barcelona, Spain).
The F11 rs2036914 and FGG rs2066865 variants were genotyped using the TaqMan genotyping assay from Life Technologies (Foster City, California, United States) on the Fluidigm genotyping platform (South San Francisco, California, United States). All genetic analyses were performed at Gendiag.exe (Barcelona, Spain).

Determining the Individual Risk of Venous Thrombosis
The individual risk for first VTE was determined using the TiC score risk algorithm. This uses the results of the genetic analysis associated with the score alongside clinical variables recognized as risk factors of VTE: age, gender, body mass index (BMI), smoking habit, presence of type II diabetes, and a family history of thrombosis (►Table 2). For women, it also includes pregnancy and treatment with prothrombotic hormonal contraceptives. 2,7,14 An overall risk value is then determined. 12 The capacity of other GRS and clinical/GRS algorithms (►Table 2) to determine the risk of VTE was also examined. In ►Table 2, TiC Ã Clinical only means that this score if form only by the clinical variables included in TiC and with the same weight. Therefore, none of the genetic variants is included. When these involved either the clinical or genetic risk components used in the TiC score, the variables therein were given the same weight as that score. In a further analysis, logistic regression was used to modify the weight assigned to the different genetic variants in the original Spanish population, to reflect the allelic frequencies of the cases and controls in the Swedish population.
Finally, the results obtained for the Swedish population were compared with those of the Spanish population. 12 The study protocol was approved by the Regional Research Ethics Committee, Stockholm, Sweden (EPN2014/987-31/1).

Statistical Analysis
The predictive capacity of the risk scores was evaluated using the area under the ROC curve (AUC; larger values indicate better discrimination). 15 The DeLong test was used to compare the AUC values of the different GRS and risk algorithms. Optimal cutoffs for each were calculated from the ROC data using the Youden Index.
Standard measures-sensitivity, specificity, OR and positive and negative likelihood ratios (LR þ , LR-) 16 -were calculated and compared using MedCalc v.18.6 software (http:// www.medcalc.org; 2018), which implements several methods for each measure. Briefly, sensitivity and specificity were compared using the McNemar test, likelihood ratios were compared using the chi-squared test. Age and BMI were compared using the Mann-Whitney U test, and proportions using the chi-squared test.

Results
►Table 3 shows the distribution of the clinical risk factors. As expected, the known risk factors of smoking, BMI, gender, and age differed significantly between cases and controls. The fact that there were more women in the control group probably contributed to the nonsignificant contribution of other risk factors such as pregnancy and procoagulant hormonal contraceptives. Among the 173 patients with VTE, 56 of them (32.37%) were provoked. ►Table 4 shows the prevalence of the VTE risk alleles among the case subjects and controls. The risk alleles in the A1 ABO blood group, SERPINA10 rs2232698 and FGG rs2066865, were significantly more frequent in case subjects than in controls.
►Table 5 shows the prognostic characteristics of all the GRS and risk algorithms. The TiC score (TiC) returned a higher AUC value than the F5L þ F2 combination (0.673 vs. 0.537; p < 0.0001). Standard accuracy measures were calculated at the cutoff yielded by the Youden index. The TiC score had a higher sensitivity than F5L þ F2 (74.57 vs. 28.90: p < 0.0001) and a better LRþ (1.33 vs. 1.83; p < 0.0001) and LR-(0.46 vs. 0.91; p < 0.0001); however, it returned a lower specificity score (60.62 vs. 78.24; p < 0.001). These results for the Swedish population are very similar to those obtained for the Spanish 12 population.
►Table 5 compares the results for the TiC score with the additional GRS and risk algorithms outlined in ►Table 2. The use of the TiC clinical variables alone (TiC Ã Clinical ONLY) Table 1 Genetic variants analyzed across the three genetic risk scores examined
The addition of F11 rs2036914 and FGG rs2066865 to the TiC score GRS did not improve its AUC (TiC Ã GRS-ONLYþ F11 þ FGG vs. TiC Ã GRS-ONLY 0.608 vs. 0.588; p ¼ 0.0575). Nor did it improve the AUC when these two variants were included in the TiC score algorithm (TiC þ F11 þ FGG vs. TiC 0.679 vs. 0.673; p ¼ 0.4112). These observations were also similar to those obtained for the Spanish population. 12 As the control population was selected to have a similar family history than cases, TiC has demonstrated to be more useful than the family history to identify patients at high risk of VTE.
►Table 6 shows that in both populations the allelic frequency of the risk alleles for blood type A1 was higher among the case subjects than the controls. The Swedish case subjects also showed a higher frequency for the risk alleles SERPINA10 rs2232698 and FGG rs2066865 compared with the controls. 12 In contrast, the Spanish cases had higher allelic frequencies for F12 rs1801020, F2 rs1799963, and the risk alleles in the gene for Factor V.
►Table 7 shows the general differences between the Swedish and Spanish populations. Among the case subjects and controls, the risk alleles in the gene for Factor V, F11 rs2036914, and F12 rs1801020 were more common among the entire Swedish population (i.e., cases plus controls) than the entire Spanish population. In addition, the Swedish case subjects showed a lower allelic frequency of F13 rs5985 and a higher frequency of the risk allele FGG rs2066865 than did the Spanish case subjects. 12 For the purpose of comparison, a modified TiC GRS (TiC Ã GRS-ONLY-MOD) and TiC algorithm (TiC Ã MOD) were studied, adjusting for the allelic frequencies of the Swedish population. The TiC Ã GRS-ONLY-MOD algorithm returned an AUC value with nonsignificant difference to that provided by the original TiC Ã GRS-ONLY model (0.636 vs. 0.588; p ¼ 0.0678). The TiC Ã MOD algorithm returned an AUC value significantly higher than that provided by the original TiC (TiC Ã MOD vs. Ti 0.783 vs. 0.673; p ¼ 0.0004) in the Swedish population.
Also, for the purpose of comparison and considering that of those genetic variants included in TiC, ABO variant was the most significantly present in VTE cases, we studied the performance of a new score. That formed by ABO, Factor V Leiden, and PT genetic variants. The score formed by ABO, Factor V Leiden, and PT genetic variants showed an AUC of 0.609, with a sensitivity of 69.39 and a specificity of 50.35. The AUC was significantly lower than that of TiC (p ¼ 0.0165).

Discussion
The main problem in the prevention of VTE is the identification of those who are at serious risk. Evaluating risk factors for VTE is crucial when weighing up the risk of bleeding against that of a first VTE, recurring VTE, or obstetrical complications. It is well accepted that VTE is a multifactorial disease precipitated by a combination of clinical and genetic risk factors. However, accurately predicting a person's risk of developing a VTE is difficult.
Our group and others 12,17,18 have shown that clinical/GRS models for estimating the risk of VTE have better predictive capacity than the classical F5L þ F2 combination. The TiC score has also shown clinical value in predicting VTE in patients with cancer, 19 and can be used to identify women with recurrent pregnancy loss in whom thrombophilia may be a contributing factor. 20 The use of a GRS might not always be generalizable across populations given differences in allelic frequencies. However, the present results show that the TiC score developed in population from Southern Europe reliably predicted VTE in a population from Northern Europe, despite significant  differences in the allelic frequencies of the contemplated genetic variants. For both populations, the TiC score returned similar results, and always significantly better than the F5L þ F2 combination. As reported earlier for the Spanish population, 12 no improvement was seen in the capacity of the TiC score (determined as either TIC-GRS Ã ONLY or TiC) when F11 rs2036914 and FGG rs2066865 were added to the algorithm. The TiC Ã GRS-ONLY-MOD score yielded an AUC ROC value no better than that provided by the TiC Ã GRS-ONLY model (►Table 5). However, the TiC Ã MOD algorithm-which took into account the specific allelic frequencies of the Swedish population-increased the AUC to 0.783. This suggests that the TiC algorithm can be tailored to various populations to optimize its prediction of risk. Similar behaviors have been reported for other risk algorithms. For instance, the Framingham risk score, which predicts coronary events, can be used worldwide, but adaptations to local populations significantly improve its predictive ability. 21,22 In a similar manner to the concern that the magnitude and direction of allelic effects can differ between populations, the concern exists about the predictive capability of GRS scores in different ethnicities. We have studied in persons of African, Latino, and East-Asian ancestry the predictive capability of cardiovascular events of a GRS developed in Europeans. 23  We found that the GRS developed in Europeans provided similar results when used in other ethnicities. Wassel et al have studied GRS related to VTE in a multiethnic cohort. 24 They also conclude that the GRS had a similar capability among the different ethnicities. In none of those studies, an adapted GRS was studied.
The clinical utility of TiC is in the decision-making process, when the physician has to decide whether or not to initiate thromboprophylaxis in a patient. In this case, TiC has proved to be better than clinical variables and the classic F2 þ F5 test in identifying the subjects a high risk of developing a VTE, and therefore in need of thromboprophylaxis. TiC can be taken as a predictive risk, because we identify subjects at high risk of developing VTE.
The present work suffers from the limitation that the number of subjects studied was relatively low. However, in its favor, the TiC score involves an algorithm that combines clinical and genetic variants that individually have been repeatedly associated with VTE in different populations.
In conclusion, the present results show the TiC score to predict the risk for VTE well, and to be better in this regard than the F5L þ F2 combination. Further, they show that the TiC score can be used to reliably predict the risk of VTE despite differences in the allelic frequencies between populations. It can do this even better when modified to take into account the specific allelic frequencies of the population under study.

Conflicts of Interest
Patent EP2799556 covering TiC is now property of Gen inCode. Eduardo Salas appears as inventor in that patent (giving rights to previous owner of the patent via employment relationship). Jose Manuel Soria is advisor of Gen inCode and he reports non-financial support from Gen inCode to perform this study.
What Is Known about This Topic?
• The two genetic variants (Factor V Leiden and 20210 G/A in PT gene) commonly used for VTE risk estimation have a very poor performance. • A recently developed risk score TiC has been shown to have a higher risk estimation capability is southern European patients.
What This Paper Add?
• The risk estimation capability of TiC has been verified in Swedish patients. • The higher risk estimation capability of TiC was preserved despite the difference in the frequency of the genetic variants between northern and southern European population. • As happens with other risk estimation equations in cardiovascular field, the estimation capability of TiC can be improved if adapted to the specific country frequency of the genetic variants.