Predictive Modeling Identifies Total Bleeds at 12-Weeks Postswitch to N8-GP Prophylaxis as a Predictor of Treatment Response

Background  Predicting annualized bleeding rate (ABR) during factor VIII (FVIII) prophylaxis for severe hemophilia A (SHA) is important for long-term outcomes. This study used supervised machine learning-based predictive modeling to identify predictors of long-term ABR during prophylaxis with an extended half-life FVIII. Methods  Data were from 166 SHA patients who received N8-GP prophylaxis (50 IU/kg every 4 days) in the pathfinder 2 study. Predictive models were developed to identify variables associated with an ABR of ≤1 versus >1 during the trial's main phase (median follow-up of 469 days). Model performance was assessed using area under the receiver operator characteristic curve (AUROC). Pre-N8-GP prophylaxis models learned from data collected at baseline; post-N8-GP prophylaxis models learned from data collected up to 12-weeks postswitch to N8-GP, and predicted ABR at the end of the outcome period (final year of treatment in the main phase). Results  The predictive model using baseline variables had moderate performance (AUROC = 0.64) for predicting observed ABR. The most performant model used data collected at 12-weeks postswitch (AUROC = 0.79) with cumulative bleed count up to 12 weeks as the most informative variable, followed by baseline von Willebrand factor and mean FVIII at 30 minutes postdose. Univariate cumulative bleed count at 12 weeks performed equally well to the 12-weeks postswitch model (AUROC = 0.75). Pharmacokinetic measures were indicative, but not essential, to predict ABR. Conclusion  Cumulative bleed count up to 12-weeks postswitch was as informative as the 12-week post-switch predictive model for predicting long-term ABR, supporting alterations in prophylaxis based on treatment response.


Introduction
Prophylaxis is considered the standard of care for the prevention and management of bleeding in patients with hemophilia A. 1,2 Extended half-life (EHL) recombinant factor VIII (rFVIII) molecules were developed to offer reduced dosing frequency and higher factor activity levels than standard half-life FVIII molecules. 1 Switching patients to prophylaxis with an EHL rFVIII molecule can provide meaningful improvements in health-related quality of life. 3 However, there is significant interpatient variability in response to rFVIII treatments, which may relate to the patient's age, body mass, bleeding phenotype, genotypic variation, ABO blood group, physical activity, and joint status, among other variables. 1,[4][5][6] Identifying patient characteristics that predict long-term outcomes could help to inform clinical decisions for prophylaxis optimization. Annualized bleeding rate (ABR) is an established outcome measure of prophylaxis efficacy as it correlates with long-term joint destruction. 7,8 Previous attempts to predict and improve ABR have been focused on the use of FVIII pharmacokinetic (PK) parameters. Collins et al established that time per week spent at FVIII activity below 1 IU/dL was associated with increased total bleeds. 9 Valentino et al showed that peak FVIII levels, area under the curve, and time spent per week with FVIII levels >20 IU/dL were linked to bleeding risk. 4 More recently, Tiede et al demonstrated that bleeding risk can change over time and is influenced by factors independent to PK parameters, suggesting that other demographic and clinical characteristics are required to predict long-term treatment response. 10 Notably, the aforementioned approaches to predict ABR relate to evidence using standard half-life FVIII molecules for prophylaxis. Since then, guidelines have suggested to target higher trough levels, 2 which is achievable with EHL rFVIII molecules. It is reasonable to reassess predictors of long-term clinical response, as these may have changed with improvements in prophylaxis.
Predictive modeling involves the use of analytic techniques to predict clinical outcomes. Machine learning is a powerful computational approach used to recognize patterns in complex, multivariate datasets that include clinical variables and outcomes, enabling the development of predictive modeling. A significant advantage of machine learning techniques is their ability to handle highly variable datasets, collinearity, and missing data. Importantly, they can be used to identify how much a variable contributes to predicting a subsequent outcome. Rapid advances in ma-chine learning techniques have enabled the application of predictive modeling to data from randomized controlled trials, showing significant promise in using patient and clinical characteristics to identify variables that are strongly associated with a chosen treatment outcome across a range of indications. [11][12][13] N8-GP is an EHL human rFVIII product that has the potential to provide a simplified prophylaxis treatment for patients with hemophilia A due to its fixed, body-weightbased dosing regimen (50 IU/kg every 4 days [Q4D]). 14 The mean trough level observed during prophylaxis with N8-GP (turoctocog alfa pegol; Esperoct, Novo Nordisk A/S, Bagsvaerd, Denmark) at 50 IU/kg every 4 days was 3 IU/dL. 15 The long-term safety and efficacy of N8-GP were demonstrated in the pivotal pathfinder 2 trial (NCT01480180), which is the only clinical trial of an EHL rFVIII molecule where most patients received a fixed dosing regimen for prophylaxis to date. In most other studies of EHL rFVIII molecules, patients received prophylaxis that was individualized by investigators, or patients were stratified to receive different regimens based on bleeding rate during a run-in phase. 16,17 For these studies, the application of machine learning techniques to identify predictive patterns is difficult, because the intensity of therapy was, at least to some extent, influenced by clinical outcomes. Due to its unique design, pathfinder 2 provides for the first time an opportunity to identify predictive patterns in the setting of prophylaxis with an EHL rFVIII molecule.
The purpose of this posthoc analysis was to develop a predictive model to identify pre-and post-N8-GP prophylaxis variables that can act as predictors of clinical response to fixed-dose prophylaxis by applying a machine learning framework to data from the pathfinder clinical trial program.

Research Objectives
The objectives of this exercise included the following: (1) to identify pre-N8-GP prophylaxis variables (including baseline characteristics) associated with an ABR of 1 or >1 bleed/year at the end of the pathfinder 2 main phase; (2) to identify post-N8-GP prophylaxis variables (including treatment-related variables, and patient characteristics) associated with an ABR of 1 or >1 bleed/year at the end of the pathfinder 2 main phase. For the purposes of this analysis, the ABR threshold of 1 (rather than "0") was chosen as the outcome of interest, as the ability to identify and distinguish early bleeds varies between patients, and it is not uncommon for some patients to treat pain as an early bleed. and mean FVIII at 30 minutes postdose. Univariate cumulative bleed count at 12 weeks performed equally well to the 12-weeks postswitch model (AUROC ¼ 0.75). Pharmacokinetic measures were indicative, but not essential, to predict ABR. Conclusion Cumulative bleed count up to 12-weeks postswitch was as informative as the 12-week post-switch predictive model for predicting long-term ABR, supporting alterations in prophylaxis based on treatment response.

Data Source
Data from the pathfinder trials (pathfinder 1, NCT01205724; pathfinder 2, NCT01480180; pathfinder 3, NCT01489111; pathfinder 5, NCT01731600; pathfinder 7, NCT02920398) were investigated for the application of supervised machine learning for predictive modeling. Of these, the pivotal pathfinder 2 trial 18-20 was selected for analysis due to the comprehensive array of variables from a sufficient sample size of patients. Pathfinder 2 was a phase III, open-label trial investigating long-term safety, PK, and efficacy of N8-GP used for prophylaxis (50 IU/kg Q4D) or on-demand treatment in 186 previously treated patients (aged !12 years) with severe hemophilia A. The full details of pathfinder 2 have been reported previously. [18][19][20]

Model Overview
Predictive modeling typically includes a predictive model learning phase, where data are collected for input into the model, and a predictive model outcome period, where the outcome of interest (ABR) is measured. The chosen clinical outcome of interest for prediction was ABR of 1 versus >1 at the end of the outcome period. Multiple predictive models using data from different time points in the study period were developed to identify variables pre-and post-N8-GP prophylaxis that were associated with patients achieving an ABR of 1 versus >1.
Each predictive model learning phase used data collected from one of five separate time points in pathfinder 2. Pre-N8-GP prophylaxis models learned from data collected at screening visit ("baseline model") or Visit 2 (prophylaxis initiation; "baseline and PK model"). Post-N8-GP prophylaxis models learned from data collected between Visit 2 and 3 (i.e., up to 4 weeks postprophylaxis switch; "post-N8-GP 4week prophylaxis model"), Visit 4 (i.e., up to 8 weeks postprophylaxis switch; "post-N8-GP 8-week prophylaxis model"), or Visit 5 (i.e., up to 12 weeks postprophylaxis switch; "post-N8-GP 12-week prophylaxis model"), as well as the data collected at baseline (►Fig. 1). The 12-week post-N8-GP prophylaxis time window was chosen as the maximum duration of the predictive model learning phase as patients tend to be followed more closely during this period by clinicians, where initial treatment response becomes apparent, and this is often the earliest time that a change in treatment is considered. prophylaxis who switched to twice-weekly during the trial due to insufficient treatment response were included in the "ABR >1" group, irrespective of their ABR outcome. For patients who did not complete the main phase, ABR was calculated at the time of withdrawal.
Each predictive model included a cohort of patients from pathfinder 2 who received consistent N8-GP prophylaxis (50 IU/kg Q4D). Data were excluded if the patient received ondemand treatment or did not receive at least 90 days of prophylaxis exposure beyond the model's learning time point. As such, the number of patients in each model varied by time point, ranging from 166 patients in the pre-N8-GP prophylaxis models to 161 in the post-N8-GP 12-week prophylaxis model.

Variable Selection
Standardized predictor variables for predictive modeling were selected from the range of patient and treatment characteristics collected at screening visit and after the switch to N8-GP prophylaxis. Variables that were expected to have clinical relevance were identified and reviewed by Fig. 1 Predictive modeling methodology. a Multiple predictive models using data from different time points in the main phase were developed to identify which variables were associated with patients achieving an ABR of 1 versus >1 during the final stage of the outcome period. b ABR values were calculated for each model's outcome period, which varied between each predictive model. ABR, annualized bleeding rate. two clinicians before being selected for inclusion in the predictive analyses (►Supplementary Table S1, available in the online version). Variable selection was based on clinical experience and availability of patient data; no additional criteria for variable selection were applied. Patient-reported outcomes such as EQ-5D were selected initially; however, preliminary model analyses did not support their retention for predictive modeling and hence these variables were excluded.

Predictive Modeling Methodology
Two machine learning techniques were used for predictive modeling: penalized logistic regression and random forests (see the Supplementary Material, available in the online version). Additionally, interpretation methods for the treebased models were used to elucidate the underlying drivers of treatment response. For explanation of predictions at the individual patient level, SHapley Additive exPlanations (SHAP) was used to quantify the contribution that each variable brought to the prediction made by the model. SHAP estimated how important each variable was by evaluating how well the model performed with and without that variable. 21 In this analysis, SHAP values greater than zero implied an increased association with an ABR 1; values less than zero implied an increased association with an ABR >1. Global ranking of how each variable contributed to the predicted clinical outcome at the group level was derived from mean absolute SHAP values.

Model Validation
To validate the predictive potential of this approach, a repeated nested cross-validation strategy was implemented to alleviate "overfitting" and ensure the models could generalize well to new data. 22 In this, each patient could contribute to the learning (training set) phase of model development or act as a test patient (testing dataset or validation dataset), i.e., each patient was assigned a prediction from a corresponding predictive model. To facilitate interpretation, the association between variables and subsequent clinical outcome was described at the model level and individual patient level. A further description of model validation is provided in the Supplementary Material (available in the online version).

Performance Metrics
The performance metrics of each predictive model were assessed using the area under the receiver operator curve (AUROC) to indicate how well variables collected at the model's time point could be used to predict the patients' observed ABR outcome at the end of the model outcome period. An AUROC of 1 indicated a perfect model and an AUROC of 0.5 indicated a model that performs equivalently to random chance. The predictive models were compared against a univariate benchmark to reference their performance. The univariate benchmark was historical ABR for pre-N8-GP prophylaxis models and total bleed count up to the model's time point for post-N8-GP prophylaxis models.

Patient Characteristics
Of 175 patients initiated with N8-GP prophylaxis (50 IU/kg Q4D) in the pathfinder 2 main phase, a total of 166 received a consistent regimen and had sufficient follow-up for inclusion in the baseline model. Patient demographics and baseline characteristics for this cohort and the cohort of patients included in the post-N8-GP 12-week prophylaxis model are reported in ►Table 1.

ABR Outcome
Patients were classified by ABR at the end of the model outcome period (which ended at approximately 15 months after the model's time point). In the baseline model's cohort, 87 patients had an ABR 1, and 79 patients had an ABR >1. Details regarding the duration of the predictive model learning phase and outcome period are reported in ►Table 2. The majority (n ¼ 122) of patients were exposed to prophylaxis for >1 year in the predictive model outcome period.

Performance of Predictive Models
The performance of the predictive models and single-variable benchmarks are reported in ►Table 2. Univariate benchmark models were included to compare minimum performance against performance of the pre-and post-N8-GP prophylaxis models developed by supervised learning methods.
For the baseline model, an AUROC of 0.636 and 0.608 was reported for the penalized logistic regression and random forests, respectively, indicating "moderate" performance of this model for predicting the observed ABR outcome at the end of the outcome period. Model performance was improved in the baseline and PK model with the addition of PK measures, as indicated by higher AUROC values (0.672 and 0.622, respectively). Both pre-N8-GP prophylaxis models demonstrated improved performance compared with the single-variable benchmark (AUROC ¼ 0.5821).
The post-N8-GP 12-week prophylaxis model was the most performant, with AUROC values of 0.724 and 0.785 for the penalized logistic regression and random forests, respectively, indicating "good" performance of this model for predicting the observed ABR outcomes. This model demonstrated marginally better performance to the single-variable benchmark of total bleed count until the model's time point (AUROC ¼ 0.748). Model performance increased incrementally over time from the post-N8-GP 4-week prophylaxis model to the post-N8-GP 12-week prophylaxis model.

Model Interpretability and SHAP Values
SHAP values were used to quantify the contribution of each variable to the prediction made by the model. ►Fig. 2 depicts individual predictions from three patient cases in the baseline model. The overall prediction for a patient (model output value) and the confidence in that prediction are related to the sum of the underlying contributions made by each of their variables. 23 Predictive Variables in the pre-N8-GP Prophylaxis Models The local interpretability for each predictor was grouped to interpret the overall effect of each variable on the model, which facilitates insight into associations, but not causality. When ranked by importance, SHAP analysis for global interpretability identified that the von Willebrand factor (vWF) level and historical ABR were the most impactful clinically relevant baseline variables for predicting observed ABR (►Fig. 3A). Mean absolute SHAP values in ►Fig. 4A illustrate the global importance of the variable and ►Fig. 3A demonstrates the local explanation summary, demonstrating the direction of relationship between the variable and outcome. Baseline vWF had an asymmetric distribution of SHAP values, with extreme negative variable values influencing model predictions more than extreme positive values. SHAP analysis of variables at a patient level identified that patients with a threshold vWF level <0.8 IU/mL or historical ABR >5 demonstrated negative SHAP values, indicating an increased association with an ABR >1. Patients with a higher vWF level >0.8 IU/mL or lower historical ABR <5 demonstrated positive SHAP values, indicating an increased association with an ABR 1 (►Fig. 3B). An increase in the vWF level beyond 0.8 IU/mL was not associated with increasing odds of the patient having an ABR 1, as indicated by the plateauing SHAP values; however, decreasing levels of vWF below the 0.8 IU/mL threshold correlated with decreasing SHAP values, indicating a stronger association with patients reporting an ABR >1. The impact of height on predictions was most pronounced for the tallest patients (>1.8 m), who demonstrated positive SHAP values, indicating an increased association with an ABR 1.

Predictive Variables in the Post-N8-GP Prophylaxis Models
Computation of SHAP values for global interpretability of the predictive model demonstrated that cumulative count of treated bleeds (i.e., "total bleed count") up to 12 weeks post-prophylaxis switch, baseline vWF level, and mean FVIII at 30 minutes (Visit 5) were the most important clinically relevant variables for predicting observed ABR (►Fig. 5A). Total bleed count up to 12 weeks postprophylaxis switch was the most informative variable for predicting observed ABR, and markedly more informative than baseline vWF level and mean FVIII at 30 minutes (Visit 5), as demonstrated by higher mean absolute SHAP values (0.089, 0.033, and 0.030, respectively; ►Fig. 4B). The performance of cumulative bleed count up to 12 weeks was "good" (AUROC ¼ 0.748) when assessed in isolation as a single-variable benchmark.
SHAP analysis for individual patient-level variable importance ranking identified that a higher mean trough level (>3%) and a higher FVIII at 30 minutes postdose (>125%) were associated with a patient achieving an ABR 1 (►Fig. 5B). FVIII at 30 minutes postdose demonstrated a wider distribution of SHAP values compared with the trough level, indicating that FVIII at 30 minutes was more informative for predicting ABR in the model. At the patient level,    SHAP analysis demonstrated that patients with a bleed count of zero universally had positive SHAP values (associated with a patient achieving an ABR 1), whereas patients with a bleed count greater than zero were universally shown to have negative SHAP values (i.e., encourage a prediction of ABR >1).

Discussion
This posthoc analysis applied supervised machine learning techniques to data from pathfinder 2 to develop predictive models to identify which pre-and post-N8-GP prophylaxis variables act as predictors of clinical response to fixed-dose N8-GP prophylaxis. The most performant predictive model was the post-N8-GP 12-week prophylaxis model, which demonstrated that total bleed count during the initial 12week period postswitch to N8-GP prophylaxis was the most informative variable in the model for predicting ABR at the end of the outcome period (which ended a median of approximately 13 months after the model's time point, i.e., long-term response to N8-GP prophylaxis), followed by baseline vWF level and mean FVIII at 30 minutes postdose. When assessed as a univariate benchmark, total bleed count up to 12 weeks was as performant as the post-N8-GP 12-week prophylaxis model, demonstrating the importance of clinical observation for predicting longterm outcomes. The performance of the baseline model was "moderate" (AUROC ¼ 0.636), and only marginally improved compared with the single-variable benchmark (historical ABR; AUROC ¼ 0.5821). The model indicated that the vWF level was the most impactful baseline variable for predicting observed ABR. Patients who responded well to prophylaxis were those with higher baseline vWF levels (>0.8 IU/mL), which is consistent with previous studies demonstrating that higher vWF levels correlate with longer FVIII half-life and higher FVIII activity between doses that may subsequently lead to a reduced ABR. 16,24 The model identified a linear relationship between low vWF and higher bleeds, which plateaued beyond 0.8 IU/mL. The models identified that patients with greater height (>1.8 m) reported positive SHAP values that were associated with an ABR 1. Although taller patients may report lower body mass index (BMI) values, which is associated with reduced clearance, 25 BMI and body weight were included as variables in the analysis and not found to be informative predictors of ABR. Taller patients in this cohort may instead represent a group of patients who were possibly more athletic, or had different levels of activity; the association between a taller height and lower ABR is currently unclear. Overall, the "moderate" performance (AUROC ¼ 0.636) of the baseline model indicated that baseline variables were limited in their performance to accurately predict ABR; as such, the association between a taller height and lower ABR may be a spurious finding. The limited performance of the baseline model confirms the dominant influence of the prophylaxis regimen on ABR outcomes.
The post-N8-GP 12-week prophylaxis model was the most performant and assessed by AUROC as "good" (AUROC ¼ 0.785). The "good" (AUROC ¼ 0.785) performance of this model indicates that the 12-week observation window (learning phase) postswitch to N8-GP prophylaxis is a sufficient period of time to monitor patients and determine whether a change in treatment or regimen may be required, or investigations for other causative factors. The post-N8-GP 12-week prophylaxis model included total bleed count; as a singlevariable benchmark, total bleed count up to 12 weeks was more performant than all variables included in the baseline model (AUROC of 0.75 vs. 0.64, respectively). The post-N8-GP 12-week prophylaxis model and total bleed count univariate benchmark model demonstrate the importance of bleeding phenotype-adjusted tailoring of prophylaxis over baseline characteristics for predicting long-term ABR.
Previous studies investigating predictors of ABR identified that endogenous factor levels, adherence, BMI, primary dosing regimen, inhibitor development, presence of arthropathy, and intensity of physical activity as variables that can influence bleeding rates. 1,4,[26][27][28][29][30] In the present analysis, the post-N8-GP 12-week prophylaxis model demonstrated that total bleed count up to 12 weeks postprophylaxis switch, baseline vWF level, and mean FVIII at 30 minutes postdose were the most important variables for predicting observed ABR with N8-GP prophylaxis. The mean absolute SHAP values demonstrated that the total bleed count was substantially more informative than all other baseline characteristics (vWF level) and PK measures (mean FVIII at 30 minutes and mean trough level), demonstrating the importance of clinical observation for predicting long-term outcomes. The "good" performance (AUROC ¼ 0.748) of cumulative bleed count as a univariate benchmark model is in line with clinical observations, whereby patients who bleed less in the initial prophylaxis period of a trial are more likely to have fewer bleeds later in the trial. These results suggest that bleeding events following prophylaxis initiation are indicative of long-term outcomes and should prompt a review of the prophylactic regimen or investigation for local pathology that may be contributing to an excess of bleeds. In routine clinical practice, while prophylaxis tends to be adjusted based on treatment response, there is no agreed period for review or an agreed cut-off for number of bleeds before deciding on a treatment change. For the first time, this posthoc analysis confirms that a 12-week treatment period is adequate to initiate treatment review in the event of bleeds, for intensification of prophylaxis or investigation into other contributing factors. The post-N8-GP 12-week prophylaxis model recognized that a higher trough level (>3%) and FVIII at 30 minutes (representing peak FVIII activity of N8-GP 15 ; >125%) were both associated with an ABR 1. Similarly, the literature reports that maintaining a higher FVIII trough level and achieving a higher FVIII peak provides increased protection from joint and nonjoint bleeding, and subsequent improvements in ABR. 4,10,30 FVIII at 30 minutes and trough level may act as surrogate markers for N8-GP PK area under the curve and decay of the curve. SHAP analysis indicated that the FVIII level at 30 minutes postdose was more informative than the trough level for predicting observed ABR. This result suggests that achieving the observed peak threshold of >125% is associated more strongly with a higher PK area under the curve and an ABR 1 than achieving a trough level >3%. However, this result may also reflect the quality of data collected, whereby values for FVIII level at 30 minutes are less anomalous than the trough level, and may therefore act as a more informative measure. Previous investigations into predicting ABR outcomes using standard half-life FVIII focused on the association of FVIII PK measurements with ABR. 4,9 In this analysis with an EHL FVIII molecule, the inclusion of PK measurements improved model performance, as demonstrated by higher AUROC values for the baseline and PK model. However, the contribution of mean FVIII at 30 minutes and mean trough level to the predictive power of the post-N8-GP 12-week prophylaxis model was substantially lower than the total bleed count, as demonstrated by lower mean SHAP values (0.030 and 0.016 vs. 0.089, respectively). Although PK data may be used to inform decisions and individualize treatment in clinical practice (e.g., dose adjustments), 31 the results reported in the present analysis suggest that using clinical observations of total bleed count alone during the initial 12-week prophylaxis period (univariate assessment) in the context of fixed-dose prophylaxis with a high trough level is as good and potentially more informative to predict long-term prophylaxis response. The results confirm the value of a simplified, fixed-dose prophylaxis regimen for patients, with possible adaptation of treatment according to bleed count in the initial weeks postinitiation, rather than based on the residual clotting factor activity levels. 32 Limitations of this analysis included the modest sample size of patients (n ¼ 166), which was subject to selection bias, whereby patients included were those who remained in the study for longer and could have been more likely to be compliant or be motivated to remain in the study due to observed improvement. The analysis was based on main phase data from pathfinder 2 only; the performance of the predictive models may benefit from a longer observation time with refined outcomes of interest. Data were not available for joint status, which is considered a clinically relevant variable for phenotypic assessment prior to treatment initiation in hemophilia A patients. 33 Data included in the analyses were for patients receiving N8-GP prophylaxis Q4D, and hence there was limited scope to compare findings against other regimens, doses, or treatments. Due to the modest sample size of patients, validation was performed using nested cross-validation (i.e., "internal validation"). Although nested cross-validation is an acceptable method for validating a predictive model, the use of a separate dataset for external validation is more robust for confirming model performance. The dataset included anomalies in trough level data, and a proportion of trough level values was below the detection limit threshold (0.045 IU/mL). FVIII at 30 minutes and FVIII trough level were the only PK parameters from pathfinder 2 for which there were sufficient data for inclusion in the predictive models. Machine learning for predictive modeling was applied to an EHL PEGylated FVIII molecule; although the association between clinically relevant variables and predicted ABR outcomes is unlikely to be specific to N8-GP, results cannot be readily extrapolated to standard half-life FVIII molecules, or those with different half-life extension technology.
To refine the performance and value of the predictive models in future analyses, results should be replicated in data from other hemophilia clinical trials, as well as the realworld setting where patient behavior may be more variable, and where a broader set of variables may be available. As the performance of each predictive model improved with the inclusion of more variables, the availability of additional clinically relevant variables such as joint status, physical activity, and intensity of previous prophylaxis regimen may help to further increase model performance. Additionally, the availability and inclusion of other PK parameters, such as area under the curve, should be applied to future machine learning analyses to further investigate the role of PK. The ability to predict ABR as a continuous value or range of values (rather than as a binary outcome of ABR 1 vs. ABR >1) may affect the model, as the occurrence of a single bleed event would have a large impact on the defined outcome.

Conclusion
Machine learning for predictive modeling is a novel approach to analyzing data from hemophilia clinical trials. In this posthoc analysis, applying supervised machine learning techniques for predictive modeling to data from the pathfinder 2 trial demonstrated that cumulative bleed count in the 12 weeks postprophylaxis switch was the most informative variable for predicting observed ABR at the end of the outcome period (approximately 13 months later), and probably more informative than PK assessments and baseline characteristics. This outcome confirms observations in clinical practice in which patients who bleed less in the initial period posttreatment switch are more likely to have fewer long-term bleeding events. Additionally, it places greater emphasis on phenotype-adjusted tailoring of prophylaxis according to bleed response, as well as investigations for other mechanisms of bleeding. The results confirm the value of simplified fixed-dose prophylaxis with adaptations based on both clinical outcomes and PK parameters. Furthermore, this analysis reports the first use of machine learning techniques applied to data from a hemophilia clinical trial, with the aim to identify results that can be applied in the clinic. The analysis should be highlighted as an example for the use of artificial intelligence techniques applied to data from future clinical trials in hemophilia for any therapeutic modality.
What is known about this topic?
• The most recent iteration of the World Federation of Hemophilia guidelines 2 recommends a higher trough level for prophylaxis, which can be achieved using extended half-life (EHL) factor VIII (FVIII) molecules, and the use of personalized prophylaxis plans. • Previous studies have attempted to predict bleeding risk and annualized bleeding rate (ABR) in people with hemophilia A utilizing FVIII pharmacokinetic (PK) parameters, and reported an association between bleeding risk and peak FVIII level, area under the curve, and time per week with FVIII activity below 1 IU/dL. • However, bleeding risk can change over time and may be influenced by factors independent to PK parameters, suggesting that other demographic and clinical characteristics are required to predict long-term treatment response to prophylaxis.