Summary
Background: When standard therapies fail, clinical trials provide experimental treatment opportunities
for patients with drug-resistant illnesses or terminal diseases. Clinical Trials can
also provide free treatment and education for individuals who otherwise may not have
access to such care. To find relevant clinical trials, patients often search online;
however, they often encounter a significant barrier due to the large number of trials
and in-effective indexing methods for reducing the trial search space.
Objectives: This study explores the feasibility of feature-based indexing, clustering, and search
of clinical trials and informs designs to automate these processes.
Methods: We decomposed 80 randomly selected stage III breast cancer clinical trials into a
vector of eligibility features, which were organized into a hierarchy. We clustered
trials based on their eligibility feature similarities. In a simulated search process,
manually selected features were used to generate specific eligibility questions to
filter trials iteratively.
Results: We extracted 1,437 distinct eligi -bility features and achieved an inter-rater agreement
of 0.73 for feature extraction for 37 frequent features occurring in more than 20
trials. Using all the 1,437 features we stratified the 80 trials into six clusters
containing trials recruiting similar patients by patient-characteristic features,
five clusters by disease-characteristic features, and two clusters by mixed features.
Most of the features were mapped to one or more Unified Medical Language System (UMLS)
concepts, demonstrating the utility of named entity recognition prior to mapping with
the UMLS for automatic feature extraction.
Conclusions: It is feasible to develop feature-based indexing and clustering methods for clinical
trials to identify trials with similar target populations and to improve trial search
efficiency.
Keywords
Medical informatics - search engine - clinical trials - knowledge representation -
eligibility determination