CC BY-NC-ND 4.0 · Laryngo-Rhino-Otol 2018; 97(S 01): S114-S141
DOI: 10.1055/s-0043-121964
Referat
Eigentümer und Copyright ©Georg Thieme Verlag KG 2018

Radiomics: Big Data Instead of Biopsies in the Future?

Article in several languages: deutsch | English
Kathrin Scheckenbach
1  Klinik für Hals-Nasen-Ohrenheilkunde, Universitätsklinikum Düsseldorf
› Author Affiliations
Further Information

Korrespondenzadresse

Priv.-Doz. Dr. med. Kathrin Scheckenbach
Klinik für Hals-Nasen-Ohrenheilkunde
Universitätsklinikum Düsseldorf
Moorenstr. 5
D-40225 Düsseldorf
Phone: 0211-8117570   
Fax: 0211-8118880   

Publication History

Publication Date:
22 March 2018 (online)

 

Abstract

Precision medicine is increasingly pushed forward, also with respect to upcoming new targeted therapies. Individual characterization of diseases on the basis of biomarkers is a prerequisite for this development. So far, biomarkers are characterized clinically, histologically or on a molecular level. The implementation of broad screening methods (“Omics”) and the analysis of big data – in addition to single markers – allow to define biomarker signatures. Next to “Genomics”, “Proteomics”, and “Metabolicis”, “Radiomics” gained increasing interest during the last years. Based on radiologic imaging, multiple radiomic markers are extracted with the help of specific algorithms. These are correlated with clinical, (immuno-) histopathological, or genomic data. Underlying structural differences are based on the imaging metadata and are often not visible and therefore not detectable without specific software. Radiomics are depicted numerically or by graphs. The fact that radiomic information can be extracted from routinely performed imaging adds a specific appeal to this method. Radiomics could potentially replace biopsies and additional investigations. Alternatively, radiomics could complement other biomarkers and thus lead to a more precise, multimodal prediction. Until now, radiomics are primarily used to investigate solid tumors. Some promising studies in head and neck cancer have already been published.


#

1. Introduction

Precision medicine aims at most accurately defining diseases in order to find personalized and individual therapies. This approach should improve healing chances and/or lead to a reduced spectrum of side effects. Diseases that in earlier times had been defined clinically are more precisely diagnosed by histological, molecular, or a broad spectrum of clinical biomarkers. In this way, biomarkers and increasingly also biomarker signatures consisting of a typical pattern of different characteristics influence more and more the clinical routine. Their prognostic and diagnostic relevance, however, may vary significantly with regard to their specificity and sensitivity. An influence is already observed with the definition of the patient cohort, the analytic method and its variance as well as the definition of the respective correlation. The degree of standardization has another effect on all levels: the method of data collection, the type of data acquisition, data processing as well as evaluation. Broad screening methods producing enormous amounts of data are increasingly applied. They allow simultaneous measurement of many parameters and their comparison with regard to the applicability as biomarkers and thus open the option to define various biomarker combinations for different patient groups.

Hereby vast amounts of data arise that necessitate providing large storage capacities and developing suitable software for valid analysis. It is also beneficial to have a safe exchange of those data on an interdisciplinary and trans-regional level. The extraction of parameters that are decisive for the respective question can only be performed by IT-based data analysis, statistics, and modeling. The broad data collection bears the possibility to analyze the data pool with regard to different issues and in different directions. However, large data pools with sufficiently large patient cohorts are necessary for defining significant biomarkers. The data analysis may be a relevant challenge that is more complex than the analysis of the material itself and that is also associated with multiple potential sources of errors.

In the last years, those screening examinations were included increasingly in the biomarker analysis under the umbrella term of “omics”. According to the definition, the overall primary analysis is relatively unselective. The suffix “omics” characterizes different sources of large data volumes whereby this suffix is preceded by naming the original materials of data collection for definition. This leads for example to the description of the “genomics” in the context of genetic expression analysis or to “transcriptomics” for investigations on the RNA level; protein analyses become “proteomics” or also “metabolomics” when the metabolome is analyzed. In analogy, comparably newly defined “radiomics” find their way into clinical research and increasingly even into clinical routine.

Defined characteristics of a disease contain information about prognosis and diagnosis with regard to the status, the outcome, and the therapeutic response. Those variables are analyzed in the context with clinical data and the course of the disease while the clinical data still serve as diagnostic and prognostic parameters.

Valid biomarkers should be easily accessible, measurable, and reproducible. So their stability with regard to the measurements should be verified. The definition should be performed in a representative, possibly large and standardized cohort where well-defined parameters are sufficiently assessed. After definition of the biomarker signature, it should be reevaluated in a second cohort – preferably independently in a second institution. Further prospective validation is reasonable in order to confirm the reliability.

Biomarkers are an essential prerequisite for personalized therapy. For some diseases (e. g. breast cancer, prostate cancer) it is possible to define single biomarkers with high significance but because of the complexity of pathogenesis, also in this context increasingly biomarker matrices are applied.

These may originate from one or several data sources. Clinical, genomic, histopathological, and other markers may be combined. These combinations are analyzed ideally by means of software that allows quantification, configuration, and visualization. Thus, large data volumes provide various options for the definition of appropriate markers or marker patterns in different stages of the disease, on different steps of the analysis, and of most diverse materials and data sources. Unfortunately, however, also the same amount of possible misinterpretations, analysis bias, methodical and statistical sources of errors must be considered that are difficult to identify and discover due to their high complexity.

Up to now, it is mostly necessary for analyses to gain material from solid tumors – e. g. tissue biopsies. These specimens undergo genomics, proteomics, or metabolomics with the help of a broad spectrum of analytic methods (e. g. next generation sequencing, NGS) and thousands of data are collected. Radiomics as a relative newcomer on the scene of biomarkers have the great advantage that no invasive specimen gaining is required.

Using mathematical algorithms, a quantitative high-throughput extraction of radiological features based on meta-datasets (DICOM format) is performed. Often these image features cannot be perceived with the human eye and so they can only be assessed in an IT-supported way [1]. Historically, radiomics originate from computer-assisted diagnosis and detection systems (CAD) of the 1980s and 1990s [2] [3]. The difference, however, is the extracted data volume and the type of combination with clinical, histological, or genomic data volumes. While CAD systems only provide answers to single questions regarding diagnosis and detection of a disease, radiomics allow generation of large data volumes from imaging such as computed tomography (CT), magnetic resonance imaging (MRI), or positron emission tomography (PET). Nonetheless, there are older investigations that already meet the requirements of radiomics to a certain extent without using that term at the time of examination and thus they had not been defined as such. A PubMed research provides the first results for the key word of “radiomics” in 2012. The research group of Lambin et al. published an article entitled “Radiomics: extracting more information from medical images using advanced feature analysis” [4]. This title contains the exact definition of this new term.

1.1 Principles of radiomics

Fortunately, the typically applied imaging technique is performed routinely. It is generally accessible and ideally already used in the context of diagnosis. However, the basic data of imaging have to be available comprehensively for processing and furthermore imaging has to be performed digitally according to current standards with adequate accuracy and ideally without artefacts ([Fig. 1]). In order to achieve this objective, innovations and increasing standardization of medical imaging have contributed to make those new methods possible. Only in this way, sufficiently large data pools are available in single institutions to establish radiomic signatures. In addition, modern hardware, the use of comparable radiocontrast media, and the standardization of imaging protocols are important factors to enable quantitative analysis and to apply specific software to this end.

Zoom Image
Fig. 1 Schematic, simplified workflow for creation of a radiomic signature.

Standardized imaging protocols and the routine application of modern software are essential for reproducible biomarker signatures gained from radiomics that can be compared on a multicenter level. Only in this way, the broad quantitative analysis becomes possible [4]. Hence, imaging performed in clinical routine may be used as gigantic source for data analysis. In order to assess the volume of the potentially available data, one must understand that each of those data sources – regardless of 2- or 3-dimensional – of each patient contains millions of voxels and hundreds of features that are available for radiomics analyses [5]. The data are ready for analysis, exist and are automatically newly generated every day. Additional material collection such as biopsies and potentially expensive tissue analyses are not needed.

Radiomics data can be extended by using a combination of different imaging procedures as for example CT/PET, “dual source/dual energy” CT scans as well as the application of radiological markers. Hereby the tissue itself, disease-specific markers as well as increasingly also biological processes are superiorly visualized. This allows generating different and additional characteristics from the obtained images.

Another variation of radiomics is radiogenomics. Radiomics are based on the interesting hypothesis that cellular and phenotypic tissue properties correspond to specific radiomic features and are displayed in imaging because radiological images are nothing else than tissue depictions [4]. The more differentiated and the finer the examination methods are, the more accurate and specific is the possible depiction and the more findings are provided. Tissue specification becomes more and more accurate with regard to macroscopic, microscopic-histological, immunohistological, electron microscopic, and molecular aspects ([Fig. 2]). The principle of radiogenomics is a logic consequence of this idea and is based on the hypothesis that even proteogenomic cell and tissue characteristics are – directly or indirectly – visualized by imaging procedures. The assumption in this context is not that single mutations may be visualized e. g. in CT scans but that tissue characteristics are induced by certain proteogenomic constellations. For example, increased regulation of cell cycle genes might trigger a heterogenic tissue structure. The idea is certainly fascinating that proteogenomic and cellular characteristics and thus also local and individual differences have equivalents in imaging and that basic data may provide currently unknown additional information, which can be assessed with specific algorithms, i. e. “methods”, and adequately correlated and set into a context. Nonetheless it must be stated that also a microscopic image without additional information such as e. g. immunohistochemistry, can provide only limited findings. Since even imaging is visualization of tissue, the information is limited.

Zoom Image
Fig. 2 a Description of a cervical lymph node metastasis after importing the imaging into the segmentation software (courtesy of Prof. S. Wesarg, Fraunhofer Institute Darmstadt, Germany).
Zoom Image
Fig. 2 b Description of a cervical lymph node metastasis after semi-automatic segmentation (in red).

Some studies could already reveal that radiomic analyses were able to extract and correlate equivalents to cellular, genetic, or phenotypic characteristic from classic imaging procedures [1]. An additional challenge is the fact that radiogenomics combine 2 “omics”, i. e. radiomics and genomics. This leads to a very large data pool and its sufficient evaluation and application can only be performed in a professional and IT-based way. Separately considered, the genomics data as well as the radiomics data have various advantages and their combination may generate useful additional information. Via genomics it is possible to identify a very detailed genetic pattern and thus an extract of molecular processes on the cellular level – broken down to DNA or RNA. However, this is limited to specific, rather small areas or cell types and tissue or tumor parts.

From these analyses specific biomarkers relevant for the discrete question have to be defined. Already this objective is a great challenge. It is certainly not realistic to aim at one-to-one correlation of genomic examination with radiomics. But if it is successful to adequately correlate the genomic data that are relevant for a problem and to perform genetic/molecular subtyping on a radiological level, this alone would be an enormous benefit. With radiogenomics, for example a tumor could be holistically characterized because in the radiological image not only single areas but the entire affected tissue would be assessed. A combination of both methods would then provide significantly more information also on a molecular biological level. If radiogenomics retrieved important molecular phenomena in equivalence to laboratory diagnostic methods, even avoiding molecular tissue diagnostics might be possible. This could finally lead to the potential avoidance of invasive biopsies, lower costs, and less staff- and material-related efforts as well as improve the patient satisfaction.

Currently radiomics are mainly applied in oncology for alternative characterization of solid carcinomas. However, they have the potential to serve as biomarkers for benign diseases with one or several classifiable correlates in imaging – e. g. in the context of Menière’s disease [6] or functional disorders of the parotid gland after radiation [7].

In the following paragraphs, examples of some radiomics applications in the field of oncology will be focused and the state-of-the-art in head and neck oncology will be described. For better understanding of the methods and their problems, the radiomics workflow will further be described and its potential sources of variation and errors will be discussed.


#
#

2. Radiomics and Tumors

Especially in oncology, radiomics are very well received. In this context, the outcome, histology, subtyping, or even therapy response are correlated with imaging features. There are already older investigations that date back to the 1970s and that follow the same principle – although this was not called “radiomics” at that time. The distinction of the terms is certainly vague. However, because of the possibilities of data storage and processing that were not available to the current extent, these articles are significantly more limited regarding their spectrum of analysis. Imaging techniques that were used included CT scan, MRI, PET/CT but also conventional X-ray, mammography, or ultrasound.

The number of publications is manageable although if has greatly increased in the last years with definition of the term of radiomics and an associated workflow as well as forward-looking expectations. A high variation is found in the size of the patient cohorts and the study design. Generally speaking, older studies deal with smaller cohorts than more recent ones – finally also due to capacities and possibilities of data processing and storage. Although most trials are based on retrospective datasets, a tendency is observed that radiological signatures are validated in second or even prospective datasets – which is highly desirable.

It is remarkable that specific tumor entities are significantly more present in this field than others. There are comparably many publications for example on lung and breast cancer, whereas other also frequently occurring entities such as cervix cancer or lymphomas are rather underrepresented. Therefore, some articles about well-investigated tumors such as lung and breast cancer will be elucidated here and other less dominant entities will not be described in this context.

2.1 Lung cancer

Already in the 1970ies, before the era of “omics”, Sutton and Hall correlated structural analyses in radiography of the lung with different pathologies. Their objective was to evaluate the feasibility of automatized diagnostics of chest radiography. However, in comparison to current possibilities, the dataset was quite restricted because IT-based analysis as we use it today was not possible. Thus, the article may be considered rather as a precursor than as an example for modern radiomics [8].

In 2008, Al-Kadi and Watson differentiated aggressive lung carcinomas from non-aggressive ones based on CT scans by means of structure-based characteristics in 15 patients [9].

Since 2010, the research team of Ganeshan et al. is continuously working on radiomics of lung carcinomas. In 2010, they published a pilot study that encompassed 18 patients with non-small cell lung cancer. Statistically processed imaging properties (middle greyscale, entropy, uniformity) could be correlated with the tumor stage and its glucose metabolism [10]. In a follow-up trial, the tumor uniformity could be correlated with the survival of 54 patients. Further investigations revealed that the structural properties could also be matched with histopathological characteristics beside the clinical ones [11]. By applying additional statistical evaluation methods, further histopathological properties could be correlated to the structural analysis of the CT scans [12].

Other publications concerning non-small cell lung cancer were provided by Aerts et al. [13]. They investigated 440 radiomic features in non-small cell lung and head and neck cancer. That aspect will be elucidated later in this article when focusing on head and neck cancer. The research team could establish a predictive signature in lung cancer for the survival, the histology, and the tumor stage. Recently, those parameters were confirmed for the survival in another patient cohort and the transferability on the modalities of planning-CT and CBCT (cone beam computed tomography) verified. This implies that radiomic signatures are potentially applicable in different modalities which may be of high clinical value [14].

Based on a cohort of 182 patients with adenocarcinomas of the lung, the same group could show that a radiomic signature with 33 markers may predict metastatic spread, and another signature encompassing 12 features the survival [15]. The analysis of other characteristics of the complexity of tumor type and heterogeneity could be correlated with the overall survival in a patient cohort and confirmed in an additional one [16].

Also in pulmonary adenocarcinomas (n=431), Yuan et al. compared 20 selected radiomic biomarkers in CT scans with volumetric analysis in order to differentiate distinct phenotypes (carcinoma in situ versus minimally invasive carcinoma versus invasive carcinoma). In this context, the radiomic signature (accuracy: 80.5%) was superior to volumetric analysis (accuracy: 69.5%) [17]. The comparison of these methods was thus decided in favor of radiomics.

Zhang et al. optimized the radiomic signature for the prediction of recurrences, death, and recurrence-free survival in non-small cell lung cancer by varying different methods for parameter selection and classification [18]. In this way, they could show that the applied statistical methods have a significant impact on the definition and relevance of radiological biomarkers. In addition, the modalities within one imaging variant are crucial. Different data quality and relevance regarding the prognosis of recurrences of lung cancer were revealed by Huynh et al. comparing different CT modalities (static versus respiration-adapted) [19].

Even radiogenomics have already been investigated in lung cancer. Aerts et al. could find a high correlation of genomic data obtained from gene-set enrichment analysis (GSEA) with radiomic parameters in patients with non-small cell lung cancer. Two characteristics of radiomic heterogeneity could be correlated with cell cycle genes that lead to the development of heterogeneous tumors and increased proliferation [13]. This substantiates the hypothesis that proteogenomic phenomena can be displayed directly and indirectly in imaging data.

Only recently, the same group elucidated in pulmonary adenocarcinomas that a CT-based radiomic signature (heterogeneity-based) in 353 patients could predict the EGFR (epidermal growth factor receptor) status. This signature was validated in a second cohort of 352 patients. A combination with a clinical data model further improved the accuracy. A signature intended to differentiate KRAS-positive from KRAS-negative tumors in the same cohorts, was also significant but with a clearly poorer accuracy than the EGFR-associated signature [20]. In diffusion-weighted MRI, Yuan et al. could confirm the EGFR mutation status of pulmonary adenocarcinomas [21].

Not only outcome parameters and biological tissue typing may be displayed by radiomics. Tools to support decision-making in therapy planning would be of great clinical value. In cases of non-small cell lung cancer, a correlation of the response to radiotherapy or radiochemotherapy with the overall survival could be found in PET and PET-CT by means of radiomic biomarkers [22] [23]. An additional benefit and even sometimes better performance compared to imaging properties of the primary tumor (n=85) resulted from an analysis of imaging parameters of lymph node metastases (n=178) of stage II-III non-small cell lung cancer for the prediction of the response to neoadjuvant radiochemotherapy [24].

The versatility of imaging-based biomarkers is very well reflected in radiomics-based studies on lung cancer. Overall, CT-based and increasingly also PET-CT trials dominate the investigations of lung cancer. The results are very promising and encompass the differentiation of malignant and benign lesions, the elucidation of genetic and histological foundations as well as clinically oriented prognoses of outcome and therapy response.


#

2.2 Breast cancer

Already early, breast cancer has been evaluated with regard to the significance of imaging because mammography has been used for screening for several decades. A rapid and precise differentiation of benign and malignant lesions based on structural characteristics as endpoints as well as the possibility of implementing automated screening have been evaluated for a long time. Structural analyses of mammograms reach back to the 1980ies and were continued in the 1990ies and 2000ies [25] [26] [27] [28] [29] [30] [31]. Since the 1990ies, structural analysis has also been successfully implemented in ultrasound diagnostics for differentiation of breast cancer [32] [33] [34]. Already in 1993, Garra et al. achieved a sensitivity of 100% and a specificity of 80% regarding the detection of malignant lesions in their examinations of 80 patients [32]. A recent investigation could analyze 364 structural parameters by means of sonoelastography in 42 patients suffering from breast cancer and 75 patients with benign lesions. Seven sonoelastic characteristics were selected that could predict malignancy with a sensitivity of 85.7% and a specificity of 89.3% [35]. A fluent transition of early structural analyses to modern radiomics can be observed.

In contrast to the detection of lung cancer where computed tomography plays a major role, magnetic resonance imaging is the adequate procedure for soft tissue visualization in breast cancer. Also here, the first structural analyses date back to the 1990ies. Already in 1997, Sinha et al. could differentiate benign from malignant breast lesions based on 8 structural characteristics combined with the patients' age with a sensitivity of 93% and a specificity of 95% [36]. Also in the following studies that had been performed with the purpose of differentiating malignant breast tumors, structural analyses could achieve good results. However, those were retrospective data analyses with a limited number of patients [37] [38] [39]. Cai et al. initiated a study with a relatively large cohort of 234 patients in which they could differentiate breast cancer from benign lesions with a sensitivity of 85% and a specificity of 89%. By means of 3 classic machine-learning algorithms, 28 structural parameters were examined and in order to avoid redundancy and to achieve improved significance, they were reduced to 5 features. The established 5 structural parameters in the diffusion-weighted MRI (apparent diffusion coefficient, sum average, entropy, elongation, sum variance) were validated in a second cohort of 93 patients with a sensitivity of 69% and a specificity of 91% [40]. Recently, Bickelhaupt et al. could show that one specific radiomic signature usefully completed the analytic significance of the apparent diffusion coefficient alone for differentiation of malignant lesions in MRI [41]. Holli et al. succeeded in finding a histological subtyping between lobular and ductal breast cancer, however, only within a pilot study of 20 patients (n=10 suffering from ductal breast cancer, n=10 with lobular breast cancer) [42].

In terms of radiogenomics, a correlation of MRI-based characteristics regarding the subtyping of 91 biopsies of invasive breast carcinomas with genomic data (TCGA/TCIA: The Cancer Genome Atlas/The Cancer Imaging Archive) in a multicenter analysis of the National Cancer Institute could be found. By means of radiomics, Wang et al. identified breast cancers in DCE-MRI that did not have the typical genomic markers (“triple negative”) [43]. A combined approach of 38 radiomic parameters and 144 genetic properties was chosen by Guo et al. that were tested in combination and against each other. Radiomic features were more suitable for predicting the tumor stage whereas genomic features better described the receptor status. The data of the 91 included patients originated from the TCIA and TCGA databases. However, the research team admitted a reduced significance of their trial because of the limited number of patients [44]. Recently, also Li et al. tested the predictive significance of MRI-based radiomics against genetic tests applied in clinical routine for breast cancer (MammaPrint, Oncotype DX, PAM50 Gene Assay) based on data of 84 patients and came to the conclusion that the radiomics-based testing might play a role in the prognosis of recurrences [45].

The response of breast cancer to chemotherapy was evaluated by several research groups. Ahmed et al. and Parikh et al. could find significant differences between chemotherapy responders and failures based on 8 and 2 (entropy and uniformity) MRI-based structural parameters, respectively [46] [47]. A recent trial of Braman et al. did not only rely on tumor-based radiomics for prediction of therapy response, but also examined the tumor-surrounding tissue. Insufficient response to neoadjuvant chemotherapy was associated with a higher peritumoral heterogeneity. The combined examination could significantly predict the treatment response, independently from the receptor status [48]. In this way, the field that is considered in radiomic examinations was extended by this study. Not only the tumor itself provides relevant data.

In summary, it can be stated that the variance of examination methods for breast cancer of which radiological structural parameters can be obtained is higher than for lung cancer. Since it is a primarily soft tissue-associated tumor, the classic examination methods of mammography, ultrasound including sonoelastography, and MRI are first-line techniques. Also in this context, studies have already been conducted that differentiate not only benign and malignant lesions as the endpoint, but that rather emphasize histological and genetic foundations and predict the clinical outcome as well as the therapy response. Data sources that were used for several analyses were not only own imaging data but larger accessible databases such as TCIA and TCGA. In order to obtain statistically more reliable results in the future and to validate radiomic signatures also trans-regionally, they are certainly – even for other tumor entities – a suitable data source that should be taken into consideration.

Fewer radiomics trials exist for other solid carcinomas such as cervix carcinomas [49], liver carcinomas [50], colon carcinomas [51], and prostate carcinomas [52] [53] [54] [55]. Concerning glioblastomas and gliomas, radiomics-based correlations could be found with molecular information such as the EGFR status and the isocitrate dehydrogenase 1 (IDH1) status [56] [57] [58]. Radiological structural characteristics could also be determined for renal cell carcinomas that correlated with the mutation status of BAP1 (BRCA2-associated protein 1) gene, VHL (von Hippel-Lindau) gene, or KDM5C gene as well as EGFR receptor status [59] [60] [61]. In a proof-of-concept pilot study that analyzed the structural characteristics of FLT-PET/MRI of patients with metastatic renal cell carcinoma, the therapeutic response to the receptor tyrosine kinase inhibitor Sunitinib could be predicted [62].


#

2.3 Radiomics in head and neck cancer

Regarding the head and neck area, radiomics-based investigations already exist for esophageal cancer, nasopharyngeal cancer, and “classic” squamous cell carcinomas of the oro- and hypopharynx, larynx, and the oral cavity.

In a cohort of 41 patients suffering from esophageal cancer, Tixier et al. evaluated the therapy response to combined radiochemotherapy (5-fluorouracil with carbo- or cisplatin). They analyzed 38 radiomic parameters of pretherapeutically performed whole-body (18)F-FDG PET examinations. Hereby, complete and partial therapy responders as well as failures could be identified more reliably than with standard uptake values (SUV) alone [63].

The research team of Zhang et al. recently published 2 articles on radiomics of nasopharyngeal carcinomas. They were based on MRI for which 870 radiomic features were evaluated per patient. The first study encompassed 110 patients and 6 methods for parameter selection and 9 classification methods were analyzed. An optimal machine-learning method was identified in order to perform biomarker screening of nasopharyngeal carcinomas [64]. In the second study, 118 patients with primary diagnosis of advanced nasopharyngeal carcinomas (stage II-IVb) without distant metastasis were integrated; 88 of them were examined in a training cohort and 30 in an independent validation cohort. A radiomic signature could be established by means of a combination of CET1-weighted and T2-weighted images together with the TNM stage regarding the progression-free survival. This was superior to a signature of CET1-weighted or T2-weighted images alone and also TNM classification alone [65]. The significance of radiomics was improved by the combination with known clinical parameters – or vice versa – in the sense of multimodal modeling.

The team of Lambin et al. may be considered as pioneers in the field of radiomics in general and in particular of head and neck cancer. Their primary research in head and neck cancer is based on routinely performed CT scans. In cases of squamous cell carcinomas of the head and neck, there seem to be similar effects to small cell lung cancer. In 2014, 440 automatically extracted radiomic features were examined in computed tomographic scans of 1,019 patients who either had lung or head and neck cancer. They included phenotypic properties that reflected the tumor image intensity, shape, structure, and waves in several scalings. The stability of those characteristics was first tested in 2 small cohorts (n=31 and n=21). A radiomic signature could then be established in a larger cohort of 422 lung cancer patients (Lung 1/Maastro) that was correlated with the clinical outcome (Kaplan-Meier diagram) of the patients.

It contains 4 parameters: statistics energy, shape compactness, grey level nonuniformity, and grey level nonuniformity HLH. For validation, 4 additional cohorts were included. In 3 of them, independence was already clear because they originated from different study centers (Lung 2/Radboud n=225, H&N2/VU Amsterdam n=95, Lung 3/MUMC n=89, H&N1/Maastro n=422). The established signature could be validated in 3 cohorts (Lung 2, H&N1, H&N2). Astonishingly, it was superior to the predictive significance of TNM staging alone in Lung 2 and also in H&N2 and comparable to the TNM classification in N&N1. A combination of the TNM staging with the radiomic signature could further improve the prediction of the outcome in all groups – independent from the patients’ treatment (radiation or radiochemotherapy). In particular after publication of the revised TNM classification that had not been applied in the context of this study, it is additionally interesting if HPV (human papillomavirus) positive can be differentiated from HPV-negative patients by means of a radiomic signature, especially because their outcome is different after radio(chemo)therapy [66] [67]. However, this was not the case although the outcome could be well predicted by the radiomic signature in particular in HPV-negative patients. In addition to clinical data correlation, the radiomic signature of the Lung 3 cohort was correlated with corresponding genetic data of the same cohort in a gene-set enrichment analysis (GSEA). Hereby, associations between the expression of different genetic groups and the radiological structural parameters could be defined. In particular, genetic expression variations of the cell cycle were depicted. Hence, also the molecular biology on which the tumor is based can be revealed by imaging up to a certain degree [13] and thus the value of radiogenomics is supported.

The same 440 CT-based radiomic features were applied in another study of the group. This time they were correlated with further clinical properties. The study was – similar to the previous one – subdivided into a training and a validation phase. Two cohorts with lung or head and neck cancer were assigned to a training cohort (Lung 1 n=422, HN1 n=135) and the signatures were validated in 2 additional independent cohorts (Lung 2 n=225, HN2 n=95). Comparing head and neck with lung cancer, 143 characteristics were relevant for both tumor entities. In addition, 190 parameters characterized the outcome only for lung cancer and further 22 radiomic parameters were only relevant for head and neck cancer. Different clusters could be correlated with survival (lung and head and neck cancer), histology (lung cancer), and tumor stage (lung and head and neck cancer), however, the HPV status could not be revealed by a signature [68].

In order to further develop the methods of radiomics, the team established a machine-learning-based method that could predict the overall survival of head and neck cancer patients based on a radiomic signature with high stability. The objective was to improve the practical application of a radiomic signature also for the clinical routine. This is necessary to introduce radiomics as a non-invasive, cost-effective method in the medium term. The already known 440 radiomic features were tested by means of 13 methods for characteristics selection and 11 machine-learning classification methods in a first cohort consisting of 101 head and neck cancer patients and validated in another independent cohort with 95 head and neck cancer patients. The endpoint was the overall survival. Hereby, a reliable machine-learning method could be identified [69].

In summary, those 3 publications were able to correlate data of different origins (clinical, histological, and genetic information) with radiomics parameters and in this way characterize the tumor based on imaging. The radiomic signature alone was sometimes superior to single data resources that had been applied for characterization before. But even when this was not the case, their completion in the sense of multimodal modeling could improve the significance for the assessment of the carcinoma. The trials had been performed with relatively large patient cohorts from institutions in partly different locations so that they were methodically well designed and the reliability of their significance could already be verified internally.

Recently an article of the “Head and Neck Quantitative Imaging Work Group of the M.D. Anderson Cancer Cancer/MICCA” was published. In a newly initiated trial, 288 patients with oropharyngeal cancer and known HPV status were included. They had undergone primary radiotherapy with curative intention (iMRT) and a standardized pretherapeutic CT scan. As primary endpoints of the radiomics-based analysis of the CT scans, the HPV status and the occurrence of local recurrences were defined. In an approach designed as a competition with scoring, different researchers could test their algorithms regarding the evaluation of the HPV status and the local recurrences. During the annual meeting of the MICCAI 2106, the winners were presented [70].

Also for the practical application of radiomics in clinical therapy, first studies are available for head and neck cancer. Based on CT scans, Ou et al. investigated, together with the team of Philippe Lambin, 544 imaging characteristics in 120 patients with advanced head and neck cancer. The patients received radiochemotherapy or “bioradiotherapy”. Based on pretherapeutic planning CT scans, the overall survival (HR=0.3; p=0.02) and the progression-free survival (HR=0.3, p=0.01) could be predicted by means of a radiomic signature encompassing 24 characteristics. A combination with the p16 status as indicator for the biomarker HPV further improved the significance of the signature. Overall, this combination was more relevant than the p16 status or the radiomic signature alone [71]. In another trial, FDG-PET images of 174 patients with advanced stage III-IV oropharyngeal carcinoma were examined who received definitive radiochemotherapy. Imaging was performed before and after therapy. As endpoints, the mortality, the local treatment failure, and distant metastasis were defined. In this investigation, 24 representative radiomic features were included that reflected the tumor intensity, shape, and structure. Predictive models for the mortality, the local treatment failure, and the occurrence of distant metastasis could be established that were cross-validated internally. Unfortunately, this model did not reach significance for local treatment failure during external validation. In addition, the models for mortality and distant metastasis could not be confirmed statistically although, according to the authors, they had an acceptable predictive performance [72].

Overall, many promising approaches exist for head and neck cancer to usefully establish and clinically integrate radiomics. Activities of other research teams that enrich the field with further independent investigations would be desirable. CT scanning as the standard imaging procedure for head and neck cancer seems to be a reasonable basis, although further MRI-based examinations still have to be assessed. At the same time, the available studies of head and neck and other carcinomas show that beside the data quality and quantity, the success of the studies is fundamentally determined by a structured approach in the context of radiomics. So, how to approach those data?


#
#

3. Practical Implementation of Radiomics

Meanwhile, a typical radiomics workflow has been defined that is used generally in nearly all studies.

First, a suitable standardized imaging procedure is identified. Then, the region of interest is determined, e. g. a tumor, which is then segmented. From the segmented areas, radiomic features are defined and extracted by means of specific algorithms. Together with data from other sources, they are included into a database and accordingly formatted and are then ready for processing. Using suitable statistical methods, biomarkers and the radiomic signatures can be defined from those databases. This principle seems to be logical and relatively easy to handle.

However, despite the application of a standardized workflow, each of those steps bears the risk of errors, difficulties, and limitations that may impair, falsify and complicate the analysis. The quality of the analysis, its significance and comparability could potentially be impaired despite the availability of suitable imaging material. Already minor changes of the standard or the methods may have effects that reduce the reproducibility. As a consequence, the established radiomic signature would not be stable and applicable. So always large cohorts with many possibly comparable datasets are preferred as the basis for establishing a radiomic signature. Validation in an independent cohort – ideally by an independent group of researchers – according to standardized protocols is useful in order to minimize internal sources of errors that are sometimes difficult to identify. Typical sources of errors will be described in the following according to the single steps of the workflow.

3.1 Imaging

Imaging is the essential basis to practically perform radiomics. It has to be available as digital file. All imaging included in a study has to be performed based on the same standard in the same modality and if possible at a comparable stage of the disease. The suitable, high-quality imaging modality, the appropriate examination protocol, and the most reasonable ROI have to be identified. Regarding the analysis of solid carcinomas, the ROI is mainly the tumor, but it may also include the surrounding tissue or possible metastases. Even specific anatomical structures, disease foci etc. may be defined. Optimally, a standard imaging is chosen with the standard protocol for the most common questions.

Different imaging modalities lead to different radiomic features with potentially different significance and specificity. Depending on the question and the ROI, more information is expected in terms of a certain modality. However, if this is not the case, the investigator probably benefits most from the standard imaging of the analyzed disease where he has most images at his disposition without additional efforts. This is important because the larger the cohort is, the potentially higher is the statistical significance and the less errors occur due to outliers. If the modality is selected, the type of how imaging is performed as well as the parameters of imaging may still be adjusted. These are very basic factors that nonetheless have to be defined exactly.

The use of different scanners can also have an impact. However, different scanners may already exist in separate institutions and replacing them is very expensive. Thus, particularly in the context of multicenter trials the scanner model should be taken into account. Regarding the acquisition of imaging, different slice thicknesses, programs, configurations, or details can be selected. Each of those program features has an influence on the structure of the imaging data. In cases of contrast media application, the radiomic features vary potentially according to the type and quantity of the contrast agent, the time of application, and the physiology-related individual distribution patterns of the patient. Not least, images taken at different times and stages as well as metabolic conditions of a disease may be acquired. In this way, the definition of the ROI and its segmentation can be modified. All these parameters influence the characteristics to be measured. Some of the above-mentioned variable parameters of imaging are difficult to influence and thus always a potential source of error. So it is even more important to strictly standardize all those characteristics that allow standardization. They include in particular technical standards. The type of image acquisition, the imaging mode, the matrices, the slices, resolutions, reconstruction as well as the type and adapted quantity of the contrast agent can be widely standardized allowing comparability between separate study centers. Fortunately, the introduction of standards for specific questions is increasingly accepted so that this prerequisite for radiomic analyses is structurally improved. Accurate clinical information about the type and stage of disease, metabolic diseases, and other clinical features help in cases of instable variables to identify influences and at least to take them into account in the context of data processing [73].


#

3.2 Segmentation

Segmentation defines the borders of the area that is analyzed by radiomics, e. g. a tumor. So segmentation is an essential step and a basic precondition for performing radiomics. The region of interest (ROI) and the volume of interest (VOI) are identified. Slice per slice the ROI is marked in the images so that itself and its relation to the surrounding structures are finally depicted in a completely 3-dimensional way. Only structures that are included, are considered later in the analysis. A definition and segmentation of several different ROIs as well as their later assessment, together or separated, is generally possible. Radiological features are relevantly influenced by segmentation. Methodically, segmentation can be generally performed manually, semi-automatically, or automatically. Up to now, not all 3 methods are available for each application. The more exact the borders are defined and the better they can be delineated from the surroundings, the easier is the establishment of a semi-automatic or automatic segmentation. ROIs that are more difficult to define are often manually segmented. Since for this purpose, the examiner has to delineate the ROI slice per slice and its borders must be defined, this process is very time consuming and incompatible with clinical routine. Manual segmentation should always be performed by a specialist because its quality is highly dependent on the examiner’s experience. Nonetheless, there is a high interobserver variability that influences the radiological signature. Even the same examiner may define segmentations of the same ROI in a different way at different times, according to the high intraobserver variability. The automatic segmentation is performed by mathematical algorithms. Due to this gain in objectivity, the intra- and interobserver variability may be neglected. The segmentations have a better reproducibility and can be performed more rapidly. Thus, automatic segmentation is very suitable for large datasets, many datasets, and also multicenter approaches with many different examiners. However, it is not possible for every ROI. Blurred object borders, missing clear contrasts in the same narrow localizations or also artefacts are problematic in the context of automatic analysis limiting its applicability. This may lead to falsely defined ROIs – even if they are reproducible. The tumor might for example not be included completely or the software scans alternative areas. It is worth discussing if an automatic misinterpretation of an automatized segmentation or the intra- and interobserver variability of manual segmentation are the greater evil for the application of radiomics. The compromise and at the same time often the precursor of automatic segmentation is semi-automatic segmentation. Hereby, the ROI is identified by the examiner and circumscribed for example in a slice of imaging. The software then performs an automatic segmentation of the defined object and the examiner post-edits it. Semi-automatic segmentation simultaneously includes sources of errors and advantages of the manual and automatic approaches. It is more rapid than manual segmentation but it suffers from the intra- and interobserver variability of the manual segmentation [73].


#

3.3 Establishment of radiomic features

The ROI defined by segmentation is analyzed automatically by means of specific algorithms that compile numeric values by analyzing voxels and pixels. Hundreds of features may be produced and varied. They describe for example localizations, intensities, shapes, structures and structural differences, greyscales, color intensities as well as correlations and relativity values of these features. The selected features should be verified before application in a study with regard to their stability within the examination and in different individuals of the study.

The ROI properties may be depicted in different ways and further processed. So ROI intensities may be visualized via a histogram that is based on fractionated volume data on voxel level. Data for example on the form of the ROI (volume, shape, surface markers, density etc.) provide additional accompanying statistical values. The analysis of additional, secondary qualities, clusters, correlations – even beyond different image settings – may provide enormous amounts of data. One challenge is the exclusion of redundancies. Of course those large data quantities are difficult to handle regarding their processing. With the help of statistical methods and machine-learning, the parameters have to be reduced to the informative and validated features that are relevant for the objective of the analysis and trial. Only in this way, the features gain their specific significance.


#

3.4 Establishment of databases

One particularity of radiomics is to reasonably analyze the radiological characteristics in the clinical, genetic, and/or histopathological context. For this purpose, according databases have to be compiled. Large data storage capacities that can be well accessed for analysis have to be available. The definition of the feature value should correspond to a selection of the possible variables that can be exactly delineated. For each clinical, genetic, or histological property, the type and source should be the same. Linking different databases, e. g. clinical, genetic, and radiological ones, may also be useful. However, the regulations of data protection should be strictly observed. Of course, all these data have to be digitally available for statistical evaluation.


#

3.5 Analysis of databases

With the compilation of the databases, the actual evaluation starts. The objective is the establishment of a radiomic signature that correlates with a specific requirement or question. An alternative objective may be a multimodal modeling where the radiomic signature is evaluated together with other data which leads to an additional value regarding precision and/or information. A radiomic signature is extracted from all measured radiological values and may contain only one or several features at the same time. Those features may be trivial in the sense of already macroscopically known phenomena but also consist of features that are more abstractly gained from voxels and pixels. There are radiomic signatures that contain a 3-digit number of single components. Taking into account that already hundreds of values of radiological characteristics are available per patient and further features are added or that large data volumes from several sources are processed simultaneously (e. g. combined with genome analysis), it is clear that experienced statisticians and a suitable software are essential. The software should be able to assess large data quantities in a reasonable timeframe and generate solid, reproducible, broadly applicable biomarker signatures. Of course, thorough approaches during the previous steps significantly improve the data quality. At that time, sources of errors can only scarcely be limited by analysis and statistics. A high number of well-defined datasets, possibly obtained from several centers by different examiners, limits errors due to outliers, interobserver variability, local particularities, and measurement uncertainty [73]. It is rather logistically difficult and cost-intensive to achieve this objective. Ideally, a primary radiological signature is created based on a retrospective cohort, validated in another independent cohort, and tested prospectively in the clinical setting [13].


#
#

4. Radiomics: Study Objectives

In oncology, radiomics are currently used in particular for characterization of solid tumors on different levels (histological, genetic, clinics-associated), for prediction of the outcome, and for prediction of the therapy response in the context of primarily conservative therapeutic measures. A transition into other reasonable correlations, however, seems to be possible.

Current tumor characterizations encompass clinical data, macroscopically assessable imaging data, rarely functional data (e. g. mobility of the vocal folds in cases of laryngeal cancer), and genetic, proteomic, and (immuno)histological information from tumor areas that are pretherapeutically gained from biopsies. The whole tumor can only be examined after surgical extirpation. Biopsies reflect a representative tissue image that may be characterized histopathologically and/or molecular-biologically. Unfortunately, however, many carcinomas are not homogenous. Various cell populations and clones with different histological and in particular molecular properties are found in different areas. Before biopsy, they cannot necessarily be differentiated. The biopsy area is determined by clinical factors, as its anatomical position and accessibility as well as the biopsy method and the capacities and experiences of the examiner who should succeed in obtaining more or less representative tissue. Studies, where biopsies were taken from different sites of the tumor, already revealed this phenomenon.

Radiomics have the advantage that they assess the whole tumor by means of morphological imaging with regard to its size, shape, surface and internal structure as well as the anatomical context. If the detection of radiomic signatures is successful for histological, genetic, and proteomic conditions of the tissue, they may potentially be assessed by radiomics in the entirety of the tumor. Furthermore, radiomics may be extended to the surrounding tissue and metastases or metabolic conditions. Alternatively, radiomics might be used to better distinguish subareas of the tumors that cannot be seen with the naked eye and thus contribute to improve the quality of biopsies and to indicate where biopsies should ideally be taken in order to histologically and molecularly characterize the tumor. However, it would be desirable to completely avoid invasive biopsies. So it would be beneficial to improve the radiomic specificity to that extent that their characterization is equivalent to the quality of biopsy or even provides better results. If and for which applications this may be possible, will have to be proven in further studies. Up to now, radiomics should be used in the context of clinical data and data of additional sources.

Some studies have already shown that an amalgamation of radiomic features with biomarkers and data from other sources may achieve a better and more accurate subtyping of diseases and a better quality of outcome predictions. Until now, information from imaging are only indirectly included in tumor typing. They contribute for example to the TNM classification. In the context of the sometimes very rough radiological characteristics that were commonly considered such as tumor size, invasion, extracapsular spread etc. informative metadata from the background remain disregarded. Many of them get “lost” for the human examiner during processing of the images. So they may currently be considered as “dead” data source. In summary with the data of other sources, radiomics may refine the typing by other biomarkers and reduce grey areas. Hereby, radiomics could become an integral part of multimodal prediction or typing models that are finally the basis of a broad spectrum of applications in the context of personalized medicine, which could be enriched and improved by radiomics. Because of the enormous data volume and in the context of a possible standardized objective description, the development and application of suitable software is essential.

Reasonable outcome correlations could contribute to the assessment of the tumor aggressiveness. As part of a multimodal overall assessment as well as for specific applications as potential biopsy substitute, radiomics might improve the prediction and monitoring of the outcome as well as treatment options.

Regardless of the predictive character, radiomics-based examinations also contribute to increasingly automatizing cancer screening in general and to improving the associated standardization. They foster cost- and time-effective diagnostics. The role of the diagnosing examiner could be either supported or be pushed into the background. However, it must not be forgotten that radiomic algorithms assess information that the human eye is not able to realize during visualization of the imaging. In this way, radiomics even possess the potential to achieve better diagnoses than humans.

And the major advantage is that the required data are already present in our standard imaging procedures in high quantities. They just wait for being explored. If relevant radiomic features are defined and validated, additional examinations will no longer be necessary.


#

5. Factors for Clinical Integration of Radiomics: A Future Vision

So what will be a future vision? Until now, radiomics in the sense of the current definition have not been introduced into clinical routine. This is mainly due to their newness. But they are increasingly recognized as alternative source of biomarkers and promoted on the scientific level. With further intensification of big data and IT-based model approaches, imaging might complete or replace information obtained by biopsies (histological, genetic/molecular). Uni- or multimodal modeling could precisely predict the outcome and therapy response so that it does not only support medical decision but even replaces it in extreme cases. Those approaches are only possible with very strict and multicentric standardization of diagnostics and therapy. Treatment individualization would be moved forward due to objective data. However, this also bears the risk of losing the individualization because psychological, social, or generally spoken “human” characteristics would not be assessed. The physician – and thus also the patient – would be more and more subordinate to standardization and the power of the data situation, with all its advantages and disadvantages. A further development of radiomics and modeling would support telemedicine and self-diagnosis and thus the centralization of medical service in specialized centers.

But what will have to happen to make radiomics practicable for clinicians? What are the wishes of the applying physician?

It is necessary that the user may rely on a constantly high specificity and sensitivity of the analysis because therapeutic and diagnostic decisions depend on it. Ideally, radiomics should have a broad application spectrum. Based on cost-benefit calculations and the patient comfort, the use of standard imaging without additional efforts would be desirable. It should be possible to integrate the respective software into the local IT systems and thus allow smooth processing when linking clinical data with genetic and (processed) imaging data, if necessary. The segmentation should be automatized and integrated into the processing software. It should be user-friendly, i. e. intuitive and clear. The results of radiomic analysis are better accepted when they can be adequately visualized. Practically, radiomics may not only reduce or avoid biopsies but additionally allow for a more holistic assessment of the tumor. A link to the completing software programs, multimodally oriented outcome prognoses, or therapy concepts, would be further possible options.


#

6. Conclusion

Radiomics enlarge the field of biomarkers in an innovative way and the basic data of imaging that hereby gain in importance are included in the wide spectrum of “omics” and biomarkers. They could substantially contribute to personalized medicine. A major advantage is that the data generally already exist and “only” have to be evaluated. Another advantage is that they may be retrieved without biopsies and their potentially complex and expensive assessment (e. g. genomics). Nonetheless, hereby an overall assessment of the tumor is generated and not only an excerpt due to a biopsy. Radiomic signatures could possibly serve alone as biomarkers and replace other clinical, histopathological, and genetic markers. In this way, the patient comfort might be improved and financial means may be saved. An additional benefit may also be generated by multimodal modeling, correlation with data from other resources and thus extend and improve their significance. For both purposes, however, it is necessary to process large data volumes, which requires a high expertise and bears the important risk of potential errors on all levels of establishment and validation. For clinical integration, not only a high measure of standardization is necessary but also the implementation of suitable segmenting and analyzing software that make the definition of radiomic signature realizable in clinical routine.

6.1 Big data instead of biopsy?

In the future, it might be that radiomics replace biopsies for specific questions. However, in the very near future it seems to be more probable that radiomics complete the findings of biopsies and that data models enriched by radiomics improve precision medicine.


#
#
#

Interessenkonflikt

Der Autor gibt an, dass keine Interessenkonflikte bestehen.


Korrespondenzadresse

Priv.-Doz. Dr. med. Kathrin Scheckenbach
Klinik für Hals-Nasen-Ohrenheilkunde
Universitätsklinikum Düsseldorf
Moorenstr. 5
D-40225 Düsseldorf
Phone: 0211-8117570   
Fax: 0211-8118880   


  
Zoom Image
Abb. 1 Schematisierter, vereinfachter Workflow zur Erstellung einer Radiomics-Signatur.
Zoom Image
Abb. 2 a Darstellung einer zervikalen Lymphknotenmetastase nach Einlesen der Bildgebung in die Segmentierungssoftware (durch Prof. S. Wesarg des Fraunhofer-Instituts Darmstadt zur Verfügung gestellt).
Zoom Image
Abb. 2 b Darstellung einer zervikalen Lymphknotenmetastase nach semi-automatischer Segmentierung (in rot hervorgehoben).
Zoom Image
Fig. 1 Schematic, simplified workflow for creation of a radiomic signature.
Zoom Image
Fig. 2 a Description of a cervical lymph node metastasis after importing the imaging into the segmentation software (courtesy of Prof. S. Wesarg, Fraunhofer Institute Darmstadt, Germany).
Zoom Image
Fig. 2 b Description of a cervical lymph node metastasis after semi-automatic segmentation (in red).