Key words
Artificial Intelligence - Explainable AI - Machine Learning - Black Box - Deep Learning
- Medical Image Processing
Introduction
Algorithms in artificial intelligence (AI) make it possible to effectively process
large quantities of data and address various questions. In the initial training phase,
already known or previously hidden relationships in sample data are identified and
represented in a model. With AI models trained in this way, identified correlations
can be applied directly to new data so that it can be processed quickly and easily.
Particularly in radiology, due to the high degree of digitalization [1] and the openness to technical progress, this approach has proven to be a very powerful
tool for effectively processing the continuously increasing amount of image data [2] in spite of the skilled labor shortage [3].
The spectrum of applications ranges from efficient image acquisition and optimized
workflows to automatic diagnostic support. For example, AI algorithms make it possible
to reduce measurement time and radiation exposure while maintaining the same image
quality due to improved image reconstruction [4]
[5]
[6]. A further application in the daily routine is the preselection of image data to
decrease the unnecessary interpretation of unremarkable images. Particularly in screening
programs like mammography, the workload can be significantly reduced [7]
[8]
[9]. In addition, AI methods allow faster and better diagnosis, e. g., as a result of
the automatic annotation of organs and pathologies [10]
[11]
[12] and new quantitative and image-based markers as are currently being intensively
researched in radiomics [13]
[14]
[15].
Advances with respect to AI methods are based on improved methods [16]
[17], larger amounts of data [18], and increased computing capacity [19] allowing the generation of increasingly complex models. However, one challenge when
using such complex AI methods is that it is often difficult to understand the reasoning
behind decision processes [20]
[21]. Particularly in the clinical routine, it must be possible to understand the reasoning
behind decisions, including those made with the help of AI algorithms [22]. The reasons for this include acceptance by patients and the possibility to evaluate
the model decision.
When training an AI method, the knowledge is implicitly acquired from the training
data and is applied to new tasks. However, this process results in some uncertainties.
Was all relevant information used or was it missing during training? Can the identified
correlations be generalized? Is there a causal relationship for the identified correlations
or are they random? To ensure the reliability of an AI system, it must be shown that
the system learned the underlying properties and the decisions are not based on irrelevant
correlations between input and output values in the training dataset.
Weaknesses can be reduced but not ruled out by carefully selecting the model architecture
and the training algorithm of an AI method. Additional information helps to minimize
the effect of interference factors, and validation of algorithms on external datasets
allows evaluation of the generalizability and is being explicitly examined and promoted
in data-driven areas like radiomics research [23]
[24]. However, errors are possible even when being careful as shown by practical examples.
For example, researchers at the Mount Sinai hospital developed an AI method for evaluating
pneumonia risk based on radiographs. However, the method achieved significantly lower
accuracy outside of that particular hospital. As it turned out, the approach used
information about the imaging devices and detected high-risk patients based on devices
used in the intensive care unit [25]. This example clearly shows how important it is to be able to understand an AI system
so that such false correlations can be discovered not just by accident but systematically.
There are major differences between the individual AI methods not only with respect
to performance but also regarding the ability to understand generated models (see
[Table 1]). If the models can't be interpreted, the image of a closed black box is often used
(see [Fig. 1]) This refers to models whose modes of operation cannot be interpreted and only the
input and output values are understandable. To understand how a black box works, explanation
models are consequently needed for the actual model. In contrast, interpretable models
are referred to as white boxes. An intermediate stage between the two extremes is
the gray box. This refers to models that allow certain insight into internal data
processing. It must be taken into consideration that in practice a method cannot always
be clearly classified as a white, gray, or black box method.
Table 1
Comparison of the various levels of performance and explainability of white, black,
and gray box methods.
|
Performance
|
Explainability
|
White box
|
Only limited model complexity
|
Direct interpretation of models
|
Black box
|
Complex models possible
|
Subsequent indirect interpretation of individual aspects using explanation models
|
Gray box
|
Complex models possible
|
Interpretation of defined aspects using models, further explanations via black box
methods are possible
|
Fig. 1 Schematic representation of (a) white box, (b) black box, and (c) gray box methods. Data processing in white box methods is transparent, while only
interpretation models, which can be a source of error, can be generated for black
box methods. Methods that combine complex information processing with interpretable
modules can be referred to as gray box methods.
White box AI
The entire data processing chain is ideally able to be understood – these methods
are referred to as white box methods. In particular, methods from the areas of classic
machine learning and static learning that provide transparent information processing
of input values, e. g. patient data, lab values, or image data, and output values,
e. g., a diagnosis, should be mentioned here. One example is linear regression, which calculates a linear combination from various numeric features. These methods
are used, for example to determine a radiomics signature and to weigh the individual
features of structure, form, and texture. The influence of every feature is determined
by an individual weight and can be easily read out and interpreted [26]. The situation is similar for other methods like Naive-Bayes classification
[27], which predicts class based on relative probabilities of occurrence. By using probability
distribution, the Naive-Bayes classifier allows simple interpretation of the influence
of an input value on the model output.
However, transparency is not synonymous with interpretability. Therefore, interpretability
can also be limited in white box methods. This is clearly seen in the case of decision trees and Random Forests, which are often also used in radiomics [28]
[29]
[30]
[31]. Decision trees model a structured series of conditions in a tree structure. If
the decision tree is complex or if Random Forests with multiple trees are used, decisions
are transparent and theoretically able to be understood but this is no longer the
case in practice because of the complexity [32].
Black box AI
If the decisions of a method can no longer be understood, for example due to their
complexity, these models are referred to as black box models. Deep learning-based
methods (DL), which often exceed the performance of classic methods, are typical here.
They are the foundation for leading methods in a broad spectrum of complex tasks including
medical image analysis and are increasingly used in radiology. Deep learning is based
on the structure and function of the brain and uses a dense network of millions of
artificial neurons that are series-connected in multiple layers. The interconnection
of the neurons allows flexible adjustment to the particular task at hand with the
input images being processed within the neural network to create visual features and
segmentation or classifications being performed. The artificial neurons in which the
model knowledge is stored are defined by learnable parameters.
Due to the high number of parameters, deep learning models are de-facto no longer
able to be understood [33] and new methods are needed to make it possible to understand the decision process.
To make the black box of deep learning more transparent, methods that attempt to explain
the unclear functionalities and interconnections of the neural networks in a targeted
manner are therefore being developed. Many of these methods can be applied to current
DL methods from general image processing. However, the value and the contribution
to the interpretability of the methods vary. If the limitations of these methods are
not taken into consideration, there is a risk of seeming explainability and the deduction
of incorrect conclusions.
Most image-based DL architectures are based on Convolutional Neural Networks (CNNs),
which are extracted with image feature filters. The visualization of these filters (see [Fig. 2]) can provide information about the extracted properties of the image data. For example,
filters in the early layers of the network extract line or circle patterns. However,
filters from deeper layers are difficult to interpret. The visualization of filters
primarily contributed to the better understanding and verification of the functioning
of CNNs. Due to the high degree of abstraction of filter visualization, this technique
is not helpful for explaining model output in an individual application case.
Fig. 2 Visualization of the feature filters of a CNN model that can differentiate between
100 different animals. The filters for the first layer (a) have an understandable function (green box: line filter, blue box: circle filter,
red box: noise filter, yellow box: color filter), while the filters in the second,
deeper layer (b) can no longer be assigned an understandable function.
Another approach is to use optimization to generate an input image that maximally activates certain neurons [34]. If a neuron is highly activated, this means that an image feature learned by this
neuron is present in the input image. Thus, the method converges in images that depict
the patterns on which the selected neurons were trained. Either random noise can be
optimized as an input image or a search can be performed for images from the training
dataset that maximize activation. Initial methods usually deliver only abstract images
that can be helpful during model development. The latter provides images that are
easier to interpret but limits the specificity when it is not clear which element
in the input images actually caused the high activation of the neurons. Nonetheless,
this approach can be helpful in practice in some cases.
Deconvolution [35]
[36] is an approximated inversion of the convolution of a CNN. The regions of the input
image that contribute to the activation of individual feature filters are highlighted.
Human interpretation of exactly which image features in the image region are highlighted
is also needed here. For this reason and as a result of the multitude of filters that
are needed for complex image analyses, deconvolution is usually used only during the
development of models for supporting analysis.
Regardless of the inner structure of a model, masking-based saliency methods examine the model as a true black box only from the outside [37]. With targeted manipulation of the input data and observation of the change in output
values, relationships between individual input parameters and results can be established.
In the context of image analysis, input is manipulated by masking or manipulating
individual image pixels. In the best case, the model output is changed exclusively
as a reaction to the masking of relevant areas. Otherwise, incorrectly learned correlations
can be concluded. Moreover, spatial significance as well as intensity influences can
be examined. However, a comprehensive examination with this method is time-consuming
and accuracy cannot be assumed even in the case of a positive result.
With gradient-based saliency methods, regions in the input image that contribute to the decision regarding a certain
output can be highlighted [38]
[39]. Using this approach, it is possible to determine whether irrelevant image areas
were considered for a decision (see [Fig. 3]). It turned out that in the detection of COVID-19 pathologies on chest radiographs
[40], the focus of the learned AI was also outside of the lungs or even the body, thus
reflecting differences in patient position and X-ray projection. Although this field
example clearly shows that this method can identify insufficiently generalized deep
learning models, care should be taken when introducing these algorithms. Even when
the focus is on the correct image region, incorrect image features in this region
can be learned and the use of saliency analyses can result in an overestimation of
the model.
Fig. 3 Visualization of the saliency heatmaps of the CNN model for an input image (a). Red pixels of heatmaps (b) and (c) show image regions that have a large impact on the network output. Heatmap (b) shows the focus of the model for the correct “jellyfish” output, which is predominantly
on the body of the animal. However, this focusing is almost identical for an incorrect
network output, as in (c) for the class “Hummingbird”.
The T-CAV method is a more abstract approach to explaining deep learning models [41]. The goal is to examine the influence of concepts in the input images. A linear
classification model is trained to learn different concepts based on the input data.
The data can then be examined based on these concepts. A biased model can be detected
early, for example during model development. However, the functionality of T-CAV is
highly dependent on the trained model and the resulting concepts.
The presented methods show how different the explanation approaches for deep learning
networks are. These can make an important contribution to the explanation of black
box models but always have systematic limitations. Explanation of complex models always
requires a reduction, which is associated with a loss of information and thus provides
only partial explanations. In summary, the good applicability for black box models
is an advantage of the indicated methods. The restrictions include the limited significance
and the associated uncertainties.
Gray box AI
Gray box methods combine the advantages of interpretable white box methods with the
powerful performance of black box methods. In this new research field, explainability
is taken into consideration in the development of AI methods in order to achieve explanation
goals without any notable loss of performance.
One possibility regarding explainability is the use of exemplary examples, referred
to as prototypes. Based on the human approach to making predictions, decisions are made on the basis
of the most similar examples that allow a direct analysis. Either entire images or
individual segments can be learned as prototypes. Such systems not only allow the
classification of medical images but also show at the same time the most similar images
in the training database [42]
[43]. The validity of model estimation can thus be classified, thereby inspiring trust
on the part of the responsible end user. At the same time, identified prototypes can
serve as training material.
Invertible neural networks have an invertible architecture so that the input and output of a
model can be inverted. This invertibility can be used to check individual layers of
the network. By manipulating relevant features, counterfactual sample images that
allow statements like "without feature A the result is..." can be generated. This
technique is already used in computer-assisted surgery to determine the degree of
uncertainty of perfusion estimation in endoscopy [44]. Even when invertible neural networks limit the possible network structures, they
are a good alternative for better understanding AI models.
The advantage of gray box methods is the combination of understandability and high
performance which are important properties particularly in sensitive areas like medicine.
However, there are currently corresponding methods only for a few application cases.
In addition, the explainability is limited to specific elements in these methods.
As in all explanation methods, it makes a difference, for example, whether individual
cases are taken into consideration or a general statement is to be made. Different
explanation approaches must be used depending on this. For this reason, additional
research and development in the new field of gray box methods are needed for customized
use of these methods in numerous application areas. Only then can the advantages of
these methods also be fully utilized.
Summary
Artificial intelligence can make an important contribution to safer and more efficient
radiology. However, broad acceptance of such systems in the medical profession as
well as among patients requires the ability to understand decisions. Radiologists
must be able to understand the models they use in order to be able to continue to
fulfill their duty of care, make informed diagnoses, provide patients with comprehensive
information, and provide well-founded documentation. To ensure the legal understandability
of measures that are taken, the explainability of models is an important requirement
for usability. In particular, high-performance systems like deep learning-based algorithms
are often too complex to be able to be understood. The need to create interpretable
models has already been recognized and is currently being addressed with various approaches
particularly by methods that can be retrospectively used on fully trained models.
The advances of the last years have resulted in considerable further developments
with various levels of transparency and make it possible to answer various questions
without limiting the complexity of the models. However, analysis from the outside
limits the value of the black box system and the corresponding methods can only provide
explanation models of the models. These are necessary reductions of the original models
and are therefore also a source of error.
The use of complex but interpretable gray box AI is an interesting alternative here.
Since explainability is part of these methods, the intermediate step of creating an
explanation model is not needed. The learned features can be analyzed and checked
with expert knowledge and offer a decision basis on which the end user can check the
reliability of the model results. Since the explanation method is an integral part
of AI solutions, this use must be considered early and it must be determined which
parts of the AI model should be understandable. Adapted algorithms are necessary here
– to provide the correct type of explanation. The close cooperation between medicine
and information technology is consequently of essential importance for identifying
relevant questions and finding customized solutions.
Funding
University of Ulm
Baustein (L.SBN.0214)