Keywords
Breast pathology - biomarkers - precision medicine - pathology reporting - artificial
intelligence
Introduction
Pathology has advanced rapidly over the last two decades. Its focus on descriptive
diagnosis was expanded by the possibilities opened up by molecular pathology. This
development has significantly contributed to our understanding of the cancerogenesis
and progression of various tumour diseases, including breast cancer, refined the differentiation
of tumour entities and enabled the identification of pathogenic or likely pathogenic
gene variants, which are today used in precision medicine to predict the response
to targeted drugs. The amount of information that can be obtained from tissue specimens
today has multiplied as a result of this. In the context of innovative treatment strategies
building on mRNA sequencing and proteomics, it will increase even further.
Thanks to the willingness of German pathologists to innovate and to their quality
awareness, the new methods and the necessary quality assurance measures were quickly
established throughout Germany across the ambulatory and hospital care sectors. The
Quality Assurance Initiative Pathology (QuIP, Qualitätssicherungs-Initiative Pathologie
GmbH) that was initiated by the German Society of Pathology (DGP, Deutsche Gesellschaft
für Pathologie e.V.) and the Professional Association of German Pathologists (BDP,
Berufsverband Deutscher Pathologinnen und Pathologen e.V.) started in 2004 with immunohistochemical
Round Robin tests; today, QuIP offers a broad spectrum of Round Robin tests, including
some addressing molecular pathology questions. Compared to the rest of Europe, Germany
today enjoys a leading position in terms of the availability of quality-assured biomarker
diagnostics [1].
Other forward-looking development themes are digitalization and artificial intelligence,
both of which are viewed as building blocks of a modern Next Generation Pathology
(the motto of the Annual Meeting of the German Society of Pathology in 2024) and have
increasingly been implemented and advanced in recent years. In pathology, there are
numerous options for digitalization: laboratory workflow, digital scanning of slides
for assessment on the monitor and automated quantification of immunohistological markers,
reporting using speech recognition, creation of structured, standardized pathology
reports, electronic transfer of pathology reports to practices and hospitals via defined
interfaces, and cancer registry reporting. The areas of application of artificial
intelligence for optimizing processes and supporting diagnostic assessments are therefore
broad and varied, ranging from prioritizing cases for diagnosis to pre-screening of
biopsies and quantification of expression profiles to creating standardized pathology
reports and systematically interpreting extensive data sets [2]
[3]
[4].
The aim of this article is to highlight the special demands on modern, future-oriented
breast pathology reporting and to describe the possibilities and limitations of artificial
intelligence in this context.
The special features of breast pathology
The special features of breast pathology
Pathology is the critical link between diagnosis and treatment in the management of
women with suspicious breast lesions – whether as part of mammography screening or
otherwise. On the one hand, it is important to identify the morphological correlate
for the suspicious imaging findings or clinical signs and symptoms, on the other,
to set the course for the provision of appropriate treatment.
The diversity of breast lesions is high in terms of clinical imaging presentation,
morphology, molecular characteristics, and biological behaviour and thus represents
a particular challenge. This applies both to routine diagnostics and the context of
AI-assisted reporting. From a morphological perspective, the breast is an exceptionally
colourful organ that is subject to hormonal effects. Accordingly, the already broad
histological spectrum of benign and malignant neoplasms is expanded by physiological
functional changes, such as mastopathy, which are common and occur in a wide variety
of forms, potentially in various combinations with neoplasms. Benign changes can be
precursors for malignant transformation, other lesions entail the risk of misinterpretation
because of their close morphological resemblance to invasive breast cancer (so-called
mimickers of cancer). In breast pathology, intraductal epithelial proliferations account
for a large number of conditions. Thus, there is a need for diagnostic stratification
and assessment of progression potential based on qualitative and quantitative criteria.
The invasive types of breast cancer also present with an unusually broad morphological
spectrum, determined by the intrinsic properties of the tumour cells, their architectural
patterns and the extent of stromal reaction.
The task is to identify the various entities and establish a prognostic stratification
based on qualitative and quantitative criteria. Quantifiable criteria for histological
grading of breast cancer were defined and introduced at a relatively early point [5]
[6].
The WHO classification provides the basis for criteria-based diagnostics which takes
molecular pathology aspects in addition to pathomorphological properties into account
and is updated on a regular basis [7]. Diagnostic criteria are particularly useful where it is important to identify mimickers
of cancer and not misinterpret them as carcinoma. This risk is especially high when
various non-invasive changes coincide in a way that there appears to be an invasive
cancer lesion. For example, the colonisation of a sclerosing adenosis lesion by a
ductal carcinoma in situ (DCIS) or a lobular carcinoma in situ (LCIS) can be misinterpreted
as an invasive carcinoma when the myoepithelial lining of the acini is not recognized.
Immunohistochemistry (IHC) and molecular pathology have expanded the array of diagnostic
methods and were utilised early on, especially in breast pathology and hematopathology.
Immunohistochemistry plays a key role in the differential diagnosis of intraductal
epithelial proliferations, the differentiation between non-invasive and invasive lesions,
the histological typing of breast cancer, and the differential diagnosis of spindle
cell lesions. While structural changes and architectural abnormalities allow the pathologist
to suspect the presence of cancer already in the overview magnification, in case of
doubt it is the immunohistochemistry-based detection of the absence of myoepithelial
cells and the destruction of the basement membrane that permits a reliable differentiation
between invasive and non-invasive lesions.
Focus on therapy
Apart from the diversity of breast lesions, the rapidly expanding array of therapeutic
options poses a particular challenge. Recognising and documenting all tissue characteristics
that are relevant to clinical management is one of the tasks pathologists are responsible
for. Knowledge of the parameters that entail specific therapeutic consequences or
recommendations is a prerequisite for this. Here, the evidence based interdisciplinary
guideline (S3 Guideline Breast Cancer) edited by the Association of Scientific Medical
Societies in Germany (Arbeitsgemeinschaft der Wissenschaftlichen Medizinischen Fachgesellschaften
e. V., AWMF), the German Cancer Society (Deutsche Krebsgesellschaft e.V., DKG), and
German Cancer Aid (Deutsche Krebshilfe, DKH) and the recommendations of the Breast
Committee of the German Gynecological Oncology Group (AGO, Arbeitsgemeinschaft Gynäkologische
Onkologie) offer guidance for pathologists, too [8]
[9]
[10].
Today, guideline-adherent testing for oestrogen (ER) and progesterone receptor (PR)
expression, HER2 status and Ki67 proliferation index is performed almost like a reflex
in patients with primary breast cancer. The results of the semi-quantitative analysed
tests are essential for determining the (neo-)adjuvant systemic therapy and also have
an impact on the timing of surgery. In metastatic breast cancer, the determination
of ER, PR and HER2 is supplemented by the analysis of other parameters, which are
selected on the basis of the resulting receptor status, to identify potential targets
for new targeted therapy. The number of addressable target structures has been increasing
rapidly in the last few years in breast cancer, too. Besides immunohistochemistry
(PD-L1) and in situ hybridization (HER2), DNA-based molecular analysis methods, such
as PCR and NGS (Next Generation Sequencing), are also used to detect changes in addressable
signalling pathways (e.g., PIK3CA, AKT1, PTEN). Tissue-based analyses were supplemented
by the analysis of cell-free, circulating DNA (cfDNA) in blood plasma (liquid biopsy;
ESR1, PIK3CA).
In order to help pathologists to keep pace with the rapid growth in knowledge and
the increasing workload, QuIP GmbH offers an information portal on breast cancer [11]. This covers the current state of knowledge on biomarker diagnostics and the resulting
treatment options.
Due to the accelerated development in the field of biomarkers, a shift away from single-gene
analyses towards parallel high-throughput analyses is becoming increasingly necessary.
The NGS-based techniques will enjoy a further boost from the model project of section
64e of the German Healthcare Development Act (GVWG, Gesundheitsversorgungsweiterentwicklungsgesetz)
of July 11th, 2021 (BGBl. I, 2754). Apart from whole exome sequencing (WES) and whole genome sequencing
(WGS), it also includes the shift to multi-omics techniques, such as whole transcriptome
sequencing (WTS), proteomics and epigenetic analyses, among others.
Only an innovation-ready pathology can enable modern personalized medicine, pursuing
new treatment options for patients who have exploited all available means of therapy.
Nevertheless, the pathology budget is not being adapted to the requirements of increasingly
sophisticated diagnostics, but is, in fact, being reduced. Consequently, there is
a risk that the ability of pathology to function and to innovate could be restricted
in the long term, and, ultimately, compromise patient care.
Reporting
Structuring and standardising pathology reports
Essentially, pathology reports should be written comprehensibly, completely and quickly.
This implies that a generally recognised terminology is used and that all therapy-relevant
criteria are included.
Meeting these requirements prevents misunderstandings, which may lead to incorrect
treatment recommendations, reduces queries and allows timely initiation of necessary
treatments.
The current WHO classification provides the generally accepted medical terminology
for tumours of the breast [7]. It is advisable to name the source if deviating terms are used, for example, for
rare entities or variants not mentioned in the WHO classification.
Protocols for structured or synoptic pathology reporting are used to ensure uniform
nomenclature and standardised documentation. In contrast to “narrative” pathology
reports, the respective organ- and/or tumour type-specific criteria (e.g., histological
tumour grade) are presented in form of a list [12]. Likewise, the level of the criteria (e.g. grade 1, grade 2, grade 3) and any specifications
(e.g., tubule formation, nuclear pleomorphism, mitotic count; 1–3 points each) are
stated using clear terminology. In the German mammography screening programme, the
results of the pathomorphological evaluation of core needle biopsies as well as of
surgical specimens have been documented electronically in predefined protocols right
from the start of the programme. Forms for standardised documentation of findings
have already been provided in the appendix to the S3 Guideline Breast Cancer for a
long time [8]. In Germany, however, pathology reports in daily routine are usually still written
in narrative form.
Unlike in Germany, the use of structured protocols has been common, particularly in
the USA for some time. The College of American Pathologists (CAP) makes almost 100
different protocols available online, free of charge for various organs [13]. There are protocols for biopsies, surgical specimens and biomarkers. However, their
use without a license is subject to certain restrictions. Furthermore, to the authors’
knowledge, the electronic versions of these protocols are not available in German,
and also not compatible with the German pathology information systems.
The International Collaboration on Cancer Reporting (ICCR) now wants to remove these
language and electronic restrictions. The ICCR is a non-profit organization that integrates
standard sources (current WHO classifications, UICC/AJCC TNM classification) and internationally
validated and evidence-based pathology datasets for cancer reporting [14]. These can be used globally. There is a broad cooperation between national pathology
societies, interdisciplinary associations and major international cancer organisations.
Authorised translations into French, Spanish and Portuguese are already available
for some protocols. The DGP and BDP are involved on the German side. The ICCR allows
a largely free use of the protocols for diagnostic reports, but not for commercial
research.
In the datasets, a distinction is made between required essential parameters (CORE
elements) and recommended additional information (NON-CORE elements) ([Table 1]). The categories for the various parameters are predefined in a form. A detailed
explanation and illustration of the parameters is provided in an appendix, so that
not only the documentation of the criteria, but also their collection and interpretation
is standardised.
Table 1 CORE elements and NON-CORE elements for the pathology reporting of resection specimens
with invasive carcinoma of the breast based on the ICCR recommendations [15].
|
Parameter
|
CORE element
|
NON-CORE element
|
Values
|
|
Clinical information
|
√
|
|
Screening vs symptomatic presentation, clinical findings, prior treatment, imaging,
family history, genetic predisposition
|
|
Operative procedure
|
√
|
|
Type of excision or mastectomy
|
|
Specimen laterality
|
√
|
|
Right or left
|
|
Size/weight/details of tissue specimen
|
|
√
|
Free text
|
|
Tumour site
|
√
|
|
Distance to the nipple, quadrant or clock face
|
|
Tumour focality
|
√
|
|
Unifocal or multifocal
|
|
|
√
|
Number and size of each focus
|
|
Tumour dimensions
|
√
|
|
Maximum dimension of largest focus
|
|
|
√
|
Other dimensions
|
|
Histological tumour type
|
√
|
|
According to the WHO classification
|
|
Histological tumour grade
|
√
|
|
Grade 1, 2 or 3
|
|
|
√
|
Specification: tubule, nuclear pleomorphism and mitosis scores
|
|
Carcinoma in situ
|
√
|
|
Histological type and nuclear grading (in DCIS), necroses
|
|
|
√
|
Architectural pattern (DCIS), extensive intraductal component (EIC)
|
|
Tumour extension
|
√
|
|
Skin, nipple, skeletal muscle
|
|
Margin status
|
√
|
|
Involved by invasive carcinoma/DCIS or distance to closest margin
|
|
|
√
|
Extent of involvement, distance to all margins
|
|
Lymphovascular invasion
|
√
|
|
Present or not detectable
|
|
|
√
|
Site if detectable elsewhere
|
|
Coexistent pathology
|
|
√
|
Free text
|
|
Microcalcifications
|
|
√
|
Present, associated lesion
|
|
Oestrogen receptor
|
√
|
|
Negative/positive/low positive, % positive nuclei, average intensity
|
|
Progesterone receptor
|
√
|
|
Negative/positive, % positive nuclei, average intensity
|
|
HER2
|
√
|
|
IHC score, ISH negative/positive, cells counted, HER2- and CEP17 signals/nuclei, HER2/CEP17
ratio
|
|
|
√
|
IHC % 3+ cells, ISH aneusomy, heterogeneity
|
|
Ancillary studies
|
|
√
|
e.g. Ki67, representative block for ancillary studies
|
|
Pathological staging
|
√
|
|
TNM classification
|
For the documentation of breast cancer, there are already 4 datasets available in
English [14]
[15]:
-
Ductal Carcinoma in Situ, Variants of Lobular Carcinoma in Situ and Low Grade Lesions
-
Invasive Carcinoma of the Breast
-
Invasive Carcinoma of the Breast in the Setting of Neoadjuvant Therapy
-
Surgically Removed Lymph Nodes for Breast Tumours
Work is currently being carried out on the German translations as well as on the integration
of the templates and data records into the German pathology information systems. The
use of such protocols would be a significant step towards international pathology
report standardisation and scientific utilisation of the documented data.
However, one must not underestimate the effort and costs for pathology facilities
that would be involved in implementing a wide range of authorised synoptic report
templates and having them continuously updated. Collaborating specialties and hospital
administrations would also benefit. Thus, it would be desirable that all beneficiaries
contribute to the costs.
The speed of reporting
Given its direct impact on the possible start of appropriate treatment, the speed
of reporting plays an important role in patient care. Finding the right balance between
speed and accuracy is key. The turnaround time (TAT) is the period of time from receipt
of the tissue specimen to completion of the pathology report. It has an impact on
the following aspects of patient care:
The aim is to keep the period of uncertainty for patients as short as possible. Of
course, this must not affect the precision of the diagnosis. It is important that
the appropriate therapy can be initiated as early as possible to optimize a patient’s
chances of survival [16]. This also applies to molecular pathological testing, especially in the metastatic
situation. Time requirements can therefore also be found in some of the international
recommendations on molecular diagnostics [17]. Delays in diagnosis can prolong the patient’s length of hospital stay and thereby
reduce hospital efficiency.
Germany plays a pioneering role by setting a TAT for pathology services in certified
breast centres and time requirements in the mammography screening programme.
In the German mammography screening programme, the period between the start of the
diagnostic assessment and the notification of the result should not exceed one week
[18].
The time windows for the TAT Pathology in the certified breast centres are as follows
[19]:
-
Histology results of core needle biopsies: within 2 working days,
-
Routine histology results incl. immunohistochemistry: max. 5 working days
These requirements have also been incorporated into the Manual for Breast Cancer Services
of the European Commission Initiative on Breast Cancer (ECIBC), albeit in a somewhat
less stringent form: Pathology results incl. immunohistochemistry: max. 5 working
days for non-surgical biopsies and 10 working days for surgical specimens [20].
Digitization of histopathology slides
Digitization of histopathology slides
Histological diagnostics requires a special, time-consuming and standardized processing
of the tissue to be examined. This involves tissue fixation, grossing, dehydration
and degreasing, immersion with paraffin, cutting with special microtomes, and finally
staining and mounting a cover slip over the tissue section on the slide. While some
of the tissue processing steps are automated, the grossing as a medical task and the
sectioning with a microtome, requiring a high level of fine motor skills, have not
yet been automated for routine application. Pathologists then examine the resulting
histological slides under the microscope and describe, report and classify their findings.
AI-supported diagnostics require that slides are digitized. The easiest way to do
this is by using digital microphotography directly at the microscope of the reporting
pathologist. A number of scanners with different features is commercially available
for the digitization of whole slides. Some models can be loaded with up to a thousand
slides which can, however, only be processed sequentially. The resulting whole slide
images (WSIs) have a size of 0.5 to 4 GB, depending on the amount of tissue and the
desired resolution; this image size is by several orders of magnitude higher than
that of standard images. A surgical case with 25 slides takes up 50 to 75 GB of hard
disc space. A few surgical specimens already exceed the amount of data that a large,
well-utilised radiography device (CT/MRI) produces in a year.
In contrast to radiology, the digitization of slides is always a secondary step in
histology. Scanning of a slide takes between 1 and 3 minutes. While in radiology,
primary digital image capture offers a speed advantage when taking plain radiographs
and is the only option of processing in diagnostic cross-sectional imaging, in histology,
digitization is an additional, time-consuming and very cost-intensive factor. The
costs result from the acquisition of slide scanners, most of which have to be purchased
as redundant equipment for capacity and downtime reasons, the high costs of storage
space, the digital transformation of workplaces, and the additional space, staff and
energy cost for devices and servers. These costs are significant and, as the costs
are not matched by compensation in any service area, it is almost impossible to provide
care in a way that covers costs, especially in standard outpatient care.
[Table 2] provides a comparison of digitized radiological and histological image data. Despite
the effort and high costs, many pathology facilities in Germany are now working on
the digitization of image data and are preparing for it by changing processes in the
run-up to image data digitization. These include the digitization of accompanying
documents for submitted specimens, the use of barcode printers for paraffin blocks
and slides, and the primary database recording of examinations that have been performed.
Aside from digital microscope cameras taking high-resolution, colour-balanced and
standardised images, many facilities have software suitable for the partial digitization
of slides. The pathologist films the whole slide or the region of interest in a meandering
motion under the microscope and one image is then assembled from suitable images of
the video stream in which overlapping image content is recognised (so-called stitching).
In this way, the pathologist can perform a part digitization of slides without spending
a significant amount of time; these images can then be used in particular for a cloud-based
low-threshold second opinion procedure. In addition, many facilities have a low-capacity
scanner available. Already today, these allow the digitization of selected slides
and the application of AI-based measurement and evaluation methods as well as the
internal development of custom models and algorithms.
Table 2 Comparison of the characteristics of radiological and histopathological image data.
|
Radiological image data
|
Histopathological image data
|
|
Resolution
|
Coarse
|
Microscopically fine
|
|
Colour channels
|
Usually one channel
|
Three channels (or more)
|
|
Digitization of image information
|
Primary, state of the art
|
Secondary, by no means everywhere as yet routine
|
|
Image data size
|
Moderate (MB range)
|
Very large (GB range)
|
|
Image data storage
|
2-dimensional
|
Pyramidal in various resolution levels
|
|
Image data format
|
Standardized, DICOM
|
Manufacturer-dependent proprietary
|
|
Digital workflow
|
Established
|
Available
|
|
Information density of the image data
|
Moderate
|
Very high
|
|
Information redundancy
|
Low
|
Very high
|
|
Interinstitutional variations
|
Lower
|
Higher
|
Artificial intelligence (AI) in breast pathology
Artificial intelligence (AI) in breast pathology
Breast pathology does offer a host of potential areas of applications of artificial
intelligence to support diagnostics. Despite the fact that, due to the complexity
of breast pathology, fully automated comprehensive histological diagnostics are still
a long way off, AI support for pathologists in individual tasks is conceivable. The
relevant key aspect of AI-supported reporting assistance is to improve quality through
objectifiable and more precise results in the evaluation of semi-quantitative individual
diagnostic parameters that are part of pre-operative core needle biopsy diagnosis,
immunohistochemical examinations of diagnostic and predictive factors and processing
of surgical specimens. The potential applications are numerous and cover almost all
aspects of routine diagnostics which have been summarised in extensive reviews [21]
[22]
[23]. Individual application fields are discussed below, together with the challenges
that arise when transferring promising AI approaches into routine practice.
AI-supported evaluation of immunohistochemical staining (Ki67, ER/PR, HER2/neu, PD-L1)
Software for the automated quantitative evaluation of immunohistochemical examinations
for predictive factors was already introduced about 10 years ago. The programmes were
originally based on image processing routines, but these can only be described as
artificial intelligence in a very broad sense. Today, however, there are neural network-based
software solutions on the market that can apply artificial intelligence in the narrower
sense. The task of quantitatively evaluating nuclear immunohistochemical staining,
as it is carried out for Ki67, the oestrogen receptor and the progesterone receptor,
is nowadays an entry-level task for companies positioning themselves on the market.
The potential advantage of an automated evaluation is a higher degree of objectivity
and accuracy. However, a common feature of the currently available software programmes
is that they can only evaluate a limited field of view, or that there is a considerable
time delay (several hours!) until the result becomes available. The task of determining
the proportion of labelled relevant cells is much more complex than it may appear
at first glance. The algorithms used not only have to distinguish between labelled
and unlabelled cells, but also be able to differentiate between the nuclei of tumour
cells and those of connective tissue cells, normal epithelial cells and inflammatory
cells, so that only tumour cells are included in the evaluation. This is a particular
challenge in sections that are only counter-stained with haematoxylin. There are also
similar commercial models on the market for HER2 diagnostics, including some which
are capable of differentiating between the HER2 scores 0 and 1+, an important feature
when specific antibody-drug conjugates are to be administered. For PD-L1, such models
are desirable, too. However, this is currently still hampered by the relatively complex
design of the various relevant scores for breast cancer.
In routine diagnostics, these methods for automated IHC evaluation will only become
widely adopted if the considerable costs for such software solutions are offset by
a tangible gain in quality. Especially in the area of hormone receptors, however,
in the case of Ki67 and HER2, the spectrum is distributed in such a way that an evaluation
focussed on the exact percentage value only achieves therapeutic relevance in a very
small proportion of cases. In the vast majority of cases, the pathologist’s results,
which are classified in increments of 5%, is perfectly adequate. AI-supported diagnostics
in borderline cases is associated with a significant time expenditure for the reporting
pathologist. At present, neither the additional time involved for the pathologist
nor the investment and maintenance costs for the AI programmes are funded at any point
in the care provision chain.
Detection of specific target structures (nodal metastases, microcalcifications)
Different organ systems have different requirements for the detection of small structures
which are difficult to identify in overview magnification. In breast pathology, these
challenges include the detection of microcalcifications in vacuum assisted biopsies
and surgical specimens. These are difficult to recognise due to the fact that histological
sections are essentially two-dimensional; as the result, microcalcifications can be
distributed across several step sections at different locations. By using haematoxylin
staining alone as an additional step section, detection can be improved as calcifications
usually stand out as more intensely stained structures compared to nuclei. Segmentation
of calcifications in the histological sections, ideally with automated measurement,
would be desirable. In contrast to automated microcalcification detection in mammography
which has slowed down the radiologists by showing them, especially in the earlier
years, far too many irrelevant microcalcifications, in the histological evaluation,
each microcalcification identified in the section is primarily important und must
be documented and correlated. However, oxalate crystals are not visible in the digitized
H&E section; polarisation microscopy has to be used to detect them.
Another diagnostic challenge in breast pathology is the detection of metastases in
lymph nodes. Several step sections have to be examined to detect metastases in sentinel
lymph nodes. Detection of metastases of lobular breast cancer is notoriously difficult
– due to that fact that a desmoplastic reaction is usually missing and that a dissociated
growth pattern is typically found in the metastatic lesion, too. Another common finding
is the marked sinus histiocytosis in the lymph node. In this case, there is less of
a risk of sinus histiocytosis cells being misinterpreted as metastases of the breast
cancer but rather the risk of small metastases being overlooked in the presence of
sinus histiocytosis. A strong AI innovation driver in medicine are international competitions
where annotated data sets are provided, and the time allowed for solving specific
tasks is usually limited to a few months. The detection of lymph node metastases of
breast cancer was the topic of the challenges CAMELYON16 and CAMELYON17 [24]. The particular difficulty of these Challenges was the processing of metastasis
detection on the level of whole slide images (WSIs). These Challenges have significantly
advanced the development of processing routines that enable the integrative processing
of entire WSIs and several WSIs for one patient. In the CAMELYON17 Challenge, the
task was to evaluate 5 WSIs of lymph nodes, which were grouped to virtual patients,
by an automated routine on slide level and on patient level, and to provide a nodal
stage as integrating diagnosis. The best algorithm classified 86.6% of WSIs correctly;
however, among the incorrectly classified slides there were 10 micrometastases and
4 macrometastases that were not recognized.
Grading assistance by AI
The grading of breast cancer has an extremely important prognostic and stratifying
function. This applies to both invasive and pre-invasive lesions. Since the 1990s,
this procedure has been part of histopathological routine diagnostics. Invasive breast
cancer is graded by adding the scores for three criteria reflecting architectural,
cytological and functional features [5]. This is different from other entities. In prostate cancer, where grading has a
similarly strong prognostic significance, grading is based solely on the tissue architecture
of the cancer. In breast cancer, the architectural feature assessed is the proportion
of area of tubular (and cribriform) differentiation. As the cytological criterion,
nuclear pleomorphism is stratified, and as the tumour functional feature the proliferation
activity is quantified as the number of mitoses per tumour area. In the core needle
diagnostics of breast cancer, the mitosis criterion is increasingly supplemented by
the growth fraction determined by Ki67 immunohistochemistry which allows a more accurate
and definite classification at the boundary between G2 and G3.
The use of artificial intelligence to support grading promises mathematical accuracy,
reproducibility and objectivity. In addition, computer-based diagnostics enable in
principle a diagnostic evaluation across the entire tumour at the highest magnification;
thus, it is better suited to show intratumoural heterogeneity. Using AI-based semantic
segmentation, it is possible to detect, virtually in real-time, the tumour epithelium
and with object segmentation several ten thousands of tumour nuclei. With subsequent
imaging processing routines, numerous measurement parameters can be recorded, and
the results can be aggregated to a graph representation. The parameters that can be
measured automatically include maximum and minimum nuclear diameter, nuclear area
and circumference, eccentricity and contour features as well as the orientation of
the main axis. In addition, the position in the image can, for example, be determined
as coordinates of the geometric centre of gravity. [Fig. 1] and [Fig. 2] show, as an example, the evaluation of the nuclear morphology of 2 breast cancers,
using a self-developed processing routine, and illustrate the dilemma in the implementation
of these detailed measurement results in the diagnostic routine: The three-stage classification
of nuclear pleomorphism is based on the comparison of the tumour cell nuclear area
with the area of normal epithelial nuclei as a means of normalizing hormonally induced
functional differences. The class limits were defined at 1.5-fold and 2.0-fold difference,
so that a score of 3 points is awarded for a nuclear pleomorphism with at least twice
the area compared to normal epithelial nuclei, and a score of 2 points for tumour
cell nuclei with 1.5 to 2 times the area. More than 30 years ago, this was likely
set as a rough guideline for operationalizing the term “nuclear pleomorphism” which
was not based on systematic measurements. However, the distribution of the diameters
of the nuclei of cancer cells, determined based on exact computer-aided measurements
shows such a high variability of the nuclear areas, but also of the other measured
variables, that these initial definitions cannot be applied to reality. As shown by
this special problem, various definitions in pathology must be further operationalized
before AI can be meaningfully integrated into the reporting process; doing so will
enable pathologists to keep pace with the possibilities of objective and simultaneous
measurement of numerous parameters.
Fig. 1 AI-supported analysis of the area, diameter and circumference of tumour and stromal
nuclei in breast cancer with nuclear grade 1. The prototype software demonstrated
here contains several modules for the automated measurement and data aggregation of
nuclei from cancer cells and stromal cells. Based on whole slide images (here from
the TCGA dataset [25]), the software analyses the regions of interest and automatically selects representative
sub-regions that are analysed in detail. These are marked with green squares in the
overview magnification (A). In B, the enlargement of a subregion is shown. The subregion is analysed with a neural
network for epithelial segmentation (D) and a network for instance segmentation of the nuclei (C
[26]
[27]
[28]). The nuclei are measured using image processing routines; the compartment to which
they belong can be determined from the position of the centroid in comparison with
the epithelial segmentation. The data (on average approx. 25 000 nuclei/case) are
output graphically (E: nuclear area, F: nuclear diameter, G: nuclear circumference). The entire processing routine takes approx. 50 seconds to
complete.
Fig. 2 AI-supported analysis of the area, diameter and circumference of tumour and stromal
nuclei in breast cancer with nuclear grade 3. The prototype software demonstrated
here contains several modules for the automated measurement and data aggregation of
nuclei from cancer cells and stromal cells. Based on whole slide images (here from
the TCGA dataset [25]), the software analyses the regions of interest and automatically selects representative
sub-regions that are analysed in detail. These are marked with green squares in the
overview magnification (A). In B, the enlargement of a subregion is shown. The subregion is analysed with a neural
network for epithelial segmentation (D) and a network for instance segmentation of the nuclei (C
[26]
[27]
[28]). The nuclei are measured using image processing routines; the compartment to which
they belong can be determined from the position of the centroid in comparison with
the epithelial segmentation. The data (on average approx. 25 000 nuclei/case) are
output graphically (E: nuclear area, F: nuclear diameter, G: nuclear circumference). The entire processing routine takes approx. 50 seconds to
complete.
Prediction of prognostic and predictive markers on the basis of H&E histology
In cancer diagnostics, an astonishing amount of activity in recent years has focused
on using artificial intelligence to predict prognostic and predictive factors on the
basis of the H&E section. In breast cancer, these factors are hormone receptors and
the HER2 status in particular. In Germany, the period between the initial reporting
of an H&E section and the immunohistochemical determination of oestrogen and progesterone
receptors, HER2/neu and Ki67 amounts to no more than a few working days. Technically,
organisationally and without loss of quality, it would in principle also be possible
to complete the conventional reporting in the morning and the determination, evaluation
and reporting of the above-mentioned factors in the afternoon of the same day. In
structured health care programmes, such as the mammography screening, the completion
of histological diagnostics within one week until the next multidisciplinary meeting
is common practice. The fact that in other countries such a tightly timed diagnostic
regime is rather unusual makes the desire for having predictive information already
available at the time of reporting the H&E section understandable. Here, the HER2
status is of particular interest, as in the event of an inconclusive immunohistochemical
result (HER2-Score 2+), it is necessary to carry out in situ hybridisation, which
is considerably more time-consuming than immunohistochemistry. The international HEROHE
Challenge (HER2 on HE) addressed this issue in 2021/22 [29]. A total of 25 valid models were submitted and approved for evaluation. Even in
the top group, however, the quality of the predictions can at best be described as
moderately good on sober reflection. However, this is hardly surprising: One could
not expect that the HER2 positivity of a tumour is associated with specific or even
defining morphological characteristics in the image or that the information of HER2
positivity is hidden somewhere in the pixel noise. The model submitted by the participant
with the highest ranking achieved a precision of 0.75 and a recall of 0.84, based
on a receiver operator characteristics area under the curve (ROC AUC) of 0.84. In
other words, the prediction of HER2 positivity is incorrect in 1 in 4 cases and 1
in 6 HER2-positive cases is not recognised. Unfortunately, these figures show the
unsuitability of such an approach for patient-specific diagnostics and the indispensability
of direct determination of predictive factors using established methods. Due to the
lack of precision and insufficient recall, such an approach is also unsuitable for
potential applications in quality assurance or for the selection of patient populations
that potentially do not need to be tested.
Yet it is still possible that other molecular parameters have a closer correlation
with conventional histology. For example, a certain correlation with conventional
H&E morphology has recently been shown for BRCA1/2 mutation status in breast cancer
[30]. Even if the correlation in this case is far from perfect, an AI-generated prediction
of the mutation status could be beneficial in particular to those patients for whom
a corresponding mutation analysis is considered not often enough, e.g., patients with
hormone receptor-positive breast cancer.
Comprehensive AI-supported diagnostics
Histopathological evaluation is the key step in the diagnosis of cancer: Pre-operative
clinical and imaging diagnostics are regularly followed by pre-operative histopathological
diagnostics; the treatment modalities and the treatment sequence then primarily follow
the various individual histopathological aspects of the tumour. Accordingly, there
is a need for the diagnostics to offer the greatest possible legal certainty, in addition
to the medical, technical and quality assurance aspects. Modern pathology meets this
legal aspect in particular by relying on criteria-based diagnostics. Diagnosis is
a systematic and step-by-step process in which defining criteria are established for
entities and differential diagnostic boundaries are drawn between similar entities.
This is done on the basis of qualitative and quantitative criteria for H&E histology
which are further refined by immunohistochemical and molecular pathological criteria.
Accordingly, the training of pathologists is focused on recognising deviations from
the norm as a region of interest in a first step, describing and analysing these changes
using predefined criteria, and then deriving a well-founded, rule-based diagnosis
in a final step. The diagnosis is established with full understanding of the clinical
consequences. It is virtually impossible to replicate such a systematic approach with
the currently available AI methods. Using annotated data sets, neural networks are
trained towards a desired outcome. The criteria that a neural network finds during
training are a pure matrix of numbers and cannot be influenced by canonical knowledge
of diagnostic criteria. The numerous explainable AI approaches have also not yet been
able to adequately evaluate whether and how a trained neural network applies the desired
criteria. Complex diagnostic tasks can be performed using AI by breaking them down
into clearly defined individual tasks of low complexity. Neural networks can be trained
for each individual task, and their intermediate results can be checked, as already
described above for nuclear grading ([Fig. 1] and [Fig. 2]). The complexity of this approach is similar to that associated with the implementation
of fully autonomous driving which is based on the carefully orchestrated interaction
of numerous neural networks for information abstraction of the traffic environment
and the obligatory software implementation of the traffic rules as a so-called traffic
rule engine to control the individual agents of the software algorithm. Given its
complexity and limited market potential, it is questionable whether pathology can
be modelled with rule engines in such comprehensive expert systems.
Consultation and reference pathology
Consultation and reference pathology
The purpose of consultation pathology is to obtain a second opinion from a specialised
pathologist if there is uncertainty about a diagnosis or if the question is particularly
complex. Breast cancer is the most common carcinoma in women. Consequently, evaluating
breast tissue specimens is part of the daily routine of many pathologists.
The German Mammography Screening Program (MSP) provided proof that an obligatory second-opinion
or reference evaluation of breast tissue specimens is unnecessary. In analogy to mammography
reporting, a second-opinion evaluation of all core needle biopsies by reference pathologists
was required during the initial phase of the German MSP, i.e., during the first 2
years after the start of the screening programme. At that time, this was a worldwide
unique demand on screening pathologists. Three of the involved 5 German reference
centres pooled and analysed their data then. Based on almost 10 000 duplicate evaluations,
a concordance rate of 94% to 98% was found. For carcinomas, the degree of concordance
between the first and second evaluating pathologist was above 99% [31]. Somewhat lower concordance rates were found for lesions with unclear biological
potential (B3/B4), which are significantly less common. These borderline categories
include a wide variety of lesions such as spindle cell lesions, phylloid tumours,
papillary lesions, and atypical intraductal proliferations. With certain lesions,
such as atypical ductal hyperplasia (ADH), the diagnostic concordance is only moderate
– even between experts [32]. Thus, the increased interobserver variability noted in these cases should not be
interpreted as an indication of inadequate qualification of the pathologists, but
rather of suboptimal objectifiability and reproducibility of the available diagnostic
criteria, as also suggested by the results of the diagnostic Round Robin tests in
the UK [33]
[34]. Based on the results presented above, the second opinion evaluation by pathologists
was limited to the first 50 cases in the German MSP. At the same time, the German
MSP, as a structured care programme, goes one step further with regard to quality
assurance. Participation in the MSP requires regular specialist continuing education
for pathologists, too, which is interlinked with interdisciplinary continuing education.
The fact that in B3 lesions, despite the methodological challenges, the concordance
rate between the first and the second evaluating pathologist is quite high in Germany
(with 75%–90%) by international standards may be attributable to this continuing education
effort.
Thus, obtaining a second opinion or reference evaluation is limited to specific questions
in breast pathology. The most common request is the classification of unusual and
rare changes. Other reasons for obtaining a second opinion include discrepancies between
clinical findings and primary pathology evaluation and the wish of a patient to hear
a second opinion before treatment is started. A separate field is reference pathology
for studies which is dedicated to establishing and standardising new histomorphological
parameters or molecular biomarkers, which can then be established and used in a decentralised
manner, too [35].
The requirements for a second opinion or reference pathology thus comprise specialist
expertise with experience in the evaluation of complex, challenging cases, access
to modern diagnostic methods, certification or accreditation with regular participation
in external quality assurance measures, and timely reporting.
Conclusions
-
Given the comparatively broad spectrum of functional, reactive and neoplastic changes
in the breast, the systematic analysis of the architectural and cytological characteristics
of a lesion in order to validly classify it on the basis of reproducible criteria
poses a particular challenge in breast pathology.
-
Recognising and documenting all tumour characteristics that are relevant to clinical
management is part of the scope of pathology.
-
Pathology reports should be written comprehensibly, completely and quickly.
-
The utilisation of structured protocols to document findings supports this and facilitates
international comparability.
-
There are hopes that the increasing digitalization of pathology and the use of artificial
intelligence (AI) will speed up the preparation and reporting process in pathology
and make diagnostics more objective. However, apart from the lack of transparency
of AI-generated decisions, technical and financial limitations still need to be overcome
before using AI can really contribute to faster and better reporting in pathology.