Keywords MRI - segmentation - kidney
Abbreviations
AKI:
Acute kidney injury
ASL:
Arterial spin labeling
BOLD:
Blood-oxygenation-level-dependent
CKD:
Chronic kidney disease
COST:
Cooperation of Science and Technology
D:
Diffusion coefficient
DWI:
Diffusion weighted imaging
EPI:
Echo-planar imaging
FLASH:
Fast low angle shot
FOCI:
Frequency offset corrected inversion
F:
Perfusion fraction
ICC:
Intraclass correlation coefficient
IVIM:
Intravoxel incoherent motion
mGRE:
Multi-echo spoiled gradient echo
MOLLI:
Modified look-locker inversion recovery
MRI:
Magnetic resonance imaging
PCASL:
Pseudo-continuous arterial spin labeling
RBF:
Renal blood flow
RC:
Repeatability coefficient
ROI:
Region of interest
TE:
Echo time
TR:
Repetition time
VFA:
Variable Flip angle
wCV:
Within-subject coefficient of variations
wSD:
Within-subject standard deviation
Introduction
Multiparametric quantitative magnetic resonance imaging (MRI) is gaining attention
in research but still needs to prove its role in clinical kidney diagnostics. As a
noninvasive modality without radiation exposure and no need for potentially nephrotoxic
contrast agents, MRI is especially attractive for renal imaging. A broad range of
MRI techniques has been reported to be useful for the assessment of structural and
functional information of the kidneys. Renal perfusion, which is a critical element
in the development of various kidney diseases, such as acute kidney injury (AKI) and
chronic kidney disease (CKD), can be quantified using arterial spin labeling (ASL)
techniques [1 ]. Diffusion-weighted imaging (DWI) can help to detect pathologies in microstructure
caused by fibrosis for instance [2 ]. Tissue oxygenation also plays an important role in renal pathophysiology and can
be depicted by blood oxygenation level-dependent (BOLD) imaging [3 ]. Other MRI parameters such as T1 and T2 can provide deeper insight into structural
changes in kidney tissue [4 ]. An overview of the most common MRI techniques for functional imaging of the kidneys
is provided in [Table 1 ].
Table 1 Overview of functional MRI techniques for renal imaging
MRI technique
MRI measure
Biomarker
Application
Arterial Spin Labeling (ASL)
Renal blood flow (ml/100mL/min)
Tissue perfusion
Renal artery stenosis
Kidney transplant dysfunction
Acute kidney injury
Chronic kidney diseases
Renal masses
Diffusion-weighted imaging (DWI)
Diffusion (mm2 /sec)
Tissue diffusion; changes in microstructure due to fibrosis, cellular infiltration,
or edema
Kidney transplant dysfunction
Acute kidney injury
Chronic kidney diseases
Renal masses
Inflammatory diseases
Blood oxygen-dependent (BOLD) MRI
T2* map (ms)
Tissue oxygenation; changes in the microstructure of the capillary bed
Renal artery stenosis
Kidney transplant dysfunction
Acute kidney injury
T1 & T2 mapping
T1/T2 relaxation time (ms)
Tissue characterization, changes in molecular environment (water content, fibrosis,
inflammation)
Kidney transplant dysfunction
Chronic kidney diseases
Although all these parameters seem to play an important role in different pathologies
of the kidneys, it should be noted that they can only be determined by MRI and there
is no noninvasive gold standard for comparison with the acquired results. For clinical
studies with different patient groups or longitudinal clinical trials, it is, therefore,
even more important to understand the measurement-related variance of the resulting
data. Derived values might also depend on the method used for image analysis.
In the past decades, several research groups have applied multiparametric MRI protocols
to examine kidneys in healthy volunteers and patients with different kidney diseases
[5 ]
[6 ]
[7 ]
[8 ]
[9 ]
[10 ]
[11 ]
[12 ]
[13 ]
[14 ]
[15 ]
[16 ]
[17 ]
[18 ]
[19 ]
[20 ]. Even though repeatability studies have been performed for different MRI parameters,
the diversity of MRI protocols, post-processing, and analysis strategies hinders the
comparability of these studies and thus impedes the final assessment of clinical applicability
[21 ]
[22 ].
In view of these difficulties, joint recommendations concerning renal MRI protocols
have been developed by a pan-European network of researchers in renal MRI (PARENCHIMA)
funded by the European Cooperation of Science and Technology (COST) to harmonize and
standardize data collection approaches [23 ]. These recommendations comprise proposals for the composition of MRI protocols,
settings, and readout techniques, but they also point out missing evidence for the
best settings and applications [22 ]
[24 ]
[25 ]
[26 ]
[27 ].
Considering the rising amount of medical imaging, fast image analysis techniques are
gaining importance [28 ]. Automatic segmentation approaches using deep learning are sought, but as they are
still in progress and lack broader availability, manual segmentation remains an essential
analysis technique also used for training data when developing automatic segmentation
tools. To our knowledge, however, there is no study evaluating different manual segmentation
techniques and automatic segmentation for the kidney concerning reproducibility.
Further studies on repeatability, reliability, and validity of multiparametric functional
MRI protocols and analysis strategies for kidney diagnostics pave the way for larger
clinical trials and finally transfer to the clinical setting.
The aim of this study was to evaluate a multiparametric renal functional MRI protocol,
guided by the PARENCHIMA recommendations, on the test-retest reliability using two
different established manual image analysis techniques including manual tissue segmentation
and representative region of interest (ROI)-based analysis as well as deep learning-based
automatic segmentation.
Methods
This study is the first subgroup analysis of a prospectively conducted study. The
study was approved by the local ethics committee and all volunteers gave written informed
consent regarding the examination and the scientific evaluation of their data.
Subjects
Ten healthy volunteers were examined twice with one week between visits. The same
time of day and identical scanning protocols were selected for both examinations to
be compared. Exclusion criteria were a history of renal or cardiovascular disease,
contradictions for MRI, and implants near the kidney region. Volunteers were asked
to avoid salt and protein-rich meals and above-average amounts of coffee on the day
of examination as well as liquids and larger meals 2 hours before the MRI scan but
were instructed to drink regular amount of liquids throughout the day.
Imaging protocol
Images were acquired using a 3T whole-body MRI system (MAGNETOM Prismafit , Siemens Healthcare, Erlangen, Germany) with an 18-channel matrix-array coil in combination
with 12 channels of a spine-array coil. The examination protocol included several
sequences for functional imaging:
A 3D single-shot pseudo-continuous ASL (PCASL) research sequence with optimized turbo
gradient spin echo (TGSE) readout for contrast-free perfusion imaging [29 ]. The four non-selective hyperbolic secant inversion pulses in combination with the
selective pre-saturation and frequency offset corrected inversion (FOCI) pulses ensure
efficient saturation of the background signal. 10 pairs of images with labeling duration
of 1500 ms, post-labeling duration of 1500 ms, and labeling flip angle of 25°. The
ten label-control image pairs and an M0 scan were acquired under free breathing.
A diffusion-weighted single-shot echo-planar imaging (EPI) research sequence with
reduced field of view (zoomed) for intravoxel incoherent motion (IVIM) imaging; a
four-directional diffusion mode 4-scan trace with monopolar diffusion gradient scheme
and b-values of 0, 10, 30, 50, 70, 100, 150, 200, 400, and 800 s/mm2 was applied. Two sets of DWI data were acquired with the phase-encoding direction
reversed (head to feet and feet to head) to enable the geometric distortion correction
resulting from EPI acquisition [30 ].
A multiple-echo spoiled gradient echo (mGRE) sequence for BOLD imaging. Ten echoes
with TE1 =2.46 ms and ΔTE=4.92 ms were acquired. Images were acquired with navigator triggering
under free breathing.
A 3D variable flip angle (VFA) approach using a volumetric interpolated breath-hold
examination (VIBE) sequence as well as a 2D inversion recovery technique using a modified
look-locker inversion recovery (MOLLI) sequence for T1 mapping.
A 2D T2prep turbo fast low angle shot (FLASH) sequence for T2 mapping. The measurement
consisted of 24 repetitions with a continuously increasing echo time of the preparation
module: TE1 =8 ms and ΔTE=8 ms. At the start, a measurement without preparation was performed.
Additionally, 2D T1-weighted GRE and T2-weigthed half-Fourier acquisition single-shot
turbo spin echo (HASTE) anatomical images were acquired and 3D T1-weighted volumetric
interpolated breath-hold examination (VIBE) imaging was performed for volumetric analysis.
Details of the imaging protocol are summarized in [Table 2 ]. Anatomical images, T1 mapping, and BOLD images were acquired during breath-hold.
IVIM measurements as well as T2 mapping were performed using navigator gating. ASL
data were measured under free breathing conditions. The total scan time was approximately
45 minutes.
Table 2 Overview of the multiparametric MRI protocol. (Resp. comp.: respiratory compensation;
FB: free breathing; NAV: navigated breathing; BH: breath-hold; MBH: multiple breath-holds;
BW: readout bandwidth; TF: Turbo FLASH).
ASL
DWI
BOLD
T1 mapping
T2 mapping
Volumetry
Sequence
PCASL
IVIM
mGRE
VFA
T2prep TFL
VIBE
TR (ms)
6000–7400
1500
133
3.5
5000
3.95
TE (ms)
27.46
55.0
2.46–46.74
1.24
2.59
1.23/2.46
Flip angle (deg)
90/180
90/180
40
2, 10
8
8
BW (Hz/Px)
3064
2004
400
430
490
890
Matrix
96×48×24
192×96×24
256×205×12
256×205×32
256×205
256×205×32
FOV (mm3 )
380×380×144
380×380×72
380×380×54
380×380×96
380×380×8
380×380×96
Voxel size (mm3 )
4.0×4.0×6.0
1.5×1.5×3.0
1.5×1.5×3.0
1.5×1.5×3.0
1.5×1.5×8.0
1.5×1.5×3.0
Resp. comp.
FB
NAV
MBH
BH
NAV
BH
Scan time (min)
4:18
5–10
3:30
0:15
3–4
0:16
Post-processing of MRI data
For ASL data, motion correction was performed retrospectively using 3D elastic registration
software (provided by the manufacturer). Renal blood flow (RBF) maps were then calculated
based on the Buxton model [31 ] using the expression given in Eq. 1 by Robson et al. [32 ]. A constant T1 of 1200 ms was assumed for the entire kidney and a T1 of 1600 ms
was used for blood. Other parameters were: arrival time: 750 ms; labeling duration:
1500 ms; post-labeling delay: 1500 ms; inversion efficiency: 0.8; additional inversion
inefficiency from background suppression pulses: 0.75; tissue/blood partition coefficient:
0.9 ml/g.
For diffusion data, image processing was applied by using the FMRIB-FSL (v. 6.0.7)
library for the left and right kidneys separately [33 ]. In particular, the susceptibility-induced geometric distortions were corrected
by FSL topup using the additional DWI data with reversed phase-encoding direction. Moreover, eddy-current
distortions, volume-to-volume movement, and slice-to-volume movement were corrected
with FSL eddy
[34 ]. Diffusion coefficient (D) maps and perfusion fraction (f) maps were calculated
based on the IVIM model [35 ]. To obtain more stabilized IVIM parameters, we used a 2-step fitting procedure (also
known as segmented fitting) in which D was estimated using higher b-values (b= 200,
400, 800). Once D was estimated, an additional data point, Sintercept , at b = 0 was introduced to allow calculation of the perfusion fraction using the
equation: f = (S0 – Sintercept )/S0 ), where S0 is the measured signal by b=0 [36 ].
For BOLD data, no motion correction was required. T2* maps were calculated using manufacturer
software on the scanner.
For T1 mapping using the VFA approach, B1+ inhomogeneities were corrected using B1-mapping
scan from the manufacturer. In addition, respiratory movements were corrected using
FSL flirt image registration [33 ]. Finally, T1 maps were calculated by linear regression.
For T2 mapping, motion correction was performed before quantitative evaluation using
LAP image registration [37 ]. Data was fitted to a monoexponential model in a voxel-wise manner.
Image analysis
Manual image segmentation
Manual image analysis was performed on a standalone PC by a radiologist with 6 years
of experience in functional MR imaging analysis of the kidneys. For evaluation of
inter-reader agreement, 2 additional readers with several years of experience in renal
research segmented anatomical T1 images.
Two different techniques were applied to manually define the renal components for
the subsequent extraction of functional information. On the one hand, representative
ROIs were located on the cortex and medulla on a central image slice. For each structure,
3 circular ROIs were placed on the superior pole, middle part, and inferior pole on
a single central image slice. On the other hand, the renal cortex and the total kidney
excluding the renal pelvis were segmented manually on all image slices. The medullary
components were calculated from the total kidney and cortex segmentation masks. Both
ROI image analysis and manual segmentation were performed on the NORA Medical Imaging
Platform (University Medical Center Freiburg [38 ]). ROI and segmentation masks were extracted in the next step for functional analysis.
[Fig. 1 ] gives an example of the masks created on NORA by manual segmentation and placement
of ROIs.
Fig. 1 Example of masks created on anatomical T1-weighted images by a manual segmentation of the cortex (right kidney, green) and the total kidney (left
kidney, blue) and by b placement of ROIs on the cortex (red circles) and medulla (yellow circles).
Automatic segmentation
We based the configuration of the segmentation model on the nnU-Net framework [39 ]. A 5-level 2D U-Net architecture was chosen that operates using deep supervision.
The outputs of the three highest resolutions in the decoder were used to form the
final segmentation mask. The input patch size was selected to be 192×128. The first
encoding level had 32 convolutional kernels that are doubled after each downsampling
with a maximum of 320 kernels at the bottleneck. The decoder's kernel count reflected
that of the encoder. Leaky ReLU with slope of 0.01 and batch normalization were applied
after every convolution.
We ran a four-fold cross-validation. The training loss consisted of the sum of the
Dice score and cross-entropy loss and operated on 2 class labels including the left
and right kidneys, calculated at the full resolution output and the auxiliary outputs
of lower resolution. Different data augmentation strategies were applied on the fly
during training to help the model learn transformation invariant features including
rotation, cropping, scaling, additive brightness and contrast to the input images,
and elastic transformations. Training was conducted with stochastic gradient descent
with an initial learning rate of 0.01, decaying with a polynomial schedule [40 ], and a Nesterov momentum of 0.99 and ran for a total of 1000 epochs with a batch
size of 8, where one epoch is defined as 500 iterations. The Dice score on the current
validation set was used to monitor the training progress.
Statistical analysis
Mean values were obtained from the 3 ROIs placed on the cortex and medulla of each
kidney. To assess agreement of test-retest measurements, the repeatability coefficient
(RC) and the within-subject coefficients of variations (wCV) were calculated based
on the within-subject standard deviation (wSD) as:
and
where d and m are the difference and mean values for scan-rescan measurements and n is the number of subjects. Additionally, the intra-class correlation coefficient
(ICC) was calculated, and Bland-Altman plots were used to evaluate the agreement between
test-retest scans with limits of agreement calculated as the mean difference ±1.96
SD of difference. Scatterplots were generated to visualize agreement between test-retest
scans comparing different renal compartments and image analysis. A paired t-test was
used to analyze the difference between segmentation strategies, kidney sides, and
renal compartments with p<0.05 considered statistically significant. Moreover, we
also analyzed the repeatability of the ratio of each parameter of the right and left
kidney. The Dice score was used to determine the inter-reader agreement between the
3 readers for manual tissue segmentation of anatomical T1 images and the accuracy
of automatic segmentation.
Analyses were carried out using MATLAB (The MathWorks, Natick, MA) and SPSS (IBM Corp.,
IBM SPSS Statistics, Version 27.0, Armonk, NY) software.
Results
Ten healthy volunteers (age range between 19 and 41, 5 female participants) were successfully
examined. [Table 3 ] provides an overview of all calculated repeatability measures. [Fig. 2 ] depicts kidney images of a healthy volunteer examined with the multiparametric MRI
protocol.
Table 3 Overview of the test-retest repeatability evaluation of the multiparametric functional
MRI protocol. MOS: manual organ segmentation; AOS: automatic organ segmentation; ROI:
region of interest; SD: standard deviation; RC: repeatability coefficient; wCV: within
subject coefficient of variation; ICC: intraclass correlation coefficient; D: diffusion
coefficient; f: perfusion fraction.
Baseline
Follow-up
Bias (%)
RC
WCV (%)
ICC
Parameter
Unit
Segmentation
Mean
SD
Mean
SD
RBF
mL/100mL/min
MOS
Total kidney
190.12
33.50
179.38
36.25
5.25
91.92
17.96
0.14
Cortex
224.67
48.06
216.80
38.77
–1.00
122.09
19.97
0.00
AOS
Total kidney
230.59
50.35
208.02
39.58
11.85
109.64
18.05
0.33
ROI
Cortex
167.41
40.18
140.45
36.81
23.60
86.46
20.28
0.49
Medulla
199.45
33.79
180.63
33.66
12.40
89.20
16.95
0.22
D
10–6 mm2 /s
MOS
Total kidney
1565.94
60.74
1558.67
62.22
0.38
123.53
2.85
0.49
AOS
Total kidney
1563.94
56.96
1563.56
61.60
–0.05
128.17
2.96
0.40
ROI
Cortex
1575.21
108.32
1598.54
109.34
–1.81
242.27
5.51
0.38
Medulla
1490.91
81.33
1515.42
93.92
–1.86
213.57
5.13
0.27
f
%
MOS
Total kidney
17.94
3.52
17.94
3.69
–2.32
8.46
17.02
0.29
AOS
Total kidney
17.83
3.39
18.07
3.39
–3.51
7.67
15.43
0.34
ROI
Cortex
19.41
4.89
18.58
5.95
1.68
12.16
23.12
0.37
Medulla
18.20
6.72
16.17
6.59
–10.22
17.80
37.39
0.11
T2*map
ms
MOS
Total kidney
51.05
3.07
51.43
3.91
–0.77
5.24
3.69
0.72
Cortex
56.41
3.79
56.05
4.26
0.52
6.16
3.96
0.71
Medulla
44.50
3.23
45.20
4.92
–1.74
9.45
7.61
0.35
AOS
Total kidney
51.00
2.91
51.29
4.18
–0.58
5.97
4.22
0.65
ROI
Cortex
57.22
7.11
56.27
9.85
0.58
20.96
13.33
0.24
Medulla
43.62
8.10
44.56
7.70
–4.62
12.43
10.18
0.69
T1 map
ms
MOS
Total kidney
1672.63
101.53
1697.44
60.51
–1.78
187.07
4.01
0.39
Cortex
1543.71
101.54
1569.04
64.45
–1.92
162.11
3.76
0.56
Medulla
1787.67
124.32
1790.49
84.40
–0.50
209.60
4.23
0.51
ROI
Cortex
1521.14
200.02
1522.07
118.23
–1.59
395.99
9.40
0.25
Medulla
1966.83
225.52
1967.67
230.63
–1.21
602.11
11.05
0.10
T2 map
ms
MOS
Total kidney
89.12
6.00
97.57
6.00
–1.49
14.73
5.93
0.19
ROI
Cortex
87.38
7.80
90.23
7.00
–3.85
17.87
7.26
0.31
Medulla
88.19
7.13
91.64
7.00
–4.50
19.79
7.95
0.04
Volume
mL
MOS
Total kidney
102.07
20.42
100.47
20.06
0.98
18.51
6.56
0.87
Cortex
41.42
12.28
40.55
7.79
–3.20
21.12
18.78
0.01
AOS
Total kidney
99.11
20.03
100.01
20.23
–2.09
29.00
10.51
0.74
Fig. 2 Quantitative images of a healthy volunteer examined with a multiparametric functional
MRI protocol: a renal blood flow (RBF) derived from ASL (ml/min/100g), b T2* map (ms), c D map (10–6 mm2 /s), d perfusion fraction (f, %), e T1 VFA (ms), f T2 map (ms).
Segmentation
Manual segmentation of the total kidney volume was performed for all MR parameters.
Manual segmentation of the cortex could be performed for ASL, BOLD, and T1 maps. Medulla
masks could be obtained by subtraction for T2* and T1 maps. Automatic segmentation
of the kidney volume was performed for ASL, DWI, BOLD, and VIBE. ROI analysis for
the cortex and medulla was performed for all parameters.
Test-retest repeatability
Test-retest repeatability of functional MRI measurements varied depending on MR parameters,
kidney compartment, kidney side, and image analysis strategy.
Comparing different functional MR parameters, best repeatability could be achieved
with DWI (wCV 2.85–5.13%), followed by BOLD (wCV 3.69–10.18%), T1 map (wCV 4.01–11.05%),
and T1 map (wCV 5.93–7.95%), whereas perfusion measurement with ASL and the perfusion
fraction derived from IVIM resulted in considerably lower repeatability (RBF: wCV
17.96–20.28%, f: wCV 17.02–7.39%). The repeatability of volume measurements with manual
and automated segmentation was moderate with wCV between 6.51% and 10.51% and relatively
low for cortex volume by manual segmentation.
In the comparison of kidney compartments, there was no significant difference in repeatability
of MR values of different parameters between the medulla and cortex except for perfusion
measurements with ASL and the perfusion fraction derived from IVIM (p<0.05).
Comparing different image analysis strategies, ROI analysis of the cortex and medulla
showed significantly less repeatability (p<0.05) compared to manual segmentation of
the cortex and medulla in T1 and T2* maps. ROI analysis in RBF maps achieved similar
repeatability results to manual segmentation. There were no significant differences
in quantitative values between automatic segmentation and manual segmentation of the
total kidney across all parameters. Repeatability was slightly better for manual segmentation
in almost all parameters and in volumetry except for the perfusion fraction (f) derived
from IVIM.
There was no significant difference concerning quantitative measurements and repeatability
between the right and left kidney across all parameters. There was also no significant
difference in repeatability between the cortex and medulla except for perfusion measurements
with ASL and IVIM (p<0.05).
Inter-reader agreement
The inter-reader agreement across all 3 readers for manual segmentation of anatomical
T1 images was acceptable for segmentation of the total kidney with Dice scores between
0.79 and 0.86, but considerably lower for segmentation of the cortex with an average
dice score between 0.66 and 0.76.
Accuracy of automatic segmentation
Automatic segmentation was applied for VIBE, ASL, DWI, and BOLD. Manual segmentation
served as the ground truth reference. Automatic segmentation using the nnU-net framework
showed overall acceptable accuracy with Dice scores between 0.86 and 0.92 displayed
in [Fig. 3 ]. The highest segmentation accuracy was achieved with anatomical VIBE and diffusion
maps (0.91), while segmentation of RBF images showed the lowest segmentation accuracy
(0.86). [Fig. 4 ] shows examples of automatic segmentation masks (lower row, red) compared to manually
segmented masks (upper row, green).
Fig. 3 Dice scores across different MR parameters: ASL, DWI, BOLD, and VIBE.
Fig. 4 Comparison of manual (upper row) and automatic organ segmentation (lower row) of the
total kidney across different MR parameters: VIBE, ASL, DWI, and BOLD.
Discussion
In this study, we evaluated a multiparametric functional non-contrast protocol for
renal MRI concerning test-retest repeatability of parameters when acquired with manual
segmentation, ROI analysis, and automatic segmentation.
In contrast to preceding studies applying multiparametric renal MRI protocols [17 ]
[41 ]
[42 ]
[43 ]
[44 ], our study comprises several new aspects concerning evaluation of functional MRI
of the kidneys. We applied a broad selection of functional MR parameters inspired
by the PARENCHIMA recommendations with differing new features. We implemented a 3D
single-shot PCASL research sequence for improved SNR and reduced motion artifacts
in perfusion imaging. Our protocol also included a diffusion-weighted single-shot
EPI prototype sequence with reduced field of view (zoomed) for IVIM imaging, where
b-values from 0 to 800 s/mm2 were applied and 2 sets of DWI data acquired with the phase-encoding direction were
reversed to enable the geometric distortion correction resulting from EPI acquisition.
Different models exist for DWI, the most common being the monoexponential model with
the measurement of the apparent diffusion coefficient (ADC), the biexponential model
(IVIM,) and diffusion tensor imaging (DTI) [26 ]
[45 ]. Several studies have shown improved representation of the diffusion-weighted signal
in kidneys with IVIM compared to ADC [46 ]
[47 ]
[48 ]. DTI also provides additional information by measuring the directional dependence
(anisotropy) of apparent diffusion in the tissue [49 ]. In our study, we did not use DTI, since no navigator-triggered acquisition was
provided for this sequence by the manufacturer. Alternative acquisition methods would
have been measurement under free breathing without triggering, which is accompanied
by severe motion artifacts, or the use of a respiratory belt, which proved no reliable
sequence triggering in our experience. Moreover, the additional acquisition time would
have exceeded the examination time of our protocol for possible application in clinical
settings.
Evaluation of our multiparametric functional MRI protocol concerning test-retest repeatability
showed that our study results were in the range of preceding studies [17 ]
[41 ]
[42 ]
[43 ]
[44 ]. As reported in previous studies, repeatability was better for structural measurements
such as T1 and T2 mapping and DWI compared to functional measurements including ASL
and BOLD [41 ]. The repeatability of RBF and f results between test and retest measurements was
lowest compared to the other evaluated parameters. It is also known that both RBF
and f are very sensitive parameters influenced by various physiological changes such
as hydration [50 ]. It is, therefore, unclear if the low repeatability is a limitation of the technique
or if there is a true difference in the perfusion of the kidney between the first
and the second measurement. As recommended [24 ], we instructed the volunteers to pay attention to sufficient hydration during the
day and avoid fluids and larger meals 2h before the examination. Examinations were
conducted at the same time of day for test and retest measurements to minimize physiological
changes due to the circadian rhythm [25 ]. However, there is no evidence whether these arrangements help to reduce artifacts
and improve repeatability. BOLD imaging is known to be sensitive to magnetic susceptibility
artifacts, which we also identified especially in the region of the left colic flexure,
probably due to intestinal gas. This might have affected the repeatability of BOLD
measurements in the left kidney and resulted in higher overall wCVs compared to results
of the right kidney. The awareness of the range of variation of values due to physiological
changes is crucial for clinical studies to differentiate between physiological and
pathological values. For instance, the median bias between the test and retest measurement
of RBF in our study ranged between 1% and 23% depending on the image analysis technique.
In a study applying a multiparametric MRI protocol for assessing kidney function in
patients with acute kidney injury by Buchanan et al., differences between approximately
40% and 60% between time points of renal recovery could be measured [51 ]. Repeatability of MRI parameters might differ depending on the technical details
of the applied protocol, which is why test-retest studies for individual study protocols
are crucial.
Besides the evaluation of the repeatability of functional renal MRI parameters, our
study included a comparison between different image analysis strategies for assessing
quantitative MRI results.
Different image analysis strategies posed different challenges. Manual segmentation
of renal compartments was not possible for all MR parameters since several parameter
maps such as diffusion and T2 maps provided no corticomedullary image contrast. If
the corticomedullary contrast is sufficient, the renal cortex can be segmented and
medulla masks can be subtracted from the total kidney and cortex masks, except for
ASL, where the medullary SNR was too low. Inter-reader agreement between 3 readers
performing manual segmentation on anatomical T1 images was acceptable for the segmentation
of the total kidney with Dice scores between 0.79 and 0.86, but poor for segmentation
of the cortex with Dice scores between 0.66 and 0.76. Evaluation of manual segmentation
masks indicated that readers delineate organ and compartment edges differently, in
that they are either more restrictive or extensive in edge definition. The inter-reader
difference in edge definition, therefore, mainly affected the inter-reader agreement
but not the quantitative analysis of MR parameters.
Automatic segmentation of total kidney volume was performed for ASL, DWI, BOLD, and
VIBE. Despite the relatively small amount of training data, the nnUNet framework yielded
acceptable segmentation accuracy with Dice scores between 0.86 and 0.91. Segmentation
performance for ASL was poorer compared to the other investigated contrast agents.
This was likely due to the inherently greater variability in signal intensity in RBF
maps and low SNR compared to the other contrast agents. As a limitation, automatic
segmentation of the cortex and medulla was not included in the training.
By applying different image analysis strategies in this study, we evaluated the effect
of image analysis methods on the repeatability of quantitative MR results. For most
of the MRI parameters, our study outcomes showed comparable results concerning repeatability
with manual and automatic segmentation of total kidney volume. ROI analysis in the
cortex and medulla, however, showed significantly lower repeatability in nearly all
MR parameters compared to manual segmentation of the cortex and medulla. One exception
was perfusion measurements with ASL, where repeatability results were relatively low
for all image analysis strategies. ROI analysis has been a common image analysis method
in the past decades, when automatic segmentation was not available yet and fast image
analysis for quantitative results was needed. This image analysis strategy seems to
be a relatively impartial method for easily assessing quantitative image information.
However, it also includes the risk of sampling error. A higher number of ROIs could
reduce the risk of sampling errors but would also diminish the advantage of time efficiency
compared to laborious manual organ segmentation. Manual segmentation proves to be
the most reliable image analysis technique which enables segmentation of all visible
macroscopic structures. It is, however, by far the most laborious and also reader-dependent
image analysis technique, as was indicated by the low inter-reader agreement and low
repeatability of manual cortex segmentation. Therefore, a limitation of this study
was also that manual segmentation and ROI analysis of all parameters was performed
by only one reader. Automatic segmentation based on the nnUNet framework presented
acceptable segmentation accuracy in our study despite the small data set. Still, it
is unclear how accurate the results would be for corticomedullary differentiation.
Evaluation of different automatic segmentation strategies with various approaches
and larger data sets is needed to further promote automatic segmentation for a broader
application and thereby also support renal MRI research with faster and more efficient
image analysis. Further standardization is needed for both renal MRI protocols and
image analysis strategies to enable multicenter studies and examination of different
renal pathologies and finally to pave the way to clinical application of multiparametric
functional MRI of the kidneys.
Conclusion
Reasonable test-retest repeatability could be achieved with our multiparametric functional
MRI protocol including ASL, IVIM, BOLD, T1 and T2 mapping, and volumetry. For further
evaluation, typical deviations and uncertainties of measured values have to be compared
to disease-related effects. Evaluation of different image analysis strategies concerning
repeatability showed overall superior repeatability of manual segmentation to ROI
analysis of the cortex and medulla, while automatic segmentation of the total kidney
displayed similar repeatability to manual segmentation. Awareness of the repeatability
limits of the applied MR parameters and image analysis techniques is crucial for the
differentiation between physiological and technical variance and pathological results
when it comes to diagnostic imaging in patients with kidney disease. These findings
encourage the development and improvement of image analysis techniques and support
broader application of multiparametric functional MRI for kidney diagnostics and future
clinical studies.