Aims Early detection of Barrett’s neoplasia is challenging. Deep learning (DL) is proposed
to play a role, with limited recent reports showing encouraging results. However,
comparative data on the best methods to develop and implement this technology is lacking.
We aim to compare two different DL models for detection of Barrett’s neoplasia, a
classical patch-based, and a pixel-based model.
Methods We collected 76 anonymous, HDWLE, histologically-confirmed images from our database,
including adenocarcinoma, HGD and LGD. For patch-based model, LeNet-5 architecture
was used. Each image is divided into patches of 48x48 pixels, each patch had a confidence
score and label (neoplastic or non-neoplastic). For pixel-based, we used SegNet architecture.
Each pixel in the image was given a label and confidence score. Validation performed
using 4 fold leave-one-out cross-validations. Graphic processing unit used was “GeForce
RTX 2080 Ti. Processing speed, global accuracy (how often is the model prediction
right), F-score (harmonic mean of sensitivity and precision), and IoU (overlap between
model prediction and expert marking) were calculated and compared using paired sample
t-test.
Results Average processing speed with pixel-based was 33ms/image, compared with 102.6ms/image
for patch-based model. At a score threshold of 0.8, pixel and patch-based models showed
mean values of global accuracy 88% and 84% (P-value 0.00002), IoU 0.40 and 0.21 (P<
0.0001), and F-score (for correctly predicted images, at IoU 0.5) 0.81 and 0.69 (P
value < 0.0001), respectively.
Conclusions Pixel-based model is significantly faster, and performed better than patch-based
model. Given average human visual response latency is estimated at 70-100ms, this
data suggest our pixel-based model could potentially detect neoplasia faster than
human eye so it will be best suited for real time detection. To our knowledge, this
is the first report comparing these two different approaches in Barrett’s neoplasia
and suggests that all future work should be done with Pixel based model.