Endoscopy 2020; 52(S 01): S51
DOI: 10.1055/s-0040-1704158
ESGE Days 2020 oral presentations
Thursday, April 23, 2020 14:30 – 16:00 Polyp forensics: Colon advanced Wicklow Meeting Room 3 Imaging 2
© Georg Thieme Verlag KG Stuttgart · New York

AUTOMATED POLYP DETECTION ON CAPSULE ENDOSCOPY USING AN INTEGRATED 2D AND 3D DEEP NEURAL NETWORK

A Bobade
1   Xyken LLC, Mclean, U S A
,
S Yi
1   Xyken LLC, Mclean, U S A
,
FC Ramirez
2   Mayo Clinic, Division of Gastroenterology and Hepatology, Scottsdale, U S A
,
JA Leighton
2   Mayo Clinic, Division of Gastroenterology and Hepatology, Scottsdale, U S A
,
SF Pasha
2   Mayo Clinic, Division of Gastroenterology and Hepatology, Scottsdale, U S A
› Author Affiliations
Further Information

Publication History

Publication Date:
23 April 2020 (online)

 

Aims We report an integrated CNN framework that utilizes color and depth (3D) information for automated polyp detection. Our goal is to demonstrate advantages gained by adding 3D information.

Tab. 1

Performance with ImageNet and COCO weights

Weights

Precision

Sensitivity

ImageNet

64.71

82.09

COCO

88.06

88.06

Methods A region-based CNN (R-CNN) (focused on region of interest) was designed to take both color and depth into consideration.

Results The R-CNN backbone was initialized with ImageNet and COCO weights for performance studies. To produce depth information, a 3D depth CNN was designed and trained using NYU-Depth V2 dataset (1449 images). To train R-CNN, 530 CE frames with polyps extracted from 120 short de-identified videos were used. Polyps were annotated using VGG (Visual Geometry Group) Image Annotator. During network training, 60% of images were augmented (changed in scale and rotation) to ensure that the network used a different image set for every step. Performance of the trained networks was measured using 55 CE video frames. Table 1 shows detection and performance with ImageNet and COCO weights. Average performance was 76.38% and 85.07% on precision and sensitivity, respectively. There was inaccurate characterization of some debris, bubbles and reflected light as polyps.

Conclusions The R-CNN framework demonstrated marginally improved results over color based CNN. Image quality and different polyp views affected its performance. Since the 3D depth CNN was trained on non-GI tract video frames, the extracted depth information did not meet accuracy level for endoscopic application. Future work includes use of an accurate 3D training dataset obtained from close range images to improve 3D depth CNN, and performance of R-CNN.