Endoscopy 2020; 52(S 01): S235
DOI: 10.1055/s-0040-1704735
ESGE Days 2020 ePoster Podium presentations
Saturday, April 25, 2020 15:00 – 15:30 Artificial Intelligence for colonoscopy and ePoster Podium 4 small: Bowel endoscopy
© Georg Thieme Verlag KG Stuttgart · New York

THE USE OF INTEGRATED 2D AND 3D BASED DEEP NEURAL NETWORK AS AN ADJUNCT DETECTION TOOL IN COLONOSCOPY

A Bobade
1   Xyken LLC, Mclean, U S A
,
S Yi
1   Xyken LLC, Mclean, U S A
,
JA Leighton
2   Mayo Clinic, Division of Gastroenterology and Hepatology, Scottsdale, U S A
,
S Pasha
2   Mayo Clinic, Division of Gastroenterology and Hepatology, Scottsdale, U S A
,
FC Ramirez
2   Mayo Clinic, Division of Gastroenterology and Hepatology, Scottsdale, U S A
› Author Affiliations
Further Information

Publication History

Publication Date:
23 April 2020 (online)

 
 

    Aims Deep convolution neural network (CNN) has been studied for identifying lesions in colonoscopy videos using colored 2D images. The benefit from using a 3D based CNN is unknown. The aim of this study is to assess the utility of an integrated CNN framework that utilizes color (2D) and depth (3D) information for automated lesion detection

    Tab. 1

    Performance with ImageNet and COCO weights

    Weights

    Precision %

    Sensitivity %

    ImageNet

    73.29

    82.39

    COCO

    83.16

    82.67

    Methods We used a 3D depth CNN trained through NYU- Depth V2 dataset (1449 images), to produce depth information for each colored image obtained from colonoscopy videos. A region-based CNN (R-CNN), focusing on region of interest (lesions), initialized with ImageNet and COCO weights for future extraction, was designed to take both color and depth information for lesion detection.

    Results We used 612 de-identified colonoscope frames containing only lesions, extracted from 31 image sequence captured on 23 patients to train the R-CNN. The regions in colonoscope frames containing only lesions were annotated using VGV (Visual Geometry Group) Image Annotator. Sixty percent of images were augmented (scaled and rotated) during each training iteration to ensure CNN does not use the same image set for every step.

    The 3D depth CNN trained framework was evaluated using 552 colonoscopy video frames of different image resolutions. The detection result and the performance with ImageNet and COCO weights is shown in the Table. An average performance on precision and sensitivity of 78.2% and 82.5%, respectively was achieved.

    Conclusions The proposed CNN framework showed better results than colored based R-CNN. The results are encouraging and future work includes using accurate 3D training dataset obtained from similar imaging ranges to improve the 3D depth CNN performance


    #