SurgiMind: Next-Generation Surgical Image Segmentation leveraging Transformers for Lung Cancer Surgery

F Ponholzer; D Barnes; L Sugic; L Mayr; S Schneeberger; J Piater; D Öfner; F Augustin; K Grossmann; A Rodríguez-Sánchez

doi:10.1055/s-0045-1809785

RSS-Feed abonnieren

Bitte kopieren Sie die angezeigte URL und fügen sie dann in Ihren RSS-Reader ein.

https://www.thieme-connect.de/rss/thieme/de/10.1055-s-00000104.xml

Zentralbl Chir 2025; 150(S 01): S97
DOI: 10.1055/s-0045-1809785

Abstracts

Innovative Technologien

SurgiMind: Next-Generation Surgical Image Segmentation leveraging Transformers for Lung Cancer Surgery

Autoren

F Ponholzer

¹Medical University of Innsbruck, Department of Visceral, Transplant and Thoracic Surgery, Innsbruck, Österreich
D Barnes

²University of Innsbruck, Department of Computer Science, Innsbruck, Österreich
L Sugic

¹Medical University of Innsbruck, Department of Visceral, Transplant and Thoracic Surgery, Innsbruck, Österreich
L Mayr

²University of Innsbruck, Department of Computer Science, Innsbruck, Österreich
S Schneeberger

¹Medical University of Innsbruck, Department of Visceral, Transplant and Thoracic Surgery, Innsbruck, Österreich
J Piater

²University of Innsbruck, Department of Computer Science, Innsbruck, Österreich
D Öfner

¹Medical University of Innsbruck, Department of Visceral, Transplant and Thoracic Surgery, Innsbruck, Österreich
F Augustin

¹Medical University of Innsbruck, Department of Visceral, Transplant and Thoracic Surgery, Innsbruck, Österreich
K Grossmann

²University of Innsbruck, Department of Computer Science, Innsbruck, Österreich
A Rodríguez-Sánchez

²University of Innsbruck, Department of Computer Science, Innsbruck, Österreich

Weitere Informationen

Auch verfügbar auf

Background To develop and evaluate a transformer-based deep learning model for real-time anatomical structure segmentation in video-assisted thoracoscopic surgery (VATS) for right upper lobe lobectomy in lung cancer patients.

Methods & Materials A retrospective cohort study was conducted using thoracoscopic video recordings from 81 patients who underwent anatomical VATS right upper lobe resection between 2009 and 2024. A total of 1539 frames were extracted and manually annotated for eight anatomical classes: right upper pulmonary vein, azygos vein, right upper lobe bronchus, phrenic nerve, middle lobe vein, A2 segmental artery, truncus anterior, and pulmonary main artery. Three deep learning architectures (U-Net, Fully Convolutional Transformer [FCT], and the novel Surgi-FCT) were trained and evaluated. Surgi-FCT was optimized by removing the Wide Focus layer and increasing the network depth to improve feature extraction and reduce computational overhead. Evaluation metrics included Dice coefficient, Intersection over Union (IoU), and precision, with separate analyses for class-present (CP) and class-absent (CA) scenarios.

Results The Surgi-FCT model with 7 encoder-decoder layers (Surgi-FCT 7) trained on 640×640 images achieved the best segmentation performance, with an average Dice coefficient of 0.69 (CP) and 0.88 (CA), resulting in an overall Dice of 0.82. This outperformed U-Net (Dice: 0.56 CP, 0.79 CA) and FCT (Dice: 0.68 CP, 0.84 CA). Surgi-FCT 7 was particularly effective in segmenting frequently occurring classes such as the pulmonary main artery and phrenic nerve. Classes with fewer examples, such as the A2 artery and middle lobe vein, had lower Dice scores (0.40 and 0.62 respectively) but still showed improved performance in multi-class training compared to single-class models. The network demonstrated that class co-occurrence, as observed in correlation matrices, improved segmentation accuracy—e.g., co-detection of the azygos vein and main artery. Higher image resolution and deeper model architecture also led to performance gains, though at increased computational cost.

Conclusion The Surgi-FCT 7 model enables accurate segmentation of complex anatomical structures in thoracic surgery videos. Leveraging transformer attention and class co-occurrence, it outperforms conventional CNN-based architectures and provides a scalable foundation for AI-powered visual assistance tools in minimally invasive thoracic surgery.

Publikationsverlauf

Artikel online veröffentlicht:
25. August 2025

Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany

Bücher zum Thema

RSS-Feed abonnieren

Teilen / Bookmarken

SurgiMind: Next-Generation Surgical Image Segmentation leveraging Transformers for Lung Cancer Surgery

Autoren

Publikationsverlauf