RSS-Feed abonnieren
DOI: 10.1055/s-0045-1803952
Automated Operative Workflow Recognition in Vestibular Schwannoma Resection: Development and Preclinical Evaluation of a Deep Learning Neural Network (Ideal Stage 0)
Background and Objectives: Paradigm shifts in the operative management of vestibular schwannoma (VS) have resulted in a significant reduction in morbidity and mortality associated with surgical excision, but despite this resection of VS remains a high-risk operation. Furthermore, the low volume, high complexity nature of VS resection, coupled with increasing transfer of expertise to centers of excellence, has brought concerns regarding training opportunities. Artificial intelligence (AI) provides opportunities to address these concerns - of particular note to surgeons is the opportunity for AI to interpret and process operative video. Machine learning (ML) in surgical video analysis offers promising prospects for training, audit, decision support, and prognostication in surgery. The past decade has seen key advances in ML-based operative workflow analysis, whereby ML platforms predict the phase and step of an operation, though existing applications mostly feature shorter surgeries (<2 hours). This study aimed to develop and evaluate a ML model capable of automated operative workflow recognition for vestibular schwannoma resection. In doing so, this study furthers previous research in this field by applying workflow prediction platforms to lengthy (median >5hrs duration), data-heavy surgeries.
Methods: An operative video dataset of twenty-one microscopic retrosigmoid vestibular schwannoma resections was collected at a single institution over a three-year period, and underwent phase and step annotation according to a workflow previously agreed by expert consensus (Approach, Excision, and Closure phases, and Debulking or Dissection steps within the Excision phase) ([Fig. 1]). Annotations were used to train a ML model consisting of a convolutional neural network (CNN) followed by a recurrent neural network (RNN) ([Fig. 2]). 5-fold cross-validation was used and performance metrics (accuracy, precision, recall, F1 score) were assessed for phase and step prediction tasks.




Results: Median operative video time was 5 hours 18 minutes (IQR 3 hours 21 min–6 hours 1 min). The “tumor excision” phase accounted for the majority of each case (median 4hr23min), while ‘Approach and Exposure’ (28min) and ‘Closure’ (17min) comprised shorter phases. The ML model accurately predicted operative phases (accuracy 81%, weighted F1 0.83) and dichotomized steps (accuracy 86%, weighted F1 0.86), but yielded reduced accuracy when predicting individual steps (accuracy 59%, weighted F1 0.58).
Conclusion: This study demonstrates that our CNN-RNN model can accurately predict the surgical phases and intra-phase steps in retrosigmoid vestibular schwannoma resection. Despite this, there remains room for improvement in individual step classification. This work is of particular significance within the context of unique ML challenges: first, the analysis of extensive datasets, in contrast to previous clinical applications of computer vision unanimously conducted on shorter duration procedures; and second, the navigation of surgeries lacking a linear progression of steps with a specific phase. Future applications of ML in low-volume complex operations should prioritize collaborative video sharing to overcome early technical barriers to clinical translation.
Publikationsverlauf
Artikel online veröffentlicht:
07. Februar 2025
© 2025. Thieme. All rights reserved.
Georg Thieme Verlag KG
Oswald-Hesse-Straße 50, 70469 Stuttgart, Germany