Accuracy of Trained Physicians is Inferior to Deep Learning-Based Algorithm for Determining Angles in Ultrasound of the Newborn Hip

David Oelen; Pascal Kaiser; Thomas Baumann; Raoul Schmid; Christof Bühler; Bayalag Munkhuu; Stefan Essig

doi:10.1055/a-1177-0480

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00000089.xml

Share / Bookmark

Facebook Linkedin Weibo

Download PDF

Ultraschall Med 2022; 43(05): e49-e55
DOI: 10.1055/a-1177-0480

Original Article

Accuracy of Trained Physicians is Inferior to Deep Learning-Based Algorithm for Determining Angles in Ultrasound of the Newborn Hip

Genauigkeit von geschulten Ärzten unterliegt einem Deep-Learning-basierten Algorithmus beim Bestimmen von Winkeln im Ultraschall der Säuglingshüfte

David Oelen

¹Biotechnologie & Physik, Supercomputing Systems, Zürich, Switzerland

,

Pascal Kaiser

¹Biotechnologie & Physik, Supercomputing Systems, Zürich, Switzerland

,

Thomas Baumann

²Research Department, Institute of Primary and Community Care Lucerne, Luzern, Switzerland

,

Raoul Schmid

³Praxis, Baarer Kinderarztpraxis, Baar, Switzerland

,

Christof Bühler

¹Biotechnologie & Physik, Supercomputing Systems, Zürich, Switzerland

,

Bayalag Munkhuu

⁴National Center for Maternal and Child Health, Ulaanbaatar, Mongolia

,

Stefan Essig

²Research Department, Institute of Primary and Community Care Lucerne, Luzern, Switzerland

› Author Affiliations

› Further Information

Also available at

Abstract
Full Text
References
Supplementary Material

Permissions and Reprints

Abstract

Purpose Sonographic diagnosis of developmental dysplasia of the hip allows treatment with a flexion-abduction orthosis preventing hip luxation. Accurate determination of alpha and beta angles according to Graf is crucial for correct diagnosis. It is unclear if algorithms could predict the angles. We aimed to compare the accuracy for users and automation reporting root mean squared errors (RMSE).

Materials and Methods We used 303 306 ultrasound images of newborn hips collected between 2009 and 2016 in screening consultations. Trained physicians labelled every second image with alpha and beta angles during the consultations. A random subset of images was labeled with time and precision under lab conditions as ground truth. Automation predicted the two angles using a convolutional neural network (CNN). The analysis was focused on the alpha angle.

Results Three methods were implemented, each with a different abstraction of the problem: (1) CNNs that directly learn the angles without any post-processing steps; (2) CNNs that return the relevant landmarks in the image to identify the angles; (3) CNNs that return the base line, bony roof line, and the cartilage roof line which are necessary to calculate the angles. The RMSE between physicians and ground truth were found to be 7.1° for alpha. The best CNN architecture was (2) landmark detection. The RMSE between landmark detection and ground truth was 3.9° for alpha.

Conclusion The accuracy of physicians in their daily routine is inferior to deep learning-based algorithms for determining angles in ultrasound of the newborn hip. Similar methods could be used to support physicians.

Zusammenfassung

Ziel Die Diagnose von Hüftdysplasie mittels Sonografie erlaubt das Behandeln mit Flexionsorthese, um einer Hüftluxation vorzubeugen. Genaue Bestimmungen der Winkel Alpha und Beta nach Graf sind essenziell für eine korrekte Diagnose. Es ist unklar, ob ein Algorithmus die Winkel vorhersagen könnte. Diese Arbeit vergleicht die Genauigkeit für Anwender und Automation mittels mittlerer quadratischer Fehler (MQF).

Material und Methode Wir verwendeten 303 306 Ultraschallbilder von Neugeborenenhüften, die zwischen 2009 und 2016 in Screening-Untersuchungen akquiriert wurden. Ausgebildete Ärzte bestimmten während der Konsultation in jedem zweiten Bild die Winkel Alpha und Beta. Eine zufällige Teilmenge an Bildern wurde unter Laborbedingungen mit Zeit und Präzision als Ground Truth beschriftet. Die Automation sagte die beiden Winkel mittels convolutional neural network (CNN) voraus. Die Analyse war auf den Winkel Alpha fokussiert.

Ergebnisse Drei Methoden wurden implementiert, jede davon mit einer anderen Abstraktion des Problems: (1) CNNs, die Winkel ohne post-processing direkt lernen; (2) CNNs, die Punkte im Bild bestimmen, die relevant sind, um die Winkel zu bestimmen; (3) CNNs, die Grundlinie, Pfannendachlinie und die Knorpeldachlinie in das Bild legen, um daraus die Winkel zu bestimmen. Der MQF zwischen Ärzten und der Ground Truth war 7,1° für Alpha. Die beste CNN-Architektur war (2) die Detektion der Punkte. Der MQF zwischen Punktedetektion und Ground Truth betrug 3,9° für Alpha.

Schlussfolgerung Die Genauigkeit von Ärzten in ihrer täglichen Arbeit ist kleiner als diejenige eines Deep-Learning-basierten Algorithmus beim Bestimmen von Winkeln im Ultraschall der Säuglingshüfte. Ähnliche Methoden könnten verwendet werden, um Ärzte zu unterstützen.

Key words

accuracy - automation - deep learning - feedback - developmental dysplasia of the hip

Supporting information

Supplementary material

Publication History

Received: 20 February 2020

Accepted: 10 May 2020

Article published online:
06 August 2020

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

References
1 Yang S, Zusman N, Lieberman E. et al. Developmental dysplasia of the hip. Pediatrics 2019; 143: e20181147

MissingFormLabel
Crossref PubMed Search in Google Scholar
2 Munkhuu B, Essig S, Renchinnyam E. et al. Incidence and treatment of developmental hip dysplasia in Mongolia: a prospective cohort study. PLoS One 2013; 8: e79427

MissingFormLabel
Crossref PubMed Search in Google Scholar
3 Kotlarsky P, Haber R, Bialik V. et al. Developmental dysplasia of the hip: What has changed in the last 20 years?. World J Orthop 2015; 6: 886

MissingFormLabel

Search in Google Scholar
4 Graf R. Hip Sonography, Diagnosis and Management of Infant Hip Dysplasia. Springer; 2006

MissingFormLabel
Search in Google Scholar
5 Quader N, Schaeffer EK, Hodgson AJ. et al. A Systematic Review and Meta-analysis on the Reproducibility of Ultrasound-based Metrics for Assessing Developmental Dysplasia of the Hip. J Pediatr Orthoped 2018; 38: e305-e311

MissingFormLabel
Crossref PubMed Search in Google Scholar
6 Jaremko JL, Mabee M, Swami VG. et al. Potential for change in US diagnosis of hip dysplasia solely caused by changes in probe orientation: patterns of alpha-angle variation revealed by using three-dimensional US. Radiology 2014; 273: 870-878

MissingFormLabel
Crossref PubMed Search in Google Scholar
7 Pedrotti L, Crivellari I, Degrate A. et al. Interpreting neonatal hip sonography: intraobserver and interobserver variability. J Pediatr Orthop B 2020; 29: 214-218

MissingFormLabel
Crossref PubMed Search in Google Scholar
8 Shirai Y, Wakabayashi K, Wada I. et al. Reproducibility of acquiring ultrasonographic infant hip images by the Graf method after an infant hip ultrasound training course. J Med Ultrason 2018; 45: 583-589

MissingFormLabel
Crossref PubMed Search in Google Scholar
9 Litjens G, Kooi T, Bejnordi BE. et al. A survey on deep learning in medical image analysis. Med Image Anal 2017; 42: 60-88

MissingFormLabel
Crossref PubMed Search in Google Scholar
10 Lecun Y, Bottou L, Bengio Y. et al. Gradient-based learning applied to document recognition. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE; 1998: 2278-2324

MissingFormLabel
PubMed Search in Google Scholar
11 Krizhevsky A, Sutskever I, Hinton GE. ImageNet Classification with Deep Convolutional Neural Networks. 25th International Conference on Neural Information Processing Systems. USA: Curran Associates Inc; 2012: 1097-1105

MissingFormLabel
PubMed Search in Google Scholar
12 Golan D, Donner Y, Mansi C. et al. Fully Automating Graf's Method for DDH Diagnosis Using Deep Convolutional Neural Networks. Deep Learning and Data Labeling for Medical Applications: First International Workshop, LABELS, and Second International Workshop, DLMIA, Held in Conjunction with MICCAI. Greece: Springer; 2016: 130-141

MissingFormLabel
Search in Google Scholar
13 Pehrson LM, Lauridsen C, Nielsen MB. Machine learning and deep learning applied in ultrasound. Ultraschall in Med 2018; 39: 379-381

MissingFormLabel
Thieme Connect PubMed Search in Google Scholar
14 Ritgen J, Merhof D, Kopaczka M. et al. Deep learning Algorithmen in der retrospektiven Bildanalyse großer Bilddatenbanken. Ultraschall in Med 2019; 40: WS21-WS23

MissingFormLabel
PubMed Search in Google Scholar
15 Graf R, Baumgartner F, Lercher K. et al. Sonographie der Säuglingshüfte und therapeutische Konsequenzen. Thieme; 2009

MissingFormLabel
Search in Google Scholar
16 Clarke NM. Swaddling and hip dysplasia: an orthopaedic perspective. Arch Dis Child 2014; 99: 5-6

MissingFormLabel
Crossref PubMed Search in Google Scholar
17 Blatt SH. To swaddle, or not to swaddle? paleoepidemiology of developmental dysplasia of the hip and the swaddling dilemma among the indigenous populations of North America. Am J Hum Biol 2014; 21: 116-128

MissingFormLabel
PubMed Search in Google Scholar
18 Wang E, Liu T, Li J. et al. Does swaddling influence developmental dysplasia of the hip?: An experimental study of the traditional straight-leg swaddling model in neonatal rats. J Bone Joint Surg Am 2012; 94: 1071-1077

MissingFormLabel
Crossref PubMed Search in Google Scholar
19 Essig S, Schmid R, Munkhuu B. et al. Qualitätskonzept eines Ultraschall-basierten, nationalen Screeningprogramms für Hüftdysplasie in der Mongolei. Ultraschall in Med 2017; 38: V2. 002

MissingFormLabel
PubMed Search in Google Scholar
20 Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. International Conference on Medical image computing and computer-assisted intervention. Germany: Springer; 2015: 234-241

MissingFormLabel
PubMed Search in Google Scholar
21 Zhao H, Shi J, Qi X. et al. Pyramid Scene Parsing Network. IEEE Conference on Computer Vision and Pattern Recognition (CVPR). USA: IEEE; 2017: 6230-6239

MissingFormLabel
PubMed Search in Google Scholar
22 Abadi M, Barham P, Chen J. et al Tensorflow: Large-Scale Machine Learning on Heterogeneous Systems. arXiv 2016. Preprint at https://arxiv.org/abs/1603.04467

MissingFormLabel
PubMed
23 Dong H, Supratak A, Mai L. et al. TensorLayer: A Versatile Library for Efficient Deep Learning Development. ACM Multimedia; 2017

MissingFormLabel
PubMed Search in Google Scholar
24 Simon EA, Saur F, Buerge M. et al. Inter-observer agreement of ultrasonographic measurement of alpha and beta angles and the final type classification based on the Graf method. Swiss Med Wkly 2004; 134: 671-677

MissingFormLabel
PubMed Search in Google Scholar
25 He K, Zhang X, Ren S. et al. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. IEEE Conference on Computer Vision (ICCV). Chile: IEEE; 2015: 1026-1034

MissingFormLabel
PubMed Search in Google Scholar
26 Tschauner C, Klapsch W, Baumgartner A. et al. „Reifungskurve“ des sonografischen Alpha-Winkels nach GRAF unbehandelter Hüftgelenke im ersten Lebensjahr. Z Orthop Unfall 1994; 132: 502-504

MissingFormLabel
Thieme Connect PubMed Search in Google Scholar
27 Blatter M. Automated Classication of Spatial Orientation of US Images in DDH Diagnosis Using Machine Learning Techniques. Zurich: ETH Zurich; 2019: 1-30

MissingFormLabel
PubMed Search in Google Scholar

Supplementary Material

Supplementary material

Subscribe to RSS

Share / Bookmark

Accuracy of Trained Physicians is Inferior to Deep Learning-Based Algorithm for Determining Angles in Ultrasound of the Newborn Hip

Abstract

Zusammenfassung

Key words

Supporting information

Publication History

References