Methods Inf Med 1978; 17(04): 238-246
DOI: 10.1055/s-0038-1636443
Original Article
Schattauer GmbH

Tlie Measurement of Performance in Probabilistic Diagnosis

III. Methods Based on Continuous Functions of the Diagnostic ProbabilitiesDIE LEISTUNGSMESSUNG BEI DER WAHRSCHEINLICHKEITSDIAGNOSE.III. AUF KONTINUIERLICHEN FUNKTIONEN DER DIAGNOSTISCHEN WAHRSCHEINLICHKEITEN BASIERENDE METHODEN
J. Hilden
1   From the Department of Public Health and Social Medicine, Erasmus University, Rotterdam, The Netherlands, and the Institute of Human Genetics, University of Copenhagen, Denmark
,
J. D. F. Habbema
1   From the Department of Public Health and Social Medicine, Erasmus University, Rotterdam, The Netherlands, and the Institute of Human Genetics, University of Copenhagen, Denmark
,
B. Bjerregaard
1   From the Department of Public Health and Social Medicine, Erasmus University, Rotterdam, The Netherlands, and the Institute of Human Genetics, University of Copenhagen, Denmark
› Author Affiliations
Further Information

Publication History

Publication Date:
19 February 2018 (online)

Within the framework of diagnostic probability prediction the problem of measuring discriminatory ability is operationally defined as the problem of measuring the agreement between probabilistic predictions and actual outcomes. We present a number of so-called scoring rules developed to this end. Most of these are continuous functions of the assigned probabilities. Discontinuous rules, including conventional non-error rates, are discussed by way of contrast. The concept of properness of a scoring-rule is discussed and the desirability of properness is argued. Separate sections deal with the problems connected with uncommon diseases and methods utilizing subdivisions of the patient material. The distinction between the three concepts of discriminatory ability, sharpness and reliability is explained. The evaluation tools developed are applied to previously presented data from the Copenhagen Acute Abdominal Pain Study.

Im Rahmen der diagnostischen Wahrscheinlichkeitsvorhersage wird das Problem der Messung des Diskriminanzvermögens operational definiert als das der Messung der Übereinstimmung zwischen Wahrscheinlichkeits vorhersage und tatsächlichem Ergebnis. Für diesen Zweck stellen die Autoren eine Reihe sogenannter »scoring rules« (Bewertungsregeln) vor, die meistens kontinuierliche Funktionen der zugeteilten diagnostischen Wahrscheinlichkeiten sind. Diskontinuierliche Regeln, einschließlich konventioneller Nicht-Fehler-Quoten, werden vergleichsweise behandelt. Das Konzept der Angepaßtheit einer Regel wird diskutiert, wobei erörtert wird, warum diese Eigenschaft erwünscht ist. Besondere Abschnitte behandeln Probleme im Zusammenhang mit ungewöhnlichen Krankheiten und Methoden cler Nutzung von Unterteilungen des Krankengutes. Die Unterscheidung zwischen den drei Konzepten Diskriminanzvermögen, Schärfe und Reliabilität wird erklärt. Die Bewertungskriterien werden auf Daten aus cler Kopenhagener Studie über akute Bauchschmerzen angewandt.

 
  • References

  • 1 BJEBBEGAARD B., BRYNITZ S., HOLST-CHRISTENSEN J., KALAJA E., LUND-KRISTENSEN J., HRLDEN J., DE Dombal F. T., HORBOCKS J. C.. Computer-aided Diagnosis of the Acute Abdomen: A System from Leeds Used on Copenhagen Patients. In de Dombal F. T., and Gremy F.. (Eds) Decision Malting and Medical Care: Can Information Science Help?. pp. 165-174. ( Amsterdam: North-Holland Publ. Co.; 1976. ).
  • 2 BJERREGAABD B.. Computer-hjffilp i den diagnostiske beslut-ningsproces. Licentiate thesis. ( Copenhagen: Institute of Human Genetics; 1978. ).
  • 3 BRIER G. W., ALLEN R. A.. Verification of Weather Forecasts. In Malone T. F.. (Edit.) Compendium of Meteorology, pp. 841-848. ( Boston: Amer. Meteorol. Soc.; 1951. ).
  • 4 BURBANK F.. A Computer Diagnostic System for the Diagnosis of Prolonged Undifferentiating Liver Disease. Amer. J. Med 46 ( 1969; ) 401-415.
  • 5 DAWID A. P.. Properties of Diagnostic Data Distributions. Biometrics 32 ( 1976; ) 647-658.
  • 6 DE EINETTI B.. Does it Make Sense to Speak of )Good Probability Appraisers! ?. In Good I. J.. (Edit.) The Scientist Speculates. PP. 357-363. ( London: Heineman; 1962. ).
  • 7 DE FINETTI B.. Probability, Induction & Statistics. ( London: Wiley; 1972. ).
  • 8 DICKEY J. M.. Estimation of Disease Probabilities Conditioned on Symptom Variables. Math. Biosci 3 ( 1968; ) 249-265.
  • 9 GOOD I. J., CARD W. I.. The Diagnostic Process with Special Reference to Errors. Meth. Inform. Med 10 ( 1971; ) 176-188.
  • 10 GUSTAESON D. H., GREIST J. H., STAUSS F. F., ERDMAN H., LATTGHREN T.. A Probabilistic System for Identifying Suicide Attemptors. Comput. biomed. Res 10 ( 1977; ) 83-89.
  • 11 GUSTAESON D. H., KESTLY J. J., GREIST J. H., JENSEN N. M.. Initial Evaluation of a Subjective Bayesian Diagnostic System. Hlth Serv. Res 6 ( 1971; ) 20-213.
  • 12 HABBEMA J. D. F.. Models for Diagnosis and Detection of Combinations of Diseases. In [1] pp. 399-410.
  • 13 HABBEMA J. D. F., HERMANS J.. Statistical Methods for Clinical Decision Making. Thesis. ( Leiden: Rijksuniversiteit; 1978. ).
  • 14 HABBEMA J. D. F., HERMANS J.. Selection of Variables in Discriminant Analysis by F-Statistic and Error Rate. Technometries 19 ( 1977; ) 487-493.
  • 15 HABBEMA J. D. F., HERMANS J., REMME J.. Data Analytical Methods in Discriminant Analysis: The Analysis of Posterior Probabilities. In Data Analysis and Informatics. pp. 211-221. ( Roquencourt: IRIA; 1977. ).
  • 16 HABBEMA J. D. F., HILDEN J., BJERREGAABD B.. The Measurement of Performance in Probabilistic Diagnosis — I. The Problem, Descriptive Tools, and Measures Based on Classification Matrices. Meth. Inform. Med 17 ( 1978; ) 217-226.
  • 17 HILDEN J., BJERREGAARD B.. Computer-Aided Diagnosis and the Atypical Case. In [1] pp. 365-378.
  • 18 HILDEN J., HABBEMA J. D. F., BJERREGAARD B.. The Measurement of Performance in Probabilistic Diagnosis — II. Trustworthiness of the Exact Values of the Diagnostic Probabilities. Meth. Inform. Med 17 ( 1978; ) 227-237.
  • 19 KNILL-JONES R. P., STERN R. B., GIRMES D. H., MAXWELL J. D., THOMPSON R. P. H., WILLIAMS R.. Use of Sequential Bayesian Model in Diagnosis of Jaundice by Computer. Brit, med. J 1973; I: 530-533.
  • 20 MOISEEVA N. I., Usov V. V.. Some Medical and Mathematical Aspects of Computer Diagnosis. Proceed. IEEE 57 ( 1969; ) 1919-1925.
  • 21 MOSTELLER F., WALLACE D. L.. Inference and Disputed Authorship : The Federalist. ( Reading Mass.: Addison-Wesley; 1964. ).
  • 22 MUBPHY A. H.. Evaluation of Probabilistic Forecasts: Some Procedures and Practices. In Murphy A. H., and Williamson D. L.. (Eds) Weather Forecasting and Weather Forecasts: Models, Systems, and Users. pp. 807-830. ( Boulder Colorado : National Center for Atmospheric Research; 1977. ).
  • 23 MUBPHY A. H., STAËL VON HOLSTEIN C.-A. S.. A Geometrical Framework for the Ranked Probability Score. Mon. Weath. Rev. (Amer. Meteor. Soe.) 103 ( 1975; ) 16-20.
  • 24 MURPHY A. H.. Hedging and Skill Scores for Probability Forecasts. J. appl. Meteorol 12 ( 1973; ) 215-223 557.
  • 25 PEARL J.. A Note on the Management of Probability Assessors. IEEE Trans. Systems, Man & Cybernetics SMG 7 ( 1977; ) 402-403.
  • 26 PREWITT J. M. S.. Decision Theoretic Approaches to White Cell Differentiation. In [1] pp. 287-308.
  • 27 SAVAGE L. J.. Elicitation of Personal Probabilities and Expectations. J. Amer. statist. Ass 66 ( 1971; ) 783-801.
  • 28 SHAPIRO A. R.. The Evaluation of Clinical Predictions. New Engl. J. Med 296 ( 1977; ) 1509-1514.
  • 29 STAËL VON HOLSTEIN C.-A. S.. Assessment and Evaluation of Subjective Probability Distributions. ( Stockholm: EFI, The Economic Research Institute of the Stockholm School of Economics; 1970. ).
  • 30 WINKLER R. L.. Scoring Rides and the Evaluation of Probability Assessors. J. Amer. statist. Ass 64 ( 1969; ) 1073-1078.
  • 31 WINKLEE R. L., MUBPHY A. H.. »Good« Probability Assessors. J. appl. Meteorol 7 ( 1968; ) 751-758.