Int J Sports Med 2023; 44(05): 352-360
DOI: 10.1055/a-1993-2371
Training & Testing

Prediction of Marathon Performance using Artificial Intelligence

1   Centre d'Etudes des Transformations des Activités Physiques et Sportives Normandie Univ, UNIROUEN, CETAPS, 76000 Rouen, France
Damien Saboul
2   Research and Innovation, Be-ys-research, Argonay, France
Michel Clémençon
1   Centre d'Etudes des Transformations des Activités Physiques et Sportives Normandie Univ, UNIROUEN, CETAPS, 76000 Rouen, France
Jérémy Bernard Coquart
1   Centre d'Etudes des Transformations des Activités Physiques et Sportives Normandie Univ, UNIROUEN, CETAPS, 76000 Rouen, France
3   Unité de Recherche Pluridisciplinaire Sport, Santé, Société Eurasport, 413 avenue Eugène Avinée, 59 120 Loos, France
› Author Affiliations


Although studies used machine learning algorithms to predict performances in sports activities, none, to the best of our knowledge, have used and validated two artificial intelligence techniques: artificial neural network (ANN) and k-nearest neighbor (KNN) in the running discipline of marathon and compared the accuracy or precision of the predicted performances. Official French rankings for the 10-km road and marathon events in 2019 were scrutinized over a dataset of 820 athletes (aged 21, having run 10 km and a marathon in the same year that was run slower, etc.). For the KNN and ANN the same inputs (10-km race time, body mass index, age and sex) were used to solve a linear regression problem to estimate the marathon race time. No difference was found between the actual and predicted marathon performances for either method (p>0,05). All predicted performances were significantly correlated with the actual ones, with very high correlation coefficients (r>0,90; p<0,001). KNN outperformed ANN with a mean absolute error of 2,4 vs 5,6%. The study confirms the validity of both algorithms, with better accuracy for KNN in predicting marathon performance. Consequently, the predictions from these artificial intelligence methods may be used in training programs and competitions.

Publication History

Received: 07 July 2022

Accepted: 05 December 2022

Accepted Manuscript online:
06 December 2022

Article published online:
17 February 2023

© 2023. Thieme. All rights reserved.

Georg Thieme Verlag
Rüdigerstraße 14, 70469 Stuttgart, Germany

  • References

  • 1 Burfoot A. The history of the marathon. Sports Med 2007; 37: 284-287
  • 2 Vitti A, Nikolaidis PT, Villiger E. et al. The “New York City Marathon”: participation and performance trends of 1,2M runners during half-century. Res Sports Med 2020; 28: 121-137
  • 3 Knechtle B, Di Gangi S, Rüst CA. et al. Men’s participation and performance in the Boston marathon from 1897 to 2017. Int J Sports Med 2018; 39: 1018-1027
  • 4 Marc A, Sedeaud A, Guillaume M. et al. Marathon progress: demography, morphology and environment. J Sports Sci 2014; 32: 524-532
  • 5 Knechtle B, Nikolaidis PT, Onywera VO. et al. Male and female Ethiopian and Kenyan runners are the fastest and the youngest in both half and full marathon. Springerplus 2016; 5: 223
  • 6 Waśkiewicz Z, Nikolaidis PT, Gerasimuk D. et al. What motivates successful marathon runners? The role of sex, age, education, and training experience in Polish runners. Front Psychol 2019; 10: 1671
  • 7 Weiss K, Valero D, Villiger E. et al. The influence of environmental conditions on pacing in age group marathoners competing in the “new york city marathon”. Front Physiol 2022; 13: 842935
  • 8 Weiss K, Valero D, Villiger E. et al. Temperature and barometric pressure are related to running speed and pacing of the fastest runners in the ‘Berlin Marathon’. Eur Rev Med Pharmacol Sci 2022; 26: 4177-4287
  • 9 El Helou N, Tafflet M, Berthelot G. et al. Impact of environmental parameters on marathon running performance. PloS One 2012; 7: e37407
  • 10 Nikolaidis PT, Di Gangi S, Chtourou H. et al. The role of environmental conditions on marathon running performance in men competing in Boston marathon from 1897 to 2018. Int J Environ Res Public Health 2019; 16: 614
  • 11 Boullosa D, Esteve-Lanao J, Casado A. et al. Factors affecting training and physical performance in recreational endurance runners. Sports (Basel) 2020; 8: 35
  • 12 Cuk I, Nikolaidis PT, Villiger E. et al. Pacing in Long-Distance Running: Sex and Age Differences in 10-km Race and Marathon. Medicina (Kaunas) 2021; 57: 389
  • 13 Deaner RO, Carter RE, Joyner MJ. et al. Men are more likely than women to slow in the marathon. Med Sci Sports Exerc 2015; 47: 607-616
  • 14 Blythe DAJ, Király FJ. Prediction and quantification of individual athletic performance of runners. PloS One 2016; 11: e0157257
  • 15 Coquart J, Alberty M, Bosquet L. Validity of a nomogram to predict long distance running performance. J Strength Cond Res 2009; 23: 2119-2123
  • 16 Berndsen J, Smyth B, Lawlor A. Pace my race: recommendations for marathon running. In: Proceedings of the 13th ACM Conference on Recommender Systems. 2019: 246-250
  • 17 Ruiz-Mayo D, Pulido E, Martınoz G. Marathon performance prediction of amateur runners based on training session data. Proc Mach Learn Data Min Sports Anal 2016; 8
  • 18 Vickers AJ, Vertosick EA. An empirical study of race times in recreational endurance runners. BMC Sports Sci Med Rehabil 2016; 8: 26
  • 19 Billat LV, Koralsztein JP, Morton RH. Time in human endurance models. From empirical models to physiological models. Sports Med Auckl NZ 1999; 27: 359-379
  • 20 Ettema JH. Limits of human performance and energy-production. Int Z Für Angew Physiol Einschließlich Arbeitsphysiologie 1966; 22: 45-54
  • 21 Scherrer J, Monod H. Le travail musculaire local et la fatigue chez lhomme. Presse Med 1960; 68: 1717-1717
  • 22 Alvero-Cruz JR, Carnero EA, García MAG. et al. Predictive performance models in long-distance runners: a narrative review. Int J Environ Res Public Health 2020; 17: 8289
  • 23 Péronnet F, Thibault G. [Physiological analysis of running performance: revision of the hyperbolic model]. J Physiol (Paris) 1987; 82: 52-60
  • 24 Tanda G. Prediction of marathon performance time on the basis of training indices. J Hum Sport Exerc 2011; 6: 521-520
  • 25 Vandewalle H. Modelling of running performances: comparisons of power-law, hyperbolic, logarithmic, and exponential models in elite endurance runners. BioMed Res Int 2018; 2018: 8203062
  • 26 Florence SL, Weir JP. Relationship of critical velocity to marathon running performance. Eur J Appl Physiol Occup Physiol 1997; 75: 274–278
  • 27 Mulligan M, Adam G, Emig T. A minimal power model for human running performance. PloS One 2018; 13: e0206645
  • 28 Huang Z-Q, Chen Y-C, Wen C-Y. Real-time weather monitoring and prediction using city buses and machine learning. Sensors (Basel) 2020; 20: 5173
  • 29 Uddin S, Khan A, Hossain ME. et al. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inform Decis Mak 2019; 19: 281
  • 30 Bunker RP, Thabtah F. A machine learning framework for sport result prediction. Appl Comput Inform 2019; 15: 27-33
  • 31 Wiseman O. Using Machine Learning to Predict the Winning Score of Professional Golf Events on the PGA Tour. National College of Ireland. 2016
  • 32 Bunker R, Susnjak T. The application of machine learning techniques for predicting results in team sport: a review. Prepr Submitt ArXiv 2019; 1-48 . doi: 10.1613/jair.1.13509
  • 33 McCabe A, Trevathan J. Artificial intelligence in sports prediction. In: Fifth International Conference on Information Technology: New Generations (itng 2008). IEEE. 2008: 1194-1197
  • 34 Claudino JG, de Oliveira Capanema D, de Souza TV. et al. Current approaches to the use of artificial intelligence for injury risk assessment and performance prediction in team sports: a systematic review. Sports Med Open 2019; 5: 28
  • 35 Velichkov B, Koychev I, Boytcheva S. Deep learning contextual models for prediction of sport event outcome from sportsman’s interviews. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019). 2019: 1240-1246
  • 36 Dhanabal S, Chandramathi S. A review of various k-nearest neighbor query processing techniques. Int J Comput Appl 2011; 31: 14-22
  • 37 Maszczyk A, Gołaś A, Pietraszewski P. et al. Application of neural and regression models in sports results prediction. Procedia Soc Behav Sci 2014; 117: 482-487
  • 38 Peace IC, Uzoma AO, Ita SA. et al. A comparative analysis of K-NN and ANN techniques in machine learning. Int J Eng Res Technol 2015; 4: 420-425
  • 39 Ruiz JR, Ramirez-Lechuga J, Ortega FB. et al. Artificial neural network-based equation for estimating VO2max from the 20 m shuttle run test in adolescents. Artif Intell Med 2008; 44: 233-245
  • 40 Ajiboye AR, Abdullah-Arshah R, Hongwu Q. Evaluating the effect of dataset size on predictive model using supervised learning technique. International Journal of Software Engineering & Computer Sciences 2015; 1: 75-84
  • 41 Harriss DJ, Jones C, MacSween A. Ethical standards in sport and exercise science research: 2022 update. Int J Sports Med 2022; 43: 1065-1070
  • 42 Renganathan V. Overview of artificial neural network models in the biomedical domain. Bratisl Lek Listy 2019; 120: 536-540
  • 43 Fritsch S, Guenther F, Suling M. et al. Package ‘Neuralnet’: Training of Neural Networks. 2016
  • 44 Malone C, Fennell L, Folliard T. et al. Using a neural network to predict deviations in mean heart dose during the treatment of left-sided deep inspiration breath hold patients. Phys Med 2019; 65: 137-142
  • 45 Smith PF, Ganesh S, Liu P. A comparison of random forest regression and multiple linear regression for prediction in neuroscience. J Neurosci Methods 2013; 220: 85-91
  • 46 Cohen J. Statistical power analysis. Curr Dir Psychol Sci 1992; 1: 98-101
  • 47 Munro BH. Statistical Methods for Health Care Research. lippincott williams & wilkins; 2005
  • 48 Lin L, Torbeck LD. Coefficient of accuracy and concordance correlation coefficient: new statistics for methods comparison. PDA J Pharm Sci Technol 1998; 52: 55-59
  • 49 Freund Y. Concordance entre deux méthodes de mesure d’une même variable: diagramme de Bland et Altman. Ann Fr Médecine Urgence 2016; 6: 143-146
  • 50 Mustafa M, Taib MN, Murat Z. et al. Comparison between KNN and ANN classification in brain balancing application via spectrogram image. J Comput Sci Comput Math 2012; 2: 17-22
  • 51 Tamilarasi R, Porkodi DR. A study and analysis of disease prediction techniques in data mining for healthcare. Int J Emerg Res Manag Technoly ISSN 2015; 1: 2278-9359
  • 52 Musa RM, Majeed AA, Taha Z. et al. The application of Artificial Neural Network and k-Nearest Neighbour classification models in the scouting of high-performance archers from a selected fitness and motor skill performance parameters. Sci Sports 2019; 34: e241-e249.
  • 53 Anyama OU, Nwachukwu EO. A hybrid prediction system for american NFL results. Int J Comput Appl Technol Res 2015; 4: 42-47
  • 54 Anyama OU, Igiri CP. An application of linear regression & artificial neural network model in the NFL result prediction. Int J Eng Res Technol 2015; 4: 457-461
  • 55 Domingos P. The Master Algorithm: How the Quest for the Ultimate Learning Machine will Remake our Uorld. Basic Books. 2015
  • 56 Kök H, Acilar AM, İzgi MS. Usage and comparison of artificial intelligence algorithms for determination of growth and development by cervical vertebrae stages in orthodontics. Prog Orthod 2019; 20: 41
  • 57 Yang L, Shami A. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 2020; 415: 295-316
  • 58 Mercier D, Léger L, Desjardins M. Nomogramme pour prédire la performance, le VO2max et l’endurance relative en course de fond. Médecine du Sport 1984; 58: 181-187
  • 59 Riegel PS. Athletic records and human endurance: A time-vs.-distance equation describing world-record performances may be used to compare the relative endurance capabilities of various groups of people. Am Sci 1981; 69: 285-290
  • 60 Nikolaidis PT, Knechtle B. Validity of recreational marathon runners’ self-reported anthropometric data. Percept Mot Skills 2020; 127: 1068-1078