Open Access
CC BY 4.0 · Yearb Med Inform 2024; 33(01): 216-222
DOI: 10.1055/s-0044-1800747
Section 9: Knowledge Representation and Management
Survey

Knowledge Representation and Management in the Age of Long Covid and Large Language Models: a 2022-2023 Survey

Authors

  • Jonathan P. Bona

    Department of Biomedical Informatics, University of Arkansas for Medical Sciences

Summary

Objectives: To select, present, and summarize cutting edge work in the field of Knowledge Representation and Management (KRM) published in 2022 and 2023.

Methods: A comprehensive set of KRM-relevant articles published in 2022 and 2023 was retrieved by querying PubMed. Topic modeling with Latent Dirichlet Allocation was used to further refine this query and suggest areas of focus. Selected articles were chosen based on a review of their title and abstract.

Results: An initial set of 8,706 publications were retrieved from PubMed. From these, fifteen papers were ultimately selected matching one of two main themes: KRM for long COVID, and KRM approaches used in combination with generative large language models.

Conclusions: This survey shows the ongoing development and versatility of KRM approaches, both to improve our understanding of a global health crisis and to augment and evaluate cutting edge technologies from other areas of artificial intelligence.



Publication History

Article published online:
08 April 2025

© 2024. The Author(s). This is an open access article published by Thieme under the terms of the Creative Commons Attribution License, permitting unrestricted use, distribution, and reproduction so long as the original work is properly cited. (https://creativecommons.org/licenses/by/4.0/)

Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany

 
  • References

  • 1 Verspoor K. The evolution of clinical knowledge during COVID-19: towards a global learning health system. Yearb Med Inform. 2021;30[01]:176–84. DOI: 10.1055/s-0041-1726503
  • 2 Hastings J. Achieving Inclusivity by Design: Social and Contextual Information in Medical Knowledge. Yearb Med Inform. 2022 Aug;31(1):228–35. DOI: 10.1055/s-0042-1742509
  • 3 He Y, Yu H, Ong E, Wang Y, Liu Y, Huffman A, et al. CIDO, a community-based ontology for coronavirus disease knowledge and data integration, sharing, and analysis. Sci Data. 2020 Jun 12;7(1):181. DOI: 10.1038/s41597-020-0523-6
  • 4 Raveendran AV, Jayadevan R, Sashidharan S. Long COVID: An overview. Diabetes Metab Syndr Clin Res Rev. 2021 May 1;15(3):869–75. DOI: 10.1016/j.dsx.2021.04.007
  • 5 Davis HE, McCorkell L, Vogel JM, Topol EJ. Long COVID: major findings, mechanisms and recommendations. Nat Rev Microbiol. 2023 Mar 1;21(3):133–46. DOI: 10.1038/s41579-022-00846-2
  • 6 Di Toro A, Bozzani A, Tavazzi G, Urtis M, Giuliani L, Pizzoccheri R, et al. Long COVID: long-term effects? Eur Heart J Suppl. 2021 Oct 1;23(Supplement_E):E1–5. DOI: 10.1093/eurheartj/suab080
  • 7 O' Mahony L, Buwalda T, Blair M, Forde B, Lunjani N, Ambikan A, et al. Impact of Long COVID on health and quality of life. HRB Open Res. 2022;5:31. DOI: 10.12688/hrbopenres.13516.1
  • 8 Faghy MA, Owen R, Thomas C, Yates J, Ferraro FV, Skipper L, et al. Is long COVID the next global health crisis? J Glob Health. 2022 Oct 26;12:03067. DOI: 10.7189/jogh.12.03067
  • 9 Mirin AA. A preliminary estimate of the economic impact of long COVID in the United States. Fatigue Biomed Health Behav. 2022 Oct 2;10(4):190–9. DOI: 10.1080/21641846.2022.2124064
  • 10 Rischard F, Altman N, Szmuszkovicz J, Sciurba F, Berman-Rosenzweig E, Lee S, et al. Long-Term Effects of COVID-19 on the Cardiopulmonary System in Adults and Children: Current Status and Questions to be Resolved by the National Institutes of Health Researching COVID to Enhance Recovery Initiative. Chest. 2024 Apr;165(4):978–89. DOI: 10.1016/j.chest.2023.12.030
  • 11 Bonilla H, Peluso MJ, Rodgers K, Aberg JA, Patterson TF, Tamburro R, et al. Therapeutic trials for long COVID-19: A call to action from the interventions taskforce of the RECOVER initiative. Front Immunol. 2023 Mar 9;14:1129459. DOI: 10.3389/fimmu.2023.1129459
  • 12 Astin R, Banerjee A, Baker MR, Dani M, Ford E, Hull JH, et al. Long COVID: mechanisms, risk factors and recovery. Exp Physiol. 2023;108(1):12–27. DOI: 10.1113/EP090802
  • 13 Min B, Ross H, Sulem E, Veyseh APB, Nguyen TH, Sainz O, et al. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Comput Surv. 2023;56(2):1–40. DOI: 10.1145/3605943
  • 14 Achiam J, Adler S, Agarwal S, Ahmad L, Akkaya I, Aleman FL, et al. Gpt-4 technical report. ArXiv; 2023. DOI: 10.48550/arXiv.2303.08774
  • 15 Lee P, Bubeck S, Petro J. Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine. N Engl J Med. 2023 Mar 30;388(13):1233–9. DOI: 10.1056/NEJMsr2214184
  • 16 Dave T, Athaluri SA, Singh S. ChatGPT in medicine: an overview of its applications, advantages, limitations, future prospects, and ethical considerations. Front Artif Intell. 2023;6:1169595. DOI: 10.3389/frai.2023.1169595
  • 17 Waisberg E, Ong J, Masalkhi M, Kamran SA, Zaman N, Sarker P, et al. GPT-4: a new era of artificial intelligence in medicine. Ir J Med Sci. 2023 Dec;192(6):3197–200. DOI: 10.1007/s11845-023-03377-8
  • 18 Kanjee Z, Crowe B, Rodman A. Accuracy of a Generative Artificial Intelligence Model in a Complex Diagnostic Challenge. JAMA. 2023 Jul 3;330(1):78–80. DOI: 10.1001/jama.2023.8288
  • 19 Ito N, Kadomatsu S, Fujisawa M, Fukaguchi K, Ishizawa R, Kanda N, et al. The Accuracy and Potential Racial and Ethnic Biases of GPT-4 in the Diagnosis and Triage of Health Conditions: Evaluation Study. JMIR Med Educ. 2023 Nov 2;9:e47532. DOI: 10.2196/47532
  • 20 Bender EM, Gebru T, McMillan-Major A, Shmitchell S. On the dangers of stochastic parrots: Can language models be too big?In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency. 2021. p. 610–23. DOI: 10.1145/3442188.3445922
  • 21 Zhang C, Zhang C, Li C, Qiao Y, Zheng S, Dam SK, et al. One small step for generative ai, one giant leap for agi: A complete survey on chatgpt in aigc era. ArXiv; 2023. DOI: 10.48550/arXiv.2304.06488
  • 22 Titus LM. Does ChatGPT have semantic understanding? A problem with the statistics-of-occurrence strategy. Cogn Syst Res. 2024 Jan 1;83:101174. DOI: 10.1016/j.cogsys.2023.101174
  • 23 Ji Z, Lee N, Frieske R, Yu T, Su D, Xu Y, et al. Survey of hallucination in natural language generation. ACM Comput Surv. 2023;55(12):1–38. DOI: 10.1145/3571730
  • 24 Arkoudas K. ChatGPT is no stochastic parrot. But it also claims that 1 is greater than 1. Philos Technol. 2023;36(3):54. DOI: 10.1007/s13347-023-00619-6
  • 25 Blei DM, Ng AY, Jordan MI. Latent dirichlet allocation. J Mach Learn Res. 2003;3[Jan]:993–1022. [cited 2024 Jul 1]. Available from: https://dl.acm.org/doi/pdf/10.5555/944919.944937
  • 26 Ambalavanan R, Snead RS, Marczika J, Kozinsky K, Aman E. Advancing the Management of Long COVID by Integrating into Health Informatics Domain: Current and Future Perspectives. Int J Environ Res Public Health. 2023 Sep 26;20(19). DOI: 10.3390/ijerph20196836
  • 27 Haendel MA, Chute CG, Bennett TD, Eichmann DA, Guinney J, Kibbe WA, et al. The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment. J Am Med Inform Assoc. 2021 Mar 1;28(3):427–43. DOI: 10.1093/jamia/ocaa196
  • 28 N3C Dashboard - Home. [cited 2024 Feb 1]. Available from: https://covid.cd2h.org/dashboard/
  • 29 Gargano MA, Matentzoglu N, Coleman B, Addo-Lartey EB, Anagnostopoulos AV, Anderton J, et al. The Human Phenotype Ontology in 2024: phenotypes around the world. Nucleic Acids Res. 2024 Jan 5;52(D1):D1333–46. DOI: 10.1093/nar/gkad1005
  • 30 Deer RR, Rock MA, Vasilevsky N, Carmody L, Rando H, Anzalone AJ, et al. Characterizing Long COVID: Deep Phenotype of a Complex Condition. EBioMedicine. 2021 Dec;74:103722. DOI: 10.1016/j.ebiom.2021.103722
  • 31 Wang L, Foer D, MacPhaul E, Lo YC, Bates DW, Zhou L. PASCLex: A comprehensive post-acute sequelae of COVID-19 (PASC) symptom lexicon derived from electronic health record clinical notes. J Biomed Inform. 2022 Jan;125:103951. DOI: 10.1016/j.jbi.2021.103951
  • 32 Zhou L, Plasek JM, Mahoney LM, Karipineni N, Chang F, Yan X, et al. Using Medical Text Extraction, Reasoning and Mapping System (MTERMS) to Process Medication Information in Outpatient Clinical Notes. AMIA Annu Symp Proc. 2011;2011:1639–48.
  • 33 Lee Y, Riskedal E, Kalleberg KT, Istre M, Lind A, Lund-Johansen F, et al. EWAS of post-COVID-19 patients shows methylation differences in the immune-response associated gene, IFI44L, three months after COVID-19 infection. Sci Rep. 2022 Jul 7;12(1):11478. DOI: 10.1038/s41598-022-15467-1
  • 34 Lv Y, Zhang T, Cai J, Huang C, Zhan S, Liu J. Bioinformatics and systems biology approach to identify the pathogenetic link of Long COVID and Myalgic Encephalomyelitis/Chronic Fatigue Syndrome. Front Immunol. 2022;13:952987. DOI: 10.3389/fimmu.2022.952987
  • 35 Komaroff AL, Lipkin WI. ME/CFS and Long COVID share similar symptoms and biological abnormalities: road map to the literature. Front Med (Lausanne). 2023 Jun 2;10:1187163. DOI: 10.3389/fmed.2023.1187163
  • 36 Tziastoudi M, Cholevas C, Stefanidis I, Theoharides TC. Genetics of COVID-19 and myalgic encephalomyelitis/chronic fatigue syndrome: a systematic review. Ann Clin Transl Neurol. 2022 Nov;9(11):1838–57. DOI: 10.1002/acn3.51631
  • 37 Reese JT, Blau H, Casiraghi E, Bergquist T, Loomba JJ, Callahan TJ, et al. Generalisable long COVID subtypes: findings from the NIH N3C and RECOVER programmes. EBioMedicine. 2023 Jan;87:104413. DOI: 10.1016/j.ebiom.2022.104413
  • 38 Hripcsak G, Duke JD, Shah NH, Reich CG, Huser V, Schuemie MJ, et al. Observational Health Data Sciences and Informatics (OHDSI): Opportunities for Observational Researchers. Stud Health Technol Inform. 2015;216:574–8.
  • 39 Callahan TJ, Wyrwa JM, Vasilevsky NA, Robinson PN, Haendel MA, Hunter LE, et al. OMOP2OBO: Semantic Integration of Standardized Clinical Terminologies to Power Translational Digital Medicine Across Health Systems. 2020. [cited 2024 Jul 1]. Available from: https://www.ohdsi.org/wp-content/uploads/2020/10/Tiffany-Callahan-Callahan_OMOP2OBO_2020_OHDSI_Symposium_Callahan_Poster.pdf
  • 40 Denecke K, May R, Rivera Romero O. How Can Transformer Models Shape Future Healthcare: A Qualitative Study. Stud Health Technol Inform. 2023 Oct 20;309:43–7. DOI: 10.3233/SHTI230736
  • 41 Aleksander SA, Balhoff J, Carbon S, Cherry JM, Drabkin HJ, Ebert D, et al. The Gene Ontology knowledgebase in 2023. Genetics. 2023 May 4;224(1):iyad031. DOI: 10.1093/genetics/iyad031
  • 42 Gene Ontology Resource. Gene Ontology Resource. [cited 2024 Jan 30]. Available from: http://geneontology.org/stats.html
  • 43 Giri SJ, Ibtehaz N, Kihara D. GO2Sum: Generating Human Readable Functional Summary of Proteins from GO Terms. bioRxiv : the preprint server for biology. United States; 2023. p. 2023.11.10.566665. DOI: 10.1101/2023.11.10.566665
  • 44 Raffel C, Shazeer N, Roberts A, Lee K, Narang S, Matena M, et al. Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. J Mach Learn Res. 2020;21[140]:1–67. [cited 2024 Jul 1]. Available from: https://dl.acm.org/doi/pdf/10.5555/3455716.3455856
  • 45 Hu M, Alkhairy S, Lee I, Pillich RT, Bachelder R, Ideker T, et al. Evaluation of large language models for discovery of gene set function. Res Sq [Preprint]. United States; 2023;rs.3.rs-3270331. DOI: 10.21203/rs.3.rs-3270331/v1
  • 46 Joachimiak MP, Caufield JH, Harris NL, Kim H, Mungall CJ. Gene Set Summarization using Large Language Models. arXiv; 2023. DOI: 10.48550/arXiv.2305.13338
  • 47 Munarko Y, Rampadarath A, Nickerson D. Building a search tool for compositely annotated entities using Transformer-based approach: Case study in Biosimulation Model Search Engine (BMSE). F1000Research. 2023 Feb 10;12:162. DOI: 10.12688/f1000research.128982.1
  • 48 Munarko Y, Rampadarath A, Nickerson DP. CASBERT: BERT-based retrieval for compositely annotated biosimulation model entities. Front Bioinforma. 2023;3:1107467. DOI: 10.3389/fbinf.2023.1107467
  • 49 Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv; 2018. DOI: 10.48550/arXiv.1810.04805
  • 50 Tran H, Phan L, Anibal J, Nguyen BT, Nguyen TS. SPBERT: an Efficient Pre-training BERT on SPARQL Queries for Question Answering over Knowledge Graphs. In: Proceedings of the International Conference on Neural Information Processing. 2021. p. 512–23. DOI: 10.1007/978-3-030-92185-9_42
  • 51 Reese JT, Danis D, Caulfied JH, Casiraghi E, Valentini G, Mungall CJ, et al. On the limitations of large language models in clinical diagnosis. medRxiv [Preprint]. 2024 Feb 26:2023.07.13.23292613. DOI: 10.1101/2023.07.13.23292613
  • 52 Searle T, Ibrahim Z, Teo J, Dobson RJB. Discharge summary hospital course summarisation of in patient Electronic Health Record text with clinical concept guided deep pre-trained Transformer models. J Biomed Inform. 2023 May;141:104358. DOI: 10.1016/j.jbi.2023.104358
  • 53 Lewis M, Liu Y, Goyal N, Ghazvininejad M, Mohamed A, Levy O, et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension. arXiv; 2019. DOI: 10.48550/arXiv.1910.13461
  • 54 Wang A, Liu C, Yang J, Weng C. Fine-tuning Large Language Models for Rare Disease Concept Normalization. J Am Med Inform Assoc. 2024 Jun 3:ocae133. DOI: 10.1093/jamia/ocae133