Validating UMLS Semantic Type Assignments Using SNOMED CT Semantic Tags

Huanying Gu; Zhe He; Duo Wei; Gai Elhanan; Yan Chen

doi:10.3414/ME17-01-0120

Subscribe to RSS

Please copy the URL and add it into your RSS Feed Reader.

https://www.thieme-connect.de/rss/thieme/en/10.1055-s-00035037.xml

Share / Bookmark

Facebook Linkedin Weibo

Download PDF

Methods Inf Med 2018; 57(01/02): 43-53
DOI: 10.3414/ME17-01-0120

Original Articles

Schattauer GmbH

Validating UMLS Semantic Type Assignments Using SNOMED CT Semantic Tags

Huanying Gu

¹Department of Computer Science, New York Institute of Technology, New York, NY, USA

,

Zhe He

²School of Information, Florida State University, Tallahassee, FL, USA

,

Duo Wei

³Computer Science and Information Systems, Stockton University, Galloway, NJ, USA

,

Gai Elhanan

⁴Department of Computer Science, New Jersey Institute of Technology, Newark, NJ, USA

⁵Desert Research Institute, Reno, NV, USA

,

Yan Chen

⁶Department of Computer Information Systems, Borough of Manhattan Community College, City University of New York, New York, NY, USA

› Author Affiliations
Funding Research reported in this publication was partially supported by the National Cancer Institute of the National Institutes of Health (NIH) under Award Number R01CA190779. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.

Further Information

Publication History

received: 02 November 2017

accepted: 20 December 2017

Publication Date:
05 April 2018 (online)

Abstract
Full Text
References

Permissions and Reprints

Summary

Background: The UMLS assigns semantic types to all its integrated concepts. The semantic types are widely used in various natural language processing tasks in the biomedical domain, such as named entity recognition, semantic disambiguation, and semantic annotation. Due to the size of the UMLS, erroneous semantic type assignments are hard to detect. It is imperative to devise automated techniques to identify errors and inconsistencies in semantic type assignments.

Objectives: Designing a methodology to perform programmatic checks to detect semantic type assignment errors for UMLS concepts with one or more SNOMED CT terms and evaluating concepts in a selected set of SNOMED CT hierarchies to verify our hypothesis that UMLS semantic type assignment errors may exist in concepts residing in semantically inconsistent groups.

Methods: Our methodology is a four-stage process. 1) partitioning concepts in a SNOMED CT hierarchy into semantically uniform groups based on their assigned semantic tags; 2) partitioning concepts in each group from 1) into the disjoint sub-groups based on their semantic type assignments; 3) mapping all SNOMED CT semantic tags into one or more semantic types in the UMLS; 4) identifying semantically inconsistent groups that have inconsistent assignments between semantic tags and semantic types according to the mapping from 3) and providing concepts in such groups to the domain experts for reviewing.

Results: We applied our method on the UMLS 2013AA release. Concepts of the semantically inconsistent groups in the PHYSICAL FORCE and RECORD ARTIFACT hierarchies have error rates 33% and 62.5% respectively, which are greatly larger than error rates 0.6% and 1% in semantically consistent groups of the two hierarchies.

Conclusion: Concepts in semantically in - consistent groups are more likely to contain semantic type assignment errors. Our methodology can make auditing more efficient by limiting auditing resources on concepts of semantically inconsistent groups.

Keywords

Controlled medical terminology - quality assurance - SNOMED CT - UMLS

References
1 Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res 2004; 32 (Database issue): D267-270.

MissingFormLabel
Crossref PubMed Search in Google Scholar
2 The Statistics of the UMLS 2016AB Release. [May 1, 2017]. Available from: www.nlm.nih.gov/research/umls/knowledge_sources/metathesaurus/release/statistics.html

MissingFormLabel
PubMed
3 The UMLS Semantic Network. [cited 2012 Dec 5]. Available from: https://semanticnetwork.nlm.nih.gov/

MissingFormLabel
PubMed
4 McCray AT, Hole WT. The scope and structure of the first version of the UMLS Semantic Network. Proc 14th Annu Symp Comput Appl Med Care; Los Alamitos, CA: 1990: 126-130.

MissingFormLabel
Search in Google Scholar
5 He Z, Morrey CP, Perl Y, Elhanan G, Chen L, Chen Y, Geller J. Sculpting the UMLS Refined Semantic Network. Online J Public Health Inform 2014; 06 (02) e181.

MissingFormLabel
PubMed Search in Google Scholar
6 Min H, Perl Y, Chen Y, Halper M, Geller J, Wang Y. Auditing as part of the terminology design life cycle. J Am Med Inform Assoc 2006; 13 (06) 676-690.

MissingFormLabel
Crossref PubMed Search in Google Scholar
7 Luo J, Zhang GQ, Wentz S, Cui L, Xu R. SimQ: real-time retrieval of similar consumer health questions. J Med Internet Res 2015; 17 (02) e43.

MissingFormLabel
Crossref PubMed Search in Google Scholar
8 Park MS, He Z, Chen Z, Oh S, Bian J. Consumers’ Use of UMLS Concepts on Social Media: Diabetes-Related Textual Data Analysis in Blog and Social Q&A Sites. JMIR Med Inform 2016; 04 (04) e41.

MissingFormLabel
Crossref PubMed Search in Google Scholar
9 Weng C, Wu X, Luo Z, Boland MR, Theodoratos D, Johnson SB. EliXR: an approach to eligibility criteria extraction and representation. J Am Med Inform Assoc 2011; 18 Suppl 1: i116-124.

MissingFormLabel
PubMed Search in Google Scholar
10 He Z, Chen Z, Oh S, Hou J, Bian J. Enriching consumer health vocabulary through mining a social Q&A site: a similarity-based approach. J Biomed Inform 2017; 69: 75-85.

MissingFormLabel
Crossref PubMed Search in Google Scholar
11 SNOMED CT User Guide. [cited 2013 Apr 2]. Available from: www.ihtsdo.org/fileadmin/user_upload/doc/en_us/ug.html

MissingFormLabel
PubMed
12 Release Notes of SNOMED CT International Edition. [May 1, 2017]. Available from: www.nlm.nih.gov/healthit/snomedct/international.html

MissingFormLabel
PubMed
13 Fung KW, Hole WT, Nelson SJ, Srinivasan S, Powell T, Roth L. Integrating SNOMED CT into the UMLS: an exploration of different views of synonymy and quality of editing. J Am Med Inform Assoc 2005; 12 (04) 486-494.

MissingFormLabel
Crossref PubMed Search in Google Scholar
14 Gu HH, Perl Y, Elhanan G, Min H, Zhang L, Peng Y. Auditing concept categorizations in the UMLS. Artif Intell Med 2004; 31 (01) 29-44.

MissingFormLabel
Crossref PubMed Search in Google Scholar
15 Cimino JJ, Min H, Perl Y. Consistency across the hierarchies of the UMLS Semantic Network and Metathesaurus. J Biomed Inform 2003; 36 (06) 450-461.

MissingFormLabel
Crossref PubMed Search in Google Scholar
16 Chen Y, Gu HH, Perl Y, Geller J. Structural groupbased auditing of missing hierarchical relationships in UMLS. J Biomed Inform 2009; 42 (03) 452-467.

MissingFormLabel
Crossref PubMed Search in Google Scholar
17 Morrey CP, Geller J, Halper M, Perl Y. The Neighborhood Auditing Tool: a hybrid interface for auditing the UMLS. J Biomed Inform 2009; 42 (03) 468-489.

MissingFormLabel
Crossref PubMed Search in Google Scholar
18 Geller J, He Z, Perl Y, Morrey CP, Xu J. Rule-based support system for multiple UMLS semantic type assignments. J Biomed Inform 2013; 46 (01) 97-110.

MissingFormLabel
Crossref PubMed Search in Google Scholar
19 Morrey CP. Auditing the Unified Medical Language System and Enhancing the Refined Semantic Network: Dissertation in the Department of Computer Science. New: Jersey Institute of Technology; 2009

MissingFormLabel
Search in Google Scholar
20 Gu H, Chen Y, He Z, Halper M, Chen L. Quality Assurance of UMLS Semantic Type Assignments Using SNOMED CT Hierarchies. Methods Inf Med 2016; 55 (02) 158-165.

MissingFormLabel
Thieme Connect PubMed Search in Google Scholar
21 Wei D, Halper M, Elhanan G. editors. Using SNOMED semantic concept groupings to enhance semantic-type assignments in the UMLS. Proceedings of the 2nd ACM SIGHIT International Health Informatics Symposium. 2012. Miami, Florida, USA: ACM.;

MissingFormLabel
Search in Google Scholar
22 Sfakianaki P, Koumakis L, Sfakianakis S, Iatraki G, Zacharioudakis G, Graf N, Marias K, Tsiknakis M. Semantic biomedical resource discovery: a Natural Language Processing framework. BMC Med Inform Decis Mak 2015; 15: 77.

MissingFormLabel
Crossref PubMed Search in Google Scholar
23 Albright D, Lanfranchi A, Fredriksen A, Styler WFt, Warner C, Hwang JD, Choi JD, Dligach D, Nielsen RD, Martin J, Ward W, Palmer M, Savova GK. Towards comprehensive syntactic and semantic annotations of the clinical narrative. J Am Med Inform Assoc 2013; 20 (05) 922-930.

MissingFormLabel
Crossref PubMed Search in Google Scholar
24 Zhang R, Pakhomov S, Melton GB. Longitudinal analysis of new information types in clinical notes. AMIA Jt Summits Transl Sci Proc 2014; 2014: 232-237.

MissingFormLabel
PubMed Search in Google Scholar
25 Hoxha J, Jiang G, Weng C. Automated learning of domain taxonomies from text using background knowledge. J Biomed Inform 2016; 63: 295-306.

MissingFormLabel
Crossref PubMed Search in Google Scholar
26 Fan JW, Li J, Lussier YA. Semantic Modeling for Exposomics with Exploratory Evaluation in Clinical Context. J Healthc Eng 2017; 2017 3818302.

MissingFormLabel
PubMed Search in Google Scholar
27 Yu B, He Z. editors. Exploratory Textual Analysis of Consumer Health Languages for People Who are Deaf/Hard of Hearing. Proceedings of 2017 IEEE International Conference on Bioinformatics and Biomedicine. 2017. Kansas City, MO: IEEE.;

MissingFormLabel
Search in Google Scholar
28 Ceusters W, Bona JP. Analyzing SNOMED CT’s Historical Data: Pitfalls and Possibilities. AMIA Annu Symp Proc 2016; 2016: 361-370.

MissingFormLabel
PubMed Search in Google Scholar
29 Chen Y, Gu HH, Perl Y, Halper M, Xu J. Expanding the extent of a UMLS semantic type via group neighborhood auditing. J Am Med Inform Assoc 2009; 16 (05) 746-757.

MissingFormLabel
Crossref PubMed Search in Google Scholar

Subscribe to RSS

Share / Bookmark

Validating UMLS Semantic Type Assignments Using SNOMED CT Semantic Tags

Publication History

Summary

Keywords

References