Yearb Med Inform 2017; 26(01): 214-227
DOI: 10.15265/IY-2017-029
Section 10: Natural Language Processing
Survey
Georg Thieme Verlag KG Stuttgart

Capturing the Patient’s Perspective: a Review of Advances in Natural Language Processing of Health-Related Text

G. Gonzalez-Hernandez
1   Department of Epidemiology, Biostatistics, and Informatics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
,
A. Sarker
1   Department of Epidemiology, Biostatistics, and Informatics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
,
K. O’Connor
1   Department of Epidemiology, Biostatistics, and Informatics, The Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
,
G. Savova
2   Boston Children’s Hospital and Harvard Medical School, Boston, MA, USA
› Institutsangaben
Weitere Informationen

Publikationsverlauf

Publikationsdatum:
11. September 2017 (online)

Summary

Background: Natural Language Processing (NLP) methods are increasingly being utilized to mine knowledge from unstructured health-related texts. Recent advances in noisy text processing techniques are enabling researchers and medical domain experts to go beyond the information encapsulated in published texts (e.g., clinical trials and systematic reviews) and structured questionnaires, and obtain perspectives from other unstructured sources such as Electronic Health Records (EHRs) and social media posts.

Objectives: To review the recently published literature discussing the application of NLP techniques for mining health-related information from EHRs and social media posts.

Methods: Literature review included the research published over the last five years based on searches of PubMed, conference proceedings, and the ACM Digital Library, as well as on relevant publications referenced in papers. We particularly focused on the techniques employed on EHRs and social media data.

Results: A set of 62 studies involving EHRs and 87 studies involving social media matched our criteria and were included in this paper. We present the purposes of these studies, outline the key NLP contributions, and discuss the general trends observed in the field, the current state of research, and important outstanding problems.

Conclusions: Over the recent years, there has been a continuing transition from lexical and rule-based systems to learning-based approaches, because of the growth of annotated data sets and advances in data science. For EHRs, publicly available annotated data is still scarce and this acts as an obstacle to research progress. On the contrary, research on social media mining has seen a rapid growth, particularly because the large amount of unlabeled data available via this resource compensates for the uncertainty inherent to the data. Effective mechanisms to filter out noise and for mapping social media expressions to standard medical concepts are crucial and latent research problems. Shared tasks and other competitive challenges have been driving factors behind the implementation of open systems, and they are likely to play an imperative role in the development of future systems.

 
  • References

  • 1 The World Health Organization Regional Office for Europe. Guidance on Developing Quality and Safety Strategies with a Health System Approach. 2008 http://www.euro.who.int/__data/assets/ pdf_file/0011/96473/E91317.pdf Accessed January 15, 2017.
  • 2 Snyder CF, Jensen RE, Segal JB, Wu AW. Patient-Reported Outcomes (PROs): Putting the Patient Perspective in Patient-Centered Outcomes research. Med Care 2013; 51 (803) S73-S79.
  • 3 Witter JP. The Promise of Patient-Reported Outcomes Measurement Information System-Turning Theory into Reality. A Uniform Approach to Patient-Reported Outcomes Across Rheumatic Diseases. Rheum Dis Clin North Am 2016; 42 (02) 377-94.
  • 4 Broderick JE, DeWitt EM, Rothrock N, Crane PK, Forrest CB. Advances in Patient-Reported Outcomes: The NIH PROMIS(®) Measures. EGEMS (Washington, DC) 2013; 01 (01) 1015.
  • 5 Meystre SM, Savova GK, Kipper-Schuler KC, Hurdle JF. Extracting Information from Textual Documents in the Electronic Health Record: a Review of Recent Research. Yearb Med Inform 2008; 128-44.
  • 6 Névéol A, Zweigenbaum P. Clinical Natural Language Processing in 2015: Leveraging the Variety of Texts of Clinical Interest. Yearb Med Inform 2016; (01) 234-9.
  • 7 Demner-Fushman D, Elhadad N. Aspiring to Unintended Consequences of Natural Language Processing: A Review of Recent Developments in Clinical and Consumer-Generated Text Processing. Yearb Med Inform 2016; (01) 224-33.
  • 8 Névéol A, Zweigenbaum P. Clinical Natural Language Processing in 2014: Foundational Methods Supporting Efficient Healthcare. Yearb Med Inform 2015; 10 (01) 194-8.
  • 9 Guidance for Industry Patient-Reported Outcome Measures: Use in Medical Product Development to Support Labeling Claims. 2006: 301-827 http://www.fda.gov/cder/guidance/index.htm Accessed January 16, 2017.
  • 10 NCI Dictionary of Cancer Terms - National Cancer Institute. https://www.cancer.gov/publications/dic-tionaries/cancer-terms Accessed April 26, 2017.
  • 11 i2b2: Informatics for Integrating Biology and the Bedside. https://www.i2b2.org/ Accessed January 15, 2017.
  • 12 Friedman C. A Broad-coverage Natural Language Processing System. Proc AMIA Symp 2000; 270-4.
  • 13 Savova GK, Masanz JJ, Ogren PV, Zheng J, Sohn S, Kipper-Schuler KC. et al. Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): Architecture, Component Evaluation and Applications. J Am Med Informatics Assoc 2010; 17 (05) 507-13.
  • 14 Tseytlin E, Mitchell K, Legowski E, Corrigan J, Chavan G, Jacobson RS. NOBLE - Flexible Concept Recognition for Large-scale Biomedical Natural Language Processing. BMC Bioinformat-ics 2016; 17: 32.
  • 15 Leaman R, Khare R, Lu Z. Challenges in Clinical Natural Language Processing for Automated Disorder Normalization. J Biomed Inform 2015; 57: 28-37.
  • 16 Bodenreider O. The Unified Medical Language System (UMLS): Integrating Bio-medical Terminology. Nucleic Acids Res 2004; 32 (90001): 267D-270.
  • 17 Kang N, Singh B, Afzal Z, van Mulligen EM, Kors JA. Using Rule-based Natural Language Processing to Improve Disease Normalization in Biomedical Text. J Am Med Inform Assoc 2013; 20 (05) 876-881.
  • 18 Jimeno A, Jimenez-Ruiz E, Lee V, Gaudan S, Berlanga R, Rebholz-Schuhmann D. Assessment of Disease Named Entity Recognition on a Corpus of Annotated Sentences. BMC Bioinformatics 2008; 9 Suppl 3: S3.
  • 19 Leaman R. C, Miller GG. Enabling Recognition of Diseases in Biomedical Text with Machine Learning: Corpus and Benchmark. In: 3rd International Symposium on Languages in Biology and Medicine Jeju Island, South Korea 2009; 82-89.
  • 20 Névéol A, Grouin C, Leixa J, Rosset S, Zwei-genbaum P. The Quaero French medical corpus: A ressource for medical entity recognition and normalization. PROC BIOTEXTM, REYKJAVIK. 2014 http://citeseerx.ist.psu.edu/viewdoc/ summary?doi=10.1.1.672.946 Accessed January 16, 2017.
  • 21 Data/Tools│NTCIR. http://research.nii.ac.jp/ntcir/data/data-en.html Accessed January 16, 2017.
  • 22 Huang M, Liu J, Zhu X. GeneTUKit: a Software for Document-level Gene Normalization. Bioin-formatics 2011; 27 (07) 1032-3.
  • 23 Huang M, Névéol A, Lu Z. Recommending MeSH Terms for Annotating Biomedical Articles. J Am Med Inform Assoc 18 (05) 660-7.
  • 24 Sullivan R, Leaman R, Gonzalez G. The DIEGO Lab Graph based gene Normalization System. In: Proceedings - 10th International Conference on Machine Learning and Applications, ICMLA 2011 2011.
  • 25 Buyko E, Tomanek K, Hahn U. Resolution of Coordination Ellipses in Biological Named Entities Using Conditional Random Fields. smtp.bootstrep.org
  • 26 Tsuruoka Y, McNaught J, Tsujii J, Ananiadou S. Learning String Similarity Measures for gene/protein Name Dictionary Look-up using Logistic Regression. Bioinformatics 2007; 23 (20) 2768-774.
  • 27 Wermter J, Tomanek K, Hahn U. High-performance Gene Name Normalization with GeNo. Bioinfor-matics 2009; 25 (06) 815-821.
  • 28 Leaman R, Islamaj RDogan, Lu Z. DNorm: Disease Name Normalization with Pairwise Learning to Rank. Bioinformatics 2013; 29 (22) 2909-2917.
  • 29 Gobbel GT, Reeves R, Jayaramaraja S. et al. Development and Evaluation of RapTAT: a Machine Learning System for Concept Mapping of Phrases from Medical Narratives. J Biomed Inform 2014; 48: 54-65.
  • 30 Kate RJ. Normalizing Clinical Terms using Learned Edit Distance Patterns. J Am Med Inform Assoc July 2015; ocv108.
  • 31 Aronson AR. Effective Mapping of Biomedical Text to the UMLS Metathesaurus: the MetaMap Program. Proc AMIA Symp January 2001; 17-21.
  • 32 Sohn S, Kocher JPA, Chute CG, Savova GK. Drug Side Effect Extraction from Clinical Narratives of Psychiatry and Psychology Patients. J Am Med Inform Assoc 2011; (18) 144-9.
  • 33 Jonnagaddalaa J, Liaw ST, Rayb P, Kumarc M, Dai HJ. TMUNSW: Identification of Disorders and Normalization to SNOMED-CT Terminology in Unstructured Clinical Notes. In: SemE-val-2015: 394.
  • 34 Denecke K. Extracting Medical Concepts from Medical Social Media with Clinical NLP Tools: a Qualitative Study. In: Proceedings of the Fourth Workshop on Building and Evaluation Resources for Health and Biomedical Text Processing 2014
  • 35 Kelly L, Goeuriot L, Suominen H, Schreck T, Leroy G, Mowery D. et al. Overview of the ShARe/CLEF eHealth Evaluation Lab. 2014 Available at: http://doras.dcu.ie/20109/
  • 36 Mowery DL, Velupillai S, South BR, Christensen L, Martinez D, Kelly L. et al. Task 2: ShARe/CLEF eHealth Evaluation Lab. 2014 Available at: http://doras.dcu.ie/20112/1/invited_paper_10.pdf
  • 37 Pradhan S, Elhadad N, Chapman W, Manandhar S, Savova G. SemEval-2014 Task 7: Analysis of Clinical Text. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Dublin, Ireland: August 23-24, 2014: 54-62.
  • 38 Zhang Y, Wang J, Tang B, Wu Y, Jian M, Chen Y. et al. UTH_CCB: A Report for SemEval 2014 – Task 7 Analysis of Clinical Text.. In: Proceedings of the 8th International Workshop on Semantic Evaluation (SemEval 2014). Dublin, Ireland: August 23-24, 2014: 802-6.
  • 39 Elhadad N, Pradhan S, Gorman SL, Manandhar S, Chapman W, Savova G. SemEval-2015 Task 14: Analysis of Clinical Text. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Denver, Colorado: June 4-5, 2015: 303-10.
  • 40 Pathak P, Patel P, Panchal V, Soni S, Dani K, Choudhary N. et al. ezDI: A Supervised NLP System for Clinical Narrative Analysis. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Denver, Colorado: June 4-5, 2015: 412-6.
  • 41 Xu J, Zhang Y, Wang J, Wu Y, Jiang M, Soysal E. et al. UTH-CCB: The Participation of the SemEval 2015 Challenge–Task 14. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Denver, Colorado: June 4-5, 2015: 311-4.
  • 42 Bethard S, Derczynski L, Savova G, Pustejo-vsky J, Verhagen M. SemEval-2015 Task 6: Clinical TempEval. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Denver, Colorado: June 4-5, 2015: 806-14.
  • 43 Velupillai S, Mowery DL, Abdelrahman S, Chris-tensen L, Chapman WW. BluLab: Temporal Information Extraction for the 2015 Clinical TempEval Challenge. In: Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015). Denver, Colorado: June 4-5, 2015: 815-9.
  • 44 Bethard S, Savova G, Chen W-T, Derczynski L, Pustejovsky J, Verhagen M. SemEval-2016 Task 12: Clinical TempEval. In: Proceedings of SemE-val-2016. San Diego, California: June 16-17, 2016: 1052-62.
  • 45 Lee H-J, Zhang Y, Xu J, Moon S, Wand J, Wu Y. et al. UTHealth at SemEval-2016 Task 12: an End-to-End System for Temporal Information Extraction from Clinical Notes. In: Proceedings of SemEval-2016. San Diego, California: June 16-17, 2016: 1292-7.
  • 46 Khalifa A, Velupillai S, Meystre S. UtahBMI at SemEval-2016 Task 12: Extracting Temporal Information from Clinical Text. In: Proceedings of SemEval-2016. San Diego, California: June 16-17, 2016: 1256-62.
  • 47 Liu C, Wang F, Hu J, Xiong H. Temporal Phenotyping from Longitudinal Electronic Health Records: a Graph Based Framework. doi:10.1145/2783258.2783352.
  • 48 Wang X, Sontag D, Wang F. Unsupervised Learning of Disease Progression Models. doi:10.1145/2623330.2623754.
  • 49 Pham T, Tran T, Phung D, Venkatesh S. DeepCare: a Deep Dynamic Memory Model for Predictive Medicine. In: PAKDD 2016: Advances in Knowledge Discovery and Data Mining 2016; 30-41.
  • 50 Liang Z, Zhang G, Huang JX, Hu QV. Deep Learning for Healthcare Decision Making with EMRs. In: 2014 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE 2014; 556-9.
  • 51 Wang Y, Luo J, Hao S, Xu H, Shin AY, Jin B. et al. NLP based Congestive Heart Failure Case Finding: a Prospective Analysis on Statewide Electronic Medical Records. Int J Med Inform 2015; 84 (12) 1039-47.
  • 52 Karmakar C, Luo W, Tran T, Berk M, Venkatesh S. Predicting Risk of Suicide Attempt Using History of Physical Illnesses From Electronic Medical Records. JMIR Ment Health 2016; 03 (03) e19.
  • 53 Jonnagaddala J, Liaw S-T, Ray P, Kumar M, Chang N-W, Dai H-J. Coronary Artery Disease Risk Assessment from Unstructured Electronic Health Records using Text Mining. J Biomed Inform 2015; 58: S203-S210.
  • 54 Jonnagaddala J, Liaw S-T, Ray P, Kumar M, Dai H-J, Hsu C-Y. Identification and Progression of Heart Disease Risk Factors in Diabetic Patients from Longitudinal Electronic Health Records. Biomed Res Int 2015; 2015: 636371.
  • 55 Chen Q, Li H, Tang B, Wang X, Liu X, Liu S. et al. An Automatic System to Identify Heart Disease Risk Factors in Clinical Texts over Time. J Biomed Inform 2015; 58: S158-63.
  • 56 Zheng B, Zhang J, Yoon SW, Lam SS, Khasawneh M, Poranki S. Predictive Modeling of Hospital Re-admissions using Metaheuristics and Data Mining. Expert Syst Appl 2015; 42 (20) 7110-20.
  • 57 Futoma J, Morris J, Lucas J. A Comparison of Models for Predicting early Hospital Readmissions. J Biomed Inform 2015; 56: 229-38.
  • 58 Iqbal E, Mallah R, Jackson RG, Ball M, Ibrahim ZM, Broadbent M. et al. Identification of Adverse Drug Events from Free Text Electronic Patient Records and Information in a Large Mental Health Case Register. PLoS One 2015; 10 (08) e0134208.
  • 59 Li Y, Salmasian H, Vilar S, Chase H, Friedman C, Wei Y. A Method for Controlling Complex Confounding Effects in the Detection of Adverse Drug Reactions using Electronic Health Records. J Am Med Inform Assoc 2014; 21 (02) 308-14.
  • 60 Wang G, Jung K, Winnenburg R, Shah NH. A Method for Systematic Discovery of Adverse Drug Events from Clinical Notes. J Am Med Informatics Assoc 2015; 22 (06) 1196-204.
  • 61 Jung K, LePendu P, Chen WS. et al. Automated Detection of off-label Drug Use. PLoS One 2014; 09 (02) e89324 doi:10.1371/journal.pone.0089324.
  • 62 Xu H, Aldrich MC, Chen Q, Iyer SV, Readhead B, Dudley JT. et al. Validating Drug Repurposing Signals using Electronic Health Records: a Case Study of Metformin Associated with Reduced Cancer Mortality. J Am Med Inform Assoc 2014; 09 (02) e89324.
  • 63 Yala A, Barzilay R, Salama L, Griffin M, Sollender G, Bardia A. et al. Using Machine Learning to Parse Breast Pathology Reports. Breast Cancer Res Treat 2017; Jan; 161 (02) 203-11.
  • 64 Weegar R, Dalianis H. Creating a Rule-based System for Text Mining of Norwegian Breast Cancer Pathology Reports. In: Sixth International Workshop on Health Text Mining and Information Analysis 2015
  • 65 Ou Y, Patrick J. Automatic Population of Structured Reports from Narrative Pathology Reports. In: Proceedings of the Seventh Australasian Workshop on Health Informatics and Knowledge Management -. Volume 153. HIKM ‘14. Darlinghurst, Australia, Australia: Australian Computer Society, Inc; 2014: 41-50.
  • 66 Hochheiser H, Castine M, Harris D, Savova G, Ja-cobson RS. An Information Model for Computable Cancer Phenotypes. BMC Med Inform Decis Mak 2016; 16 (01) 121.
  • 67 Wang TD, Plaisant C, Quinn AJ, Stanchak R, Murphy S, Shneiderman B. Aligning Temporal Data by Sentinel Events. In: Proceeding of the Twenty-Sixth Annual CHI Conference on Human Factors in Computing Systems - CHI ‘08. New York, New York, USA: ACM Press; 2008: 457.
  • 68 Nikfarjam A, Emadzadeh E, Gonzalez G. Towards Generating a Patient’s Timeline: Extracting Temporal Relationships from Clinical Notes. J Biomed Inform. 2013 46 Suppl: S40-7.
  • 69 Kovacevic A, Dehghan A, Filannino M, Keane JA, Nenadic G. Combining Rules and Machine Learning for Extraction of Temporal Expressions and Events from Clinical Narratives. J Am Med Inform Assoc 2013; 20 (05) 859-66.
  • 70 Schuemie M. Methods for Drug Safety Signal Detection in Longitudinal Observational Databases: LGPS and LEOPARD. Pharmacoepidemiol Drug Saf 2011; Mar; 20 (03) 292-9.
  • 71 Zhao J. Temporal Weighting of Clinical Events in Electronic Health Records for Pharmacovigi-lance. In: 2015 IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE 2015; 375-81.
  • 72 Lin C, Karlson EW, Dligach D, Ramirez MP, Miller TA, Mo H. et al. Automatic Identification of Methotrexate-induced Liver Toxicity in Patients with Rheumatoid Arthritis from the Electronic Medical Record. J Am Med Inform Assoc 2015; 22 (e1): e151-61.
  • 73 Chen L, Dligach D, Miller T, Bethard S, Savova G. Layered Temporal Modeling for the Clinical Domain. J Am Med Inf Assoc. 2015
  • 74 Sinnenberg L, Buttenheim AM, Padrez K, Man-cheno C, Ungar L, Merchant RM. Twitter as a Tool for Health Research: a Systematic Review. Am J Public Health 2017; 107 (01) 143.
  • 75 Shannon Greenwood, Andrew Perrin, Maeve Duggan. PEW Research Center Social Media Update. 2016
  • 76 Fox S. The social life of health information; Pew Research Center..
  • 77 Househ M. The use of Social Media in Healthcare: Organizational, Clinical, and Patient Perspectives. Stud Health Technol Inform 2013; 183: 244-8.
  • 78 Elhadad N, Gravano L, Hsu D, Balter S, Reddy V, Waechter H. Information Extraction from Social Media for Public Health. In: KDD at Bloomberg. The Data Frameworks Track; 2014
  • 79 Prieto VM, Matos S, Álvarez M, Cacheda F, Ol-iveira JL. Twitter: a Good Place to Detect Health Conditions. PLoS One 2014; 09 (01) e86191.
  • 80 Seaman I, Giraud-Carrier C. Prevalence and Attitudes about Illicit and Prescription Drugs on Twitter. In: 2016 IEEE International Conference on Healthcare Informatics (ICHI). IEEE 2016; 14-7.
  • 81 Baldwin T, Cook P, Lui M, MacKinlay A, Wang L. How Noisy Social Media Text, How Different Social Media Sources?. 2013; 356-64.
  • 82 Sarker A, Mollá D, Paris C. An Approach for Query-focused Text Summarisation for Evidence Based Medicine. In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). 7885 LNAI; Springer Verlag: 2013: 295-304.
  • 83 Zillner S, Neururer S, Zillner S, Neururer S. Big Data in the Health Sector 10. 2 Analysis of Industrial Needs in the Health Sector. doi:10.1007/978-3-319-21569-3_10.
  • 84 Cohen K, Demner-Fushman D. Biomedical Natural Language Processing. 1st ed.. Cohen K, Demn-er-Fushman D. eds.). Amsterdam/Philadelphia: John Benjamins Publishing Company; 2014
  • 85 Sarker A, Gonzalez G. Data, Tools and Resources for Mining Social Media Drug Chatter. In: Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining (BIOTXTM). Osaka 2016; 99-107.
  • 86 Torii M, Tilak SS, Doan S, Zisook DS, Fan J-W. Mining Health-Related Issues in Consumer Product Reviews by Using Scalable Text Analytics. Biomed Inform Insights 2016; 08 (Suppl 1): 1-11.
  • 87 Sarker A, Nikfarjam A, Gonzalez G. Social Media Mining Shared Task Workshop. Pac Symp Biocom-put 2016; 21: 581-92.
  • 88 Wong CA, Merchant RM, Moreno MA. Using Social Media to Engage Adolescents and Young Adults with their Health. Healthc (Amst) 2014; 02 (04) 220-4.
  • 89 Gittelman S, Lange V, Gotway CACrawford. et al. A New Source of Data for Public Health Surveillance: Facebook Likes. J Med Internet Res 2015; 17 (04) e98.
  • 90 Kite J, Foley BC, Grunseit AC, Freeman B. Please Like Me: Facebook and Public Health Communication. PLoS One 2016; 11 (09) e0162765.
  • 91 Platt T, Platt J, Thiel DB, Kardia SLR. Facebook Advertising Across an Engagement Spectrum: a Case Example for Public Health Communication. JMIR Public Health Surveill 2016; 02 (01) e27.
  • 92 Broniatowski DA, Dredze M, Paul MJ, Dugas A. Using Social Media to Perform Local Influenza Surveillance in an Inner-City Hospital: A Retrospective Observational Study. JMIR Public Health Surveill 2015; 01 (01) e5.
  • 93 Sharpe JD, Hopkins RS, Cook RL, Striley CW. Evaluating Google, Twitter, and Wikipedia as Tools for Influenza Surveillance Using Bayesian Change Point Analysis: a Comparative Analysis. JMIR Public Health Surveill 2016; 02 (02) e161.
  • 94 Ofoghi B, Mann M, Verspoor K. Towards Early Discovery of Salient Health Treats : a Social Media Emotion Classification Technique. Pac Symp Biocomput 2016; 21: 504-15.
  • 95 Correia RB, Li L, Rocha LM. Monitoring Potential Drug Interactions and Reactions via Network Analysis of Instagram User Timelines. Pac Symp Biocomput 2016; 21: 492-503.
  • 96 Aphinyanaphongs Y, Lulejian A, Brown DP, Bon-neau R, Krebs P. Text Classification for Automatic Detection of e-Cigarette use and use for Smoking Cessation from Twitter: a Feasability Pilot. Pac Symp Biocomput 2016; 21: 480-91.
  • 97 Guillory J, Kim A, Murphy J, Bradfield B, Nonnemaker J, Hsieh Y. Comparing Twitter and Online Panels for Survey Recruitment of E-Cig-arette Users and Smokers. J Med Internet Res 2016; 18 (11) e288.
  • 98 Coloma PM, Becker B, Sturkenboom MCJM, van Mulligen EM, Kors JA. Evaluating Social Media Networks in Medicines Safety Surveillance: Two Case Studies. Drug Saf 2015; 38 (10) 921-30.
  • 99 Nikfarjam A, Sarker A, O’Connor K, Ginn R, Gonzalez G. Pharmacovigilance from Social Media: Mining Adverse Drug Reaction Mentions using Sequence Labeling with Word Embedding Cluster Features. J Am Med Inform Assoc 2015; 22 (03) 671-81.
  • 100 Sarker A, Ginn R, Nikfarjam A. et al. Utilizing Social Media Data for Pharmacovigilance: a Review. J Biomed Inform 2015; 54: 202-12.
  • 101 Pimpalkhute P, Patki A, Nikfarjam A, Gonzalez G. Phonetic Spelling Filter for Keyword Selection in Drug Mention Mining from Social Media. AMIA Jt Summits Transl Sci Proc 2014; Apr 7; 2014; 90-5.
  • 102 Sarker A, Gonzalez G. A Corpus for Mining Drug-related Knowledge from Twitter Chatter: Language Models and their Utilities. Data Brief 2016; Nov 23; 10: 122-131.
  • 103 Limsopatham N, Collier N. Towards the Semantic Interpretation of Personal Health Messages from Social Media. In: Proceedings of the ACM First International Workshop on Understanding the City with Urban Informatics - UCUI ‘15. New York, New York, USA: ACM Press 2015; 27-30.
  • 104 Lampos V, Cristianini N. Tracking the flu pandemic by monitoring the Social Web. In: 2010 2nd International Workshop on Cognitive Information Processing. IEEE 2010; 411-6.
  • 105 Paul MJ, Dredze M, Broniatowski D. Twitter improves influenza Forecasting. PLoS Curr 2014; 06: 1-13.
  • 106 Lee K, Agrawal A, Choudhary A. Real-time Disease Surveillance using Twitter Data. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining - KDD ‘13. New York, New York, USA: ACM Press; 2013: 1474.
  • 107 Broniatowski DA, Paul MJ, Dredze M. Twitter: big data opportunities. Science 2014; (345) 148.
  • 108 Liu X, Chen H. A Research Framework for Pharmacovigilance in Health Social Media: Iden-tifcation and Evaluation of Patient Adverse Drug Event Reports. J Biomed Inform 2015; 58: 268-79.
  • 109 Plachouras V, Leidner JL, Garrow AG. Quantifying Self-Reported Adverse Drug Events on Twitter. In: Proceedings of the 7th 2016 International Conference on Social Media & Society - SMSociety ‘16. New York, New York, USA: ACM Press 2016; 1-10.
  • 110 Patki A, Sarker A, Pimpalkhute P, Nikfarjam A, Ginn R. Mining Adverse Drug Reaction Signals form Social Media: Going Beyond Extraction. In: Proceedings of BioLinkSig 2014; 9-19.
  • 111 Mazzocut M, Truccolo I, Antonini M, Rinaldi F, Omero P, Ferrarin E. et al. Web Conversations About Complementary and Alternative Medicines and Cancer: Content and Sentiment Analysis. J Med Internet Res 2016; 18 (06) e120.
  • 112 Korkontzelos I, Nikfarjam A, Shardlow M, Sarker A, Ananiadou S, Gonzalez GH. Analysis of the Effect of Sentiment Analysis on Extracting Adverse Drug Reactions from Tweets and Forum Posts. J Biomed Inform 2016; 62: 148-58.
  • 113 Adrover C, Bodnar T, Huang Z, Telenti A, Salathé M. Identifying Adverse Effects of HIV Drug Treatment and Associated Sentiments Using Twitter. JMIR Public Health Surveill 2015; 01 (02) e7.
  • 114 Sarker A, Gonzalez G. Portable Automatic Text Classification for Adverse Drug Reaction Detection via Multi-Corpus Training. J Biomed Inform 2014; 53: 196-207.
  • 115 Dai H-J, Touray M, Jonnagaddala J, Syed-Abdul S. Feature Engineering for Recognizing Adverse Drug Reactions from Twitter Posts. Information 2016; 07 (02) 27.
  • 116 Kendra RL, Karki S, Eickholt JL, Gandy L. Characterizing the Discussion of Antibiotics in the Twittersphere: What is the Bigger Picture?. J Med Internet Res 2015; 17 (06) e154.
  • 117 Sarker A, O’Connor K, Ginn R, Scotch M, Smith K, Malone D. et al. Social Media Mining for Toxicovigilance: Automatic Monitoring of Prescription Medication Abuse from Twitter. Drug Saf 2016; 39 (03) 231-40.
  • 118 Kavuluru R, Sabbir AKM. Toward Automated e-Cigarette Surveillance: Spotting e-Ciga-rette Proponents on Twitter. J Biomed Inform 2016; 61: 19-26.
  • 119 Choudhury S, Alani H. Personal Life Event Detection from Social Media. 2014
  • 120 KıcKıman E, Richardson M. Towards Decision Support and Goal Achievement: Identifying Action-Outcome Relationships from Social Media. Proc 21th ACM SIGKDD. 2015
  • 121 Wen M, Zheng Z, Jang H, Xiang G, Rosé C. Extracting Events with Informal Temporal References in Personal Histories in Online Communities. ACL. 2013
  • 122 Collier N, Son N, Nguyen N. OMG U got flu? Analysis of Shared Health Messages for Bio-surveillance. J Biomed Semantics 2011; 02 Suppl 5 S9.
  • 123 Nakov P, Ritter A, Rosenthal S, Sebastiani F, Stoyanov V. SemEval-2016 Task 4: Sentiment Analysis in Twitter. In: International Workshop on Semantic Evaluation Exercises (SemEval) 2016; 1-18.
  • 124 Daniulaityte R, Chen L, Lamy FR, Carlson RG, Thirunarayan K, Sheth A. “When ‘Bad’ is ‘Good’”: Identifying Personal Communication and Sentiment in Drug-Related Tweets. JMIR Public Health Surveill 2016; 02 (02) e162.
  • 125 Cobb NK, Mays D, Graham AL. Sentiment Analysis to Determine the Impact of Online Messages on Smokers’ Choices to Use Varenicline. J Natl Cancer Inst Monogr 2013; 2013 (47) 224-30.
  • 126 Chan B, Lopez A, Sarkar U. The Canary in the Coal Mine Tweets: Social Media Reveals Public Perceptions of Non-Medical Use of Opioids. PLoS One 2015; 10 (08) e0135072.
  • 127 Lei Y, Pereira JA, Quach S. et al. Examining Perceptions about Mandatory Influenza Vaccination of Healthcare Workers through Online Comments on News Stories. PLoS One 2015; 10 (06) e0129993.
  • 128 Ramagopalan S, Wasiak R, Cox AP. Using Twitter to Investigate Opinions about Multiple Sclerosis Treatments: a Descriptive, Exploratory Study. F1000Res 2014; 03: 216.
  • 129 Mollema L, Harmsen IA, Broekhuizen E, Clijnk R, De Melker H, Paulussen T. et al. Disease Detection or Public Opinion Refection? Content Analysis of Tweets, Other Social Media, and Online Newspapers During the Measles Outbreak in the Netherlands in 2013. J Med Internet Res 2015; 17 (05) e128.
  • 130 Shutler L, Nelson LS, Portelli I, Blachford C, Perrone J. Drug Use in the Twittersphere: A Qualitative Contextual Analysis of Tweets About Prescription Drugs. J Addict Dis 2015; 34 (04) 303-10.
  • 131 Aramaki E, Maskawa S, Morita M. Twitter Catches the flu: Detecting Influenza Epidemics using Twitter. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics; 2011: 1568-76.
  • 132 Kralj PNovak, Smailović J, Sluban B, Mozetič I. Sentiment of Emojis. PLoS One 2015; 10 (12) e0144296.
  • 133 Aronson AR, Lang F-M. An Overview of Meta-Map: Historical Perspective and Recent Advances. J Am Med Inform Assoc 2010; 17 (03) 229-36.
  • 134 Li X, Li J, Wu Y. A Global Optimization Approach to Multi-polarity Sentiment Analysis. PLoS One 2015; 10 (04) e0124672.
  • 135 Agarwal V, Zhang L, Zhu J, Fang S, Cheng T, Hong C. et al. Impact of Predicting Health Care Utilization Via Web Search Behavior: A Data-Driven Analysis. J Med Internet Res 2016; 18 (09) e251.
  • 136 Peng Y, Moh M, Moh T-S. Efficient Adverse Arug Event Extraction using Twitter Sentiment Analysis. In: 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). IEEE 2016; 1011-8.
  • 137 Huynh T, He Y, Willis A, Uger S. Adverse Drug Reaction Classification With Deep Neural Networks. In: COLING. Osaka: 2016: 877-87.
  • 138 Paul MJ, Dredze M. Discovering Health Topics in Social Media Using Topic Models. PLoS One 2014; 09 (08) e103408.
  • 139 Wang S, Paul MJ, Dredze M. Exploring Health Topics in Chinese Social Media: An Analysis of Sina Weibo. In: Workshops at the Twenty-Eighth AAAI Conference on Artificial Intelligence 2014; 20-3.
  • 140 Kumar M, Dredze M, Coppersmith G, De Choudhury M. Detecting Changes in Suicide Content Manifested in Social Media Following Celebrity Suicides. In: Proceedings of the 26th ACM Conference on Hypertext & Social Media -HT ‘15. New York. New York, USA: ACM Press; 2015: 85-94.
  • 141 Surian D, Nguyen DQ, Kennedy G, Johnson M, Coiera E, Dunn AG. Characterizing Twitter Discussions About HPV Vaccines Using Topic Modeling and Community Detection. J Med Internet Res 2016; 18 (08) e232.
  • 142 Li J, Ritter A, Cardie C, Hovy E. Major Life Event Extraction from Twitter based on Congratulations/Condolences Speech Acts. EMN-L P. 2014
  • 143 Li J, Cardie C. Timeline Generation. In: Proceedings of the 23rd International Conference on World Wide Web - WWW ‘14. New York, New York, USA: ACM Press; 2014: 643-52.
  • 144 Sullivan R, Sarker A, O’Connor K, Goodin A, Karlsrud M, Gonzalez G. Finding Potentially Unsafe Nutritional Supplements from User Reviews with Topic Modeling. Pac Symp Biocomput 2016; 21: 528-39.
  • 145 Wang T, Huang Z, Gan C. On Mining Latent Topics from Healthcare Chat Logs. J Biomed Inform 2016; 61: 247-59.
  • 146 Zou B, Lampos V, Gorton R, Cox IJ. On Infectious Intestinal Disease Surveillance using Social Media Content. In: Proceedings of the 6th International Conference on Digital Health Conference - DH ‘16. New York, New York, USA: ACM Press; 2016: 157-61.
  • 147 Leaman R, Wojtulewicz L, Sullivan R, Skariah A, Yang J, Gonzalez G. Towards Internet-Age Pharmacovigilance: Extracting Adverse Drug Reactions from User Posts to Health-Related Social Networks. In: Workshop on Biomedical Natural Language Processing. Uppsala, Sweden: 2010: 117-25.
  • 148 De Choudhury M, Counts S, Horvitz E. Major Life Changes and Behavioral Markers in Social Media: Case of Childbirth. In: Social Networks During Major Transitions 2013; 1431-42.
  • 149 Wiley MT, Jin C, Hristidis V, Esterling KM. Pharmaceutical drugs chatter on Online Social Networks. J Biomed Inform 2014; 49: 245-54.
  • 150 Denecke K. Information Extraction from Medical Social Media. In: Health Web Science. Cham: Springer International Publishing; 2015: 61-73.
  • 151 Yates A, Goharian N, Frieder O. Extracting Adverse Drug Reactions from Social Media. In: Proceedings of the National Conference on Artificial Intelligence Vol 03 2015; 2460-7.
  • 152 Morlane-Hon F, Grouin C, Zweigenbaum P. Identification of Drug-Related Medical Conditions in Social Media. In: LREC 2016; 2022-8.
  • 153 Mikolov T, Chen K, Corrado G, Dean J. Distributed Representations of Words and Phrases and their Compositionality. Nips 2013; 1-9.
  • 154 Han B, Cook P, Baldwin T. Lexical Normalization for Social Media Text. ACM Trans Intell Syst Technol 2013; 04 (01) 1-27.
  • 155 Choudhury M, Saraf R, Jain V, Mukherjee A, Sarkar S, Basu A. Investigation and Modeling of the Structure of Texting Language. Int J Doc Anal Recognit 2007; 10 (3-4): 157-74.
  • 156 Cook P, Stevenson S. An Unsupervised Model for Text Message Normalization. In: Proceedings of the Workshop on Computational Approaches to Linguistic Creativity. Association for Computational Linguistics; 2009: 71-8.
  • 157 Xue Z, Yin D, Davison B. Normalizing Micro-text. Anal Microtext 2011; (September): 74-9.
  • 158 Liu F, Weng F, Jiang X. A Broad-Coverage Normalization System for Social Media Language. Proc 50th Annu Meet Assoc Comput Linguist Vol 1 Long Pap 2012; (July): 1035-44.
  • 159 Karimi S, Metke-Jimenez A, Kemp M, Wang C. Cadec: a Corpus of Adverse Drug Event Annotations. J Biomed Inform 2015; 55: 73-81.
  • 160 O’Connor K, Pimpalkhute P, Nikfarjam A, Ginn R, Smith KL, Gonzalez G. Pharmacovig-ilance on Twitter? Mining Tweets for Adverse Drug Reactions. AMIA Annu Symp Proc 2014; 2014: 924-33.
  • 161 Limsopatham N, Collier N. Adapting Phrase-based Machine Translation to Normalise Medical Terms in Social Media Messages. In: Conference Proceedings - EMNLP 2015: Conference on Empirical Methods in Natural Language Processing 2015; 1675-80.
  • 162 Limsopatham N, Collier N. Learning Orthographic Features in Bi-directional LSTM for Biomedical Named Entity Recognition. In: Fifth Workshop on Building and Evaluating Resources for Biomedical Text Mining. Osaka: 2016: 10-9.
  • 163 Chandrashekar PB, Magge A, Sarker A, Gonzalez G. Social Media Mining for Identification and Exploration of Health-related Information from Pregnant Women. In: Proceedings of the First Workshop on Mining Online Health Reports. Cambridge, UK: 2017: 1-9.
  • 164 Milne DN, Pink G, Hachey B, Calvo RA. CL-Psych 2016 Shared Task: Triaging Content in Online Peer-support Forums. In: 3rd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality. San Diego, California: Association for Computational Linguistics; 2016: 118-27.
  • 165 Kim MacS, Wang Y, Wan S, Paris C. Da-ta61-CSIRO Systems at the CLPsych 2016 Shared Task. In: 3rd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality. San Diego, California: 2016: 128-32.
  • 166 Qntfy GC, Dredze M, Harman C, Hollingshead KIhmc, Mitchell M. CLPsych 2015 Shared Task: Depression and PTSD on Twitter. In: E 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality. Denver, Colorado: 2015: 31-9.
  • 167 Resnik P, Armstrong W, Claudino L, Nguyen T. The University of Maryland CLPsych 2015 Shared Task System. In: 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality. Denver, Colorado: 2015: 54-60.
  • 168 Rastegar-Mojarad M, Elayavilli RK, Yu Y, Liu H. Detecting Signals in Noisy Data - Can Ensemble Classifiers Help Identify Adverse Drug Reaction in Tweets?. In: Social Media Mining Shared Task Workshop. Hawaii: 2016
  • 169 Wang W. MiningAdverse Drug Reaction Mentions in Twitter with Word Embeddings. In: Social Media Mining Shared Task Workshop. Hawaii: 2016
  • 170 Natarajan S, Bangera V, Khot T, Picado J, Wazalwar A, Santos VCosta. et al. Markov Logic Networks for Adverse Drug Event Extraction from Text. Knowl Inf Syst 2016; 1-23.
  • 171 Segura-Bedmar I, Martínez P, Revert R, Moreno-Schneider J. Exploring Spanish Health Social Media for Detecting Drug Effects. BMC Med Inform Decis Mak 2015; 14 (02) s6.