Methods Inf Med 2015; 54(06): 515-521
DOI: 10.3414/ME15-01-0023
Original Articles
Schattauer GmbH

Use of a Latent Topic Model for Characteristic Extraction from Health Checkup Questionnaire Data[*]

Y. Hatakeyama
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
,
I. Miyano
2   Department of Public Health, Kochi University Medical School, Kochi, Japan
,
H. Kataoka
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
,
N. Nakajima
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
,
T. Watabe
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
,
N. Yasuda
2   Department of Public Health, Kochi University Medical School, Kochi, Japan
,
Y. Okuhara
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
› Author Affiliations
Further Information

Publication History

received: 08 February 2015

accepted: 29 May 2015

Publication Date:
23 January 2018 (online)

Summary

Objectives: When patients complete questionnaires during health checkups, many of their responses are subjective, making topic extraction difficult. Therefore, the purpose of this study was to develop a model capable of extracting appropriate topics from subjective data in questionnaires conducted during health checkups.

Methods: We employed a latent topic model to group the lifestyle habits of the study participants and represented their responses to items on health checkup questionnaires as a probability model. For the probability model, we used latent Dirichlet allocation to extract 30 topics from the questionnaires. According to the model parameters, a total of 4381 study participants were then divided into groups based on these topics. Results from laboratory tests, including blood glucose level, triglycerides, and estimated glomerular filtration rate, were compared between each group, and these results were then compared with those obtained by hierarchical clustering.

Results: If a significant (p < 0.05) difference was observed in any of the laboratory measurements between groups, it was considered to indicate a questionnaire response pattern corresponding to the value of the test result. A comparison between the latent topic model and hierarchical clustering grouping revealed that, in the latent topic model method, a small group of participants who reported having subjective signs of uri-nary disorder were allocated to a single group.

Conclusions: The latent topic model is useful for extracting characteristics from a small number of groups from questionnaires with a large number of items. These results show that, in addition to chief complaints and history of past illness, questionnaire data obtained during medical checkups can serve as useful judgment criteria for assessing the conditions of patients.

* Supplementary online material published on our website http://dx.doi.org/10.3414/ME15-01-0023


 
  • References

  • 1 Hamaguchi M, Kojima T, Ohbora A, Takeda N, Fukui M, Kato T. Protective effect of alcohol consumption for fatty liver but not metabolic syndrome. World J Gastroenterol 2012; 18: 156-167
  • 2 Hishida A, Koyama A, Tomota A, Katase S, Asai Y, Hamajima N. Smoking cessation, alcohol intake and transient increase in the risk of metabolic syndrome among Japanese smokers at one health checkup institution. MC Public Health 2009; 9: 263
  • 3 Hamabe A, Uto H, Imamura Y, Kusano K, Mawatari S, Kumagai K. et al. Impact of cigarette smoking on onset of nonalcoholic fatty liver disease over a 10-year period. J Gastroenterol 2011; 46: 769-778
  • 4 Dara J, Dowling JN, Travers D, Cooper GF, Chapman WW. Evaluation of preprocessing techniques for chief complaint classification. J Biomed Inform 2008; 41: 613-623
  • 5 Chapman WW, Christensen LM, Wagner MM, Haug PJ, Ivanov O, Dowling JN. et al. Classifying free-text triage chief complaints into syndromic categories with natural language processing. Artif Intell Med 2005; 33: 31-40
  • 6 Körber S, Frieser D, Steinbrecher N, Hiller W. Classification characteristics of the Patient Health Questionnaire-15 for screening somatoform disorders in a primary care setting. J Psychosom Res 2011; 71: 142-147
  • 7 Kubo SH, Schulman S, Starling RC, Jessup M, Wentworth D, Burkhoff D. Development and validation of a patient questionnaire to determine New York Heart Association classification. J Card Fail 2004; 10: 228-235
  • 8 Najafi M, Sheikhvatan M, Montazeri A, Sheikhfatollahi M. Factor Structure of the World Health Organization’s Quality of Life Questionnaire-BREF in Patients with Coronary Artery Disease. Int J Prev Med 2013; 4: 1052-1058
  • 9 Mond J, Mitchison D, Latner J, Hay P, Owen C, Rodgers B. Quality of life impairment associated with body dissatisfaction in a general population sample of women. BMC Public Health 2013; 13: 920
  • 10 Berlim MT, Pavanello DP, Caldieraro MA, Fleck MP. Reliability and validity of the WHOQOL BREF in a sample of Brazilian outpatients with major depression. Qual Life Res 2005; 14: 561-564
  • 11 Xia P, Li N, Hau KT, Liu C, Lu Y. Quality of life of Chinese urban community residents: a psychometric study of the mainland Chinese version of the WHOQOL-BREF. BMC Med Res Methodol 2012; 12: 37
  • 12 Kato S, Oshima Y, Oka H, Chikuda H, Takeshita Y, Miyoshi K. et al. Comparison of the Japanese Orthopaedic Association (JOA) Score and Modified JOA (mJOA) Score for the Assessment of Cervical Myelopathy: A Multicenter Observational Study. PLoS One 2015; 10: e0123022
  • 13 Suetsugu Y, Honjo S, Ikeda M, Kamibeppu K.. The Japanese version of the Postpartum Bonding Questionnaire: Examination of the reliability, validity, and scale structure. 2015 Feb 21;. S0022-3999 (15)00043-4 doi: 10.1016/ j.jpsychores.2015.02.008 [Epub ahead of print]
  • 14 Gui J, Wang SL, Lei YK. Multi-step dimensionality reduction and semi-supervised graph-based tumor classification using gene expression data. Artif Intell Med 2010; 50: 181-191
  • 15 Vidyasagar M. Machine learning methods in the computational biology of cancer. Proc Math Phys Eng Sci 2014; 470: 20140081
  • 16 Blei DM, Ng AY, Jordan MI. Latent Dirichlet Allocation. J Machine Learning Res 2003; 3: 993-1022
  • 17 Griffiths TL, Steyvers M. Finding scientific topics. Proc Natl Acad Sci USA 2004; 101: 5228-5235
  • 18 Wang C, Blei DM, Li FF.. Simultaneous image classification and annotation. Conference on Computer Vision and Pattern Recognition, 2009 (CVPR 2009). IEEE . IEEE. 2009: 1903-1910
  • 19 Gambatesa M, D’Ambrosio A, D’Antini D, Mirabella L, De Capraris A, Iuso S. et al. Counseling, quality of life, and acute postoperative pain in elderly patients with hip fracture. J Multidiscip Healthc 2013; 6: 335-346
  • 20 Beneciuk JM, Robinson ME, George SZ. Low back pain subgroups using fear-avoidance model measures: results of a cluster analysis. Clin J Pain 2012; 28: 658-666
  • 21 Westman M, Kull I, Lind T, Melén E, Stjärne P, Toskala E. et al. The link between parental allergy and offspring allergic and nonallergic rhinitis. Allergy 2013; 68: 1571-1578
  • 22 Alvarez-Lister MS, Pereda N, Abad J, Guilera G. GReVIA. olyvictimization and its relationship to symptoms of psychopathology in a southern European sample of adolescent outpatients. Child Abuse Negl 2014; 38: 747-756
  • 23 Rietdijk J, Fokkema M, Stahl D, Valmaggia L, Ising HK, Dragt S. et al. The distribution of self-reported psychotic-like experiences in non-psychotic help-seeking mental health patients in the general population; a factor mixture analysis. Soc Psychiatry Psychiatr Epidemiol 2014; 49: 349-358
  • 24 Walker LM, Hampton A, Robinson JW. Assessment of relational intimacy: factor analysis of the personal assessment of intimacy in relationships questionnaire. Psychooncology 2014; 23: 346-349
  • 25 Chow MY, Morrow A, Heron L, Yin JK, Booy R, Leask J. Quality of life for parents of children with influenza-like illness: development and validation of Care-ILI-QoL. Qual Life Res 2013; 23: 939-951