Methods Inf Med 2015; 54(06): 515-521
DOI: 10.3414/ME15-01-0023
Original Articles
Schattauer GmbH

Use of a Latent Topic Model for Characteristic Extraction from Health Checkup Questionnaire Data[*]

Y. Hatakeyama
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
,
I. Miyano
2   Department of Public Health, Kochi University Medical School, Kochi, Japan
,
H. Kataoka
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
,
N. Nakajima
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
,
T. Watabe
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
,
N. Yasuda
2   Department of Public Health, Kochi University Medical School, Kochi, Japan
,
Y. Okuhara
1   Center of Medical Information Science, Kochi University Medical School, Kochi, Japan
› Institutsangaben
Weitere Informationen

Publikationsverlauf

received: 08. Februar 2015

accepted: 29. Mai 2015

Publikationsdatum:
23. Januar 2018 (online)

Preview

Summary

Objectives: When patients complete questionnaires during health checkups, many of their responses are subjective, making topic extraction difficult. Therefore, the purpose of this study was to develop a model capable of extracting appropriate topics from subjective data in questionnaires conducted during health checkups.

Methods: We employed a latent topic model to group the lifestyle habits of the study participants and represented their responses to items on health checkup questionnaires as a probability model. For the probability model, we used latent Dirichlet allocation to extract 30 topics from the questionnaires. According to the model parameters, a total of 4381 study participants were then divided into groups based on these topics. Results from laboratory tests, including blood glucose level, triglycerides, and estimated glomerular filtration rate, were compared between each group, and these results were then compared with those obtained by hierarchical clustering.

Results: If a significant (p < 0.05) difference was observed in any of the laboratory measurements between groups, it was considered to indicate a questionnaire response pattern corresponding to the value of the test result. A comparison between the latent topic model and hierarchical clustering grouping revealed that, in the latent topic model method, a small group of participants who reported having subjective signs of uri-nary disorder were allocated to a single group.

Conclusions: The latent topic model is useful for extracting characteristics from a small number of groups from questionnaires with a large number of items. These results show that, in addition to chief complaints and history of past illness, questionnaire data obtained during medical checkups can serve as useful judgment criteria for assessing the conditions of patients.

* Supplementary online material published on our website http://dx.doi.org/10.3414/ME15-01-0023