A Method for Extracting Dissatisfaction Entities in the Pharmaceutical Sector Using Large Language Model
DOI:
https://doi.org/10.4108/eetsis.10511Keywords:
Patient Centricity, Pharmaceutical Industry, Large Language Models, Prompt Engineering, Text MiningAbstract
In today’s VUCA environment, pharmaceutical companies face mounting pressure to enhance both innovation and patient satisfaction amid shrinking domestic markets and intensifying global competition. Despite the growing emphasis on Patient Centricity—the integration of patients' voices into the drug development process—Japan still lags in the systematic incorporation of patient feedback. This study proposes a novel, patient-centered approach that leverages large language models (LLMs) and prompt engineering to extract dissatisfaction entities from patient-generated content, automatically generate a Dissatisfaction Dictionary, and visualize the interrelationships between these entities. Using the “Medical & Welfare_Pharmaceuticals” category from the Dissatisfaction Survey Dataset provided by the National Institute of Informatics, we analysed 300 patient comments with a Conversation Chain built on the GPT-4o API and LangChain. As a result, dissatisfaction entities were extracted and vectorized using TF-IDF and cosine similarity to form thematic clusters. This analysis revealed interconnected concerns related to pricing, efficacy, usability, and medical service processes. For example, discrepancies in pricing and perceived ineffectiveness of generics frequently co-occurred with complaints about pharmacist communication. Our findings offer pharmaceutical firms a systematic, scalable framework to reflect patient dissatisfaction in drug development, thereby enhancing patient engagement, satisfaction, and strategic alignment with Patient Centricity principles.
References
[1] Mitsuzawa K, Tauchi M, Domoulin M, Nakashima M, Mizumoto T. FKC corpus: A Japanese corpus from new opinion survey service. In: Proc Novel Incentives for Collecting Data and Annotation from People Workshop. 2016:11–18.
[2] Misawa K, Tanai M, Domoulin M, Nakajima M, Mizumoto T. Construction and analysis of a corpus specialized in negative reputation information. In: Proc 22nd Annu Conf Japanese Soc Nat Lang Process. 2016:501–504.
[3] Hasegawa T, Kitayama D. Visualization of complaint groups using the Complaint Survey Dataset. In: Proc 9th Forum Data Eng Inf Manag (DEIM). 2017:P7–1.
[4] Suehiro S, Saito H. Vectorization of features from the Complaint Survey Dataset. In: Proc 23rd Annu Conf Japanese Soc Nat Lang Process. 2017:545–548.
[5] Matsumoto K, Sando H. Extracting the voices of women from “We Buy Your Complaints”: A text mining approach. In: NII IDR User Forum. 2019:P12.
[6] Sakai T, Fujimura K. Discovering latent needs from complaint expressions described in blogs. Trans Inf Process Soc Jpn. 2011;52(12):3806–3816.
[7] Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD, Dhariwal P, et al. Language models are few-shot learners. Adv Neural Inf Process Syst. 2020;33:1877–1901.
[8] Wei J, Wang X, Schuurmans D, Bosma M, Xia F, Chi E, et al. Chain-of-thought prompting elicits reasoning in large language models. Adv Neural Inf Process Syst. 2022;35:24824–24837.
[9] Kojima T, Gu SS, Reid M, Matsuo Y, Iwasawa Y. Large language models are zero-shot reasoners. Adv Neural Inf Process Syst. 2022;35:22199–22213.
[10] Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, et al. Attention is all you need. Adv Neural Inf Process Syst.2017;30:5998–6008.
[11] Misawa K, Narita K, Tanai M, Nakajima M, Mizumoto T. An initiative for constructing an opinion survey corpus for quantitative analysis. In: Proc 23rd Annu Conf Japanese Soc Nat Lang Process. 2017:1014–1017.
[12] Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proc NAACL. 2019:4171–4186.
[13] Pandya K, Holia M. Automating customer service using LangChain: Building custom open-source GPT chatbot for organizations. arXiv. 2023;arXiv:2310.05421.
[14] Salton G, Buckley C. Term-weighting approaches in automatic text retrieval. Inf Process Manag. 1988;24(5):513–523.
[15] Japan Pharmaceutical Manufacturers Association. Clinical Evaluation Department Task Force 3 Report. Tokyo: JPMA; 2018.
[16] Ministry of Health, Labour and Welfare. Pharmaceutical Industry Vision 2021. Tokyo: MHLW; 2021.
[17] Insight Tech Co., Ltd. Complaint Survey Data. National Institute of Informatics, Informatics Research Data Repository. 2023. doi:10.32130/idr.7.1.
[18] Hayashimoto M, Niwata Y, Ito T, Ueki S, Uchida Y, Seki Y, et al. Activities based on patient centricity in pharmaceutical companies: Trends in drug development utilizing patients’ voices. J Sci Technol Stud. 2020;18:119–127.
[19] Ikoma T, Tsuda K. Verification of the efficacy of active ingredients in over-the-counter drugs using text mining. In: Proc 27th Annu Conf Japanese Soc Artif Intell. 2013:1F33–1F33.
[20] Ginn R, Pimpalkhute P, Nikfarjam A, Patki A, O’Connor K, Sarker A, et al. Mining Twitter for adverse drug reaction mentions: A corpus and classification benchmark. In: Proc 4th Workshop on Building and Evaluating Resources for Health and Biomedical Text Processing. 2014:1–8.
[21] OpenAI. Models – OpenAI API Documentation. 2024.
[22] Lee J, Park JS, Feng B, Wang K. Cross-sectional analysis of Australian dental practitioners' perceptions of teledentistry. EAI Endors Trans Scalable Inf Syst. 2024;5366.
[23] You M, Ge YF, Wang K, Wang H, Cao J, Kambourakis G. Hierarchical adaptive evolution framework for privacy-preserving data publishing. World Wide Web. 2024;27(4):49.
[24] Khanam T, Siuly S, Wang K, Zheng Z. A privacy-preserving encryption framework for big data analysis. In: Proc Int Conf Web Inf Syst Eng. 2024 Nov;84–94.
[25] Hamadouche S, Boudraa O, Gasmi M. Combining lexical, host, and content-based features for phishing website detection using machine learning models. EAI Endors Trans Scalable Inf Syst. 2024;11(6).
[26] Madhavi S, et al. Event extraction with spectrum estimation using neural networks linear methods. EAI Endors Trans Scalable Inf Syst. 2024;11(4).
[27] Narayan G, Haveri P, Rashimi B. A framework for data provenance assurance in cloud environment using Ethereum blockchain. EAI Endors Trans Scalable Inf Syst. 2024;11(2):1–14.
Downloads
Published
Issue
Section
License
Copyright (c) 2025 Kakeru Ota, Takumi Uchida, Maya Iwano, Tsuda Kazuhiko

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.
