VMHQA: A Vietnamese Multi-choice Dataset for Mental Health Domain Question Answering

Authors

DOI:

https://doi.org/10.4108/eetsis.7678

Keywords:

VMHQA, Mental Health Dataset, Vietnamese Multiple-Choice Question Answering (MCQA), BERT-based Models, NLP in Mental Health, Retrieval-Augmented Generation (RAG), Agentic Chunking, Large Language Modes (LLMs)

Abstract

This paper introduces VMHQA, a VietnameseMultiple-Choice Question Answering (MCQA) dataset designed to address critical mental health resources gaps, particularly in low and middle-income countries like Vietnam. The dataset comprises 10,000 meticulously curated records across 1,166 mental health subjects, including 249 topics in the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) and 8,599 contextual paragraphs. Each record adheres to the United States Medical Licensing Examination
(USMLE) format, with targeted questions, correct answers, multiple-choice options, and supporting paragraphs from reputable sources such as academic journals and local hospital websites, further inspected by prestigious mental hospitals in Vietnam. VMHQA thus provides a reliable, structured foundation for preconsultation tools, allowing for early psychological intervention for those concerned about mental health issues. This study also goes beyond data collection to evaluate the effectiveness of VMHQA using cutting-edge machine learning models, such as BERT-based architectures, large language models (LLMs) ranging from 7 to 9 billion parameters, and various generative pre-trained transformer (GPT) frameworks. In addition, we look at how Retrieval-Augmented Generation (RAG) combined with Agentic Chunking can improve the accuracy and interpretability of responses in this specialised domain. The retrieval mechanisms of RAG are examined explicitly for their ability to generate contextually accurate answers sensitive to psychological nuances. Our findings shed light on the effectiveness of these advanced models in handling complex, domainspecific question-answering tasks in mental health, highlighting their potential to make mental health care more accessible and reliable for Vietnamese-speaking communities. VMHQA thus represents a significant step toward making mental health care more accessible, offering hope for improved mental health outcomes.

References

[1] R.M., B., Vardhan, K.B., Nidhish, M., Kiran C., S., Nahid Shameem, D. and Sai Charan, V. (2024) Eye Disease Detection Using Deep Learning Models with Transfer Learning Techniques. ICST Transactions on Scalable Information Systems 11. doi:10.4108/eetsis.5971.

[2] Bhuvanya, R., Kujani, T. and Sivakumar, K. (2024) Fusing Attention and Convolution: A Hybrid Model for Brain Stroke Prediction. ICST Transactions on Scalable Information Systems 11. doi:10.4108/eetsis.7022.

[3] Alvi, A.M., Khan, M.J., Manami, N.T., Miazi, Z.A., Wang, K., Siuly, S. and Wang, H. (2024) XCR-Net: A Computer Aided Framework to Detect COVID-19. IEEE Transactions on Consumer Electronics 70(4): 7551–7561. doi:10.1109/TCE.2024.3446793.

[4] Tawhid, M.N.A., Siuly, S.,Wang, K. andWang, H. (2024) GENet: A Generic Neural Network for Detecting Various Neurological Disorders From EEG. IEEE Transactions on Cognitive and Developmental Systems 16(5): 1829–1842. doi:10.1109/TCDS.2024.3386364.

[5] Rajpurkar, P., Zhang, J., Lopyrev, K. and Liang, P. (2016), SQuAD: 100,000+ Questions for Machine Comprehension of Text. 1606.05250.

[6] Lai, G., Xie, Q., Liu, H., Yang, Y. and Hovy, E. (2017), RACE: Large-scale ReAding Comprehension Dataset From Examinations. 1704.04683.

[7] Vilares, D. and Gomez-Rodriguez, C. (2019), HEADQA: A Healthcare Dataset for Complex Reasoning. 1906. 04701.

[8] Jin, D., Pan, E., Oufattole, N., Weng, W.H., Fang, H. and Szolovits, P. (2020), What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. 2009.13081.

[9] Pal, A., Umapathi, L.K. and Sankarasubbu, M. (2022), MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering. 2203. 14371.

[10] Huy, T.D., Tu, N.A., Vu, T.H., Minh, N.P., Phan, N., Bui, T.H. and Truong, S.Q.H. (2021) ViMQ: A Vietnamese Medical Question Dataset for Healthcare Dialogue System Development. 1517, 657–664. doi:10.1007/978-3-030-92310-5_76. 2304.14405.

[11] Le, K., Nguyen, H., Le Thanh, T. and Nguyen, M. (2022) VIMQA: A Vietnamese dataset for advanced reasoning and explainable multi-hop question answering. In Calzolari, N., Bechet, F., Blache, P., Choukri, K., Cieri, C., Declerck, T., Goggi, S. et al. [eds.] Proceedings of the Thirteenth Language Resources and Evaluation Conference (Marseille, France: European Language Resources Association): 6521-6529.

[12] Nguyen, K.V., Nguyen, D.V., Nguyen, A.G.T. and Nguyen, N.L.T. (2020), A Vietnamese Dataset for Evaluating Machine Reading Comprehension. 2009.4725.

[13] American Psychiatric Association (2013) Diagnostic and Statistical Manual of Mental Disorders (American Psychiatric Association), fifth edition ed. doi:10.1176/appi.books.9780890425596.

[14] Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Kuttler, H. et al. (2021), Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. 2005.11401.

[15] Chen, T., Wang, H., Chen, S., Yu, W., Ma, K., Zhao, X., Zhang, H. et al. (2024), Dense X Retrieval: What Retrieval Granularity Should We Use? 2312.06648.

[16] (2024), About the USMLE | USMLE, https://www.usmle.org/about-usmle.

[17] Vu Anh (2024), Undertheseanlp/underthesea, Under The Sea.

[18] OpenAI (2024), GPT-4 Technical Report. 2303.08774.

[19] Gao, T., Yao, X. and Chen, D. (2022), SimCSE: Simple Contrastive Learning of Sentence Embeddings. 2104.08821.

[20] Nguyen, D.Q. and Nguyen, A.T. (2020), PhoBERT: Pretrained language models for Vietnamese. 2003.00744.

[21] Lee, J., Yoon, W., Kim, S., Kim, D., Kim, S., So, C.H. and Kang, J. (2020) BioBERT: A pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36(4): 1234–1240.doi:10.1093/bioinformatics/btz682. 1901.08746.

[22] Huang, K., Altosaar, J. and Ranganath, R. (2020), ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission. 1904.05342.

[23] Peng, Y., Yan, S. and Lu, Z. (2019), Transfer Learning in Biomedical Natural Language Processing: An Evaluation of BERT and ELMo on Ten Benchmarking Datasets. 1906.05474.

[24] Ji, S., Zhang, T., Ansari, L., Fu, J., Tiwari, P. and Cambria, E. (2021), MentalBERT: Publicly Available Pretrained Language Models for Mental Healthcare. 2110.15621.

[25] Chakraborty, S., Bisong, E., Bhatt, S., Wagner, T., Elliott, R. and Mosconi, F. (2020) BioMedBERT: A Pre-trained Biomedical Language Model for QA and IR. In Proceedings of the 28th International Conference on Computational Linguistics (Barcelona, Spain (Online): International Committee on Computational Linguistics): 669–679. doi:10.18653/v1/2020.coling-main.59.

[26] Beltagy, I., Lo, K. and Cohan, A. (2019), SciBERT: A Pretrained Language Model for Scientific Text. 1903. 10676.

[27] Dubey, A., Jauhri, A., Pandey, A., Kadian, A. and Al-Dahle (2024), The Llama 3 Herd of Models. 2407.21783.

[28] Team, G. (2024), Gemma 2: Improving Open Language Models at a Practical Size. 2408.00118.

[29] Jiang, A.Q., Sablayrolles, A., Mensch, A., Bamford, C., Chaplot, D.S., de las Casas, D., Bressand, F. et al. (2023), Mistral 7B. 2310.06825.

[30] Chen, Z., Cano, A.H., Romanou, A., Bonnet, A., Matoba, K., Salvi, F., Pagliardini, M. et al. (2023), MEDITRON-70B: Scaling Medical Pretraining for Large Language Models. 2311.16079.16

Downloads

Published

25-09-2025

How to Cite

1.
Nguyen TAH, Nguyen Q-D, Nguyen HM, Nguyen AH, Nguyen L. VMHQA: A Vietnamese Multi-choice Dataset for Mental Health Domain Question Answering. EAI Endorsed Scal Inf Syst [Internet]. 2025 Sep. 25 [cited 2025 Sep. 25];12(4). Available from: https://publications.eai.eu/index.php/sis/article/view/7678