A novel knowledge enhancement method for large-scale natural language training model

Authors

DOI:

https://doi.org/10.4108/airo.8987

Keywords:

large-scale natural language training model, knowledge enhancement, long text representation, pre-trained mode

Abstract

Knowledge enhancement-based large-scale natural language training model is an advanced language model that combines deep learning and knowledge enhancement. By learning from massive unlabeled data and combining with external knowledge such as knowledge graph, it breaks through the limitations of traditional models in interpretability and reasoning ability. Introducing knowledge into data-driven artificial intelligence model is an important way to realize human-machine hybrid intelligence. However, since most pre-trained models are trained on large-scale unstructured corpus data, the defects in certainty and explainability can be remedied to some extent by introducing external knowledge. To solve the above problems, we present a knowledge-enhanced large-scale natural language training model that integrates deep learning with external knowledge sources (e.g., knowledge graphs) to improve interpretability and reasoning ability. This approach addresses the limitations of traditional models trained on unstructured data by incorporating external knowledge to enhance certainty and explainability. We propose a new knowledge enhancement method and demonstrate its effectiveness through a long text representation model. This model processes structured, knowledge-rich long texts by extracting and integrating knowledge and semantic information at the sentence and document levels. It then fuses these representations to generate an enhanced long text representation. Experiments on legal case matching tasks show that our model significantly outperforms existing methods, highlighting its innovation and practical value.

Downloads

Download data is not yet available.

References

[1] Gao P, Li J, Liu S. An introduction to key technology in artificial intelligence and big data driven e-learning and e-education[J]. Mobile Networks and Applications, 2021, 26(5): 2123-2126.

[2] Zha D, Bhat Z P, Lai K H, et al. Data-centric artificial intelligence: A survey[J]. ACM Computing Surveys, 2025, 57(5): 1-42.

[3] Ahmad K, Iqbal W, El-Hassan A, et al. Data-driven artificial intelligence in education: A comprehensive review[J]. IEEE Transactions on Learning Technologies, 2023, 17: 12-31.

[4] Yu J, Lu Z, Yin S, et al. News recommendation model based on encoder graph neural network and bat optimization in online social multimedia art education[J]. Computer Science and Information Systems, 2024, 21(3): 989-1012.

[5] Liu Y, Xu Y, Zhou S. Enhancing User Experience through Machine Learning-Based Personalized Recommendation Systems: Behavior Data-Driven UI Design[J]. Authorea Preprints, 2024.

[6] Yan F, Zhang X, Yang C, et al. Data‐driven modelling methods in sintering process: Current research status and perspectives[J]. The Canadian Journal of Chemical Engineering, 2023, 101(8): 4506-4522.

[7] Wang Z, Wang Y. Digital Library Book Recommendation System Based on Tag Mining[J]. Journal of Artificial Intelligence Research, 2024, 1(1): 10-16.

[8] Wang H, Li B, Gong J, et al. Machine learning-based fatigue life prediction of metal materials: Perspectives of physics-informed and data-driven hybrid methods[J]. Engineering Fracture Mechanics, 2023, 284: 109242.

[9] Wang H, Li J, Wu H, et al. Pre-trained language models and their applications[J]. Engineering, 2023, 25: 51-65.

[10] Ding N, Qin Y, Yang G, et al. Parameter-efficient fine-tuning of large-scale pre-trained language models[J]. Nature Machine Intelligence, 2023, 5(3): 220-235.

[11] Zhang X, Malkov Y, Florez O, et al. Twhin-bert: A socially-enriched pre-trained language model for multilingual tweet representations at twitter[C]//Proceedings of the 29th ACM SIGKDD conference on knowledge discovery and data mining. 2023: 5597-5607.

[12] Zhu R J, Zhao Q, Li G, et al. Spikegpt: Generative pre-trained language model with spiking neural networks[J]. arXiv preprint arXiv:2302.13939, 2023.

[13] Arısoy E, Chen S F, Ramabhadran B, et al. Converting neural network language models into back-off language models for efficient decoding in automatic speech recognition[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2013, 22(1): 184-192.

[14] Choudhary K, Beniwal R. Xplore word embedding using CBOW model and skip-gram model[C]//2021 7th international conference on signal processing and communication (ICSC). IEEE, 2021: 267-270.

[15] Xiong Z, Shen Q, Xiong Y, et al. New Generation Model of Word Vector Representation Based on CBOW or Skip-Gram[J]. Computers, Materials & Continua, 2019, 60(1).

[16] Pennington J, Socher R, Manning C D. Glove: Global vectors for word representation[C]//Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP). 2014: 1532-1543.

[17] Yao T, Zhai Z, Gao B. Text classification model based on fasttext[C]//2020 IEEE International conference on artificial intelligence and information systems (ICAIIS). IEEE, 2020: 154-157.

[18] Ma L, Yang W, Xu B, et al. Knowlog: Knowledge enhanced pre-trained language model for log understanding[C]//Proceedings of the 46th ieee/acm international conference on software engineering. 2024: 1-13.

[19] Kamalloo E, Clarke C L A, Rafiei D. Limitations of Open-Domain Question Answering Benchmarks for Document-level Reasoning[C]//Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2023: 2123-2128.

[20] A. Karak, K. Kunal, N. Darapaneni, and A. R. Paduri. Implementation of GPT models for Text Generation in Healthcare Domain[J]. EAI Endorsed Trans AI Robotics, vol. 3, Apr. 2024.

[21] S. Fang. A Survey of Data-Driven 2D Diffusion Models for Generating Images from Text[J]. EAI Endorsed Trans AI Robotics, vol. 3, Apr. 2024.

[22] Li B, Jiang G, Li N, et al. Research on large-scale structured and unstructured data processing based on large language model[C]//Proceedings of the International Conference on Machine Learning, Pattern Recognition and Automation Engineering. 2024: 111-116.

[23] Li I, Pan J, Goldwasser J, et al. Neural natural language processing for unstructured data in electronic health records: a review[J]. Computer Science Review, 2022, 46: 100511.

[24] Hu L, Liu Z, Zhao Z, et al. A survey of knowledge enhanced pre-trained language models[J]. IEEE Transactions on Knowledge and Data Engineering, 2023, 36(4): 1413-1430.

[25] Zhu H, Peng H, Lyu Z, et al. Pre-training language model incorporating domain-specific heterogeneous knowledge into a unified representation[J]. Expert Systems with Applications, 2023, 215: 119369.

[26] Sun T, Shao Y, Qiu X, et al. Colake: Contextualized language and knowledge embedding[J]. arXiv preprint arXiv:2010.00309, 2020.

[27] Lee J, Yoon W, Kim S, et al. BioBERT: a pre-trained biomedical language representation model for biomedical text mining[J]. Bioinformatics, 2020, 36(4): 1234-1240.

[28] Bhasuran B. BioBERT and similar approaches for relation extraction[M]//Biomedical Text Mining. New York, NY: Springer US, 2022: 221-235.

[29] Zhao Y, Li H, Yin S. A Multi-channel Character Relationship Classification Model Based on Attention Mechanism[J]. Int. J. Math. Sci. Comput.(IJMSC), 2022, 8: 28-36.

[30] M. Tyagi, P. K. Singh, S. K. Yadav, and S. K. Soni. A Multi-Channel Spam Detection System Utilizing Natural Language Processing and Machine Learning[J]. EAI Endorsed Trans AI Robotics, vol. 4, Mar. 2025.

[31] Wang H, Li J, Li Z. AI-generated text detection and classification based on BERT deep learning algorithm[J]. arXiv preprint arXiv:2405.16422, 2024.

[32] Yu B, Tang F, Ergu D, et al. Efficient classification of malicious urls: M-bert—a modified bert variant for enhanced semantic understanding[J]. IEEE Access, 2024, 12: 13453-13468

Downloads

Published

15-07-2025

How to Cite

[1]
Q. Han and G. So, “A novel knowledge enhancement method for large-scale natural language training model”, EAI Endorsed Trans AI Robotics, vol. 4, Jul. 2025.