Sentence classification based on the concept kernel attention mechanism

Hui Li; Guimin Huang; Yiqun Li; Xiaowei Zhang; Yabing Wang

doi:10.4108/eai.17-5-2022.173980

Authors

Hui Li Guilin University of Electronic Technology
Guimin Huang Guilin University of Electronic Technology
Yiqun Li Guilin University of Electronic Technology
Xiaowei Zhang Guilin University of Electronic Technology
Yabing Wang Guilin University of Electronic Technology

DOI:

https://doi.org/10.4108/eai.17-5-2022.173980

Keywords:

Sentence classification, text conceptualization, concept knowledge base, attention mechanism, concept embeddings

Abstract

Sentence classification is important for data mining and information security. Recently, researchers have paid increasing attention to applying conceptual knowledge to assist in sentence classification. Most existing approaches enhance classification by finding word-related concepts in external knowledge bases and incorporating them into sentence representations. However, this approach assumes that all concepts are equally important, which is not helpful for distinguishing the categories of the sentence. In addition, this approach may also introduce noisy concepts, resulting in lower classification performance. To measure the importance of the concepts for the text, we propose the Concept Kernel Attention Network (CKAN). It not only introduces concept information into the deep neural network but also contains two attention mechanisms to assign weights to concepts. The attention mechanisms are the text-to-concept attention mechanism (TCAM) and the entity-to-concept attention mechanism (ECAM). These attention mechanisms limit the importance of noisy concepts as well as contextually irrelevant concepts and assign more weights to concepts that are important for classification. Meanwhile, we combine the relevance of concepts and entities to encode multi-word concepts to reduce the impact of the inaccurate representation of multi-word concepts for classification. We tested our model on five public text classification datasets. Comparison experiments with strong baselines and ablation experiments demonstrate the effectiveness of CKAN.

References

Liu Y, Ji L, Huang R, Ming T, Gao C, Zhang J. An attention-gated convolutional neural network for sentence classification. Intelligent Data Analysis. 2019;23(5):1091-107.

Türker R. Short text categorization using world knowledge. Karlsruhe: Karlsruher Institut für Technologie; 2021.

Song G, Ye Y, Du X, Huang X, Bie S. Short text classification: A survey. Journal of multimedia. 2014;9(5):635-43.

Yin J, Tang M, Cao J, Wang H, You M, Lin Y. Vulnerability exploitation time prediction: an integrated framework for dynamic imbalanced learning. World Wide Web. 2022;25(1):401-423.

Zhang F, Wang Y, Liu S, Wang H. Decision-based evasion attacks on tree ensemble classifiers. World Wide Web. 2020; 23(5):2957-2977.

Ge Y-F, Orlowska M, Cao J, Wang H, Zhang Y. MDDE: multitasking distributed differential evolution for privacy-preserving database fragmentation. VLDB J. 2022.

Xie Q, Huang J, Peng M, Zhang Y, Peng K, Wang H. Discriminative Regularized Deep Generative Models for Semi-Supervised Learning. 2019 IEEE International Conference on Data Mining (ICDM); Nov. 8-11, 2019; Beijing, China: IEEE; 2019. pp. 658-667.

Xu J, Cai Y, Wu X, Lei X, Huang Q, Leung H-f, Li Q. Incorporating context-relevant concepts into convolutional neural networks for short text classification. Neurocomputing. 2020;386:42-53.

Liu Y, Li P, Hu X. Combining context-relevant features with multi-stage attention network for short text classification. Computer Speech & Language. 2022;71:101268.

Wang F, Wang Z, Li Z, Wen J-R, editors. Concept-based Short Text Classification and Ranking. Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management; 2014 November 3-7; Shanghai, China: Association for Computing Machinery.

Wang J, Wang Z, Zhang D, Yan J, editors. Combining Knowledge with Deep Convolutional Neural Networks for Short Text Classification. IJCAI; 2017.

Chen J, Hu Y, Liu J, Xiao Y, Jiang H, editors. Deep short text classification with knowledge powered attention. Proceedings of the AAAI Conference on Artificial Intelligence; 2019.

Tao S, Sakai T, editors. Improving concept representations for short text classiﬁcation. Proceedings of the 26th Annual Meeting of the Association for Natural Language Processing; 2020.

Li P, Mao K, Xu Y, Li Q, Zhang J. Bag-of-Concepts representation for document classification based on automatic knowledge acquisition from probabilistic knowledge base. Knowledge-Based Systems. 2020;193:105436.

Wu W, Li H, Wang H, Zhu KQ, editors. Probase: a probabilistic taxonomy for text understanding. Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data; 2012 May 20-24; Scottsdale, Arizona, USA: Association for Computing Machinery.

Wu F, Qiao Y, Chen J-H, Wu C, Qi T, Lian J, Liu D, Xie X, Gao J, Wu W, editors. Mind: A large-scale dataset for news recommendation. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics; 2020: Association for Computational Linguistics.

Mikolov T, Chen K, Corrado G, Dean J. Efficient estimation of word representations in vector space. arXiv preprint arXiv:13013781. 2013.

Pennington J, Socher R, Manning CD, editors. Glove: Global vectors for word representation. Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP); 2014.

Li W, Li L, editors. Combining Knowledge with Attention Neural Networks for Short Text Classification. International Conference on Knowledge Science, Engineering and Management; 2021; Berlin, Heidelberg: Springer, Cham.

Jiang H, Yang D, Xiao Y, Wang W. Understanding a bag of words by conceptual labeling with prior weights. World Wide Web. 2020;23(4):2429-47.

Devlin J, Chang M-W, Lee K, Toutanova K, editors. Bert: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies; 2019.

Zhan J, Liao X, Bao Y, Gan L, Tan Z, Zhang M, He R, Lu J. An effective feature representation of web log data by leveraging byte pair encoding and TF-IDF. Proceedings of the ACM Turing Celebration Conference-China; May 17-19; Chengdu, China: Association for Computing Machinery; 2019. p. Article 62.

Moens MF, Huang X-J, Specia L, Yih W-t, editors. Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing. Proceedings of the Conference; 2021 November 7-11: The Association for Computational Linguistics.

Reimers N, Gurevych I, editors. Sentence-bert: Sentence embeddings using siamese bert-networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing; 2019.

Li X, Li C, Chi J, Ouyang J. Short text topic modeling by exploring original documents. Knowl Inf Syst. 2018; 56(2):443-462.

Rashid J, Shah SMA, Irtaza A. Fuzzy topic modeling approach for text mining over short text. Inform Process Manag. 2019; 56(6):102060.

Gao W, Peng M, Wang H, Zhang Y, Xie Q, Tian G. Incorporating word embeddings into topic modeling of short text. Knowl Inf Syst. 2019; 61(2):1123-1145.

Gao W, Peng M, Wang H, Zhang Y, Han W, Hu G, Xie Q. Generation of topic evolution graphs from short text streams. Neurocomputing. 2020; 383:282-294.

Zhou P, Qi Z, Zheng S, Xu J, Bao H, Xu B. Text classification improved by integrating bidirectional LSTM with two-dimensional max pooling. arXiv preprint arXiv:161106639. 2016.

Peng M, Liao Q, Hu W, Tian G, Wang H, Zhang Y. Pattern Filtering Attention for Distant Supervised Relation Extraction via Online Clustering. In: Cheng R, Mamoulis N, Sun Y, Huang X, editors. Web Information Systems Engineering–WISE 2019. Cham: Springer International Publishing; 2019. pp. 310-325.

Flisar J, Podgorelec V, editors. Document Enrichment using DBPedia Ontology for Short Text Classification. Proceedings of the 8th International Conference on Web Intelligence, Mining and Semantics; 2018 June 25-27; Novi Sad, Serbia: Association for Computing Machinery.

Wang F, Wang Z, Li Z, Wen J-R. Concept-based Short Text Classification and Ranking. Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management; November 3-7, 2014; Shanghai, China: Association for Computing Machinery; 2014. pp. 1069-1078.

Flisar J, Podgorelec V. Improving short text classification using information from DBpedia ontology. Fundamenta Informaticae. 2020;172(3):261-97.

Ji L, Wang Y, Shi B, Zhang D, Wang Z, Yan J. Microsoft concept graph: Mining semantic concepts for short text understanding. Data Intelligence. 2019;1(3):238-70.

Wang Z, Wang H, Wen J-R, Xiao Y, editors. An Inference Approach to Basic Level of Categorization. Proceedings of the 24th ACM International on Conference on Information and Knowledge Management; 2015 October 18-23; Melbourne, Australia: Association for Computing Machinery.

Qi P, Zhang Y, Zhang Y, Bolton J, Manning CD, editors. Stanza: A python natural language processing toolkit for many human languages. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations; 2020.

Hao Y, Zhang Y, Liu K, He S, Liu Z, Wu H, Zhao J, editors. An end-to-end model for question answering over knowledge base with cross-attention combining global knowledge. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics; 2017.

Wang SI, Manning CD, editors. Baselines and bigrams: Simple, good sentiment and topic classification. Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics; 2012.

Conneau A, Schwenk H, Barrault L, Lecun Y, editors. Very deep convolutional networks for text classification. Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics; 2016: Association for Computational Linguistics.

Zhang X, Zhao J, LeCun Y, editors. Character-level convolutional networks for text classification. Advances in Neural Information Processing Systems; 2015.

Yogatama D, Dyer C, Ling W, Blunsom P. Generative and discriminative text classification with recurrent neural networks. arXiv preprint arXiv:170301898. 2017.

Howard J, Ruder S. Universal language model fine-tuning for text classification. arXiv preprint arXiv:180106146. 2018.

Ren W, Li Y, Su H, Kartchner D, Mitchell C, Zhang C, editors. Denoising multi-source weak supervision for neural text classification. Findings of the Association for Computational Linguistics: EMNLP 2020; 2020.

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I, editors. Attention is all you need. Advances in Neural Information Processing Systems; 2017.