Truculent Post Analysis for Hindi Text

Authors

  • Mitali Agarwal University of Petroleum and Energy Studies image/svg+xml
  • Poorvi Sahu University of Petroleum and Energy Studies image/svg+xml
  • Nisha Singh University of Petroleum and Energy Studies image/svg+xml
  • Jasleen University of Petroleum and Energy Studies image/svg+xml
  • Puneet Sinha Bajaj Finserv
  • Rahul Kumar Singh University of Petroleum and Energy Studies image/svg+xml

DOI:

https://doi.org/10.4108/eetsis.5641

Keywords:

Truculent Post, Hindi language, Sentiment Analysis, BERT, LSTM, NLP

Abstract

INTRODUCTION: With the rise of social media platforms, the prevalence of truculent posts has become a major concern. These posts, which exhibit anger, aggression, or rudeness, not only foster a hostile environment but also have the potential to stir up harm and violence.

OBJECTIVES: It is essential to create efficient algorithms for detecting virulent posts so that they can recognise and delete such content from social media sites automatically. In order to improve accuracy and efficiency, this study evaluates the state-of-the-art in truculent post detection techniques and suggests a unique method that combines deep learning and natural language processing. The major goal of the proposed methodology is to successfully regulate hostile social media posts by keeping an eye on them.

METHODS: In order to effectively identify the class labels and create a deep-learning method, we concentrated on comprehending the negation words, sarcasm, and irony using the LSTM model. We used multilingual BERT to produce precise word embedding and deliver semantic data. The phrases were also thoroughly tokenized, taking into consideration the Hindi language, thanks to the assistance of the Indic NLP library.

RESULTS:  The F1 scores for the various classes are given in the "Proposed approach” as follows: 84.22 for non-hostile, 49.26 for hostile, 68.69 for hatred, 49.81 for fake, and 39.92 for offensive

CONCLUSION: We focused on understanding the negation words, sarcasm and irony using the LSTM model, to classify the class labels accurately and build a deep-learning strategy.

References

M. Bhardwaj, M.S. Akhtar, A. Ekbal, A. Das, T. Chakraborty, Hostility detection dataset in hindi (2020). arXiv:2011.03588.

V. Bhatnagar, P. Kumar, S. Moghili, and P. Bhattacharyya, Divide and conquer: An ensemble approach for hostile post detection in hindi In Combating Online Hostile Posts in Regional Languages during Emergency Situation: First International Workshop, CONSTRAINT 2021, Collocated with AAAI 2021, Virtual Event, February 8, 2021, Revised Selected Papers 1 (pp. 244-255). Springer International Publishing.

V.K. Jha, P. Hrudya, P. Vinu, V. Vijayan, and P. Prabaharan, Dhot-repository and classification of offensive tweets in the Hindi language, Procedia Computer Science, 171 (2020) 2324–2333.

S.M. Jayanthi, A. Gupta, Sj_aj@ dravidianlangtech-eacl2021: Task-adaptive pre-training of multilingual bert models for offensive language identification, arXiv preprint arXiv:2102.01051 (2021).

Bhatnagar, Varad, Prince Kumar, and Pushpak Bhattacharyya. "Investigating hostile post detection in Hindi." Neurocomputing 474 (2022): 60-81.

Torregrosa, Javier, Sergio D’Antonio-Maceiras, Guillermo Villar-Rodríguez, Amir Hussain, Erik Cambria, and David Camacho. "A mixed approach for aggressive political discourse analysis on Twitter." Cognitive computation 15, no. 2 (2023): 440-465.

Bathla, Gourav, Pardeep Singh, Rahul Kumar Singh, Erik Cambria, and Rajeev Tiwari. "Intelligent fake reviews detection based on aspect extraction and analysis using deep learning." Neural Computing and Applications 34, no. 22 (2022): 20213-20229.

Schmidt, Anna, and Michael Wiegand. "A survey on hate speech detection using natural language processing." In Proceedings of the fifth international workshop on natural language processing for social media, pp. 1-10. 2017.

A.G. d’Sa, I. Illina, D. Fohr, "Bert and fasttext embeddings for automatic detection of toxic speech." In 2020 International Multi-Conference on:“Organization of Knowledge and Advanced Technologies”(OCTA), pp. 1-5. IEEE, 2020.

T. Raha, S.G. Roy, U. Narayan, Z. Abid, V. Varma, "Task adaptive pretraining of transformers for hostility detection." In Combating Online Hostile Posts in Regional Languages during Emergency Situation: First International Workshop, CONSTRAINT 2021, Collocated with AAAI 2021, Virtual Event, February 8, 2021, Revised Selected Papers 1, pp. 236-243. Springer International Publishing, 2021.

R.K Singh,M.K Sachan,R.B Patel, "Cross‐domain opinion classification via aspect analysis and attention sharing mechanism." Concurrency and Computation: Practice and Experience 34, no. 15 (2022): e6957.

A. De, Venkatesh E, Kumar Maurya, M.S. Desarkar: "Coarse and fine-grained hostility detection in Hindi posts using fine tuned multilingual embeddings." In Combating Online Hostile Posts in Regional Languages during Emergency Situation: First International Workshop, CONSTRAINT 2021, Collocated with AAAI 2021, Virtual Event, February 8, 2021, Revised Selected Papers 1, pp. 201-212. Springer International Publishing, 2021.

Badjatiya, Pinkesh, Shashank Gupta, Manish Gupta, and Vasudeva Varma. "Deep learning for hate speech detection in tweets." In Proceedings of the 26th international conference on World Wide Web companion, pp. 759-760. 2017.

Z. Waseem and D. Hovy. "Hateful symbols or hateful people? predictive features for hate speech detection on twitter." In Proceedings of the NAACL student research workshop, pp. 88-93. 2016.

O. Kamal,A. Kumar ,and T. Vaidhya, "Hostility detection in Hindi leveraging pre-trained language models." In Combating Online Hostile Posts in Regional Languages during Emergency Situation: First International Workshop,

CONSTRAINT 2021, Collocated with AAAI 2021, Virtual Event, February 8, 2021, Revised Selected Papers 1, pp. 213-223. Springer International Publishing, 2021.

Hossain, M.Z., Rahman, M.A., Islam, M.S., Kar, S., "Banfakenews: A dataset for detecting fake news in bangla." arXiv preprint arXiv:2004.08789 (2020).

Vinayak, S., Sharma, R., & Singh, R., "MOVBOK: A personalized social network based cross domain recommender system." Indian Journal of Science and Technology 9, no. 31 (2016): 1-10

Singh, R. K., Sachan, M. K., & Patel, R. B., "Cross‐domain sentiment classification using decoding‐enhanced bidirectional encoder representations from transformers with disentangled attention." Concurrency and Computation: Practice and Experience 35, no. 6 (2023): 1-1.

Downloads

Published

04-04-2024

How to Cite

1.
Agarwal M, Sahu P, Singh N, Jasleen, Sinha P, Singh RK. Truculent Post Analysis for Hindi Text. EAI Endorsed Scal Inf Syst [Internet]. 2024 Apr. 4 [cited 2024 May 3];. Available from: https://publications.eai.eu/index.php/sis/article/view/5641

Issue

Section

Short communications