Capturing Racial & Gender Inequities on Social Media Platforms using Machine Learning


  • Sonika Malik Department of Information Technology, Maharaja Surajmal Institute of Technology, Delhi, India
  • Harshita Chopra Department of Information Technology, Maharaja Surajmal Institute of Technology, Delhi, India
  • Aniket Vashishtha Department of Information Technology, Maharaja Surajmal Institute of Technology, Delhi, India



Social Media Analytics, Aspect Extraction, Machine Learning, Natural Language Processing


Online social media platforms provide a continuously evolving database due to the highly increasing popularity and rapid expansion of its user base. Users share their life experiences towards various inequity incidents faced at the workplace on the basis of their race or gender on these platforms while maintaining their anonymity. We aim at utilising famous social media platforms to perform extensive analysis and classification tasks for posts capturing instances of various types of Inequalities prevalent in today’s workplace. We present a framework to mine opinions expressed towards sexual harassment, mental health, racial injustice and gender-based bias in the corporate workplace using NLP techniques on social media data. The documents are represented by semantic similarity to aspect embedding’s captured using an attention-based framework for aspect extraction. In addition, we used scores from Empath categories to add information related to emotional facets.


Maharani W, Widyantoro DH, Khodra ML. Aspect extraction in customer reviews using syntactic pattern. Procedia Computer Science. 2015 Jan 1;59:244-53.

Poria S, Cambria E, Gelbukh A. Aspect extraction for opinion mining with a deep convolutional neural network. Knowledge-Based Systems. 2016 Sep 15;108:42-9.

Poria S, Cambria E, Ku LW, Gui C, Gelbukh A. A rule-based approach to aspect extraction from product reviews. InProceedings of the second workshop on natural language processing for social media (SocialNLP) 2014 Aug (pp. 28-37).

Dosoula N, Griep R, Ridder RD, Slangen R, Schouten K, Frasincar F. Detection of multiple implicit features per sentence in consumer review data. InInternational Baltic Conference on Databases and Information Systems 2016 Jul 4 (pp. 289-303). Springer, Cham.

Jiménez-Zafra SM, Martín-Valdivia MT, Martínez-Cámara E, Ureña-López LA. Combining resources to improve unsupervised sentiment analysis at aspect-level. Journal of Information Science. 2016 Apr;42(2):213-29.

Panchendrarajan R, Ahamed N, Murugaiah B, Sivakumar P, Ranathunga S, Pemasiri A. Implicit aspect detection in restaurant reviews using cooccurence of words. InProceedings of the 7th workshop on computational approaches to subjectivity, sentiment and social media analysis 2016 Jun (pp. 128-136).

Fujita, H., and A. Selamat. "Hate Crime on Twitter: Aspect-based Sentiment Analysis Approach." In Advancing Technology Industrialization Through Intelligent Software Methodologies, Tools and Techniques: Proceedings of the 18th International Conference on New Trends in Intelligent Software Methodologies, Tools and Techniques (SoMeT_19), vol. 318, p. 284. IOS Press, 2019.

Schouten K, Frasincar F. Finding implicit features in consumer reviews for sentiment analysis. InInternational Conference on Web Engineering 2014 Jul 1 (pp. 130-144). Springer, Cham.

Xu H, Zhang F, Wang W. Implicit feature identification in Chinese reviews using explicit topic mining model. Knowledge-Based Systems. 2015 Mar 1;76:166-75.

Chatterji S, Varshney N, Rahul RK. AspectFrameNet: a frameNet extension for analysis of sentiments around product aspects. The Journal of Supercomputing. 2017 Mar;73(3):961-72.

Dina NZ, Juniarta N. Aspect based Sentiment Analysis of Employee’s Review Experience. Journal of Information Systems Engineering and Business Intelligence. 2020 Apr 27;6(1):79-88.

He R, Lee WS, Ng HT, Dahlmeier D. An unsupervised neural attention model for aspect extraction. InProceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) 2017 Jul (pp. 388-397).

Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J. Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems. 2013;26.

Das Swain V, Saha K, Reddy MD, Rajvanshy H, Abowd GD, De Choudhury M. Modeling organizational culture with workplace experiences shared on glassdoor. InProceedings of the 2020 CHI conference on human factors in computing systems 2020 Apr 21 (pp. 1-15).

Fast E, Chen B, Bernstein MS. Empath: Understanding topic signals in large-scale text. InProceedings of the 2016 CHI conference on human factors in computing systems 2016 May 7 (pp. 4647-4657).

Yang Z, Yang D, Dyer C, He X, Smola A, Hovy E. Hierarchical attention networks for document classification. InProceedings of the 2016 conference of the North American chapter of the association for computational linguistics: human language technologies 2016 Jun (pp. 1480-1489).

Onabola O, Ma Z, Xie Y, Akera B, Ibraheem A, Xue J, Liu D, Bengio Y. hBert+ BiasCorp--Fighting Racism on the Web. arXiv preprint arXiv:2104.02242. 2021 Apr 6.




How to Cite

Malik S, Chopra H, Vashishtha A. Capturing Racial & Gender Inequities on Social Media Platforms using Machine Learning. EAI Endorsed Trans Creat Tech [Internet]. 2022 Jul. 6 [cited 2024 Apr. 18];9(31):e4. Available from: