Novel Semantic Relatedness Computation for Multi-Domain Unstructured Data

Rafeeq  Ahmed; Pradeep  Kumar Singh; Tanvir  Ahmad

doi:10.4108/eai.13-7-2018.165503

Novel Semantic Relatedness Computation for Multi-Domain Unstructured Data

Authors

Rafeeq Ahmed Jamia Millia Islamia
Pradeep Kumar Singh KNIT Sultanpur
Tanvir Ahmad Jamia Millia Islamia

DOI:

https://doi.org/10.4108/eai.13-7-2018.165503

Keywords:

Text Mining, Semantic Similarity, Concept Extraction

Abstract

Semantic Relatedness computation has been a fundamental as well as an essential step for domains like Information Retrieval, Natural Language Processing, Semantic Web, etc. Many techniques for Semantic Relatedness calculation in a single domain have been proposed. However, these techniques give inappropriate results for the massive multidomain dataset because they provide a relation between concepts across different domains, which are not related to each other. Their similarities should be minimized. In this paper, a novel method, "modified Balanced Mutual Information(MBMI)," to calculate the semantic relatedness of multidomain data has been proposed. In this proposed method, to get semantic relatedness, concepts are extracted, followed by a fuzzy vector from a given corpus. A comparison of the proposed method with other existing methods has been performed. We used medical and computer science articles as our dataset. The proposed method shows better results for multidomain data.

Downloads

Download data is not yet available.

Downloads

Published

30-06-2020

How to Cite

Ahmed R, Kumar Singh P, Ahmad T. Novel Semantic Relatedness Computation for Multi-Domain Unstructured Data. EAI Endorsed Trans Energy Web [Internet]. 2020 Jun. 30 [cited 2025 Oct. 13];8(31):e5. Available from: https://publications.eai.eu/index.php/ew/article/view/834

Download Citation

Issue

Vol. 8 No. 31 (2021): EAI Endorsed Transactions on Energy Web

Section

Research articles

License

This work is licensed under a Creative Commons Attribution 3.0 Unported License.

This is an open-access article distributed under the terms of the Creative Commons Attribution CC BY 4.0 license, which permits unlimited use, distribution, and reproduction in any medium so long as the original work is properly cited.