g-Sum: A Graph Summarization Approach for a Single Large Social Network
DOI:
https://doi.org/10.4108/eai.23-3-2021.169073Keywords:
Social Network, Social Graph, Graph Data, Graph Mining, Graph Summarization, Graph Partition, Reconstruction Error, Big GraphAbstract
With the advances in computing resources, the processing of huge amount of data becomes possible. However, the human ability to identify patterns from such data has not scaled accordingly. Consequently, efficient computational approaches for condensing and simplifying data are becoming essential for uncovering actionable insights, especially the big social networks. Many large datasets analyzed today are represented with the help of graphs. In many real-world applications, summarizing large graphs is beneficial to achieve a number of advantages, including 1) significant speedup for graph algorithms, 2) graph storage space reduction, 3) faster network transmission, 4) improved data privacy, 5) more effective graph visualization, etc. All these benefits can be obtained from the summarized graph, if it is summarized accurately. The quality of the summary graph is mostly measured using the Reconstruction Error (RE). In the existing literature, RE is a very challenging task. Most of the approaches investigated so farending up with high value of RE. Hence, the summarized graph does not express the original graph completely due to the high value of the RE. Therefore, in this work we have examined how can we summarize a big graph structure into a compact summary by reducing its RE and ensuring its usefulness. For this task, we have proposed a novel graph summarization approach, called g-Sum, to calculate the graph structure summaries while minimizing RE and compare to the existing work in the domain. The functionality of the g-Sum is decomposed into three different, but interlinked phases; (1)- graph dataset pre-processing; (2)- graph partitioning; and (3)- graph summarization. We have performed an extensive series of experiments on different real-world social graph dataset, including Ego-Facebook, Enron Email and Web-Stanford graph datasets. The computed results show the effectiveness of the g-Sum and also compared with the state-of-the-art graph summarization approaches, S2L and GraSS.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2022 EAI Endorsed Transactions on Scalable Information Systems
This work is licensed under a Creative Commons Attribution 3.0 Unported License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.