High-Order Local Clustering on Hypergraphs
DOI:
https://doi.org/10.4108/eetsis.7431Keywords:
Local Clustering, Hypergraphs, ConductanceAbstract
Graphs are a commonly used model in data mining to represent complex relationships, with nodes representing entities and edges representing relationships. However, graphs have limitations in modeling high-order relationships. In contrast, hypergraphs offer a more versatile representation, allowing edges to join any number of nodes. This capability empowers hypergraphs to model multiple relationships and capture high-order information present in real-world applications. We focus on the problem of local clustering in hypergraphs, which computes a cluster near a given seed node. Although extensively explored in the context of graphs, this problem has received less attention for hypergraphs. Current methods often directly extend graph-based local clustering to hypergraphs, overlooking their inherent high-order features and resulting in low-quality local clusters. To address this, we propose an effective hypergraph local clustering model. This model introduces a novel conductance measurement that leverages the high-order properties of hypergraphs to assess cluster quality. Based on this new definition of hypergraph conductance, we propose a greedy algorithm to find local clusters in real time. Experimental evaluations and case studies on real-world datasets demonstrate the effectiveness of the proposed methods.
References
[1] Agarwal, S., Lim, J., Zelnik-Manor, L., Perona, P., Kriegman, D.J., Belongie, S.J.: Beyond pairwise clustering. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society (2005)
[2] Arora, J., Tushir, M., Dadhwal, S.K.: A new suppressionbased possibilistic fuzzy c-means clustering algorithm. EAI Endorsed Transactions on Scalable Information Systems (2023)
[3] Backstrom, L., Huttenlocher, D., Kleinberg, J., Lan, X.: Group formation in large social networks: membership, growth, and evolution. In: Proceedings of the 12th ACM SIGKDD (2006)
[4] Chierichetti, F., Lattanzi, S., Panconesi, A.: Rumour spreading and graph conductance. In: Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete Algorithms. SIAM (2010)
[5] Deng, N., Wang, Y., Huang, G., Zhou, Y., Li, Y.: Semantic coherence analysis of english texts based on sentence semantic graphs. EAI Endorsed Transactions on Scalable Information Systems (2023)
[6] Du, R., Drake, B.L., Park, H.: Hybrid clustering based on content and connection structure using joint nonnegative matrix factorization. J. Glob. Optim. (2019)
[7] Epasto, A., Feldman, J., Lattanzi, S., Leonardi, S., Mirrokni, V.: Reduce and aggregate: similarity ranking in multi-categorical bipartite graphs. In: Proceedings of the 23rd international conference onWorld wide web (2014)
[8] Gargi, U., Lu, W., Mirrokni, V., Yoon, S.: Large-scale community detection on youtube for topic discovery and exploration. In: Proceedings of the International AAAI Conference on Web and Social Media (2011)
[9] Gong, X., Wang, H., Wang, X., Chen, C., Zhang, W., Zhang, Y.: Influence maximization on hypergraphs via multi-hop influence estimation. Information Processing & Management (2024)
[10] Jeub, L.G., Balachandran, P., Porter, M.A., Mucha, P.J., Mahoney, M.W.: Think locally, act locally: Detection of small, medium-sized, and large communities in large networks. Physical Review E (2015)
[11] Jiang, B.B., Wang, J.G., Wang, Y., Xiao, J.: Gene prioritization for type 2 diabetes in tissue-specific protein interaction networks. Systems Biology (2009)
[12] Koutis, I., Miller, G.L.: Graph partitioning into isolated, high conductance clusters: Theory, computation and applications to preconditioning. In: Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures (2008)
[13] Lai, L., Qing, Z., Yang, Z., Jin, X., Lai, Z., Wang, R., Hao, K., Lin, X., Qin, L., Zhang, W., et al.: Distributed subgraph matching on timely dataflow. Proceedings of the VLDB Endowment 12(10) (2019)
[14] Lang, K., Rao, S.: A flow-based method for improving the expansion or conductance of graph cuts. In: International Conference on Integer Programming and Combinatorial Optimization. Springer (2004)
[15] Leskovec, J., Adamic, L.A., Huberman, B.A.: The dynamics of viral marketing. ACM Transactions on the Web (TWEB) 1(1), 5–es (2007)
[16] Li, J., He, J., Zhu, Y.: E-tail product return prediction via hypergraph-based local graph cut. In: SIGKDD. pp. 519–527 (2018)
[17] Li, Y., Yang, R., Shi, J.: Efficient and effective attributed hypergraph clustering via k-nearest neighbor augmentation. Proc. ACM Manag. Data 1(2) (2023)
[18] Lin, L., Li, R., Jia, T.: Scalable and effective conductancebased graph clustering. In: IAAI. pp. 4471–4478 (2023)
[19] Liu, B., Zhang, F., Zhang, W., Lin, X., Zhang, Y.: Efficient community search with size constraint. In: 37th IEEE International Conference on Data Engineering, ICDE 2021, Chania, Greece, April 19-22, 2021. IEEE (2021)
[20] Liu, K., Wang, S., Zhang, Y., Xing, C.: An efficient algorithm for distance-based structural graph clustering. Proc. ACM Manag. Data 1(1), 45:1–45:25 (2023)
[21] Luo, L., Fang, Y., Cao, X., Zhang, X., Zhang, W.: Detecting communities from heterogeneous graphs: A context path-based graph neural network model. In: Proceedings of the 30th ACM international conference on information & knowledge management (2021)
[22] Luo, Q., Yu, D., Cai, Z., Lin, X., Wang, G., Cheng, X.: Toward maintenance of hypercores in large-scale dynamic hypergraphs. The VLDB Journal 32(3) (2023)
[23] Luo, Q., Yu, D., Liu, Y., Zheng, Y., Cheng, X., Lin, X.: Finer-grained engagement in hypergraphs. In: (ICDE). IEEE (2023)
[24] Luo, Q., Zhang, W., Yang, Z., Wen, D., Wang, X., Yu, D., Lin, X.: Hierarchical structure construction on hypergraphs. In: Proceedings of the 33rd ACM International Conference on Information and Knowledge Management (2024)
[25] Nahi, H.A., Al-dolaimy, F., Abbas, F.H., Almohamadi, M., Hasan, M.A., Alkhafaji, M.A., Guneser, M.T.: A multi-objective optimization for enhancing the efficiency of service in flying ad-hoc network environment. EAI Endorsed Transactions on Scalable Information Systems (2023)
[26] Sarki, R., Ahmed, K., Wang, H., Zhang, Y., Wang, K.: Convolutional neural network for multi-class classification of diabetic eye disease. EAI Endorsed Transactions on Scalable Information Systems (2021)
[27] Singh, R., Subramani, S., Du, J., Zhang, Y., Wang, H., Miao, Y., Ahmed, K.: Antisocial behavior identification from twitter feeds using traditional machine learning algorithms and deep learning. EAI Endorsed Transactions on Scalable Information Systems (2023)
[28] Spielman, D.A., Teng, S.: A local clustering algorithm for massive graphs and its application to nearly linear time graph partitioning. SIAM J. Comput. (2013)
[29] Su, G., Wang, H., Zhang, Y., Coster, A.C., Wilkins, M., Canete, P.F., Yu, D., Yang, Y., zhang, w.: Inferring gene regulatory networks by hypergraph variational autoencoder. bioRxiv (2024)
[30] Takai, Y., Miyauchi, A., Ikeda, M., Yoshida, Y.: Hypergraph clustering based on pagerank. In: KDD. pp. 1970– 1978. ACM (2020)
[31] Voevodski, K., Teng, S.H., Xia, Y.: Spectral affinity in protein networks. BMC systems biology (2009)
[32] Yang, J., Leskovec, J.: Defining and evaluating network communities based on ground-truth. In: ACM SIGKDD Workshop (2012)
[33] Yang, R., Shi, J., Yang, Y., Huang, K., Zhang, S., Xiao, X.: Effective and scalable clustering on massive attributed graphs. In: WWW. ACM / IW3C2 (2021)
[34] Yang, Z., Zhang, W., Lin, X., Zhang, Y., Li, S.: Hgmatch: A match-by-hyperedge approach for subgraph matching on hypergraphs. In: (ICDE) (2023)
[35] Zhang, Y., Rohe, K.: Understanding regularized spectral clustering via graph conductance. Advances in Neural Information Processing Systems (2018)
[36] Zhou, Y., Cheng, H., Yu, J.X.: Clustering large attributed graphs: An efficient incremental approach. In: ICDM 2010, The 10th IEEE. IEEE (2010)
[37] Zhou, Z., Zhang, F., Lin, X., Zhang, W., Chen, C.: K-core maximization: An edge addition approach. In: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10-16, 2019. ijcai.org (2019)
[38] Zien, J.Y., Schlag, M.D.F., Chan, P.K.: Multilevel spectral hypergraph partitioning with arbitrary vertex sizes. IEEE 18(9), 1389–1399 (1999)
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2024 Jingtian Wei, Zhengyi Yang, Qi Luo, Yu Zhang, Lu Qin, Wenjie Zhang
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
This is an open access article distributed under the terms of the CC BY-NC-SA 4.0, which permits copying, redistributing, remixing, transformation, and building upon the material in any medium so long as the original work is properly cited.