A PERFORMANCE EVALUATION OF COMMUNITY DETECTION ALGORITHMS USING MODULARITY AND NMI ACROSS DIVERSE SOCIAL NETWORK DATASETS
Main Article Content
Abstract
Community detection is crucial in the analysis of social networks. Its main goal is to find groups of users densely connected among them, and sparsely connected between themselves. There are so many algorithms which makes it difficult for researchers to choose the best one for a specific dataset. In this paper, we provide a thorough comparison between five Community Detection (CD) algorithms- Girman-Newman (GN), Clauset-Newman-Moore (CNM), Label Propagation Algorithm (LPA), Louvain and Leiden. To evaluate in real social network datasets like Zachary's Karate Club, Dolphin networks and bigger ones such as Facebook, Twitter, LinkedIn along with citation networks we used different metrics modularity and NMI. We employed Normalized Mutual Information (NMI) to quantify the agreement of detected communities with their true ground-truth score and modularity for evaluating the overall quality of partitions given by diverse methods. Our experiments show that greedy and modularity-optimization algorithms are particularly well-suited. Notably, the Leiden algorithm had a better modularity value than Louvain (Q = 0.9141 and Q = 0.9051 of LinkedIn Network respectively) in most of the dataset. The NMI plot provided more explanation about Clauset-Newman-Moore (CNM), Louvain and Label Propagation Algorithm (LPA)which are in good agreement on community detection and their NMI score. These results would allow us to choose more rationally among the different community detection algorithms for social network analysis, by providing accurate quantitative benchmarks.