Accelerating Louvain community detection algorithm on graphic processing unit

Maryam Mohammadi, Mahmood Fazlali, Mehdi Hosseinzadeh

Research output: Contribution to journalArticlepeer-review

10 Citations (Scopus)


The Louvain community detection algorithm is a hierarchal clustering method categorized in the NP-hard problem. Its execution time to find communities in large graphs is, therefore, a challenge. Parallelization is an effective solution for amortizing Louvain's execution time. In this paper, we propose an adaptive CUDA Louvain method (ACLM) algorithm that benefits from the graphic processing unit (GPU). ACLM uses the shared memory in GPU, as well as the optimal number of threads in the GPU blocks. These features minimize parallelization overhead and accelerate the calculation of modularity parameters. The proposed algorithm allocates threads to each block based on the number of required streaming multiprocessors (SMs) and warps on GPU. The implementation results show that ACLM can effectively accelerate the execution time by 77% compared to the competitive method in the large graph benchmarks.

Original languageEnglish
Pages (from-to)6056-6077
Number of pages22
JournalJournal of Supercomputing
Issue number6
Publication statusPublished - Jun 2021


  • CUDA
  • Louvain community detection
  • Modularity
  • Parallel processing


Dive into the research topics of 'Accelerating Louvain community detection algorithm on graphic processing unit'. Together they form a unique fingerprint.

Cite this