TY - JOUR
T1 - Community-based replica management in distributed systems
AU - Nosrati, Masoud
AU - Fazlali, Mahmood
N1 - © 2018, Emerald Publishing Limited.
PY - 2018
Y1 - 2018
N2 - Purpose: One of the techniques for improving the performance of distributed systems is data replication, wherein new replicas are created to provide more accessibility, fault tolerance and lower access cost of the data. In this paper, the authors propose a community-based solution for the management of data replication, based on the graph model of communication latency between computing and storage nodes. Communities are the clusters of nodes that the communication latency between the nodes are minimum values. The purpose of this study if to, by using this method, minimize the latency and access cost of the data. Design/methodology/approach: This paper used the Louvain algorithm for finding the best communities. In the proposed algorithm, by requesting a file according to the nodes of each community, the cost of accessing the file located out of the applicant’s community was calculated and the results were accumulated. On exceeding the accumulated costs from a specified threshold, a new replica of the file was created in the applicant’s community. Besides, the number of replicas of each file should be limited to prevent the system from creating useless and redundant data. Findings: To evaluate the method, four metrics were introduced and measured, including communication latency, response time, data access cost and data redundancy. The results indicated acceptable improvement in all of them. Originality/value: So far, this is the first research that aims at managing the replicas via community detection algorithms. It opens many opportunities for further studies in this area.
AB - Purpose: One of the techniques for improving the performance of distributed systems is data replication, wherein new replicas are created to provide more accessibility, fault tolerance and lower access cost of the data. In this paper, the authors propose a community-based solution for the management of data replication, based on the graph model of communication latency between computing and storage nodes. Communities are the clusters of nodes that the communication latency between the nodes are minimum values. The purpose of this study if to, by using this method, minimize the latency and access cost of the data. Design/methodology/approach: This paper used the Louvain algorithm for finding the best communities. In the proposed algorithm, by requesting a file according to the nodes of each community, the cost of accessing the file located out of the applicant’s community was calculated and the results were accumulated. On exceeding the accumulated costs from a specified threshold, a new replica of the file was created in the applicant’s community. Besides, the number of replicas of each file should be limited to prevent the system from creating useless and redundant data. Findings: To evaluate the method, four metrics were introduced and measured, including communication latency, response time, data access cost and data redundancy. The results indicated acceptable improvement in all of them. Originality/value: So far, this is the first research that aims at managing the replicas via community detection algorithms. It opens many opportunities for further studies in this area.
KW - Community detection
KW - Distributed system
KW - Replication
KW - Resource management
UR - http://www.scopus.com/inward/record.url?scp=85046729357&partnerID=8YFLogxK
U2 - 10.1108/IJWIS-01-2017-0006
DO - 10.1108/IJWIS-01-2017-0006
M3 - Article
AN - SCOPUS:85046729357
SN - 1744-0084
VL - 14
SP - 41
EP - 61
JO - International Journal of Web Information Systems
JF - International Journal of Web Information Systems
IS - 1
ER -