Practical and fast causal consistent partial geo-replication

Pedro Fouto, João Leitão, Nuno Preguiça

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

Distributed storage systems are a fundamental component of large-scale Internet services. To keep up with the increasing expectations of users regarding availability and latency, the design of data storage systems has evolved to achieve these properties by exploiting techniques such as partial replication, geo-replication, and weaker consistency models. How to combine all these techniques in a single solution in a practical and efficient way is highly challenging. In this paper we propose a novel replication scheme that can offer causal+ consistency in a geo-distributed scenario with partial replication, where datacenters replicate different portions of the entire database. We leverage on a recently proposed methodology that decouples the propagation of data and causality-tracking metadata. Our solution presents a novel causal consistency tracking and enforcing algorithm, focusing on maximizing parallelism in the execution of remote operations which, as we show, has a significant influence on the performance of a partially replicated system. We also propose and implement a design to integrate our solution in the popular Cassandra database. Experimental results show that, by exploring a new position in the trade-off between throughput and data visibility (by balancing the execution of local and remote operations, respectively), our solution presents overall good performance.

Original languageEnglish
Title of host publicationNCA 2018 - 2018 IEEE 17th International Symposium on Network Computing and Applications
PublisherInstitute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)9781538676592
DOIs
Publication statusPublished - 26 Nov 2018
Event17th IEEE International Symposium on Network Computing and Applications, NCA 2018 - Cambridge, United States
Duration: 1 Nov 20183 Nov 2018

Conference

Conference17th IEEE International Symposium on Network Computing and Applications, NCA 2018
CountryUnited States
CityCambridge
Period1/11/183/11/18

Fingerprint

Metadata
Visibility
Throughput
Availability
Internet
Data storage equipment

Keywords

  • Causal Consistency
  • Geo-Replication
  • Partial Replication

Cite this

Fouto, P., Leitão, J., & Preguiça, N. (2018). Practical and fast causal consistent partial geo-replication. In NCA 2018 - 2018 IEEE 17th International Symposium on Network Computing and Applications [8548067] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/NCA.2018.8548067
Fouto, Pedro ; Leitão, João ; Preguiça, Nuno. / Practical and fast causal consistent partial geo-replication. NCA 2018 - 2018 IEEE 17th International Symposium on Network Computing and Applications. Institute of Electrical and Electronics Engineers Inc., 2018.
@inproceedings{5bca97b29038441c915f84bf43c025ed,
title = "Practical and fast causal consistent partial geo-replication",
abstract = "Distributed storage systems are a fundamental component of large-scale Internet services. To keep up with the increasing expectations of users regarding availability and latency, the design of data storage systems has evolved to achieve these properties by exploiting techniques such as partial replication, geo-replication, and weaker consistency models. How to combine all these techniques in a single solution in a practical and efficient way is highly challenging. In this paper we propose a novel replication scheme that can offer causal+ consistency in a geo-distributed scenario with partial replication, where datacenters replicate different portions of the entire database. We leverage on a recently proposed methodology that decouples the propagation of data and causality-tracking metadata. Our solution presents a novel causal consistency tracking and enforcing algorithm, focusing on maximizing parallelism in the execution of remote operations which, as we show, has a significant influence on the performance of a partially replicated system. We also propose and implement a design to integrate our solution in the popular Cassandra database. Experimental results show that, by exploring a new position in the trade-off between throughput and data visibility (by balancing the execution of local and remote operations, respectively), our solution presents overall good performance.",
keywords = "Causal Consistency, Geo-Replication, Partial Replication",
author = "Pedro Fouto and Jo{\~a}o Leit{\~a}o and Nuno Pregui{\cc}a",
note = "info:eu-repo/grantAgreement/EC/H2020/732505/EU# info:eu-repo/grantAgreement/FCT/5876/147279/PT#",
year = "2018",
month = "11",
day = "26",
doi = "10.1109/NCA.2018.8548067",
language = "English",
booktitle = "NCA 2018 - 2018 IEEE 17th International Symposium on Network Computing and Applications",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
address = "United States",

}

Fouto, P, Leitão, J & Preguiça, N 2018, Practical and fast causal consistent partial geo-replication. in NCA 2018 - 2018 IEEE 17th International Symposium on Network Computing and Applications., 8548067, Institute of Electrical and Electronics Engineers Inc., 17th IEEE International Symposium on Network Computing and Applications, NCA 2018, Cambridge, United States, 1/11/18. https://doi.org/10.1109/NCA.2018.8548067

Practical and fast causal consistent partial geo-replication. / Fouto, Pedro; Leitão, João; Preguiça, Nuno.

NCA 2018 - 2018 IEEE 17th International Symposium on Network Computing and Applications. Institute of Electrical and Electronics Engineers Inc., 2018. 8548067.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Practical and fast causal consistent partial geo-replication

AU - Fouto, Pedro

AU - Leitão, João

AU - Preguiça, Nuno

N1 - info:eu-repo/grantAgreement/EC/H2020/732505/EU# info:eu-repo/grantAgreement/FCT/5876/147279/PT#

PY - 2018/11/26

Y1 - 2018/11/26

N2 - Distributed storage systems are a fundamental component of large-scale Internet services. To keep up with the increasing expectations of users regarding availability and latency, the design of data storage systems has evolved to achieve these properties by exploiting techniques such as partial replication, geo-replication, and weaker consistency models. How to combine all these techniques in a single solution in a practical and efficient way is highly challenging. In this paper we propose a novel replication scheme that can offer causal+ consistency in a geo-distributed scenario with partial replication, where datacenters replicate different portions of the entire database. We leverage on a recently proposed methodology that decouples the propagation of data and causality-tracking metadata. Our solution presents a novel causal consistency tracking and enforcing algorithm, focusing on maximizing parallelism in the execution of remote operations which, as we show, has a significant influence on the performance of a partially replicated system. We also propose and implement a design to integrate our solution in the popular Cassandra database. Experimental results show that, by exploring a new position in the trade-off between throughput and data visibility (by balancing the execution of local and remote operations, respectively), our solution presents overall good performance.

AB - Distributed storage systems are a fundamental component of large-scale Internet services. To keep up with the increasing expectations of users regarding availability and latency, the design of data storage systems has evolved to achieve these properties by exploiting techniques such as partial replication, geo-replication, and weaker consistency models. How to combine all these techniques in a single solution in a practical and efficient way is highly challenging. In this paper we propose a novel replication scheme that can offer causal+ consistency in a geo-distributed scenario with partial replication, where datacenters replicate different portions of the entire database. We leverage on a recently proposed methodology that decouples the propagation of data and causality-tracking metadata. Our solution presents a novel causal consistency tracking and enforcing algorithm, focusing on maximizing parallelism in the execution of remote operations which, as we show, has a significant influence on the performance of a partially replicated system. We also propose and implement a design to integrate our solution in the popular Cassandra database. Experimental results show that, by exploring a new position in the trade-off between throughput and data visibility (by balancing the execution of local and remote operations, respectively), our solution presents overall good performance.

KW - Causal Consistency

KW - Geo-Replication

KW - Partial Replication

UR - http://www.scopus.com/inward/record.url?scp=85059985366&partnerID=8YFLogxK

U2 - 10.1109/NCA.2018.8548067

DO - 10.1109/NCA.2018.8548067

M3 - Conference contribution

BT - NCA 2018 - 2018 IEEE 17th International Symposium on Network Computing and Applications

PB - Institute of Electrical and Electronics Engineers Inc.

ER -

Fouto P, Leitão J, Preguiça N. Practical and fast causal consistent partial geo-replication. In NCA 2018 - 2018 IEEE 17th International Symposium on Network Computing and Applications. Institute of Electrical and Electronics Engineers Inc. 2018. 8548067 https://doi.org/10.1109/NCA.2018.8548067