High Availability with Redis Cluster

Welcome to this blog. If you are coming here directly, it’s highly recommended to read through this story first. We shall be looking at following topics in this blog :-

  • High-Availability with Redis Cluster.

Question:- How does Redis-Cluster provides High-Availability ?

Answer:- High availability refers to the cluster’s ability to remain operational even in the face of certain failures. For example, the cluster can detect when a primary shard fails and promote a replica to a primary without any manual intervention from the outside.

Question:- How does Redis-Cluster provides Automatic-Failover ?

Answer:- Redis-Cluster can come to know quickly, whenever the primary shard has failed and it can promote its replica to the new primary.

  • Say, we have one replica for every primary shard. If all our data is divided between three Redis Servers, we would need a six-membered cluster, with three primary shards and three replicas.

Question:- Demonstrate how the Split-Brain situation can happen with Redis-Cluster ?

Answer:- Here, is how the Split-Brain situation is demonstrated :-

Step #1.) Imagine that, we have got a Redis-Cluster with THREE primary shards and one replica for every primary shard. Overall, our Redis cluster is a six-membered cluster, with three primary shards and three replicas. Further imagine that, Network Partitioning has have happened i.e. the group on the left side will not be able to talk to the shards in the group on the right side.

  • Now, both cluster-groups will think that they are offline and both shall trigger a fail-over of any primary shards, resulting in left side with all primary shards, as well as right side also would have all primary shards.

Step #2.) Both sides, thinking they have all the primaries, will continue to receive client requests that modify data. And that is a problem, because maybe client A sets the key foo to bar on the left side, but a client B sets the same key’s value to baz on the right side.

Step #3.) When the network partition is removed and the shards try to rejoin, we will have a conflict, because we have two shards holding different data, claiming to be the primary, and we wouldn’t know which data is valid. This is called a split brain situation, and it’s a very common issue in the world of distributed systems.

Question:- What’s the solution to fix the Split-Brain situation ?

Answer:- Maintain an odd number of primary shards and two replicas per primary shard. Here is the detailed solution to this problem :-

  • To prevent something called a split brain situation in a Redis cluster, always keep an odd number of shards in your cluster.

Let’s take this below cluster :-

Now, Imagine a network-split happens like this :-

  • Here, Left side group (set of nodes), is in Minority and therefore it shall NOT try to trigger a fail-over and shall STOP accepting any client write requests.

That’s all in this section. If you liked reading this blog, kindly do press on clap button multiple times, to indicate your appreciation. We would see you in next part of this series with Hands-On with Redis-Cluster.

References :-

--

--

Software Engineer for Big Data distributed systems

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store