Deep dive into AWS for developers | Part3 — RDS & ElasticCache

  • POSTGRES
  • MYSQL
  • MARIADB
  • ORACLE
  • SQL-Server
  • Amazon AURORA (AWS Proprietary Database) → This is not compatible with Free-Tier.
  • AWS Managed RDS takes care of automatic provisioning of database.
  • It also takes care of underlying OS patching. Please note that, we as end-users can’t at all login/SSH to the underlying instance and that’s why it is being termed as Managed service.
  • It takes care of continuous backups and same can be restored to specific timestamp i.e. Point-In-time-Recovery.
  • We get Performance dashboards, so as to view the performance of the database.
  • It supports Horizontal Scalability i.e. we get Read-Replicas for improved Read-Performance. We can also perform Vertical-Scalability on AWS RDS.
  • We can also setup the Multi-AZ setup for DR database.
  • Storage is backed up by EBS i.e. by using gp2 OR io volumes.
  • This is suitable for unpredictable workloads.
  • In case underlying storage left is less than 10% of underlying storage.
  • Low storage last for more than 5 minutes.
  • 6 hours have passed, since last modification.
  • Failover in case of complete AZ failure / blast / disaster.
  • Loss of Network itself to Master database.
  • Loss of Storage failure of RDS instance.
  • First, internally snapshot is being taken.
  • New DB is restored from the recently taken snapshot in new AZ.
  • And At last, Synchronisation is established between the two databases.
  • At Rest Data encryption :- It stands for security of data, not in movement. We can encrypt the master data with AWS KMS AES-256 encryption. Encryption has to be defined at the launch time. Also, If the master is not encrypted, then read replicas can not be encrypted as well.
  • Inflight Data encryption :- It stands for security of data, in movement i.e. when data is traveling from applications/clients to the database. We use SSL certificates to encrypt the data in flight.
  • Our EC2 instance would have something called as IAM Role.
  • Using this IAM role, EC2 instance would issue an API call to the RDS service, to get “Auth Token”. This auth-token is a short-lived credential.
  • Using this “Auth Token”, we now connect to AWS RDS database instance. Its always recommended to have : Network In/Out connection between Application & DB as encrypted through SSL.
  • One Aurora instance (master) takes writes.
  • Automated failover for master in less than 30 seconds.
  • Usually, there are 6 copies of your data across 3 Availability-Zones and Storage is striped across shared storage volume.
  • To make our application stateless i.e. the state of the application could be stored in the ElasticCache. Say user logs into the application, then this application can write the session-data to the Elastic-Cache. Now, another request from the same user lands at the different instance, then that user’s session data can be well retrieved from the ElasticCache and say we found the user data, that’s how the user can be deemed to be already logged-in. That’s how, we have made our Application Stateless. Below is how the architecture for the same looks like :-
  • To reduce the load from the databases for read intensive workload. Idea here is that, common queries would get cached and now database would not be queried. The results can be served directly from the cache itself. Along with this, ElasticCache should also have an Invalidation-Strategy to make sure that, only most recent data is living inside the cache. Below is how the architecture for the same looks like :-
  • MEMCACHED ElasticCache → It supports multi-node-cluster for partitioning of data, thus it provides Data-Sharding. In this mode, there is no replication happening and hence no High-Availability. There are NO backups and restore features. Its also not a persistent cache. Its a multi-threaded architecture.
  • REDIS ElasticCache → Just like the RDS database, It supports high-availability and read-replicas to scale reads. It supports Multi-AZ with auto-failover. It also provides data durability using AOF persistence and therefore Redis can also be used as databse. Also, there are backups and restore features. Redis can also be used as pub-sub message brokers.
  • Cluster mode disabled :- In this mode, there is a single shard and all nodes are part of this single shard only. Inside this shard, we have ONE master and upto 5 replica nodes. In case of failover of master node, one of the replicas can take over. Replication from master to replica-nodes happens asynchronously. Primary node shall be used for read & write operations, whereas only the replica nodes shall be used for read operations. Here, all the nodes in an cluster, have all the data at all the times and hence it provides us safeguard against data-loss, in case of any node failure. Its quite helpful for scaling the read operations. It also supports the Multi-AZ setup as well.
  • Cluster mode enabled :- In this mode, data is partitioned across multiple shards. Each shard has a primary node and upto 5 replica-nodes in it. It also supports Multi-AZ setup. Its quite helpful to scale the writes. We can have upto 500 nodes per cluster. For e.g. say we don’t setup the replica-nodes, then there can be 500 shards possible, each with single master. Similarly, say we setup each shard with 1 master and 1 replica, then in-total there would be 250 shards at-max possible.
  • Usually, its safe to cache the data, but sometimes it may be out of time and it might become eventually-consistent.
  • It’s appreciable to use the caching when, data is changing slowly and there are few keys, which are needed more frequently, but say if the data is changing too frequently and all large key-space is needed frequently, then usage of Caching is considered as an Anti-Pattern.
  • It’s suggestible to use Caching, when the data to be stored has appropriate structure. For example → Key-Value caching OR Caching of aggregation results.
  • Whenever application needs some data, its going to inquire to the cache first and in case cache has the requested data, this is called as “Cache Hit”.
  • In case, data is not present in the cache, this is termed as “Cache Miss”. In this scenario, the application sends requests to Database and then writes the data back to the cache, so that other request (from the application) can find the data into the cache. Please note that, in this case of “Cache Miss”, there is penalty on the request as, there are 3 round-trips being involved in total. It may be a bad user experience and lead to some additional latency to the users.
  • Whenever application needs some data, its going to inquire to the cache first and in case cache has the requested data, this is called as “Cache Hit”.
  • Whenever, application sends some write-request to the database, the same is also written to the Elastic-Cache.
  • We delete the items explicitly from the Cache.
  • Items can be evicted, because memory is full and as per LRU principle, older entries () shall be deleted.
  • Items can be deleted, because there is a TTL counter being set on the entry. TTL can range from few seconds to hours. TTL can be helpful for these types of data like Comments, Leaderboards & Activity-Streams, etc.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store