Deep dive into AWS for developers | Part2 — Scalability

  • Vertical-Scalability → It means increasing size of the given instance. For e.g. Say we have a system with CPU of 1 Ghz clock-rate and 2 GB RAM and its able to handle load of 50 TPS, but say if we increase its capacity to 4 GB RAM and 2 GHz clock-rate, it might be able to handle load of 80 TPS. Although, there is always a limit to how-much we can vertically scale a particular application i.e. depending upon the hardware limit. Analogous to this: Say we have “t2.micro” instance currently and we replace it with “t2.large” instance, then that shall be termed as Vertical scaling. We use vertical-scaling, when we have non-distributed systems like database-server. AWS services like RDS, ElasticCache can be vertically scaled by upgrading the underlying instance-types. As of this writing, AWS has as small as “t2.nano” instance with 0.5 GB of RAM and 1 vCPU. AWS has as large as “u-I2tbI.metal” instance with 12.3 TB of RAM and 448 vCPUs.
  • Horizontal-Scalability → It means increasing the number of instances for our application. e.g. Say we have a system with CPU of 1 Ghz clock-rate and 2 GB RAM and its able to handle load of 50 TPS, but say if we add-on another same instance with same capacity, it might be able to handle load of 1000 TPS. Analogous to this: Say we have “t2.micro” instance currently and we add another same “t2.micro” instance, then that shall be termed as Horizontal scaling. Generally, Horizontal-scaling implies that we have distributed-systems in place. This is very usual for the web-applications, but not every application/software-system is a distributed system. It’s very easy to scale horizontally using EC2 instances. This is also being used behind the Auto-Scaling-Group and Load-Balancer.
  • High-Availability can be in passive mode, meaning that application in another data-center would only get activated, if primary application goes down. e.g. RDS multi-AZ setup.
  • High-Availability can be in active mode, meaning that application has been horizontally-scaled and all of the instances are getting the live-traffic equally.
  • LBs helps in spreading the load across multiple downstream EC2 instances in a round-robin fashion.
  • LBs are very helpful in exposing a single entry-access-point(i.e. through help of DNS) to the application.
  • LBs also provides SSL-termination(https) for connections to our sites/software.
  • LBs also enforces stickyness with help of cookies.
  • LBs provides high-availability across multiple Availability-Zones i.e. LB can be spread across even multiple AZs as well.
  • LBs also can differentiate between public & private traffic.
  • LBs also handles the failures of downstream EC2 instances, by constantly doing health-checks every 5 seconds. This time can be configurable very well. Health-check is usually done on a port & end-point (e.g. /health). If the response is 200 OK,
  • AWS guarantees that ELB shall be working and available always.
  • AWS takes care of high-availability, upgrades and maintenance of the ELB.
  • AWS also provides configuration-handles for ELB.
  • It’s integrated with many AWS offerings and services.
  • AWS LB can scale unlimited, but not instantaneously. So, we might need to contact to AWS support for the same.
  • Internal Private LB → This LB is private within an AWS account. We can’t access it from public web.
  • External Public LB → This LB is publicly available on the web and users can very well access it.
  • Load-Balancer has its own security-group. Take for example, the aforesaid picture as shown. According to these rules, it allows http type incoming traffic on port 80 from anywhere on the web and allows https type incoming traffic on port 443 from anywhere on the web.
  • EC2-instance also has its own security-group. According to these rules, as shown in above picture, only source from where incoming-traffic (at EC2 instance) allowed is : from security-group belonging to LB. The EC2 security-group references the security-group of that of Load-Balancer.
  • Load-Balancing to HTTP applications across multiple machines (target-groups).
  • Load-Balancing to multiple HTTP applications on a same machine (ex: containers).
  • ALB also supports redirects from http to https.
  • ALB supports for http/2 and web-sockets.
  • ALB can route to multiple target-groups.
  • For Classic-LB, by default cross-zone load-balancing is disabled by default. Also, there are no extra charges, even if we enable the Inter-AZ load-balancing.
  • For Application-LB, cross-zone load-balancing is enabled by default and it can’t be diabled. There are no extra charges for Inter-AZ load-balancing.
  • For Network-LB, cross-zone load-balancing is disabled by default. There are some extra charges, we shall have to pay, in order to enable the Inter-AZ load-balancing through NLB.
  • SSL refers to ‘Secure Socket Layer’, used to encrypt the connections.
  • TLS refers to ‘Transport Layer Security’, which is a newer version.
  • If our connections are shorter, we can set this value to be smaller.
  • The range for this value can be set between 0 to 3600 seconds.
  • Scaling-out i.e. Addition of an EC2 instance, in case load increases.
  • Scaling-in i.e. Removal of an EC2 instance, in case load decreases.
  • Make sure that, we have minimum no. of EC2 instances running.
  • Automatically, register/de-register the instances to the LB as well.
  • Target Average CPU usage. For ex. If average CPU usage is more than 40%, its an alarm for us and would add an another EC2 instance.
  • Average Network-In and Average Network-Out.
  • No. of requests on the ELB per instance .
  • Pre-scheduled time, if we know the visitor-patterns in advance.
  • We first send the custom-metric from our application (running on EC2 instance) to CloudWatch through the help of ‘PutMetric’ API.
  • We then setup the Cloud-Watch alarm to react to low/high values.
  • We then use those cloud-watch-alarms as the scaling policy for ASG.
  • Target Tracking Scaling :- Say, we want our average CPU usage for ASG to be 40%, then this policy would be automatically adding CPU instance, whenever CPU usage goes above 40%. This policy is used most in daily life. Another example for this policy-usage can be : Say an application is deployed with an Application Load Balancer and an Auto Scaling Group. Currently, the scaling of the Auto Scaling Group is done manually and we would like to define a scaling policy, that will ensure the average number of connections to our EC2 instances is averaging at around 1000.
  • Simple Step Scaling :- We can setup this policy using Cloudwatch. For e.g. When Cloud-watch alarm would get trigger-red (CPU usage goes above 70%), then add 2 ec2 instances to the ASG. Another example can be: When Cloud-watch alarm would get trigger-red (CPU usage goes below 30%), then remove 1 ec2 instance from the ASG.
  • Scheduled Actions :- Anticipate a scaling based on known usage pattern. Example :- Let’s increase the capacity by 10 more EC2 instances at 9 pm on next Friday.



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store

aditya goel

Software Engineer for Big Data distributed systems