Skip to content

EC2 Auto Scaling

Amazon EC2 Auto Scaling Overview

  • Definition: Amazon EC2 Auto Scaling is a service that automatically adjusts the number of EC2 instances in a fleet to maintain application performance, availability, and cost efficiency based on defined conditions.
  • Key Features:
    • Scales instances in/out based on demand (e.g., CPU utilization).
    • Ensures minimum/maximum instance counts for availability.
    • Integrates with Elastic Load Balancers (ELB) and other AWS services.
    • Supports Spot Instances, On-Demand, and mixed fleets.
  • Use Cases: Web applications, batch processing, microservices, disaster recovery.

1. EC2 Auto Scaling Core Concepts

Components

  • Auto Scaling Group (ASG):
    • Collection of EC2 instances treated as a logical unit.
    • Defines min, max, and desired capacity.
    • Explanation: E.g., ASG with min=2, max=10, desired=4 instances.
  • Launch Template/Configuration:
    • Specifies instance details (AMI, instance type, security groups, user data).
    • Launch Template: Preferred, supports versioning.
    • Launch Configuration: Legacy, single version.
    • Explanation: E.g., template with t3.micro, Amazon Linux 2 AMI.
  • Scaling Policies:
    • Rules to add/remove instances based on metrics or schedules.
    • Types: Target Tracking, Step Scaling, Simple Scaling, Scheduled Scaling.
    • Explanation: E.g., scale out if CPU > 70%.
  • Health Checks:
    • Monitors instance health (EC2 status or ELB health).
    • Replaces unhealthy instances.
    • Explanation: E.g., terminate instance if ELB reports “OutOfService”.

Scaling Types

  • Horizontal Scaling:
    • Add/remove instances (scale out/in).
    • Explanation: E.g., add 2 instances during traffic spike.
  • Vertical Scaling:
    • Not supported by Auto Scaling (requires instance type change, causes downtime).
    • Explanation: Use larger instances manually if needed.

Key Notes:

  • Exam Relevance: Understand ASG setup, scaling policies, and health checks.
  • Mastery Tip: Practice creating an ASG with a launch template and ELB integration.

2. EC2 Auto Scaling Performance Features

EC2 Auto Scaling ensures high-performing applications.

Scaling Policies

  • Target Tracking:
    • Maintains a metric at a target (e.g., CPU at 50%).
    • Uses CloudWatch metrics (e.g., CPUUtilization, RequestCountPerTarget).
    • Explanation: Simplest—e.g., scale to keep 500 requests/target.
  • Step Scaling:
    • Scales based on metric thresholds (e.g., +2 instances if CPU > 70%, +4 if > 90%).
    • Explanation: Granular control—e.g., aggressive scaling for spikes.
  • Simple Scaling:
    • Scales by fixed amount (e.g., +1 instance if CPU > 60%).
    • Waits for cooldown (default 300s) before next action.
    • Explanation: Legacy—use for basic needs.
  • Scheduled Scaling:
    • Scales at specific times (e.g., +5 instances every Monday 9 AM).
    • Explanation: Predictable loads—e.g., payroll processing.

Predictive Scaling

  • Purpose: Anticipate demand.
  • Features: Uses ML to forecast load (e.g., based on CloudWatch history).
  • Explanation: E.g., scale out before Black Friday traffic.

Warm Pools:

  • Purpose: Pre-initialize instances.
  • Features: Keep stopped instances ready to launch (faster scaling).
  • Explanation: E.g., reduce app startup time for web servers.

Key Notes:

  • Performance: Target Tracking + Predictive Scaling = responsive apps.
  • Exam Tip: Know when to use Target Tracking vs. Step Scaling.

3. EC2 Auto Scaling Resilience Features

Resilience ensures application availability.

Multi-AZ Distribution

  • Purpose: Survive AZ failures.
  • How It Works: Launches instances across specified AZs.
  • Explanation: E.g., spread 4 instances across us-east-1a, us-east-1b.

Health Checks

  • Purpose: Replace failed instances.
  • Types:
    • EC2: Checks instance status (running, not impaired).
    • ELB: Checks application health (e.g., HTTP 200).
    • Custom: User-defined via API.
  • Explanation: E.g., terminate instance if ELB health check fails.

Instance Refresh:

  • Purpose: Update fleet.
  • Features: Rolling replacement of instances (e.g., new AMI, launch template).
  • Explanation: E.g., update to latest app version with minimal downtime.

Suspended Processes:

  • Purpose: Pause scaling actions.
  • Features: Suspend health checks, scaling, or replacements.
  • Explanation: E.g., pause during maintenance to avoid terminations.

Key Notes:

  • Resilience: Multi-AZ + ELB health checks = high availability.
  • Exam Tip: Design an ASG for Multi-AZ with ELB integration.

4. EC2 Auto Scaling Security Features

Security aligns with SAA-C03’s secure architecture focus.

Encryption

  • EBS Volumes: Use KMS for at-rest encryption.
  • Network Traffic: HTTPS/TLS via ELB or app configuration.
  • Explanation: E.g., encrypt EBS root volume with KMS key.

Access Control

  • IAM:
    • Controls ASG operations (e.g., autoscaling:CreateAutoScalingGroup).
    • Instance role grants app permissions (e.g., s3:GetObject).
    • Example: {"Effect": "Allow", "Action": "cloudwatch:PutMetricData", "Resource": "*"}.
  • Security Groups:
    • Restrict instance traffic (e.g., port 80 from ELB).
  • Explanation: Least privilege—e.g., ASG role only scales, instance role accesses S3.

VPC:

  • Purpose: Isolate instances.
  • How It Works: Deploy in private subnets, route via ELB.
  • Explanation: E.g., app in private subnet, ELB in public subnet.

Key Notes:

  • Security: KMS + IAM + VPC = secure scaling.
  • Exam Tip: Practice IAM policy and security group for ASG.

5. EC2 Auto Scaling Cost Optimization

Cost efficiency is a key exam domain.

Pricing

  • Auto Scaling: Free (pay for EC2 instances, ELB, CloudWatch).
  • EC2:
    • On-Demand: ~$0.096/hour (m5.large).
    • Spot: Up to 90% savings (e.g., ~$0.03/hour).
    • Reserved Instances: ~50% savings for steady-state.
  • Free Tier: 750 hours/month of t2/t3.micro (shared with EC2).
  • Example: ASG with 4 m5.large (On-Demand) = ~$9.22/day.

Cost Strategies

  • Spot Instances:
    • Use mixed instance policies (Spot + On-Demand).
    • Explanation: E.g., 80% Spot for batch jobs, 20% On-Demand for reliability.
  • Right-Sizing:
    • Set min/desired capacity conservatively.
    • Use t3/t4g for burstable workloads.
    • Explanation: E.g., t3.micro for low-traffic apps.
  • Predictive Scaling:
    • Avoid over-provisioning during peaks.
    • Explanation: E.g., scale before traffic spikes.
  • Cooldown Periods:
    • Prevent rapid scaling (default 300s).
    • Explanation: E.g., avoid adding unneeded instances.

Key Notes:

  • Cost Savings: Spot + t3 + Predictive Scaling = low costs.
  • Exam Tip: Calculate costs for Spot vs. On-Demand ASG.

6. EC2 Auto Scaling Advanced Features

Mixed Instance Policies

  • Purpose: Combine instance types and purchase options.
  • Features:
    • Mix On-Demand, Spot, and multiple instance types (e.g., m5, c5).
    • Allocate across AZs and types.
  • Explanation: E.g., 50% m5.large On-Demand, 50% c5.large Spot.

Instance Weighting:

  • Purpose: Normalize capacity.
  • Features: Assign weights to instance types (e.g., m5.large=1, m5.2xlarge=4).
  • Explanation: E.g., 4 m5.large = 1 m5.2xlarge for capacity.

Custom Metrics:

  • Purpose: Scale on app-specific metrics.
  • Features: Use CloudWatch custom metrics (e.g., queue depth).
  • Explanation: E.g., scale on SQS queue size.

Lifecycle Hooks:

  • Purpose: Customize scaling actions.
  • Features: Pause instance launch/termination (e.g., for bootstrapping).
  • Explanation: E.g., install software before joining ELB.

Key Notes:

  • Flexibility: Mixed policies + custom metrics = advanced scaling.
  • Exam Tip: Know lifecycle hooks for custom bootstrapping.

7. EC2 Auto Scaling Use Cases

Understand practical applications.

Web Applications

  • Setup: ASG + ALB + t3 instances.
  • Features: Scale with traffic, Multi-AZ.
  • Explanation: E.g., e-commerce site during sales.

Batch Processing

  • Setup: ASG + Spot Instances.
  • Features: Parallelize jobs, cost-efficient.
  • Explanation: E.g., image processing for media.

Microservices

  • Setup: ASG + ECS + custom metrics.
  • Features: Scale per service demand.
  • Explanation: E.g., scale payment service on transaction volume.

Disaster Recovery

  • Setup: ASG + cross-region ELB.
  • Features: Maintain capacity in DR Region.
  • Explanation: E.g., failover to eu-west-1.

8. EC2 Auto Scaling vs. Other Services

Feature EC2 Auto Scaling Elastic Beanstalk Fargate
Type Instance Scaling App Deployment Serverless Containers
Workload EC2-based apps Web apps Containerized apps
Control Granular High-level Serverless
Cost EC2-based EC2 + management vCPU + memory
Use Case Custom scaling Simplified deployment No server management

Explanation:

  • EC2 Auto Scaling: Fine-grained control for EC2 fleets.
  • Elastic Beanstalk: Managed app platform with scaling.
  • Fargate: Serverless containers, no instance management.

Detailed Explanations for Mastery

  • Target Tracking:
    • Example: Scale to maintain 500 requests/target; ASG adds 2 instances if demand rises.
    • Why It Matters: Simplest policy—exam favorite.
  • Spot Instances:
    • Example: Mixed ASG with 70% Spot; fallback to On-Demand if Spot unavailable.
    • Why It Matters: Cost optimization—key SAA-C03 scenario.
  • Lifecycle Hooks:
    • Example: Pause launch, install app, register with ELB.
    • Why It Matters: Custom scaling—common trap.

Quick Reference Table

Feature Purpose Key Detail Exam Relevance
Auto Scaling Group Manage instances Min, max, desired capacity Core Concept
Launch Template Instance config AMI, instance type, SG Core Concept
Target Tracking Maintain performance Scale on metric (e.g., CPU 50%) Performance
Predictive Scaling Anticipate load ML-based forecasting Performance
Multi-AZ High availability Spread across AZs Resilience
Health Checks Replace failed instances EC2, ELB, custom Resilience
Spot Instances Cost savings Mixed with On-Demand Cost
Lifecycle Hooks Customize scaling Pause launch/termination Flexibility