EC2 Auto Scaling
Amazon EC2 Auto Scaling Overview
- Definition: Amazon EC2 Auto Scaling is a service that automatically adjusts the number of EC2 instances in a fleet to maintain application performance, availability, and cost efficiency based on defined conditions.
- Key Features:
- Scales instances in/out based on demand (e.g., CPU utilization).
- Ensures minimum/maximum instance counts for availability.
- Integrates with Elastic Load Balancers (ELB) and other AWS services.
- Supports Spot Instances, On-Demand, and mixed fleets.
- Use Cases: Web applications, batch processing, microservices, disaster recovery.
1. EC2 Auto Scaling Core Concepts
Components
- Auto Scaling Group (ASG):
- Collection of EC2 instances treated as a logical unit.
- Defines min, max, and desired capacity.
- Explanation: E.g., ASG with min=2, max=10, desired=4 instances.
- Launch Template/Configuration:
- Specifies instance details (AMI, instance type, security groups, user data).
- Launch Template: Preferred, supports versioning.
- Launch Configuration: Legacy, single version.
- Explanation: E.g., template with t3.micro, Amazon Linux 2 AMI.
- Scaling Policies:
- Rules to add/remove instances based on metrics or schedules.
- Types: Target Tracking, Step Scaling, Simple Scaling, Scheduled Scaling.
- Explanation: E.g., scale out if CPU > 70%.
- Health Checks:
- Monitors instance health (EC2 status or ELB health).
- Replaces unhealthy instances.
- Explanation: E.g., terminate instance if ELB reports “OutOfService”.
Scaling Types
- Horizontal Scaling:
- Add/remove instances (scale out/in).
- Explanation: E.g., add 2 instances during traffic spike.
- Vertical Scaling:
- Not supported by Auto Scaling (requires instance type change, causes downtime).
- Explanation: Use larger instances manually if needed.
Key Notes:
- Exam Relevance: Understand ASG setup, scaling policies, and health checks.
- Mastery Tip: Practice creating an ASG with a launch template and ELB integration.
2. EC2 Auto Scaling Performance Features
EC2 Auto Scaling ensures high-performing applications.
Scaling Policies
- Target Tracking:
- Maintains a metric at a target (e.g., CPU at 50%).
- Uses CloudWatch metrics (e.g., CPUUtilization, RequestCountPerTarget).
- Explanation: Simplest—e.g., scale to keep 500 requests/target.
- Step Scaling:
- Scales based on metric thresholds (e.g., +2 instances if CPU > 70%, +4 if > 90%).
- Explanation: Granular control—e.g., aggressive scaling for spikes.
- Simple Scaling:
- Scales by fixed amount (e.g., +1 instance if CPU > 60%).
- Waits for cooldown (default 300s) before next action.
- Explanation: Legacy—use for basic needs.
- Scheduled Scaling:
- Scales at specific times (e.g., +5 instances every Monday 9 AM).
- Explanation: Predictable loads—e.g., payroll processing.
Predictive Scaling
- Purpose: Anticipate demand.
- Features: Uses ML to forecast load (e.g., based on CloudWatch history).
- Explanation: E.g., scale out before Black Friday traffic.
Warm Pools:
- Purpose: Pre-initialize instances.
- Features: Keep stopped instances ready to launch (faster scaling).
- Explanation: E.g., reduce app startup time for web servers.
Key Notes:
- Performance: Target Tracking + Predictive Scaling = responsive apps.
- Exam Tip: Know when to use Target Tracking vs. Step Scaling.
3. EC2 Auto Scaling Resilience Features
Resilience ensures application availability.
Multi-AZ Distribution
- Purpose: Survive AZ failures.
- How It Works: Launches instances across specified AZs.
- Explanation: E.g., spread 4 instances across us-east-1a, us-east-1b.
Health Checks
- Purpose: Replace failed instances.
- Types:
- EC2: Checks instance status (running, not impaired).
- ELB: Checks application health (e.g., HTTP 200).
- Custom: User-defined via API.
- Explanation: E.g., terminate instance if ELB health check fails.
Instance Refresh:
- Purpose: Update fleet.
- Features: Rolling replacement of instances (e.g., new AMI, launch template).
- Explanation: E.g., update to latest app version with minimal downtime.
Suspended Processes:
- Purpose: Pause scaling actions.
- Features: Suspend health checks, scaling, or replacements.
- Explanation: E.g., pause during maintenance to avoid terminations.
Key Notes:
- Resilience: Multi-AZ + ELB health checks = high availability.
- Exam Tip: Design an ASG for Multi-AZ with ELB integration.
4. EC2 Auto Scaling Security Features
Security aligns with SAA-C03’s secure architecture focus.
Encryption
- EBS Volumes: Use KMS for at-rest encryption.
- Network Traffic: HTTPS/TLS via ELB or app configuration.
- Explanation: E.g., encrypt EBS root volume with KMS key.
Access Control
- IAM:
- Controls ASG operations (e.g., autoscaling:CreateAutoScalingGroup).
- Instance role grants app permissions (e.g., s3:GetObject).
- Example: {"Effect": "Allow", "Action": "cloudwatch:PutMetricData", "Resource": "*"}.
- Security Groups:
- Restrict instance traffic (e.g., port 80 from ELB).
- Explanation: Least privilege—e.g., ASG role only scales, instance role accesses S3.
VPC:
- Purpose: Isolate instances.
- How It Works: Deploy in private subnets, route via ELB.
- Explanation: E.g., app in private subnet, ELB in public subnet.
Key Notes:
- Security: KMS + IAM + VPC = secure scaling.
- Exam Tip: Practice IAM policy and security group for ASG.
5. EC2 Auto Scaling Cost Optimization
Cost efficiency is a key exam domain.
Pricing
- Auto Scaling: Free (pay for EC2 instances, ELB, CloudWatch).
- EC2:
- On-Demand: ~$0.096/hour (m5.large).
- Spot: Up to 90% savings (e.g., ~$0.03/hour).
- Reserved Instances: ~50% savings for steady-state.
- Free Tier: 750 hours/month of t2/t3.micro (shared with EC2).
- Example: ASG with 4 m5.large (On-Demand) = ~$9.22/day.
Cost Strategies
- Spot Instances:
- Use mixed instance policies (Spot + On-Demand).
- Explanation: E.g., 80% Spot for batch jobs, 20% On-Demand for reliability.
- Right-Sizing:
- Set min/desired capacity conservatively.
- Use t3/t4g for burstable workloads.
- Explanation: E.g., t3.micro for low-traffic apps.
- Predictive Scaling:
- Avoid over-provisioning during peaks.
- Explanation: E.g., scale before traffic spikes.
- Cooldown Periods:
- Prevent rapid scaling (default 300s).
- Explanation: E.g., avoid adding unneeded instances.
Key Notes:
- Cost Savings: Spot + t3 + Predictive Scaling = low costs.
- Exam Tip: Calculate costs for Spot vs. On-Demand ASG.
6. EC2 Auto Scaling Advanced Features
Mixed Instance Policies
- Purpose: Combine instance types and purchase options.
- Features:
- Mix On-Demand, Spot, and multiple instance types (e.g., m5, c5).
- Allocate across AZs and types.
- Explanation: E.g., 50% m5.large On-Demand, 50% c5.large Spot.
Instance Weighting:
- Purpose: Normalize capacity.
- Features: Assign weights to instance types (e.g., m5.large=1, m5.2xlarge=4).
- Explanation: E.g., 4 m5.large = 1 m5.2xlarge for capacity.
Custom Metrics:
- Purpose: Scale on app-specific metrics.
- Features: Use CloudWatch custom metrics (e.g., queue depth).
- Explanation: E.g., scale on SQS queue size.
Lifecycle Hooks:
- Purpose: Customize scaling actions.
- Features: Pause instance launch/termination (e.g., for bootstrapping).
- Explanation: E.g., install software before joining ELB.
Key Notes:
- Flexibility: Mixed policies + custom metrics = advanced scaling.
- Exam Tip: Know lifecycle hooks for custom bootstrapping.
7. EC2 Auto Scaling Use Cases
Understand practical applications.
Web Applications
- Setup: ASG + ALB + t3 instances.
- Features: Scale with traffic, Multi-AZ.
- Explanation: E.g., e-commerce site during sales.
Batch Processing
- Setup: ASG + Spot Instances.
- Features: Parallelize jobs, cost-efficient.
- Explanation: E.g., image processing for media.
Microservices
- Setup: ASG + ECS + custom metrics.
- Features: Scale per service demand.
- Explanation: E.g., scale payment service on transaction volume.
Disaster Recovery
- Setup: ASG + cross-region ELB.
- Features: Maintain capacity in DR Region.
- Explanation: E.g., failover to eu-west-1.
8. EC2 Auto Scaling vs. Other Services
Feature | EC2 Auto Scaling | Elastic Beanstalk | Fargate |
---|---|---|---|
Type | Instance Scaling | App Deployment | Serverless Containers |
Workload | EC2-based apps | Web apps | Containerized apps |
Control | Granular | High-level | Serverless |
Cost | EC2-based | EC2 + management | vCPU + memory |
Use Case | Custom scaling | Simplified deployment | No server management |
Explanation:
- EC2 Auto Scaling: Fine-grained control for EC2 fleets.
- Elastic Beanstalk: Managed app platform with scaling.
- Fargate: Serverless containers, no instance management.
Detailed Explanations for Mastery
- Target Tracking:
- Example: Scale to maintain 500 requests/target; ASG adds 2 instances if demand rises.
- Why It Matters: Simplest policy—exam favorite.
- Spot Instances:
- Example: Mixed ASG with 70% Spot; fallback to On-Demand if Spot unavailable.
- Why It Matters: Cost optimization—key SAA-C03 scenario.
- Lifecycle Hooks:
- Example: Pause launch, install app, register with ELB.
- Why It Matters: Custom scaling—common trap.
Quick Reference Table
Feature | Purpose | Key Detail | Exam Relevance |
---|---|---|---|
Auto Scaling Group | Manage instances | Min, max, desired capacity | Core Concept |
Launch Template | Instance config | AMI, instance type, SG | Core Concept |
Target Tracking | Maintain performance | Scale on metric (e.g., CPU 50%) | Performance |
Predictive Scaling | Anticipate load | ML-based forecasting | Performance |
Multi-AZ | High availability | Spread across AZs | Resilience |
Health Checks | Replace failed instances | EC2, ELB, custom | Resilience |
Spot Instances | Cost savings | Mixed with On-Demand | Cost |
Lifecycle Hooks | Customize scaling | Pause launch/termination | Flexibility |