Skip to content

dynamo db

Amazon DynamoDB Overview

  • Definition: Amazon DynamoDB is a fully managed NoSQL database service designed for low-latency, scalable, and highly available key-value and document-based workloads.
  • Key Concepts:
    • Tables: Containers for data (items).
    • Items: Individual records (up to 400 KB).
    • Attributes: Key-value pairs within items.
  • Use Cases: Real-time applications (e.g., gaming leaderboards, IoT, e-commerce carts).

1. DynamoDB Core Concepts

Tables, Items, and Attributes

  • Tables: Logical collections of items (no fixed schema).
    • Explanation: Flexible—items in a table can have different attributes.
  • Items: Rows of data, like JSON documents (e.g., { "userId": "123", "name": "Alice" }).
    • Explanation: Max size 400 KB—split large data into multiple items or use S3.
  • Attributes: Name-value pairs within an item (e.g., "score": 95).
    • Explanation: Can be scalar (string, number), set, or nested (document).

Primary Keys

  • Types:
    • Partition Key: Single attribute (e.g., userId)—distributes data across partitions.
    • Composite Key: Partition key + sort key (e.g., userId + orderDate)—organizes data within a partition.
  • Explanation: Partition key determines data distribution; sort key enables range queries (e.g., all orders for userId in 2023).
  • Design Tip: Choose keys for even distribution (e.g., avoid status as partition key—hot partitions).

Key Notes:

  • Exam Relevance: Know key types and their impact on scalability.
  • Mastery Tip: Practice designing keys for a use case (e.g., user sessions).

2. Capacity Modes

Provisioned Capacity

  • Purpose: Pre-allocate read/write throughput.
  • Units:
    • RCU (Read Capacity Unit): 1 strong consistent read (4 KB) or 2 eventually consistent reads per second.
    • WCU (Write Capacity Unit): 1 write (1 KB) per second.
  • Auto-Scaling: Adjusts RCU/WCU based on demand (e.g., target 70% utilization).
  • Explanation: Predictable workloads—cheaper if usage is steady.

On-Demand Capacity

  • Purpose: Pay per request, no capacity planning.
  • Features: Scales instantly, no limits.
  • Explanation: Flexible—ideal for unpredictable traffic (e.g., new app launches).

Key Notes:

  • Performance: On-Demand for bursts, Provisioned for consistency.
  • Exam Tip: Calculate RCU/WCU for a workload (e.g., 10 KB item = 3 RCUs strongly consistent).

3. DynamoDB Indexes

Indexes enhance query flexibility.

Local Secondary Index (LSI)

  • Purpose: Alternate sort key for a partition key.
  • Features:
    • Max 5 per table, created at table creation.
    • Shares capacity with base table.
  • Explanation: Query within a partition (e.g., userId with timestamp instead of orderId).
  • Use Case: Sort orders by date for a user.

Global Secondary Index (GSI)

  • Purpose: Alternate partition key and/or sort key.
  • Features:
    • Max 20 per table, add/remove anytime.
    • Separate capacity (RCU/WCU) from base table.
  • Explanation: Query across partitions (e.g., email as partition key instead of userId).
  • Use Case: Find users by email, not just ID.

Key Notes:

  • Exam Relevance: LSI for intra-partition queries, GSI for cross-partition.
  • Mastery Tip: Design a table with LSI and GSI for a multi-access app.

4. DynamoDB Resilience Features

Resilience ensures data durability and availability.

Multi-AZ Replication

  • Purpose: High availability and durability.
  • How It Works: Data replicated across 3 AZs in a Region (11 9s durability).
  • Explanation: Automatic—no config needed, survives AZ failures.

Global Tables

  • Purpose: Multi-region replication.
  • How It Works: Replicates table to other Regions (eventual consistency).
  • Explanation: Enables low-latency reads globally and DR—e.g., us-east-1 + eu-west-1.
  • Requirement: On-Demand or Provisioned with auto-scaling.

Point-in-Time Recovery (PITR)

  • Purpose: Restore to any second in the last 35 days.
  • How It Works: Continuous backups, enabled manually.
  • Explanation: Protects against accidental deletes—restores to new table.

Key Notes:

  • Resilience: Global Tables for DR, PITR for recovery.
  • Exam Tip: Know PITR setup and Global Table consistency model.

5. DynamoDB Performance Features

DynamoDB excels in high-performing architectures.

Low Latency

  • Purpose: Single-digit millisecond reads/writes.
  • Explanation: SSD-based, distributed design—scales with partitions.

DynamoDB Accelerator (DAX)

  • Purpose: In-memory caching for reads.
  • Features:
    • Microsecond latency (vs. milliseconds).
    • Write-through cache (updates table + cache).
  • Explanation: Offloads read traffic—e.g., leaderboard queries.
  • Use Case: Read-heavy apps (e.g., social media).

Streams

  • Purpose: Capture table changes in near real-time.
  • Features:
    • 24-hour retention.
    • Triggers Lambda or processes via Kinesis.
  • Explanation: Enables event-driven apps—e.g., update cache on item change.

Key Notes:

  • Performance: DAX for reads, Streams for reactivity.
  • Exam Tip: Compare DAX vs. ElastiCache (DAX is DynamoDB-specific).

6. DynamoDB Security Features

Security is critical for SAA-C03.

Encryption

  • At Rest: AWS KMS (default or custom key).
  • In Transit: HTTPS/TLS.
  • Explanation: Meets compliance needs (e.g., GDPR).

Access Control

  • IAM Policies: Fine-grained access (e.g., allow reads on specific attributes).
    • Example: {"Effect": "Allow", "Action": "dynamodb:GetItem", "Resource": "table/users", "Condition": {"ForAllValues:StringEquals": {"dynamodb:Attributes": ["name"]}}}.
  • VPC Endpoints: Private access via AWS PrivateLink.
  • Explanation: Keeps traffic off the internet—secure for enterprise apps.

Key Notes:

  • Security: IAM + VPC Endpoints = least privilege.
  • Exam Tip: Practice IAM policy for attribute-level access.

7. DynamoDB Cost Optimization

Cost efficiency is a key exam domain.

Capacity Modes

  • Provisioned: Cheaper for steady loads (e.g., $1.25/100 WCUs/month).
  • On-Demand: Flexible but pricier (e.g., $1.25/million writes).
  • Explanation: Over-provisioning wastes money—use auto-scaling.

Time to Live (TTL)

  • Purpose: Auto-delete expired items (free).
  • How It Works: Set TTL attribute (e.g., expireAt epoch time).
  • Explanation: Reduces storage costs—e.g., session data cleanup.

DAX

  • Cost: Extra charge (~$0.02/hour per node).
  • Explanation: Use only for high-read apps to justify expense.

Key Notes:

  • Cost Savings: TTL + right capacity mode = efficient usage.
  • Exam Tip: Calculate cost for Provisioned vs. On-Demand.

8. DynamoDB Use Cases

Understand practical applications.

Real-Time Analytics

  • Setup: Table + Streams + Lambda.
  • Features: Millisecond latency, event-driven.
  • Explanation: Leaderboards, live metrics.

Session Management

  • Setup: Table with TTL.
  • Features: Scalable, auto-expiry.
  • Explanation: Web app sessions—e.g., cart data.

IoT Data Store

  • Setup: Global Tables + DAX.
  • Features: Low latency, global access.
  • Explanation: Sensor data across regions.

9. DynamoDB Operations

  • Batch Operations: Write/Get up to 25 items per request.
    • Explanation: Reduces API calls—faster, cheaper.
  • Transactions: Atomic writes/reads (up to 25 items).
    • Explanation: Ensures consistency—e.g., deduct stock + log sale.

Detailed Explanations for Mastery

  • Partition Key Design:
    • Example: userId (good), status (bad—hot partitions).
    • Why It Matters: Uneven distribution throttles performance—key exam pitfall.
  • RCU/WCU Calculation:
    • Example: 10 KB item, strongly consistent read = 3 RCUs (10 ÷ 4, rounded up).
    • Why It Matters: Miscalculate, and you over/under-provision.
  • Streams Use Case:
    • Example: Item update → Lambda → update S3 aggregate.
    • Why It Matters: Event-driven pattern—common in SAA-C03.

Quick Reference Table

Feature Purpose Key Detail Exam Relevance
Primary Keys Data access Partition, Composite Core Concept
Capacity Modes Throughput Provisioned, On-Demand Cost, Performance
LSI/GSI Query flexibility 5 LSI, 20 GSI Performance
Global Tables Multi-region Eventual consistency Resilience
DAX Read acceleration Microsecond latency Performance
Streams Change capture 24-hour retention Performance
Encryption Security KMS, TLS Security
TTL Auto-delete Free, attribute-based Cost Optimization