Skip to content

Others

1. AWS Data Exchange

  • Definition: A service to find, subscribe to, and use third-party data in the AWS Cloud.
  • Key Features:
    • Access 3,500+ data products from 300+ providers.
    • Integrates with S3, Redshift, Athena, and QuickSight for analytics.
    • Supports data delivery without infrastructure management.
  • Use Cases: Market research, financial analysis, IoT data enrichment.
  • Performance: Scales to deliver data to millions of AWS users.
  • Resilience: S3-based storage ensures 11 9s durability.
  • Security: IAM for access control, KMS for encryption, no sharing of sensitive private data (e.g., medical records).
  • Cost: Pay-per-use (subscription + data usage); no infrastructure costs.
  • Integrations: S3, Redshift, Athena, QuickSight, Glue.
  • Recent Updates: Enhanced data catalog integration (2024).
  • Exam Tip: Use for third-party data integration; compare with AWS Glue for ETL.

2. AWS Lake Formation

  • Definition: A service to build, manage, and secure data lakes on S3.
  • Key Features:
    • Creates data lakes by cataloging and moving data to S3.
    • Fine-grained access control (row/column-level) via IAM and Lake Formation permissions.
    • Integrates with Glue for ETL and Data Catalog.
  • Use Cases: Centralized data lakes for analytics, ML, and reporting.
  • Performance: Scales to petabytes; optimized for Athena, Redshift, EMR queries.
  • Resilience: S3 durability (11 9s), multi-AZ metadata storage.
  • Security: KMS encryption, Lake Formation permissions, Security Hub integration (2025).
  • Cost: $1/100K catalog objects, $0.44/DPU-hour for Glue ETL; optimize with partitioning.
  • Integrations: S3, Glue, Athena, Redshift, QuickSight, EMR, SageMaker.
  • Recent Updates: Enhanced row/column access, cross-account sharing (Jan 2025).
  • Exam Tip: Use for secure data lakes; master Lake Formation vs. IAM permissions.

3. Amazon Managed Streaming for Apache Kafka (Amazon MSK)

  • Definition: A fully managed service for running Apache Kafka to process real-time streaming data.
  • Key Features:
    • Supports Kafka APIs for producers/consumers.
    • Auto-manages brokers, ZooKeeper, and scaling.
    • Integrates with Kinesis Firehose for S3 delivery.
  • Use Cases: Streaming pipelines, log processing, IoT analytics.
  • Performance: Low-latency (milliseconds), scales to GB/s with brokers.
  • Resilience: Multi-AZ clusters, auto-recovery of failed brokers.
  • Security: IAM, KMS encryption, SASL/SCRAM, Security Hub (2025).
  • Cost: $0.21–$1.08/broker-hour, $0.02/GB-month storage; optimize brokers.
  • Integrations: Kinesis Firehose, S3, OpenSearch, Lambda, Redshift.
  • Recent Updates: Managed S3 delivery via Firehose (2023), throughput enhancements (2024).
  • Exam Tip: Compare MSK (Kafka) vs. Kinesis for streaming.

4. AWS Data Pipeline

  • Definition: A managed orchestration service for scheduling and automating data movement and transformation.
  • Key Features:
    • Schedules ETL workflows across AWS services (e.g., S3, RDS, Redshift).
    • Supports on-premises data sources.
    • Uses tasks runners (EC2, Lambda) for processing.
  • Use Cases: Legacy ETL pipelines, periodic data transfers.
  • Performance: Limited scalability compared to Glue; suitable for simple workflows.
  • Resilience: Retries failed tasks, S3 durability for data.
  • Security: IAM roles, KMS encryption for data.
  • Cost: $1–$2.50/pipeline/month; optimize with scheduling.
  • Integrations: S3, RDS, Redshift, DynamoDB, EMR.
  • Recent Updates: No major updates; considered legacy vs. Glue.
  • Exam Tip: Use for simple, scheduled ETL; prefer Glue for modern workloads.

5. Amazon OpenSearch Service

  • Definition: A managed service for deploying and scaling OpenSearch clusters for search and analytics.
  • Key Features:
    • Supports OpenSearch (up to 2.17) and legacy Elasticsearch (up to 7.10).
    • Ingests logs, metrics, and traces for real-time analytics.
    • Zero-ETL integrations with S3, CloudWatch, Security Lake.
  • Use Cases: Log analytics, application search, security analytics.
  • Performance: Millisecond query latency, scales to petabytes with OR1 instances (2024).
  • Resilience: Multi-AZ, auto-replacement of failed nodes, 11 9s S3 durability.
  • Security: IAM, KMS, fine-grained access, Security Hub (2025).
  • Cost: $0.03–$1.08/instance-hour, $0.024/GB-month storage; use serverless for auto-scaling.
  • Integrations: S3, Kinesis, Lambda, QuickSight, CloudWatch, Security Lake.
  • Recent Updates: Vector search on disk (2024), OpenSearch 2.17 support (2025).
  • Exam Tip: Use for search/log analytics; compare with Athena for SQL queries.

6. Amazon QuickSight

  • Definition: A serverless business intelligence (BI) service for creating interactive dashboards and visualizations.
  • Key Features:
    • Supports data from S3, Redshift, Athena, OpenSearch, and more.
    • ML-powered insights (e.g., anomaly detection, forecasting).
    • QuickSight Q for natural language queries (e.g., “Top sales in 2024”).
  • Use Cases: Business reporting, real-time dashboards, data exploration.
  • Performance: SPICE engine for fast queries; scales to thousands of users.
  • Resilience: Serverless, multi-AZ, durable S3 storage for datasets.
  • Security: IAM, KMS, row-level security, VPC connectivity.
  • Cost: $9–$18/user/month (standard), $0.30/GB SPICE; optimize with SPICE caching.
  • Integrations: S3, Redshift, Athena, OpenSearch, Lake Formation, Data Exchange.
  • Recent Updates: Enhanced Q natural language capabilities (2024).
  • Exam Tip: Use for BI; compare with OpenSearch for operational analytics.

Quick Comparison Table

Service Purpose Key Use Case Cost Model Key Integration
AWS Data Exchange Third-party data access Market research Subscription + usage S3, Redshift, Athena
AWS Lake Formation Secure data lake management Centralized analytics $1/100K objects, Glue ETL Glue, Athena, QuickSight
Amazon MSK Managed Kafka streaming Real-time pipelines $0.21–$1.08/broker-hour Firehose, S3, OpenSearch
AWS Data Pipeline ETL orchestration Legacy data transfers $1–$2.50/pipeline/month S3, RDS, Redshift
Amazon OpenSearch Search and log analytics Log monitoring $0.03–$1.08/instance-hour S3, Kinesis, QuickSight
Amazon QuickSight BI and visualizations Dashboards $9–$18/user/month S3, Athena, OpenSearch

Study Tips

  • Hands-On:
    • Data Exchange: Subscribe to a dataset, query with Athena.
    • Lake Formation: Create a data lake, apply row-level permissions.
    • MSK: Set up a Kafka cluster, stream to S3 via Firehose.
    • Data Pipeline: Schedule an S3-to-Redshift pipeline.
    • OpenSearch: Ingest logs from Kinesis, query with dashboards.
    • QuickSight: Build a dashboard from S3 data via Athena.
  • Scenarios:
    • Third-party data = Data Exchange.
    • Secure data lake = Lake Formation.
    • Kafka streaming = MSK.
    • Legacy ETL = Data Pipeline.
    • Log analytics = OpenSearch.
    • BI dashboards = QuickSight.
  • Memorize: Pricing, integrations, Lake Formation permissions, OpenSearch vs. QuickSight use cases.