Introduction
Database selection is one of the most heavily tested topics on the SAP-C02 exam. AWS offers more than 15 purpose-built database services, and the exam expects you to choose the right one for each workload based on data model, access patterns, performance requirements, cost constraints, and operational preferences.
Unlike the Solutions Architect Associate exam, the SAP-C02 goes beyond basic service descriptions. You need to understand advanced configurations like Aurora Global Database failover behavior, DynamoDB GSI projections and their cost implications, read replica promotion across regions, and migration strategies from on-premises databases to AWS.
This guide provides a decision framework and detailed comparison to help you navigate database selection questions on the SAP-C02.
AWS Database Services Overview
| Service | Type | Data Model | Best For |
|---|---|---|---|
| Amazon RDS | Relational (managed) | Tables, rows, columns | Traditional OLTP applications |
| Amazon Aurora | Relational (cloud-native) | Tables, rows, columns | High-performance relational workloads |
| Amazon DynamoDB | Key-value / Document | Items with attributes | High-scale, low-latency applications |
| Amazon ElastiCache | In-memory | Key-value | Caching, session stores, leaderboards |
| Amazon MemoryDB | In-memory (durable) | Key-value | Redis-compatible with durability |
| Amazon Neptune | Graph | Nodes and edges | Social networks, fraud detection, knowledge graphs |
| Amazon DocumentDB | Document | JSON documents | MongoDB-compatible workloads |
| Amazon Keyspaces | Wide-column | Partitioned rows | Cassandra-compatible workloads |
| Amazon Timestream | Time-series | Time-stamped records | IoT, DevOps monitoring, analytics |
| Amazon QLDB | Ledger | Journal with hash chain | Immutable audit trails, regulatory compliance |
| Amazon Redshift | Data warehouse | Columnar tables | Analytics, BI, large-scale aggregations |
| Amazon OpenSearch | Search and analytics | Documents with indexes | Full-text search, log analytics |
Database Decision Framework
When facing a SAP-C02 question about database selection, work through these decision points:
Step 1: Identify the Data Model
- Structured data with relationships and transactions? Relational (RDS, Aurora)
- Key-value lookups at massive scale? DynamoDB
- JSON documents with flexible schema? DocumentDB or DynamoDB
- Highly connected data with traversal queries? Neptune
- Time-stamped metrics or events? Timestream
- Immutable, cryptographically verifiable records? QLDB
- Large-scale analytics and aggregations? Redshift
- Full-text search with fuzzy matching? OpenSearch
Step 2: Define Performance Requirements
- Single-digit millisecond reads/writes at any scale? DynamoDB
- Microsecond read latency? ElastiCache or DAX (in front of DynamoDB)
- 5x throughput of standard MySQL, 3x of PostgreSQL? Aurora
- Sub-second query on petabytes of data? Redshift with RA3 nodes
- Real-time search across millions of documents? OpenSearch
Step 3: Consider Operational Requirements
- Fully serverless (no capacity management)? DynamoDB on-demand, Aurora Serverless v2, Keyspaces, Timestream, QLDB
- Minimal migration effort from on-premises MySQL/PostgreSQL? RDS or Aurora
- MongoDB compatibility required? DocumentDB
- Cassandra compatibility required? Keyspaces
- Redis compatibility with durability? MemoryDB for Redis
Step 4: Evaluate Multi-Region Requirements
- Active-active multi-region writes? DynamoDB Global Tables
- Low-latency reads in multiple regions with single-region writes? Aurora Global Database (read replicas in up to 5 secondary regions)
- Cross-region failover with RPO near zero? Aurora Global Database (RPO < 1 second)
- Cross-region replication for ElastiCache? ElastiCache Global Datastore
Amazon RDS Deep Dive
When to Choose RDS
- Application uses standard relational engines (MySQL, PostgreSQL, MariaDB, Oracle, SQL Server)
- Need managed patching, backups, and point-in-time recovery
- Workload fits within a single instance (vertical scaling)
- Licensing requirements necessitate Oracle or SQL Server (Bring Your Own License support)
Key SAP-C02 Concepts
- Multi-AZ deployments: Synchronous standby replica for high availability. Automatic failover in 60-120 seconds. No read traffic served from standby (except Multi-AZ DB Cluster for PostgreSQL and MySQL, which does support reads).
- Multi-AZ DB Cluster: Available for MySQL and PostgreSQL. Two readable standbys. Faster failover (~35 seconds). Better write performance with local storage on each instance.
- Read Replicas: Asynchronous replication. Up to 15 read replicas (5 for non-Aurora engines). Can be in different regions for disaster recovery and low-latency reads. Can be promoted to standalone databases.
- Cross-Region Read Replicas: Serve read traffic in remote regions. Can be promoted during disaster recovery. Creates a new independent database when promoted (breaks replication).
- RDS Custom: For Oracle and SQL Server, provides access to the underlying OS and database. Use when you need custom plugins, patches, or configurations not supported by standard RDS.
Amazon Aurora Deep Dive
When to Choose Aurora over RDS
- Need higher performance than standard RDS (5x MySQL, 3x PostgreSQL throughput)
- Require more than 5 read replicas (Aurora supports up to 15)
- Need faster failover (typically under 30 seconds with Aurora, automatic with reader endpoint)
- Want auto-scaling read replicas
- Need Aurora-specific features: Global Database, Serverless v2, parallel query
Aurora Architecture
Aurora separates compute from storage. The storage layer replicates data six ways across three Availability Zones, providing durability without needing Multi-AZ compute configuration.
- Storage auto-scaling: Up to 128 TB, grows in 10 GB increments automatically
- Continuous backup to S3: Point-in-time recovery with backtrack (for MySQL-compatible only) that rewinds the database without restoring from backup
- Cluster endpoint: Directs writes to the primary instance
- Reader endpoint: Load-balances reads across read replicas
Aurora Global Database
This is a critical SAP-C02 topic. Aurora Global Database enables:
- One primary region for read-write operations
- Up to five secondary regions with read replicas (up to 16 per region)
- Replication lag under 1 second (typically under 200ms) using dedicated infrastructure
- Managed planned failover for controlled region switches (no data loss)
- Unplanned failover (detach and promote) for disaster recovery. RPO measured in seconds. RTO typically under 1 minute after the secondary is promoted.
- Write forwarding: Secondary region readers can forward writes to the primary region (adds latency but simplifies application logic)
Aurora Serverless v2
- Scales compute capacity in fine-grained increments (0.5 ACU steps) from 0.5 to 256 ACUs
- Scales in seconds, not minutes
- Supports read replicas, Global Database, and Multi-AZ
- Use when workload is variable, intermittent, or unpredictable
- More cost-effective than provisioned Aurora for spiky workloads; more expensive for steady, high-throughput workloads
Amazon DynamoDB Deep Dive
When to Choose DynamoDB
- Need consistent single-digit millisecond performance at any scale
- Data model fits key-value or document access patterns
- Application requires active-active multi-region writes
- Want fully serverless with zero operational overhead
- Need a database for serverless architectures (Lambda, API Gateway)
Key SAP-C02 Concepts
- Partition key design: Distribute reads and writes evenly across partitions. Avoid hot partitions. Use high-cardinality attributes as partition keys.
- Sort key: Enables range queries within a partition. Combine with partition key for composite primary key.
- Global Secondary Index (GSI): Project a different partition key and sort key. Each GSI has its own provisioned capacity. Only project attributes you need to minimize cost.
- Local Secondary Index (LSI): Same partition key as the table but different sort key. Must be created at table creation time. Shares the table’s provisioned capacity.
- On-demand vs Provisioned capacity: On-demand for unpredictable workloads (pay per request). Provisioned for steady workloads (cheaper, supports auto-scaling and reserved capacity).
- DynamoDB Streams: Capture item-level changes. Feed to Lambda for real-time processing. Essential for event-driven architectures and cross-service synchronization.
DynamoDB Global Tables
- Multi-active (active-active) replication across selected AWS regions
- Last-writer-wins conflict resolution based on timestamps
- Provides local read and write latency in each region
- Requires DynamoDB Streams to be enabled
- Table must be empty when adding a new region (for new global table setups)
- Use for globally distributed applications requiring low-latency read and write access in multiple regions
DynamoDB Accelerator (DAX)
- In-memory cache for DynamoDB with microsecond read latency
- Write-through cache (writes go to DynamoDB, reads served from cache)
- API-compatible with DynamoDB (minimal code changes)
- Deployed within a VPC
- Use for read-heavy workloads with hot key patterns
For a deeper dive into serverless patterns using DynamoDB, see our serverless architecture patterns guide.
ElastiCache Deep Dive
Redis vs Memcached on the SAP-C02
| Feature | ElastiCache for Redis | ElastiCache for Memcached |
|---|---|---|
| Data structures | Strings, lists, sets, sorted sets, hashes, streams | Simple key-value |
| Persistence | Backup and restore, append-only file | None |
| Replication | Cluster mode with read replicas | None (multi-node for sharding only) |
| Multi-AZ | Automatic failover with replicas | No |
| Global Datastore | Cross-region replication | No |
| Use case | Caching, session store, leaderboards, real-time analytics | Simple caching, large cache pools |
SAP-C02 guidance: When a question mentions caching with high availability, persistence, or advanced data structures, choose Redis. When it mentions only simple caching with maximum memory capacity, Memcached may appear as an option.
MemoryDB for Redis
MemoryDB for Redis is a Redis-compatible database that provides durability (not just caching). Use it when the application uses Redis as a primary database (not just a cache) and requires data durability without a separate backing store.
Specialty Database Services
Amazon Neptune
- Graph database supporting Apache TinkerPop Gremlin and W3C SPARQL query languages
- Use for social networks, recommendation engines, fraud detection, and knowledge graphs
- Supports up to 15 read replicas and Multi-AZ deployment
- Neptune Serverless for variable or intermittent graph workloads
SAP-C02 trigger: When a question mentions “relationships between entities,” “graph traversals,” “social network,” or “fraud ring detection,” Neptune is typically the answer.
Amazon DocumentDB
- MongoDB-compatible document database
- Separates storage and compute (similar to Aurora architecture)
- Up to 15 read replicas and 128 TB storage
- Use when migrating MongoDB workloads to AWS with minimal code changes
SAP-C02 trigger: When a question explicitly mentions MongoDB compatibility, choose DocumentDB. If the question mentions JSON documents without MongoDB requirements, DynamoDB is usually more appropriate.
Amazon Keyspaces
- Apache Cassandra-compatible serverless database
- Fully managed, no clusters to provision
- Use for migrating Cassandra workloads to AWS
Amazon Timestream
- Purpose-built for time-series data (IoT sensors, application metrics, DevOps monitoring)
- Automatic tiering between in-memory and magnetic storage
- Built-in time-series analytics functions
- Serverless, scales automatically
SAP-C02 trigger: IoT sensor data, application metrics with timestamps, or fleet telemetry.
Amazon QLDB
- Ledger database with immutable, cryptographically verifiable transaction log
- Uses an append-only journal with SHA-256 hash chain
- Serverless, no capacity management
- Cannot delete or modify history
SAP-C02 trigger: When a question mentions “immutable audit trail,” “regulatory compliance with verifiable history,” or “cryptographic verification of data integrity.”
Database Migration Scenarios
The SAP-C02 tests migration from on-premises databases to AWS. Key services:
AWS Database Migration Service (DMS)
- Supports homogeneous (e.g., Oracle to Oracle) and heterogeneous (e.g., Oracle to PostgreSQL) migrations
- Continuous data replication (CDC) for near-zero-downtime migrations
- Use with AWS Schema Conversion Tool (SCT) for schema and code conversion in heterogeneous migrations
Migration Decision Table
| Source | Target | Approach |
|---|---|---|
| On-prem MySQL | Aurora MySQL | DMS with continuous replication, then cutover |
| On-prem Oracle | Aurora PostgreSQL | SCT for schema conversion + DMS for data migration |
| On-prem MongoDB | DocumentDB | DMS with MongoDB as source |
| On-prem Cassandra | Amazon Keyspaces | Cqlsh COPY or DMS |
| On-prem SQL Server | RDS SQL Server | DMS or native backup/restore to S3 |
| On-prem Redis | ElastiCache Redis | Application-level migration or backup/restore |
SAP-C02 Migration Considerations
- Large databases (multi-TB): Use AWS Snowball Edge for initial data load, then DMS for CDC to catch up changes during migration.
- Minimize downtime: Use DMS continuous replication. Keep source and target in sync until cutover.
- Heterogeneous migrations: Always mention SCT for schema and stored procedure conversion alongside DMS for data movement.
SAP-C02 Database Decision Matrix
Use this quick-reference matrix during exam preparation:
| Requirement | Recommended Service |
|---|---|
| OLTP with SQL and transactions | Aurora or RDS |
| Key-value at massive scale | DynamoDB |
| Active-active multi-region writes | DynamoDB Global Tables |
| Multi-region reads with single-region writes | Aurora Global Database |
| Microsecond read latency | ElastiCache (Redis) or DAX |
| Graph relationships | Neptune |
| MongoDB compatibility | DocumentDB |
| Cassandra compatibility | Keyspaces |
| Time-series IoT data | Timestream |
| Immutable ledger | QLDB |
| Data warehouse analytics | Redshift |
| Full-text search | OpenSearch |
| Session storage with HA | ElastiCache Redis |
| Durable Redis as primary DB | MemoryDB for Redis |
Frequently Asked Questions
How do I choose between Aurora and standard RDS on the SAP-C02?
Choose Aurora when you need higher performance, more read replicas (up to 15), faster failover, storage auto-scaling, or features like Global Database, Serverless v2, or Backtrack. Choose standard RDS when you need Oracle, SQL Server, or MariaDB engines, or when Aurora’s higher per-instance cost is not justified for small workloads.
When should I use DynamoDB Global Tables vs Aurora Global Database?
Use DynamoDB Global Tables when you need active-active multi-region writes with automatic conflict resolution and a key-value/document data model. Use Aurora Global Database when you need relational data model with SQL, cross-region reads with sub-second replication lag, and single-region writes.
What is the difference between DAX and ElastiCache for caching DynamoDB?
DAX is a DynamoDB-specific cache that is API-compatible with DynamoDB, requiring minimal code changes. ElastiCache is a general-purpose cache that requires application-level cache logic. Use DAX when caching DynamoDB reads exclusively. Use ElastiCache when caching data from multiple sources or when you need advanced Redis features.
How does Aurora Serverless v2 differ from standard Aurora?
Aurora Serverless v2 scales compute automatically in 0.5 ACU increments based on workload demand. Standard Aurora requires you to select a specific instance size. Serverless v2 is ideal for variable workloads, development environments, and multi-tenant applications. Standard Aurora is more cost-effective for steady, high-throughput workloads.
When should I recommend QLDB vs DynamoDB with Streams?
Recommend QLDB when the requirement explicitly mentions immutable audit trails, cryptographic verification, or regulatory compliance requiring proof that data has not been tampered with. DynamoDB Streams provides change data capture but does not provide cryptographic hash-chain verification.
How do I handle database migration with minimal downtime on the SAP-C02?
Use AWS DMS with continuous data replication (CDC). Set up the DMS replication instance, create source and target endpoints, and run a full load followed by ongoing replication. Cut over when the replication lag is minimal. For heterogeneous migrations, use SCT first for schema conversion.
What database should I choose for a session store?
ElastiCache for Redis is the standard recommendation for session stores due to sub-millisecond latency, built-in expiration (TTL), and Multi-AZ failover. DynamoDB with TTL is an alternative when you want a fully managed serverless option without managing cache clusters.
Can Aurora handle NoSQL workloads?
Aurora is a relational database and is not designed for NoSQL access patterns. If the workload involves key-value lookups at high scale without complex joins, DynamoDB is the appropriate choice. If the workload requires JSON document storage with MongoDB compatibility, use DocumentDB.
Conclusion
Database selection questions are among the most nuanced on the SAP-C02 exam. The key is not memorizing every feature of every service, but understanding the decision framework: identify the data model, define performance requirements, consider operational preferences, and evaluate multi-region needs.
Use the decision matrix in this guide as a study reference and practice applying it to scenario-based questions. For cost implications of different database choices, review our cost optimization strategies guide.
Test your database selection skills with realistic exam scenarios using Sailor.sh’s SAP-C02 mock exams to build confidence before your exam.