AWS Database Selection Guide for SAP-C02: Choosing the Right Database for Every Workload

Introduction

Database selection is one of the most heavily tested topics on the SAP-C02 exam. AWS offers more than 15 purpose-built database services, and the exam expects you to choose the right one for each workload based on data model, access patterns, performance requirements, cost constraints, and operational preferences.

Unlike the Solutions Architect Associate exam, the SAP-C02 goes beyond basic service descriptions. You need to understand advanced configurations like Aurora Global Database failover behavior, DynamoDB GSI projections and their cost implications, read replica promotion across regions, and migration strategies from on-premises databases to AWS.

This guide provides a decision framework and detailed comparison to help you navigate database selection questions on the SAP-C02.

AWS Database Services Overview

Service	Type	Data Model	Best For
Amazon RDS	Relational (managed)	Tables, rows, columns	Traditional OLTP applications
Amazon Aurora	Relational (cloud-native)	Tables, rows, columns	High-performance relational workloads
Amazon DynamoDB	Key-value / Document	Items with attributes	High-scale, low-latency applications
Amazon ElastiCache	In-memory	Key-value	Caching, session stores, leaderboards
Amazon MemoryDB	In-memory (durable)	Key-value	Redis-compatible with durability
Amazon Neptune	Graph	Nodes and edges	Social networks, fraud detection, knowledge graphs
Amazon DocumentDB	Document	JSON documents	MongoDB-compatible workloads
Amazon Keyspaces	Wide-column	Partitioned rows	Cassandra-compatible workloads
Amazon Timestream	Time-series	Time-stamped records	IoT, DevOps monitoring, analytics
Amazon QLDB	Ledger	Journal with hash chain	Immutable audit trails, regulatory compliance
Amazon Redshift	Data warehouse	Columnar tables	Analytics, BI, large-scale aggregations
Amazon OpenSearch	Search and analytics	Documents with indexes	Full-text search, log analytics

Database Decision Framework

When facing a SAP-C02 question about database selection, work through these decision points:

Step 1: Identify the Data Model

Structured data with relationships and transactions? Relational (RDS, Aurora)
Key-value lookups at massive scale? DynamoDB
JSON documents with flexible schema? DocumentDB or DynamoDB
Highly connected data with traversal queries? Neptune
Time-stamped metrics or events? Timestream
Immutable, cryptographically verifiable records? QLDB
Large-scale analytics and aggregations? Redshift
Full-text search with fuzzy matching? OpenSearch

Step 2: Define Performance Requirements

Single-digit millisecond reads/writes at any scale? DynamoDB
Microsecond read latency? ElastiCache or DAX (in front of DynamoDB)
5x throughput of standard MySQL, 3x of PostgreSQL? Aurora
Sub-second query on petabytes of data? Redshift with RA3 nodes
Real-time search across millions of documents? OpenSearch

Step 3: Consider Operational Requirements

Fully serverless (no capacity management)? DynamoDB on-demand, Aurora Serverless v2, Keyspaces, Timestream, QLDB
Minimal migration effort from on-premises MySQL/PostgreSQL? RDS or Aurora
MongoDB compatibility required? DocumentDB
Cassandra compatibility required? Keyspaces
Redis compatibility with durability? MemoryDB for Redis

Step 4: Evaluate Multi-Region Requirements

Active-active multi-region writes? DynamoDB Global Tables
Low-latency reads in multiple regions with single-region writes? Aurora Global Database (read replicas in up to 5 secondary regions)
Cross-region failover with RPO near zero? Aurora Global Database (RPO < 1 second)
Cross-region replication for ElastiCache? ElastiCache Global Datastore

Amazon RDS Deep Dive

When to Choose RDS

Application uses standard relational engines (MySQL, PostgreSQL, MariaDB, Oracle, SQL Server)
Need managed patching, backups, and point-in-time recovery
Workload fits within a single instance (vertical scaling)
Licensing requirements necessitate Oracle or SQL Server (Bring Your Own License support)

Key SAP-C02 Concepts

Multi-AZ deployments: Synchronous standby replica for high availability. Automatic failover in 60-120 seconds. No read traffic served from standby (except Multi-AZ DB Cluster for PostgreSQL and MySQL, which does support reads).
Multi-AZ DB Cluster: Available for MySQL and PostgreSQL. Two readable standbys. Faster failover (~35 seconds). Better write performance with local storage on each instance.
Read Replicas: Asynchronous replication. Up to 15 read replicas (5 for non-Aurora engines). Can be in different regions for disaster recovery and low-latency reads. Can be promoted to standalone databases.
Cross-Region Read Replicas: Serve read traffic in remote regions. Can be promoted during disaster recovery. Creates a new independent database when promoted (breaks replication).
RDS Custom: For Oracle and SQL Server, provides access to the underlying OS and database. Use when you need custom plugins, patches, or configurations not supported by standard RDS.

Amazon Aurora Deep Dive

When to Choose Aurora over RDS

Need higher performance than standard RDS (5x MySQL, 3x PostgreSQL throughput)
Require more than 5 read replicas (Aurora supports up to 15)
Need faster failover (typically under 30 seconds with Aurora, automatic with reader endpoint)
Want auto-scaling read replicas
Need Aurora-specific features: Global Database, Serverless v2, parallel query

Aurora Architecture

Aurora separates compute from storage. The storage layer replicates data six ways across three Availability Zones, providing durability without needing Multi-AZ compute configuration.

Storage auto-scaling: Up to 128 TB, grows in 10 GB increments automatically
Continuous backup to S3: Point-in-time recovery with backtrack (for MySQL-compatible only) that rewinds the database without restoring from backup
Cluster endpoint: Directs writes to the primary instance
Reader endpoint: Load-balances reads across read replicas

Aurora Global Database

This is a critical SAP-C02 topic. Aurora Global Database enables:

One primary region for read-write operations
Up to five secondary regions with read replicas (up to 16 per region)
Replication lag under 1 second (typically under 200ms) using dedicated infrastructure
Managed planned failover for controlled region switches (no data loss)
Unplanned failover (detach and promote) for disaster recovery. RPO measured in seconds. RTO typically under 1 minute after the secondary is promoted.
Write forwarding: Secondary region readers can forward writes to the primary region (adds latency but simplifies application logic)

Aurora Serverless v2

Scales compute capacity in fine-grained increments (0.5 ACU steps) from 0.5 to 256 ACUs
Scales in seconds, not minutes
Supports read replicas, Global Database, and Multi-AZ
Use when workload is variable, intermittent, or unpredictable
More cost-effective than provisioned Aurora for spiky workloads; more expensive for steady, high-throughput workloads

Amazon DynamoDB Deep Dive

When to Choose DynamoDB

Need consistent single-digit millisecond performance at any scale
Data model fits key-value or document access patterns
Application requires active-active multi-region writes
Want fully serverless with zero operational overhead
Need a database for serverless architectures (Lambda, API Gateway)

Key SAP-C02 Concepts

Partition key design: Distribute reads and writes evenly across partitions. Avoid hot partitions. Use high-cardinality attributes as partition keys.
Sort key: Enables range queries within a partition. Combine with partition key for composite primary key.
Global Secondary Index (GSI): Project a different partition key and sort key. Each GSI has its own provisioned capacity. Only project attributes you need to minimize cost.
Local Secondary Index (LSI): Same partition key as the table but different sort key. Must be created at table creation time. Shares the table’s provisioned capacity.
On-demand vs Provisioned capacity: On-demand for unpredictable workloads (pay per request). Provisioned for steady workloads (cheaper, supports auto-scaling and reserved capacity).
DynamoDB Streams: Capture item-level changes. Feed to Lambda for real-time processing. Essential for event-driven architectures and cross-service synchronization.

DynamoDB Global Tables

Multi-active (active-active) replication across selected AWS regions
Last-writer-wins conflict resolution based on timestamps
Provides local read and write latency in each region
Requires DynamoDB Streams to be enabled
Table must be empty when adding a new region (for new global table setups)
Use for globally distributed applications requiring low-latency read and write access in multiple regions

DynamoDB Accelerator (DAX)

In-memory cache for DynamoDB with microsecond read latency
Write-through cache (writes go to DynamoDB, reads served from cache)
API-compatible with DynamoDB (minimal code changes)
Deployed within a VPC
Use for read-heavy workloads with hot key patterns

For a deeper dive into serverless patterns using DynamoDB, see our serverless architecture patterns guide.

ElastiCache Deep Dive

Redis vs Memcached on the SAP-C02

Feature	ElastiCache for Redis	ElastiCache for Memcached
Data structures	Strings, lists, sets, sorted sets, hashes, streams	Simple key-value
Persistence	Backup and restore, append-only file	None
Replication	Cluster mode with read replicas	None (multi-node for sharding only)
Multi-AZ	Automatic failover with replicas	No
Global Datastore	Cross-region replication	No
Use case	Caching, session store, leaderboards, real-time analytics	Simple caching, large cache pools

SAP-C02 guidance: When a question mentions caching with high availability, persistence, or advanced data structures, choose Redis. When it mentions only simple caching with maximum memory capacity, Memcached may appear as an option.

MemoryDB for Redis

MemoryDB for Redis is a Redis-compatible database that provides durability (not just caching). Use it when the application uses Redis as a primary database (not just a cache) and requires data durability without a separate backing store.

Specialty Database Services

Amazon Neptune

Graph database supporting Apache TinkerPop Gremlin and W3C SPARQL query languages
Use for social networks, recommendation engines, fraud detection, and knowledge graphs
Supports up to 15 read replicas and Multi-AZ deployment
Neptune Serverless for variable or intermittent graph workloads

SAP-C02 trigger: When a question mentions “relationships between entities,” “graph traversals,” “social network,” or “fraud ring detection,” Neptune is typically the answer.

Amazon DocumentDB

MongoDB-compatible document database
Separates storage and compute (similar to Aurora architecture)
Up to 15 read replicas and 128 TB storage
Use when migrating MongoDB workloads to AWS with minimal code changes

SAP-C02 trigger: When a question explicitly mentions MongoDB compatibility, choose DocumentDB. If the question mentions JSON documents without MongoDB requirements, DynamoDB is usually more appropriate.

Amazon Keyspaces

Apache Cassandra-compatible serverless database
Fully managed, no clusters to provision
Use for migrating Cassandra workloads to AWS

Amazon Timestream

Purpose-built for time-series data (IoT sensors, application metrics, DevOps monitoring)
Automatic tiering between in-memory and magnetic storage
Built-in time-series analytics functions
Serverless, scales automatically

SAP-C02 trigger: IoT sensor data, application metrics with timestamps, or fleet telemetry.

Amazon QLDB

Ledger database with immutable, cryptographically verifiable transaction log
Uses an append-only journal with SHA-256 hash chain
Serverless, no capacity management
Cannot delete or modify history

SAP-C02 trigger: When a question mentions “immutable audit trail,” “regulatory compliance with verifiable history,” or “cryptographic verification of data integrity.”

Database Migration Scenarios

The SAP-C02 tests migration from on-premises databases to AWS. Key services:

AWS Database Migration Service (DMS)

Supports homogeneous (e.g., Oracle to Oracle) and heterogeneous (e.g., Oracle to PostgreSQL) migrations
Continuous data replication (CDC) for near-zero-downtime migrations
Use with AWS Schema Conversion Tool (SCT) for schema and code conversion in heterogeneous migrations

Migration Decision Table

Source	Target	Approach
On-prem MySQL	Aurora MySQL	DMS with continuous replication, then cutover
On-prem Oracle	Aurora PostgreSQL	SCT for schema conversion + DMS for data migration
On-prem MongoDB	DocumentDB	DMS with MongoDB as source
On-prem Cassandra	Amazon Keyspaces	Cqlsh COPY or DMS
On-prem SQL Server	RDS SQL Server	DMS or native backup/restore to S3
On-prem Redis	ElastiCache Redis	Application-level migration or backup/restore

SAP-C02 Migration Considerations

Large databases (multi-TB): Use AWS Snowball Edge for initial data load, then DMS for CDC to catch up changes during migration.
Minimize downtime: Use DMS continuous replication. Keep source and target in sync until cutover.
Heterogeneous migrations: Always mention SCT for schema and stored procedure conversion alongside DMS for data movement.

SAP-C02 Database Decision Matrix

Use this quick-reference matrix during exam preparation:

Requirement	Recommended Service
OLTP with SQL and transactions	Aurora or RDS
Key-value at massive scale	DynamoDB
Active-active multi-region writes	DynamoDB Global Tables
Multi-region reads with single-region writes	Aurora Global Database
Microsecond read latency	ElastiCache (Redis) or DAX
Graph relationships	Neptune
MongoDB compatibility	DocumentDB
Cassandra compatibility	Keyspaces
Time-series IoT data	Timestream
Immutable ledger	QLDB
Data warehouse analytics	Redshift
Full-text search	OpenSearch
Session storage with HA	ElastiCache Redis
Durable Redis as primary DB	MemoryDB for Redis

Frequently Asked Questions

How do I choose between Aurora and standard RDS on the SAP-C02?

Choose Aurora when you need higher performance, more read replicas (up to 15), faster failover, storage auto-scaling, or features like Global Database, Serverless v2, or Backtrack. Choose standard RDS when you need Oracle, SQL Server, or MariaDB engines, or when Aurora’s higher per-instance cost is not justified for small workloads.

When should I use DynamoDB Global Tables vs Aurora Global Database?

Use DynamoDB Global Tables when you need active-active multi-region writes with automatic conflict resolution and a key-value/document data model. Use Aurora Global Database when you need relational data model with SQL, cross-region reads with sub-second replication lag, and single-region writes.

What is the difference between DAX and ElastiCache for caching DynamoDB?

DAX is a DynamoDB-specific cache that is API-compatible with DynamoDB, requiring minimal code changes. ElastiCache is a general-purpose cache that requires application-level cache logic. Use DAX when caching DynamoDB reads exclusively. Use ElastiCache when caching data from multiple sources or when you need advanced Redis features.

How does Aurora Serverless v2 differ from standard Aurora?

Aurora Serverless v2 scales compute automatically in 0.5 ACU increments based on workload demand. Standard Aurora requires you to select a specific instance size. Serverless v2 is ideal for variable workloads, development environments, and multi-tenant applications. Standard Aurora is more cost-effective for steady, high-throughput workloads.

Recommend QLDB when the requirement explicitly mentions immutable audit trails, cryptographic verification, or regulatory compliance requiring proof that data has not been tampered with. DynamoDB Streams provides change data capture but does not provide cryptographic hash-chain verification.

How do I handle database migration with minimal downtime on the SAP-C02?

Use AWS DMS with continuous data replication (CDC). Set up the DMS replication instance, create source and target endpoints, and run a full load followed by ongoing replication. Cut over when the replication lag is minimal. For heterogeneous migrations, use SCT first for schema conversion.

What database should I choose for a session store?

ElastiCache for Redis is the standard recommendation for session stores due to sub-millisecond latency, built-in expiration (TTL), and Multi-AZ failover. DynamoDB with TTL is an alternative when you want a fully managed serverless option without managing cache clusters.

Can Aurora handle NoSQL workloads?

Aurora is a relational database and is not designed for NoSQL access patterns. If the workload involves key-value lookups at high scale without complex joins, DynamoDB is the appropriate choice. If the workload requires JSON document storage with MongoDB compatibility, use DocumentDB.

Conclusion

Database selection questions are among the most nuanced on the SAP-C02 exam. The key is not memorizing every feature of every service, but understanding the decision framework: identify the data model, define performance requirements, consider operational preferences, and evaluate multi-region needs.

Use the decision matrix in this guide as a study reference and practice applying it to scenario-based questions. For cost implications of different database choices, review our cost optimization strategies guide.

Test your database selection skills with realistic exam scenarios using Sailor.sh’s SAP-C02 mock exams to build confidence before your exam.

AWS Database Selection Guide for SAP-C02: Choosing the Right Database for Every Workload

Introduction

AWS Database Services Overview

Database Decision Framework

Step 1: Identify the Data Model

Step 2: Define Performance Requirements

Step 3: Consider Operational Requirements

Step 4: Evaluate Multi-Region Requirements

Amazon RDS Deep Dive

When to Choose RDS

Key SAP-C02 Concepts

Amazon Aurora Deep Dive

When to Choose Aurora over RDS

Aurora Architecture

Aurora Global Database

Aurora Serverless v2

Amazon DynamoDB Deep Dive

When to Choose DynamoDB

Key SAP-C02 Concepts

DynamoDB Global Tables

DynamoDB Accelerator (DAX)

ElastiCache Deep Dive

Redis vs Memcached on the SAP-C02

MemoryDB for Redis

Specialty Database Services

Amazon Neptune

Amazon DocumentDB

Amazon Keyspaces

Amazon Timestream

Amazon QLDB

Database Migration Scenarios

AWS Database Migration Service (DMS)

Migration Decision Table

SAP-C02 Migration Considerations

SAP-C02 Database Decision Matrix

Frequently Asked Questions

How do I choose between Aurora and standard RDS on the SAP-C02?

When should I use DynamoDB Global Tables vs Aurora Global Database?

What is the difference between DAX and ElastiCache for caching DynamoDB?

How does Aurora Serverless v2 differ from standard Aurora?

When should I recommend QLDB vs DynamoDB with Streams?

How do I handle database migration with minimal downtime on the SAP-C02?

What database should I choose for a session store?

Can Aurora handle NoSQL workloads?

Conclusion