Can DynamoDB Replace Your SQL Database?

The pitch for DynamoDB is compelling: single-digit millisecond latency at any scale, no servers to manage, pay-per-request pricing that scales to zero. For serverless architectures on AWS, it's the natural database choice.

But "can DynamoDB handle my workload" is the wrong question. DynamoDB can handle almost any workload in terms of throughput and latency. The real question is whether your access patterns fit a key-value model, or whether you'd be fighting the database on every query.

The Core Difference

SQL databases let you query data however you want. You define the schema, load the data, and figure out the queries later. If a new report needs a join across three tables with a filter on a date range, you write the SQL and add an index if it's slow.

DynamoDB inverts this. You design the schema around your queries. Every access pattern needs to be accounted for in your key design upfront. If you need a query you didn't plan for, your options are adding a Global Secondary Index (you get 20 per table), doing a full table scan (expensive and slow), or restructuring your data.

This isn't a limitation in the traditional sense. It's a tradeoff. You give up query flexibility in exchange for predictable performance at any scale. The question is whether that tradeoff works for your application.

When DynamoDB Fits Well

Your access patterns are predictable

If you can enumerate the queries your application needs, DynamoDB is likely a good fit. Most transactional applications fall into this category:

Get user by ID
Get all orders for a customer
Get all items in an order
Get recent orders by status

These are all key-based lookups or queries on a known partition with a sort key condition. DynamoDB handles these in single-digit milliseconds regardless of table size.

Your data has natural hierarchies

This is where DynamoDB's sort key model starts to shine. In SQL, a customer with orders and order items requires three tables and two joins:

SELECT c.name, o.order_date, oi.product_name, oi.quantity
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
WHERE c.customer_id = '12345'

In DynamoDB, you can model this as a single partition with sort key prefixes:

PK	SK	Data
CUSTOMER#12345	PROFILE	name, email, ...
CUSTOMER#12345	ORDER#2024-001	order_date, status, ...
CUSTOMER#12345	ORDER#2024-001#ITEM#1	product, quantity, ...
CUSTOMER#12345	ORDER#2024-001#ITEM#2	product, quantity, ...
CUSTOMER#12345	ORDER#2024-002	order_date, status, ...

One query with pk = CUSTOMER#12345 returns the customer, all their orders, and all order items. No joins. The sort key hierarchy gives you the same result as the three-table SQL query, but in a single read operation.

You can also narrow it down. pk = CUSTOMER#12345 AND sk BEGINS_WITH ORDER#2024-001 returns just one order and its items. The sort key acts as both a type discriminator and a filter.

Your reads vastly outnumber your writes

DynamoDB's model pre-computes query results at write time. When you store an order item under the customer's partition key, you're essentially pre-joining the data. Reads are fast because the work happened during the write.

This is the right tradeoff when your application reads data far more often than it writes it. Most user-facing applications fit this pattern.

You need consistent performance at scale

SQL databases slow down as data grows. Queries that were fast with 10,000 rows become slow with 10 million rows unless you carefully manage indexes, query plans, and connection pools. DynamoDB's performance is the same whether your table has 1,000 items or 1 billion items, because every query is a key lookup or a range scan within a single partition.

How Sort Keys Replace Joins

The mental model shift that trips people up: in SQL, related data lives in separate tables and you join them at query time. In DynamoDB, related data lives together (same partition key) and you separate it using sort key prefixes.

Think of the partition key as "which group of related data" and the sort key as "which specific item within that group." A query always targets one partition and can filter on the sort key using begins_with, between, >, <, or exact match.

A few patterns that replace common SQL joins:

One-to-many: Parent and children share a partition key. The parent has a sort key like META or PROFILE, children have sort keys with a prefix like ORDER# or ITEM#.

Many-to-many: Use a GSI (Global Secondary Index) to invert the relationship. If users belong to groups, store items with pk = USER#123, sk = GROUP#456 and create a GSI with pk = GROUP#456, sk = USER#123. Query the table for "which groups does this user belong to" and query the GSI for "which users are in this group."

Filtering by status or date: Overload a GSI sort key with the field you need to filter on. If you need "all orders with status PENDING sorted by date," create a GSI where the partition key is TENANT#abc and the sort key is PENDING#2024-03-15. Query with sk BEGINS_WITH PENDING# to get all pending orders in date order.

None of these require a join operation. The data is pre-organized for the query you need.

When DynamoDB Does Not Fit

You need full-text search or partial string matching

DynamoDB's begins_with operator works on sort keys, but that's it. You can find all customers whose last name starts with "Smi" but you cannot find all customers whose email contains "gmail" without scanning the entire table.

If your application needs search (product search, document search, log querying), you need a search engine. OpenSearch (managed Elasticsearch) is the typical companion to DynamoDB for this. Store your transactional data in DynamoDB, replicate it to OpenSearch via DynamoDB Streams, and run search queries against OpenSearch.

Your access patterns are unpredictable

If your application needs ad-hoc queries that you can't enumerate upfront, DynamoDB will fight you. Business intelligence tools, data exploration interfaces, and admin dashboards where users can filter and sort on any combination of fields are poor fits.

This is the data warehousing problem. If someone needs to ask "show me all orders over $500 from customers in Minnesota who signed up in Q3 and haven't ordered in 90 days," that's a SQL query. Modeling that in DynamoDB would require a GSI for every possible filter combination, which isn't practical.

You need many different indexes on the same data

DynamoDB gives you 20 GSIs per table. That sounds like a lot until you realize that each distinct query pattern that doesn't align with your primary key or an existing GSI needs its own index. If your data needs to be queried 25 different ways, you'll run out of indexes.

More importantly, every GSI is a full copy of the projected attributes. If you have 20 GSIs projecting all attributes, you're storing your data 21 times and paying for 21 writes on every update. The cost model breaks down quickly.

You need complex transactions across unrelated data

DynamoDB supports transactions (up to 100 items per transaction), but they cost 2x a normal write and are limited to 100 items across tables in the same region. There's no stored procedure equivalent, no multi-statement transaction with conditional branching, and no savepoints.

If your application needs to atomically update many unrelated entities with complex rollback logic, or if transactions routinely involve more than a handful of items, a relational database with full ACID transactions is simpler and more natural.

You need aggregations at query time

DynamoDB has no SUM, COUNT, AVG, or GROUP BY. If you need "total revenue by product category for the last quarter," you either pre-compute that aggregation at write time (maintaining running totals) or you scan and compute it in application code.

Pre-computing works well for known aggregations. It breaks down when the aggregation requirements change frequently or when you need aggregations across dimensions you didn't anticipate.

The Decision Framework

Ask these questions about your workload:

Can you list your access patterns? If yes, DynamoDB is viable. If your queries are ad-hoc or user-defined, stick with SQL.
Does your data have natural parent-child relationships? If yes, sort key hierarchies will serve you well. If your relationships are many-to-many across many dimensions, it gets complex fast.
Do you need text search beyond prefix matching? If yes, you'll need OpenSearch alongside DynamoDB, not instead of it.
How many distinct query patterns do you have? Under 20 (including GSIs), you're fine. Over 20, you'll hit the index limit.
Do you need real-time aggregations on arbitrary dimensions? If yes, DynamoDB alone won't do it. You'll need a separate analytics layer.
Is predictable latency at scale more important than query flexibility? DynamoDB optimizes for the former. SQL optimizes for the latter.

The Hybrid Reality

In practice, most production systems aren't purely one or the other. DynamoDB handles the transactional workload (user requests, order processing, real-time operations) while other systems handle the analytical workload (reporting, search, ad-hoc queries).

A common architecture:

DynamoDB for transactional data (fast reads and writes, predictable access patterns)
DynamoDB Streams to replicate changes in real time
OpenSearch for full-text search and complex filtering
S3 + Athena or a data warehouse for analytics and reporting

The question isn't "DynamoDB or SQL" for your entire system. It's "which parts of your system benefit from DynamoDB's strengths, and which parts need something else."

Starting the Migration

If your workload fits, the migration path is:

Map every query your application makes against the SQL database
Group related queries by the entity they center on (customer, order, product)
Design partition keys around those entities
Design sort keys to support the specific queries within each partition
Identify queries that don't fit the primary key and design GSIs for them
If you run out of GSIs or can't model a query, that query needs a different system

The upfront design work is real. But the payoff is a database that doesn't slow down as you scale, costs nothing when idle, and requires zero operational maintenance. For serverless applications on AWS, that tradeoff is usually worth making.