About the author: I'm Charles Sieg, a cloud architect and platform engineer who builds apps, services, and infrastructure for Fortune 1000 clients through Vantalect. If your organization is rethinking its software strategy in the age of AI-assisted engineering, let's talk.
DynamoDB sits at the center of more AWS architectures than any other database service. I've used it for everything from mobile backends handling millions of daily active users to event-sourced systems processing tens of thousands of writes per second. Most teams treat it as a simple key-value store, plug it in, and move on. That works until they hit a hot partition at 3 AM, discover their GSI is throttling independently of the base table, or realize their on-demand table costs three times what provisioned capacity would have. After years of running DynamoDB at scale, I've accumulated enough operational scars to fill this reference. Patterns, trade-offs, cost traps, and the internal mechanics that explain why DynamoDB behaves the way it does.
This is an architecture reference for engineers who already know the basics. If you need a getting-started tutorial, AWS has plenty. What follows covers how DynamoDB actually works under the hood, how to design for its strengths, and which failure modes will find you in production.
How DynamoDB Works Internally
Understanding DynamoDB's internals changes how you design for it. AWS has published enough about the architecture (particularly through the 2022 USENIX paper and re:Invent talks) to build a solid mental model.
Request Routing
Every DynamoDB API call hits a request router first. The router authenticates the request, resolves which partition holds the target data by hashing the partition key, and forwards the request to the correct storage node. The router maintains a partition map that tells it which storage node group owns which key range. When partitions split, the router's map updates accordingly.
This architecture means the partition key is the single most important design decision you make. Every read and every write goes through a hash function that determines the physical location. A poorly chosen partition key funnels traffic to a single storage node group, and no amount of provisioned capacity fixes that.
Storage and Replication
Each partition stores data in a B-tree structure on SSD-backed storage. Every partition maintains three replicas spread across three Availability Zones within the region. One replica serves as the leader; the other two are followers. DynamoDB uses a Multi-Paxos consensus protocol to coordinate writes.
The write path works as follows: the leader generates a write-ahead log (WAL) entry, sends it to both followers, and acknowledges the write once two of three replicas (a quorum) persist the log record. This gives you durability across two AZs before the client receives a success response. The leader then applies the write to its B-tree.
Strongly consistent reads go to the leader replica. Eventually consistent reads can go to any replica, which is why they cost half as much (the load spreads across all three replicas instead of concentrating on the leader).
Control Plane vs. Data Plane
DynamoDB maintains a clean separation between control plane and data plane:
| Component | Responsibility | Operations | Availability Impact |
|---|---|---|---|
| Control Plane | Table management, configuration | CreateTable, UpdateTable, DeleteTable, DescribeTable | Configuration changes only |
| Data Plane | Read/write traffic | GetItem, PutItem, Query, Scan, BatchWrite | Direct application impact |
| Auto Admin | Partition management, splitting, health monitoring | Automatic (no API) | Background operations |
The data plane operates independently of the control plane. Once a table is configured, it keeps serving traffic even if the control plane is degraded. This is the same pattern you see across AWS services (AWS Elastic Load Balancing: An Architecture Deep-Dive covers the same separation for ELB). During a control plane outage, your existing tables keep working; you just cannot create new tables or modify existing ones.
Auto Admin and Partition Management
The auto-admin subsystem handles partition health monitoring, splitting, and rebalancing. It runs continuously, watching for partitions that are approaching size limits (10 GB per partition) or throughput limits. When a partition needs to split, auto-admin selects a split point, creates new partitions, migrates data, and updates the request router's partition map.
This process is transparent, but understanding it explains several DynamoDB behaviors that surprise teams in production. Partition splits take time. During splits, the old partition continues serving traffic while data migrates. If you see brief latency spikes during sustained high-throughput writes, partition splits are often the cause.
Capacity Modes and Throughput
DynamoDB offers two capacity modes. Picking the wrong one is the most common cost mistake I see.
On-Demand Mode
On-demand mode charges per request. No capacity planning required. DynamoDB automatically scales to handle your traffic. The service tracks your recent peak and can instantly handle double that peak. If sustained traffic exceeds the doubled peak, the new level becomes the baseline for future scaling.
| Metric | On-Demand Behavior |
|---|---|
| Scaling | Automatic, based on recent traffic peaks |
| Instant capacity | 2x the previous traffic peak |
| New table default | 4,000 WRU/s and 12,000 RRU/s |
| Throttling risk | Possible if traffic spikes beyond 2x previous peak instantly |
| Pricing (US East, Standard) | $1.25 per million WRU; $0.25 per million RRU |
On-demand tables can still throttle. If your traffic jumps from near-zero to 50,000 writes per second without any ramp-up, DynamoDB has not observed enough traffic history to pre-allocate capacity. The 2x scaling rule requires a previous peak to double. For launch-day scenarios or scheduled batch jobs that spike from zero, pre-warm by gradually ramping traffic or temporarily switching to provisioned mode.
Provisioned Mode
Provisioned mode lets you specify read capacity units (RCUs) and write capacity units (WCUs). You pay for the capacity you provision, whether you use it or not.
| Metric | Provisioned Behavior |
|---|---|
| Scaling | Manual or auto-scaling (CloudWatch-driven) |
| Cost (US East, Standard) | $0.00065/WCU/hour; $0.00013/RCU/hour |
| Reserved capacity | Up to 77% savings with 1-year or 3-year commitments |
| Burst capacity | 300 seconds of unused capacity banked |
| Decrease limits | 4 per day initially, up to 27 per 24-hour period |
Auto-scaling with provisioned mode is the cost-optimal choice for workloads with predictable patterns. Set a target utilization of 70%, configure reasonable min/max bounds, and let CloudWatch Alarms trigger scaling actions. The catch: auto-scaling reacts to CloudWatch metrics, which have a 1-2 minute delay. If your traffic spikes faster than that, you will see throttling before auto-scaling catches up.
Choosing Between Modes
flowchart TD
A[New DynamoDB
Table] --> B{Traffic pattern
predictable?}
B -->|Yes| C{Steady baseline
with known peaks?}
B -->|No| D[On-Demand Mode]
C -->|Yes| E[Provisioned +
Auto-Scaling]
C -->|No| F{Budget
constrained?}
F -->|Yes| G[Provisioned +
Reserved Capacity]
F -->|No| D
E --> H{Cost optimization
priority?}
H -->|Yes| I[Add Reserved
Capacity for baseline]
H -->|No| J[Provisioned with
Auto-Scaling only] The rule I follow: start with on-demand for new workloads where you do not know the traffic pattern. After 2-4 weeks of production data, evaluate whether provisioned mode with auto-scaling would cost less. For most steady-state workloads, provisioned mode saves 40-60% compared to on-demand.
Read and Write Unit Mechanics
Understanding how RCUs and WCUs map to actual operations prevents surprise throttling:
| Operation | Unit Cost | Size Unit |
|---|---|---|
| Strongly consistent read | 1 RCU | Per 4 KB |
| Eventually consistent read | 0.5 RCU | Per 4 KB |
| Transactional read | 2 RCU | Per 4 KB |
| Standard write | 1 WCU | Per 1 KB |
| Transactional write | 2 WCU | Per 1 KB |
Items larger than these thresholds consume proportionally more units, rounded up. A 6 KB strongly consistent read costs 2 RCUs. A 3.5 KB write costs 4 WCUs. Keeping items small directly reduces throughput consumption.
Partition Mechanics and Scaling
Partitions are the fundamental scaling unit. Every performance problem I have debugged in DynamoDB traces back to partition behavior.
Partition Throughput Limits
Each partition supports a fixed maximum throughput:
| Resource | Limit per Partition |
|---|---|
| Read throughput | 3,000 RCU |
| Write throughput | 1,000 WCU |
| Storage | 10 GB |
Your table's total throughput is the sum of all partition throughputs. A table with 10 partitions supports up to 30,000 RCUs and 10,000 WCUs in aggregate, but only if traffic distributes evenly across partitions. If 80% of your reads target items in a single partition, that partition's 3,000 RCU limit becomes your effective ceiling regardless of how much capacity you provisioned at the table level.
Split for Heat
When DynamoDB detects a partition receiving sustained high throughput, it automatically splits that partition into two. The split point is chosen based on recent traffic patterns to distribute load evenly between the new partitions. This doubles the available throughput for that key range at no additional cost.
Split for heat cannot help in every scenario:
| Scenario | Split for Heat Effective? | Explanation |
|---|---|---|
| Many items, distributed hot keys | Yes | Split distributes items across new partitions |
| Single hot item | No | The item lives in one partition; splitting does not help |
| LSI present on table | Limited | Cannot split within an item collection |
| Ever-increasing sort key | Limited | New writes always target the latest partition |
The single hot item problem is the most common production issue I encounter. A counter, a leaderboard, a "latest" record: any pattern where the majority of writes target one partition key cannot benefit from split for heat. The solution is application-level sharding: append a random suffix to the partition key and aggregate on read.
Adaptive Capacity
Adaptive capacity complements split for heat. It automatically reallocates unused throughput from cold partitions to hot partitions. If partition A uses 200 WCUs of its 1,000 WCU allocation while partition B needs 1,500 WCUs, adaptive capacity shifts some of partition A's unused capacity to partition B.
This happens automatically with no configuration required. Adaptive capacity reduces, but does not eliminate, throttling from uneven access patterns. It works best when the overall table has enough provisioned capacity; it simply redistributes it more effectively.
Partition Key Design
Good partition key design is the single highest-leverage architecture decision for DynamoDB:
| Pattern | Example | Distribution |
|---|---|---|
| High cardinality, uniform access | UserID, OrderID, SessionID | Excellent |
| High cardinality, skewed access | CustomerID (some customers 1000x more active) | Needs write sharding |
| Low cardinality | Status (ACTIVE/INACTIVE), Region (5 values) | Poor; hot partitions guaranteed |
| Time-based | Date, Hour | Poor; all writes target current period |
For skewed access patterns, use write sharding: append a random number (0-N) to the partition key and scatter-gather on reads. For time-series data, combine the timestamp with a high-cardinality attribute (device ID, sensor ID) as the partition key.
Secondary Indexes
DynamoDB provides two types of secondary indexes, and choosing wrong creates problems that are expensive to fix.
Global Secondary Indexes (GSI)
A GSI creates a separate, fully independent partition structure with its own partition key and sort key. DynamoDB asynchronously replicates items from the base table to each GSI.
Key architecture details:
- GSIs have their own throughput capacity, separate from the base table
- GSI writes are eventually consistent (the replication is asynchronous)
- A throttled GSI back-pressures writes to the base table
- Maximum 20 GSIs per table (adjustable)
- GSIs can project a subset of attributes, reducing storage and throughput costs
Local Secondary Indexes (LSI)
An LSI shares the partition key with the base table but uses a different sort key. All items with the same partition key (across the base table and all LSIs) form an item collection.
| Characteristic | GSI | LSI |
|---|---|---|
| Partition key | Any attribute | Same as base table |
| Sort key | Any attribute | Different from base table |
| Throughput | Independent capacity | Shares base table capacity |
| Consistency | Eventually consistent only | Eventually or strongly consistent |
| When to create | Anytime | Table creation only |
| Item collection limit | None | 10 GB per partition key value |
| Maximum per table | 20 | 5 |
The 10 GB item collection limit on LSIs is a hard constraint. If any partition key's total data (base table items plus all LSI entries) exceeds 10 GB, writes for that partition key fail. I have seen this kill production systems when a high-volume entity (a busy tenant in a multi-tenant system) crosses the threshold with no warning. If you use LSIs, monitor the ItemCollectionSizeLimitExceeded metric.
Single-Table Design
Single-table design stores multiple entity types in one DynamoDB table, using composite partition keys and sort keys to model relationships. The pattern gained popularity through Rick Houlihan's re:Invent talks and Alex DeBrie's "The DynamoDB Book."
Advantages: all related entities co-located in the same partitions, enabling efficient queries across entity types with a single Query operation. No joins needed.
Drawbacks: complex key design, harder to reason about, GSI overloading can make the table opaque to new team members. With the November 2025 launch of multi-attribute composite keys for GSIs (up to four attributes per key), some of the synthetic key concatenation complexity has been reduced.
My recommendation: use single-table design for access-pattern-heavy workloads where you know all query patterns upfront. Use multi-table design when your access patterns evolve frequently or when different entity types have vastly different throughput characteristics.
Global Tables
Global tables replicate a DynamoDB table across multiple AWS regions with sub-second replication latency.
Replication Architecture
Each region maintains a full, independent replica. Writes to any replica propagate to all other replicas asynchronously. DynamoDB uses last-writer-wins conflict resolution based on timestamps for concurrent writes to the same item in different regions.
Consistency Models
Global tables now support two consistency models:
| Model | Abbreviation | Behavior | Use Case |
|---|---|---|---|
| Multi-Region Eventually Consistent | MREC | Writes replicate asynchronously; brief inconsistency window | Most applications; highest availability |
| Multi-Region Strongly Consistent | MRSC | Reads guaranteed to reflect all prior writes globally | Financial transactions, inventory systems |
MRSC is a significant addition (launched at re:Invent 2024). Previously, global tables only supported eventual consistency, which made them unsuitable for workloads requiring guaranteed read-after-write consistency across regions. MRSC uses a coordination protocol across regions, which increases write latency (cross-region round trip) but guarantees consistency.
Multi-Account Global Tables
As of 2025, DynamoDB supports multi-account global tables. You can replicate table data across different AWS accounts and regions, adding account-level isolation. This is valuable for organizations using separate accounts for production, staging, and disaster recovery, or for regulated industries requiring strict account boundaries.
Global Tables and DAX
A critical operational gotcha: writes that arrive at a replica via global table replication bypass DAX. The DAX cache does not update when a replication write occurs. Your cache will serve stale data until the TTL expires. If you use both global tables and DAX, set aggressive TTLs on DAX and accept that reads may lag behind cross-region writes.
DynamoDB Streams and Change Data Capture
DynamoDB Streams captures a time-ordered sequence of item-level modifications. Every write (put, update, delete) generates a stream record.
Stream Architecture
Stream records are organized into shards (similar to Kinesis shards). Each shard has a parent-child relationship that reflects partition splits. Stream records are available for 24 hours. You configure stream view type at the table level:
| View Type | Contents | Use Case |
|---|---|---|
| KEYS_ONLY | Partition key and sort key only | Triggering downstream by key |
| NEW_IMAGE | Complete item after modification | Replication, search index updates |
| OLD_IMAGE | Complete item before modification | Audit trails, rollback |
| NEWANDOLD_IMAGES | Both before and after | Change comparison, CDC pipelines |
Integration Patterns
DynamoDB Streams integrates directly with Lambda for event-driven architectures. Each stream shard can support up to 2 simultaneous readers (1 reader for global tables to avoid throttling). Common patterns:
- Search indexing: Stream changes to OpenSearch (AWS OpenSearch Service: An Architecture Deep-Dive)
- Cross-service replication: Fan out changes via SNS/SQS (AWS Event-Driven Messaging: SNS, SQS, EventBridge, and Beyond)
- Analytics pipeline: Zero-ETL integration with Redshift and SageMaker Lakehouse (launched January 2025)
- Materialized views: Build aggregated views in another table
Kinesis Data Streams Integration
As an alternative to DynamoDB Streams, you can route change data capture records to a Kinesis Data Stream. This gives you longer retention (up to 365 days vs. 24 hours), more consumers per shard, and integration with the broader Kinesis ecosystem. The trade-off is additional cost for the Kinesis stream.
DynamoDB Accelerator (DAX)
DAX is a fully managed, in-memory cache that sits in front of DynamoDB. It provides microsecond read latency for cached items.
DAX Architecture
A DAX cluster runs within your VPC with one primary node and up to 10 read replica nodes. The primary handles writes; read replicas serve read traffic. DAX maintains two caches:
| Cache | Stores | Populated By | TTL Default |
|---|---|---|---|
| Item cache | Individual items by primary key | GetItem, BatchGetItem | 5 minutes |
| Query cache | Full result sets | Query, Scan | 5 minutes |
DAX is a write-through cache: writes go through DAX to DynamoDB, and the item cache updates immediately. The query cache does not invalidate on writes; it relies purely on TTL expiration.
When to Use DAX (and When Not To)
| Scenario | DAX Recommended? | Reason |
|---|---|---|
| Read-heavy, repeated key access | Yes | Microsecond latency, reduced RCU consumption |
| Write-heavy workloads | No | DAX adds latency to writes; minimal benefit |
| Strongly consistent reads required | No | DAX serves eventually consistent data only |
| Infrequent, unique key access | No | Cache miss rate too high; adds latency and cost |
| Global tables | Use with caution | Replication writes bypass DAX; stale cache risk |
DAX Pricing
DAX instance pricing varies by node type:
| Instance Type | vCPUs | Memory | Cost/Hour (US East) |
|---|---|---|---|
| dax.t3.small | 2 | 2 GB | ~$0.04 |
| dax.r5.large | 2 | 16 GB | ~$0.29 |
| dax.r5.xlarge | 4 | 32 GB | ~$0.58 |
| dax.r5.8xlarge | 32 | 256 GB | ~$4.64 |
A production DAX cluster (3 nodes across AZs using r5.large) costs approximately $630/month. Compare that against the RCU cost it replaces to determine ROI.
Pricing and Cost Optimization
DynamoDB pricing catches teams off guard because the cost model is fundamentally different from traditional databases. You pay for throughput, storage, and features independently.
Cost Breakdown
| Component | On-Demand (US East) | Provisioned (US East) |
|---|---|---|
| Writes | $1.25/million WRU | $0.00065/WCU/hour (~$0.47/WCU/month) |
| Reads | $0.25/million RRU | $0.00013/RCU/hour (~$0.09/RCU/month) |
| Storage (Standard) | $0.25/GB/month | $0.25/GB/month |
| Storage (Standard-IA) | $0.10/GB/month | $0.10/GB/month |
| Backups (warm) | $0.10/GB/month | $0.10/GB/month |
| Backups (cold) | $0.03/GB/month | $0.03/GB/month |
| Streams reads | $0.02/100K read requests | $0.02/100K read requests |
| Global table rWRUs | $1.875/million rWRU | N/A (uses CRR WCU) |
Cost Optimization Strategies
1. Right-size capacity mode. On-demand costs 5-7x more per unit than provisioned capacity for steady workloads. Run on-demand for the first month to establish baselines, then switch to provisioned with auto-scaling.
2. Use reserved capacity. For predictable baseline throughput, reserved capacity (1-year or 3-year term) saves up to 77% compared to on-demand pricing.
3. Database Savings Plans. Launched December 2025, these plans offer committed-use discounts across DynamoDB (including on-demand tables), RDS, and other managed databases. Unlike reserved capacity, they apply automatically across accounts and regions.
4. Standard-IA table class. For tables with infrequent access (archival data, configuration stores), Standard-IA cuts storage costs by 60% ($0.10/GB vs. $0.25/GB). Read and write unit costs are approximately 25% higher, so this only saves money when storage dominates your bill.
5. Minimize item size. Compress large attribute values. Store large blobs in S3 and keep only a reference in DynamoDB. Every KB matters because it directly multiplies your RCU/WCU consumption.
6. Project only needed attributes in GSIs. Full attribute projection on GSIs doubles storage and write costs. Use KEYS_ONLY or INCLUDE with specific attributes.
7. Use eventually consistent reads. If your application tolerates it, eventually consistent reads cost half of strongly consistent reads. For many use cases (product catalogs, user profiles displayed on dashboards), eventual consistency is perfectly acceptable.
Failure Modes and Operational Lessons
Throttling
Throttling is the most common operational issue. DynamoDB returns ProvisionedThroughputExceededException when a partition exceeds its throughput limit. The AWS SDKs implement exponential backoff with jitter by default, but sustained throttling degrades application performance.
Root causes I see most frequently:
| Cause | Symptom | Fix |
|---|---|---|
| Hot partition key | Throttling despite low table-level utilization | Redesign partition key; add write sharding |
| GSI throttling | Base table writes rejected | Increase GSI capacity; review GSI key design |
| On-demand cold start | Throttling on new or idle table receiving sudden traffic | Pre-warm with gradual ramp; consider provisioned mode |
| Under-provisioned auto-scaling | Brief throttling during rapid scale-up | Lower target utilization; increase minimum capacity |
| Scan operations | Broad throttling across partitions | Replace Scans with Queries; use parallel scan with rate limiting |
The October 2025 Outage
On October 19-20, 2025, a race condition in an internal DynamoDB microservice that manages DNS records for regional cells caused a 3-hour DynamoDB outage in US-EAST-1. The failure cascaded: because EC2 instance creation depends on DynamoDB for metadata, EC2 could not launch new instances for an additional 12 hours. Over 17 million outage reports across services including Snapchat, Roblox, Reddit, and Venmo.
Lessons from that incident:
- Multi-region is real DR. Single-region DynamoDB (even with multi-AZ replication) does not protect against regional failures. Global tables provide genuine regional independence.
- Understand cascading dependencies. Your application depends on DynamoDB. Other AWS services also depend on DynamoDB internally. A DynamoDB outage affects services you did not realize were coupled.
- Pre-provision, do not scale-on-demand during recovery. After the DynamoDB outage resolved, EC2 could not scale because instance creation was still impaired. If your recovery plan involves launching new compute, and the outage affects compute provisioning, your plan fails.
Item Size Limits
The 400 KB item size limit is hard. DynamoDB rejects any write that would create an item larger than 400 KB. This includes all attribute names and values. Long attribute names consume your item size budget. Use short, abbreviated attribute names for high-volume tables (store the human-readable mapping in your application code).
Transaction Limits
DynamoDB transactions support up to 100 items per transaction, with a total transaction size limit of 4 MB. Transactions cost 2x the standard read/write units. Design your data model to minimize transaction scope; if you regularly need to update more than 100 items atomically, DynamoDB may not be the right fit for that specific operation.
Service Quotas Quick Reference
| Quota | Default | Adjustable |
|---|---|---|
| Tables per account per region | 2,500 | Yes (max 10,000) |
| Item size | 400 KB | No |
| Partition key length | 2,048 bytes | No |
| Sort key length | 1,024 bytes | No |
| GSIs per table | 20 | Yes |
| LSIs per table | 5 | No |
| LSI item collection size | 10 GB | No |
| Partition throughput (read) | 3,000 RCU | No |
| Partition throughput (write) | 1,000 WCU | No |
| Partition storage | 10 GB | No |
| Account read throughput per region | 80,000 RCU | Yes |
| Account write throughput per region | 80,000 WCU | Yes |
| Table throughput (on-demand) | 40,000 RRU/WRU | Yes |
| Projected attributes across all indexes | 100 | No |
| Concurrent backups | 50 | Yes |
| Stream readers per shard | 2 | No |
| Transaction items | 100 | No |
| Transaction size | 4 MB | No |
DynamoDB vs. Alternatives
Choosing DynamoDB requires understanding where it fits and where it does not.
flowchart TD
A[Data Storage
Decision] --> B{Need relational
queries, joins,
complex SQL?}
B -->|Yes| C[Aurora / RDS]
B -->|No| D{Access patterns
known and stable?}
D -->|Yes| E{Need single-digit
ms latency at
any scale?}
D -->|No| F{Need flexible
queries on
document data?}
E -->|Yes| G[DynamoDB]
E -->|No| H{Need full-text
search or
analytics?}
F -->|Yes| I[DocumentDB /
MongoDB]
F -->|No| G
H -->|Yes| J[OpenSearch]
H -->|No| G | Criteria | DynamoDB | Aurora (MySQL/PostgreSQL) | Cassandra (Self-Managed) |
|---|---|---|---|
| Operational burden | Zero (fully managed) | Low (managed, some tuning) | High (cluster ops, compaction, repairs) |
| Scaling model | Automatic partitioning | Vertical + read replicas | Horizontal (add nodes) |
| Consistency | Eventual or strong (per-read) | Strong (ACID) | Tunable per query |
| Query flexibility | Partition key + sort key + GSIs | Full SQL | CQL (SQL-like, limited joins) |
| Cost at scale | High for write-heavy workloads | Moderate (compute-based) | Low (commodity hardware) |
| Multi-region | Global tables (managed) | Aurora Global Database | Built-in (manual config) |
| Vendor lock-in | High (proprietary API) | Moderate (standard SQL) | None (open source) |
DynamoDB excels when you need a fully managed, infinitely scalable database with predictable single-digit millisecond latency and you can design your access patterns around partition keys. It struggles when you need ad-hoc queries, complex aggregations, or when write volumes make the per-request cost model prohibitive.
Key Patterns
After years of building on DynamoDB, these are the patterns that consistently matter:
Design for partitions first. Every performance characteristic flows from partition key design. Invest time upfront modeling your access patterns and validating that your partition key distributes traffic evenly. Fixing a partition key in production means migrating data.
Start on-demand, migrate to provisioned. On-demand removes the risk of under-provisioning during early development and launch. After you have production traffic data, provisioned mode with auto-scaling and reserved capacity typically saves 50-70%.
Monitor GSI throttling independently. GSI throughput is separate from base table throughput. A throttled GSI throttles your base table writes. Set CloudWatch alarms on WriteThrottleEvents for every GSI.
Use eventually consistent reads by default. Strongly consistent reads cost twice as much and concentrate load on the partition leader. Only use strong consistency when your application genuinely requires read-after-write guarantees.
Keep items small. Every additional KB in item size multiplies your RCU and WCU consumption. Use short attribute names for high-throughput tables. Store large values in S3.
Plan for the 400 KB limit. Applications that store growing lists or embedded arrays in a single item will hit the 400 KB wall. Design your data model with item growth in mind; use a one-to-many pattern with separate items instead of unbounded lists within a single item.
Test your DR story with global tables. The October 2025 US-EAST-1 outage proved that single-region deployments have a regional blast radius. If DynamoDB availability matters to your business, deploy global tables and validate that your application actually fails over correctly.
Additional Resources
- Amazon DynamoDB Developer Guide
- Best Practices for Designing and Architecting with DynamoDB
- DynamoDB Pricing
- Scaling DynamoDB: How Partitions, Hot Keys, and Split for Heat Impact Performance
- The DynamoDB Book by Alex DeBrie
- Amazon DynamoDB re:Invent 2024 Recap
- DynamoDB Service Quotas
- Optimizing Costs on DynamoDB Tables
- Reliability Lessons from the 2025 AWS DynamoDB Outage
Let's Build Something!
I help teams ship cloud infrastructure that actually works at scale. Whether you're modernizing a legacy platform, designing a multi-region architecture from scratch, or figuring out how AI fits into your engineering workflow, I've seen your problem before. Let me help.
Currently taking on select consulting engagements through Vantalect.

