Skip to main content

AWS S3 Cost Optimization: The Complete Savings Playbook

AWSCost OptimizationS3

About the author: I'm Charles Sieg, a cloud architect and platform engineer who builds apps, services, and infrastructure for Fortune 1000 clients through Vantalect. If your organization is rethinking its software strategy in the age of AI-assisted engineering, let's talk.

S3 is the most used service on AWS and, for many organizations, the single largest line item on the bill after compute. The insidious thing about S3 costs is that they creep. Nobody notices when a bucket grows from 10 TB to 50 TB over six months because the data is "just sitting there." Then the bill arrives and the storage line has tripled. I have audited AWS accounts where S3 spending dropped 60-70% after a week of lifecycle policies, storage class changes, and cleaning up forgotten multipart uploads. The savings were always there. Nobody had looked.

This is the first in a series on AWS cost optimization. I'm starting with S3 because it is the service where the gap between what teams pay and what they should pay is consistently the widest. What follows covers every lever available for reducing S3 spend, with specific pricing numbers, break-even calculations, and the operational gotchas that bite teams who optimize too aggressively.

Where S3 Money Actually Goes

Before optimizing anything, you need to understand what S3 actually charges for. Most engineers think of S3 as "storage costs." Storage is usually less than half the bill.

The Three Cost Pillars

S3 pricing has three independent dimensions, and ignoring any one of them leaves money on the table:

Cost DimensionWhat It CoversTypical Share of Bill
StorageGB stored per month, varies by storage class30-50%
RequestsPUT, GET, LIST, HEAD, DELETE API calls20-40%
Data transferEgress to internet, cross-region, cross-AZ15-30%

The exact split depends on your workload. A data lake with infrequent reads is storage-dominated. A CDN origin bucket serving millions of requests is request-dominated. A cross-region replication setup is transfer-dominated. Know your split before choosing optimization strategies.

How to Find Your Split

S3 Storage Lens gives you the full picture. Enable the default dashboard (free tier covers 28 usage metrics) and look at the cost breakdown by bucket. For request-level detail, enable S3 server access logging or use CloudTrail data events, then analyze with Athena. I run a monthly query against CloudTrail logs that breaks down API call volume by bucket, operation type, and source. The results consistently surprise teams who assumed their costs were all storage.

Storage Class Selection

S3 offers seven storage classes. Choosing the wrong one for your access pattern is the most common source of overspending.

The Storage Class Spectrum

Storage ClassStorage $/GB/moPUT Cost (per 1K)GET Cost (per 1K)Retrieval FeeMin DurationMin Object Size
S3 Standard$0.023$0.005$0.0004NoneNoneNone
S3 Intelligent-Tiering$0.023 (frequent)$0.005$0.0004NoneNoneNone*
S3 Standard-IA$0.0125$0.01$0.001$0.01/GB30 days128 KB
S3 One Zone-IA$0.01$0.01$0.001$0.01/GB30 days128 KB
Glacier Instant Retrieval$0.004$0.02$0.01$0.03/GB90 days128 KB
Glacier Flexible Retrieval$0.0036$0.03$0.0004$0.01-0.03/GB90 daysNone
Glacier Deep Archive$0.00099$0.05$0.0004$0.02/GB180 daysNone

*Intelligent-Tiering monitors objects > 128 KB; smaller objects stay in Frequent Access tier.

The pricing gap between Standard and Deep Archive is 23x. A 100 TB dataset sitting in S3 Standard costs $2,300/month. The same data in Deep Archive costs $99/month. That $2,201/month difference is $26,412/year. For data you access once a quarter at most, there is no reason to keep it in Standard.

Intelligent-Tiering: My Default Recommendation

For any bucket where access patterns are unpredictable or mixed, S3 Intelligent-Tiering is the right default. It automatically moves objects between tiers based on access frequency:

TierTriggerStorage RateSavings vs. Standard
Frequent AccessDefault (or accessed recently)$0.023/GB0%
Infrequent Access30 days without access$0.0125/GB46%
Archive Instant Access90 days without access$0.004/GB83%
Archive Access (opt-in)90 days without access$0.0036/GB84%
Deep Archive Access (opt-in)180 days without access$0.00099/GB96%

The monitoring fee is $0.0025 per 1,000 objects per month. For objects larger than 128 KB, this fee pays for itself the moment one object drops to the Infrequent Access tier ($0.0105/GB savings). The math only breaks down for buckets with millions of small files that are all accessed frequently; in that case, you pay the monitoring fee and get no tiering benefit.

I now set Intelligent-Tiering as the default storage class on every new bucket unless I have a specific reason not to. The opt-in Archive Access and Deep Archive tiers add even more savings for cold data without requiring lifecycle policies.

When to Use Specific Classes Instead

ScenarioRecommended ClassWhy
Hot data, frequent readsS3 StandardNo retrieval fees, lowest request costs
Predictably cold after N daysStandard-IA or Glacier via lifecycleLifecycle rules are cheaper than IT monitoring for known patterns
Reproducible data (can regenerate)One Zone-IA20% cheaper; single-AZ risk acceptable
Compliance archives (7+ year retention)Glacier Deep Archive$0.00099/GB; 23x cheaper than Standard
Mixed/unknown access patternsIntelligent-TieringAutomatic optimization; no operational overhead

Lifecycle Policies

Lifecycle policies automate storage class transitions and object expiration. They are the highest-impact cost optimization tool for S3, and most teams either do not use them or configure them too conservatively.

Transition Rules

A lifecycle transition rule moves objects from one storage class to another after a specified number of days. The transition waterfall typically follows this pattern:

flowchart LR
  A[S3 Standard
$0.023/GB] -->|30 days| B[Standard-IA
$0.0125/GB] B -->|60 days| C[Glacier Instant
$0.004/GB] C -->|90 days| D[Glacier Flexible
$0.0036/GB] D -->|180 days| E[Deep Archive
$0.00099/GB]
S3 lifecycle transition waterfall

Each transition incurs a per-request fee. The fee varies by destination class:

Transition TargetCost per 1,000 Transitions
Standard-IA$0.01
One Zone-IA$0.01
Intelligent-Tiering$0.01
Glacier Instant Retrieval$0.02
Glacier Flexible Retrieval$0.03
Glacier Deep Archive$0.05

These transition fees matter for buckets with millions of objects. Transitioning 10 million objects to Glacier Flexible costs $300 just in transition fees. For small objects, the transition fee can exceed the storage savings. Rule of thumb: do not transition objects smaller than 128 KB to IA or Glacier classes. The minimum billable size is 128 KB for those tiers anyway; smaller objects are charged as if they were 128 KB.

Expiration Rules

Lifecycle expiration permanently deletes objects after a specified age. This is the simplest and most impactful cost optimization for data with known retention requirements.

Common expiration targets:

Data TypeTypical RetentionAnnual Cost per TB at Standard
Application logs30-90 days$276 (if kept all year)
Build artifacts14-30 days$276 (if kept all year)
Temporary uploads1-7 days$276 (if kept all year)
Analytics staging data7-14 days$276 (if kept all year)
Database backups30-365 days$276 (if kept all year)

The "Annual Cost" column shows what a TB costs if you never clean it up. Adding a 30-day expiration to a log bucket that accumulates 1 TB per month saves $276/month in storage alone, because you are only ever storing 1 TB instead of an ever-growing pile.

Transition Cost Traps

I've seen three lifecycle misconfigurations repeatedly:

Trap 1: Too many transitions. Going Standard to IA to Glacier Instant to Glacier Flexible to Deep Archive means paying four transition fees per object. Skip intermediate steps if the data's access pattern supports it. Go directly from Standard to Glacier Flexible after 60 days if you rarely need the data.

Trap 2: Minimum duration charges. Standard-IA has a 30-day minimum. If you delete or overwrite an object after 15 days, AWS charges you for the full 30 days. Glacier Flexible has a 90-day minimum. Deep Archive has a 180-day minimum. Transitioning data that gets deleted within the minimum duration costs more than leaving it in Standard.

Trap 3: Small object overhead. Objects smaller than 128 KB in IA and Glacier classes are billed as 128 KB. A 1 KB object in Standard-IA costs the same storage as a 128 KB object. If your bucket has millions of small files, the storage "savings" from transitioning to IA are illusory.

Versioning Cost Control

S3 versioning is mandatory for cross-region replication and useful for accidental deletion protection. It also silently doubles or triples your storage costs if you do not manage noncurrent versions.

The Hidden Cost of Versioning

When versioning is enabled, every overwrite creates a new version while the old version persists. Deleting an object does not actually delete it; S3 places a delete marker on top while all previous versions remain. An application that overwrites a 1 MB object daily in a versioned bucket accumulates 365 MB of noncurrent versions per year for that single object.

I audited a client's account where versioned buckets held 300 TB of noncurrent versions. Nobody had configured noncurrent version expiration. They were paying $3,000/month in storage for data they could never access without knowing the specific version ID. After adding a lifecycle rule to expire noncurrent versions after 30 days, their storage dropped by 280 TB over the following month.

Noncurrent Version Expiration

Add this lifecycle rule to every versioned bucket:

ConfigurationRecommended ValueRationale
NoncurrentVersionExpiration days30Covers most "oops, I need the old version" scenarios
NoncurrentVersionsToRetain3Keeps last 3 versions for rollback
ExpiredObjectDeleteMarkertrueCleans up orphaned delete markers

For compliance workloads that require longer retention, transition noncurrent versions to Glacier Deep Archive after 30 days instead of expiring them. Storage drops from $0.023/GB to $0.00099/GB while maintaining version history.

Request Cost Optimization

Request costs are the sleeper expense in S3. A workload making 100 million GET requests per month pays $40,000 in request fees alone (at $0.0004 per 1,000 GETs). Many teams never check this because they assume "storage is the cost."

Consolidating Small Objects

The most impactful request optimization: reduce the number of objects. An application that stores one JSON record per S3 object and reads them individually generates one GET per record. Batching 1,000 records into a single object reduces GET requests by 999x.

PatternObjectsMonthly GETsMonthly GET Cost
One record per object100M100M$40.00
100 records per batch1M1M$0.40
1,000 records per batch100K100K$0.04

The same principle applies to writes. Buffering data and writing larger objects less frequently reduces PUT costs proportionally.

Reducing LIST Operations

LIST operations cost $0.005 per 1,000 requests, 12.5x more expensive than GET. Applications that poll S3 for new files by listing bucket contents generate surprisingly large bills. I worked on a data pipeline that called ListObjectsV2 every 10 seconds across 50 prefixes. That is 432,000 LIST requests per day, costing $65/month just for listing.

Alternatives to polling with LIST:

ApproachCostLatency
S3 Event Notifications to SQS$0.40 per million messagesSeconds
S3 Event Notifications to LambdaLambda invocation costSeconds
EventBridge integration$1.00 per million eventsSeconds
S3 Inventory (batch)$0.0025 per million objects listedDaily/weekly

Event-driven architectures eliminate LIST polling entirely. See AWS Event-Driven Messaging: SNS, SQS, EventBridge, and Beyond for the full pattern.

CloudFront for Read-Heavy Workloads

Serving S3 objects through CloudFront reduces both request costs and data transfer costs. Data transfer from S3 to CloudFront is free. CloudFront then serves cached responses from edge locations without hitting S3.

For a bucket serving 10 million GETs per month with an 80% cache hit rate, CloudFront reduces S3 GET requests from 10M to 2M, saving $3.20/month in request costs plus all the egress savings. The math improves as request volume grows. See Amazon CloudFront: An Architecture Deep-Dive for CloudFront architecture and pricing details.

Data Transfer Cost Reduction

S3 data transfer charges apply to data leaving S3. Inbound data transfer (uploads to S3) is free. Outbound follows AWS's standard egress pricing tiers.

Transfer Pricing Tiers

DestinationCost per GB
Same region, same AZ (via VPC endpoint)Free
Same region, cross-AZ$0.01 per direction
S3 to CloudFrontFree
S3 to internet (first 10 TB/mo)$0.09
S3 to internet (next 40 TB/mo)$0.085
S3 to internet (next 100 TB/mo)$0.07
Cross-region replication$0.02

VPC Gateway Endpoints

If your EC2 instances or Lambda functions access S3 within the same region, a VPC Gateway Endpoint routes traffic over AWS's internal network at no cost. Without the endpoint, traffic routes through a NAT Gateway at $0.045/GB. For a workload transferring 10 TB/month from S3 to EC2, a VPC endpoint saves $450/month in NAT Gateway data processing charges alone.

VPC Gateway Endpoints for S3 are free to create and free to use. There is no reason not to have one in every VPC that accesses S3. See Cutting AWS Egress Costs with a Centralized VPC and Transit Gateway for the full egress optimization architecture.

flowchart TD
  S3[S3 Bucket] -->|Free| CF[CloudFront
Edge Locations] CF -->|$0.085/GB| INT[Internet
End Users] S3 -->|$0.09/GB| INT2[Internet
Direct Egress] S3 -->|Free via
VPC Endpoint| EC2[EC2 / Lambda
Same Region] S3 -->|$0.045/GB via
NAT Gateway| EC2B[EC2 / Lambda
No VPC Endpoint] S3 -->|$0.02/GB| S3B[S3 Bucket
Other Region] style EC2B fill:#ff6b6b,color:#fff style INT2 fill:#ff6b6b,color:#fff
S3 data transfer cost optimization paths

The red paths are the expensive ones. Eliminating NAT Gateway routing and direct internet egress are the two highest-impact transfer optimizations.

S3 Transfer Acceleration

S3 Transfer Acceleration uses CloudFront edge locations to speed up uploads over long distances. It costs $0.04-0.08/GB on top of standard transfer pricing. Only use it when upload speed from distant clients genuinely matters. I have seen teams enable Transfer Acceleration "just in case" and add thousands per month to their bill for uploads that originate from the same region as the bucket.

Incomplete Multipart Uploads

This is the easiest money to recover. Incomplete multipart uploads are partial file uploads that started and never finished. They sit in your bucket invisibly, consuming storage, and S3 charges you for every byte. You cannot see them in the S3 console's normal object listing. They do not appear in bucket size metrics. They show up only in S3 Storage Lens or through the ListMultipartUploads API.

How They Accumulate

Any application using the multipart upload API (required for files over 5 GB, commonly used for files over 100 MB) can leave incomplete uploads behind. Network failures, application crashes, timeout misconfigurations, and abandoned large file transfers all contribute. I have found buckets with terabytes of incomplete multipart uploads dating back years.

The Fix: One Lifecycle Rule

Add this lifecycle rule to every bucket:

{
  "Rules": [
    {
      "ID": "AbortIncompleteMultipartUploads",
      "Status": "Enabled",
      "Filter": {},
      "AbortIncompleteMultipartUpload": {
        "DaysAfterInitiation": 7
      }
    }
  ]
}

Seven days is safe for virtually all workloads. If a multipart upload has not completed in seven days, it never will. Some teams use 1-3 days. I default to 7 to be conservative.

This single rule, applied account-wide, consistently saves 5-15% of total S3 storage costs in accounts that have never configured it.

Monitoring and Visibility

S3 Storage Lens

Storage Lens is the most useful S3 monitoring tool that most teams never enable. The free tier provides 28 metrics including storage by class, request counts, and incomplete multipart upload tracking. The advanced tier ($0.20 per million objects monitored per month) adds 35 additional metrics including cost efficiency scores and activity metrics.

Storage Lens FeatureFree TierAdvanced Tier
Storage metrics28 metrics63 metrics
Activity metricsNoYes
Cost efficiency metricsBasic (4 metrics)Comprehensive
Prefix-level aggregationNoYes
CloudWatch publishingNoYes
Data retention14 days15 months
CostFree$0.20 per million objects/month

For accounts with fewer than 50 million objects, the advanced tier costs less than $10/month and provides the visibility needed to identify every optimization opportunity. Enable it.

Storage Class Analysis

S3 Storage Class Analysis monitors access patterns for individual buckets or prefixes over 30+ days and recommends whether objects should transition to a different storage class. It generates actionable recommendations based on actual access data rather than guesswork.

Enable Storage Class Analysis on your largest buckets first. After 30 days, check the recommendations. I typically find that 40-60% of objects in S3 Standard across a typical account should be in a lower-cost tier.

Key Savings Patterns

After optimizing dozens of AWS accounts, these are the consistently highest-impact actions, ranked by typical savings:

PriorityActionTypical SavingsEffort
1Lifecycle expiration for temporary data20-40% of storageLow
2Noncurrent version expiration10-30% of storageLow
3Abort incomplete multipart uploads5-15% of storageLow
4Intelligent-Tiering as default class20-50% of storageLow
5VPC Gateway EndpointsEliminates NAT costs for S3 trafficLow
6CloudFront for public content30-50% of transfer + request costsMedium
7Object consolidation (small files)50-99% of request costsHigh
8Lifecycle transitions to Glacier70-95% of storage for cold dataMedium
9Event-driven architecture (replace LIST polling)Variable; eliminates LIST costsHigh

The first five items take less than an hour to implement across an entire account and typically reduce the S3 bill by 30-50%. Start there.

Additional Resources

Let's Build Something!

I help teams ship cloud infrastructure that actually works at scale. Whether you're modernizing a legacy platform, designing a multi-region architecture from scratch, or figuring out how AI fits into your engineering workflow, I've seen your problem before. Let me help.

Currently taking on select consulting engagements through Vantalect.