Amazon CloudFront: An Architecture Deep-Dive

Amazon CloudFront is one of the most underestimated services in the AWS portfolio. Most teams think of it as a caching layer you put in front of your S3 bucket or Application Load Balancer to speed up static asset delivery. That understanding was roughly correct in 2015. It is incomplete today. CloudFront has evolved into a globally distributed edge compute and security platform that handles request routing, WAF enforcement, DDoS mitigation, authentication, A/B testing, header manipulation, and serverless compute, all before a request ever reaches your origin. This article covers the architectural patterns and operational lessons I have accumulated from architecting systems that serve traffic through CloudFront across dozens of AWS accounts.

This article serves as an architecture reference for engineers and architects who need to understand how CloudFront works under the hood and how to optimize it.

What CloudFront Actually Is

CloudFront started life in 2008 as a straightforward content delivery network: cache objects at edge locations close to users, reduce latency, offload origin traffic. Over the past seventeen years, AWS has systematically expanded CloudFront's capabilities until it has become something fundamentally different from what it was originally designed to be.

Today, CloudFront is better understood as a globally distributed request processing platform with caching as one of its capabilities. A request arriving at a CloudFront edge location can be:

Served from cache (the traditional CDN function)
Transformed by CloudFront Functions (lightweight JavaScript at the edge)
Processed by Lambda@Edge (full Lambda functions at regional edge caches)
Inspected and filtered by AWS WAF rules
Authenticated via signed URLs or signed cookies
Routed to different origins based on path patterns, headers, or cookies
Failed over to a backup origin if the primary is unhealthy
Logged in real-time to Kinesis Data Streams

All of this happens before the request reaches your origin, and it often never does. For many workloads, the majority of requests are handled entirely at the edge.

The practical implication for architects: evaluating CloudFront purely on its caching performance misses most of its value. The edge compute, security integration, and origin architecture capabilities are where CloudFront differentiates itself, particularly for AWS-native workloads.

CloudFront Architecture Internals

Understanding CloudFront's physical and logical architecture is essential for making good decisions about cache behavior, origin configuration, and performance optimization.

The Three-Tier Topology

CloudFront operates a three-tier infrastructure:

Tier	Count	Purpose	Characteristics
Edge locations	600+ globally	Serve cached content, execute CloudFront Functions, terminate TLS	Lowest latency to end users, smallest cache capacity
Regional edge caches (RECs)	13 globally	Second-level cache between edge and origin	Larger cache capacity, longer TTLs, reduce origin fetches
Origin Shield	Optional, per-region	Centralized cache layer in front of origin	Single point of origin contact, maximizes cache hit ratio

When a request arrives at an edge location and the content is not cached locally, the edge does not go directly to your origin. Instead, it checks the regional edge cache. If the REC has the content, it serves it to the edge location, which caches it locally and serves the user. Only if the REC also has a cache miss does the request go to the origin (or to Origin Shield, if enabled).

This tiered architecture means that even a cache miss at the edge does not necessarily result in an origin fetch. In practice, regional edge caches absorb a significant percentage of cache misses from individual edge locations, because they aggregate cache from dozens of edge locations in their region.

CloudFront three-tier cache topology

Origin Shield

Origin Shield is the most impactful CloudFront feature that most teams do not enable. It adds a fourth layer (a single, centralized cache in front of your origin) that collapses all origin-bound requests into a single point.

Without Origin Shield, each regional edge cache independently fetches from your origin on a cache miss. If you have content requested from ten RECs simultaneously (during a cache invalidation or a new deployment, for example), your origin receives ten requests for the same object. With Origin Shield, all ten RECs route their misses through a single Origin Shield node, which makes one request to the origin and fans the response out.

The impact is significant for workloads with:

High cardinality content (millions of unique URLs) where cache hit ratios at individual RECs are low
Expensive origin computation where each origin request triggers database queries or API calls
Origin scaling constraints where the origin cannot handle high concurrent request rates

Origin Shield adds approximately $0.0090 per 10,000 requests in the US/Europe. For any origin that is not trivially cheap to operate, this pays for itself immediately.

Request Routing

CloudFront uses a combination of anycast IP addresses and latency-based DNS routing to direct users to the nearest edge location. When a user resolves a CloudFront domain name, Route 53 (which handles CloudFront's DNS) returns the IP address of the edge location with the lowest measured latency from the user's DNS resolver location.

CloudFront uses latency-based routing rather than simple geography. AWS continuously measures latency between DNS resolver networks and edge locations, so a user in a city with two nearby edge locations will be routed to the one with better current network performance, even if it is farther away geographically.

How CloudFront Scales

CloudFront edge locations scale horizontally within each PoP. During traffic spikes, CloudFront automatically provisions additional capacity at affected edge locations. This scaling is invisible to you; there is no pre-warming step, no capacity reservation, no scaling policy to configure.

However, there is an important nuance: your origin still needs to handle the traffic that misses cache. CloudFront scaling protects you from the full force of user traffic, but origin requests still scale proportionally to cache miss rate × total traffic. If your cache hit ratio drops (due to an invalidation, a deployment, or a change in traffic patterns), your origin can suddenly see a spike in requests. Origin Shield and proper cache key design are your primary defenses against this.

Origin Architecture

The origin is where CloudFront fetches content that is not in cache. CloudFront's origin capabilities have expanded substantially and now support sophisticated failover, access control, and connection management patterns.

S3 Origins

S3 is the most common CloudFront origin, and the integration has become increasingly tight over the years.

Origin Access Control (OAC) is the current best practice for securing S3 origins. OAC uses IAM-based authentication: CloudFront signs requests to S3 using SigV4, and the S3 bucket policy grants access only to the CloudFront distribution. This means your S3 bucket can remain completely private (no public access) while CloudFront serves its content globally.

Origin Access Identity (OAI) is the legacy mechanism that OAC replaces. OAI used a special CloudFront identity rather than IAM signing. It still works, but it has limitations: no support for S3 server-side encryption with KMS (SSE-KMS), no support for S3 bucket policies that require specific conditions, and no support for requests that use the S3 POST method. If you are still using OAI, migrate to OAC.

Feature	OAC	OAI
SSE-KMS support	Yes	No
S3 bucket policy conditions	Full support	Limited
POST method support	Yes	No
SigV4 signing	Yes	No
New distributions	Recommended	Legacy

ALB and NLB Origins

For dynamic content, CloudFront origins are typically Application Load Balancers or Network Load Balancers. The key architectural consideration here is origin protocol policy: CloudFront can connect to ALB/NLB origins over HTTP or HTTPS.

For ALB origins, I always configure HTTPS between CloudFront and the ALB, even though the traffic traverses AWS's internal network. This provides defense-in-depth and satisfies compliance requirements. The ALB should use a certificate from ACM (free, auto-renewing), and the CloudFront origin configuration should verify the certificate.

A critical security pattern: configure the ALB's security group to only accept traffic from CloudFront's managed prefix list (com.amazonaws.global.cloudfront.origin-facing). This prevents anyone from bypassing CloudFront by hitting the ALB directly, which would bypass your WAF rules, caching, and edge security.

Origin Groups and Failover

Origin groups allow you to configure automatic failover between a primary and secondary origin. When CloudFront receives a configurable set of HTTP status codes from the primary origin (typically 500, 502, 503, 504), it automatically retries the request against the secondary origin.

Common failover patterns:

Pattern	Primary Origin	Secondary Origin	Use Case
S3 cross-region	S3 bucket in us-east-1	S3 bucket in us-west-2	Static site disaster recovery
ALB multi-region	ALB in primary region	ALB in DR region	Dynamic app disaster recovery
S3 static fallback	ALB (dynamic)	S3 (static maintenance page)	Graceful degradation during outages

Connection Management

CloudFront maintains persistent connections to your origins, reusing TCP and TLS connections across multiple requests. This matters because TLS handshakes are expensive; a full TLS 1.3 handshake adds 1-2 round trips. By reusing connections, CloudFront amortizes this cost across many requests.

The Origin Keep-Alive Timeout (default 5 seconds, configurable up to 60 seconds) controls how long CloudFront keeps idle connections open to the origin. For origins with steady traffic, increasing this to 30-60 seconds reduces connection setup overhead. For origins with very spiky traffic, the default is fine; you do not want CloudFront holding thousands of idle connections.

The Origin Read Timeout (default 30 seconds, configurable up to 60 seconds) controls how long CloudFront waits for a response from the origin. If your origin has endpoints that take longer than 30 seconds (large report generation, for example), you must increase this. Otherwise, CloudFront returns a 504 to the user.

Cache Behavior

Cache behavior configuration is where most CloudFront complexity lives, and it is where the most common misconfigurations occur.

Cache Keys

The cache key determines what CloudFront uses to identify a unique cached object. The default cache key behavior depends on whether you use cache policies or legacy cache settings. With the recommended CachingOptimized managed cache policy, the cache key includes only the URL path (query strings, headers, and cookies are excluded). With legacy cache settings (no cache policy attached), CloudFront includes the URL path and all query string parameters. Two requests with the same cache key components but different excluded headers or cookies will return the same cached response.

Excluding query strings from the cache key (the CachingOptimized default) is usually what you want for static content. For dynamic content, you need to include additional components in the cache key: specific query string parameters, specific headers (like Accept-Language for localized content), or specific cookies (like a session cookie for per-user caching).

The cardinal rule of cache key design: include the minimum necessary to produce a correct response. Every additional component in the cache key reduces your cache hit ratio. Including the Authorization header in the cache key, for example, effectively creates a per-user cache. Your cache hit ratio drops to near zero, and every request goes to the origin.

Cache Policies vs. Origin Request Policies

This distinction is one of CloudFront's most important and most confusing concepts.

Policy Type	Controls	Purpose
Cache policy	What goes into the cache key + TTL settings	Determines cache hit/miss behavior
Origin request policy	What headers/cookies/query strings are forwarded to the origin	Determines what the origin sees on a cache miss

These are independent. You can include a header in the origin request policy (so the origin receives it) without including it in the cache policy (so it does not affect caching). This separation matters: you can forward the User-Agent header to your origin for analytics without making every unique user agent create a separate cache entry.

Before this separation existed (the "legacy cache settings" model), forwarding a header to the origin automatically added it to the cache key. This forced a trade-off between origin functionality and cache efficiency. The policy model eliminates that trade-off entirely.

Cache policy vs. origin request policy

TTL Hierarchy

CloudFront determines how long to cache an object using a priority hierarchy:

Origin Cache-Control or Expires headers. Highest priority. If your origin sends Cache-Control: max-age=3600, CloudFront caches for 3600 seconds.
Cache policy minimum/maximum/default TTL. CloudFront enforces these as bounds. If the origin says max-age=86400 (24 hours) but your cache policy has a maximum TTL of 3600 (1 hour), CloudFront caches for 1 hour.
Default TTL. Used when the origin does not send Cache-Control or Expires headers.

My recommendation: set Cache-Control headers at the origin (this is the most explicit and portable approach), configure the cache policy with a reasonable default TTL (3600 seconds for most content), and set a maximum TTL that prevents accidentally caching something forever.

Cache Invalidation

Cache invalidation removes objects from all CloudFront edge caches before their TTL expires. You can invalidate specific paths (/images/logo.png) or wildcard paths (/images/*).

Key operational details:

Invalidations propagate globally in 60-300 seconds (not instantaneous)
The first 1,000 invalidation paths per month are free; additional paths cost $0.005 each
A wildcard invalidation (/*) counts as one path regardless of how many objects it affects
Invalidations are eventually consistent; there is no way to guarantee that all edge locations have purged the object at a specific point in time

For deployments, I prefer cache-busting filenames (appending a hash to the filename, like app.a1b2c3d4.js) over invalidations. Cache-busting is instantaneous (the new filename is a new cache key), free, and deterministic. Invalidation is slow, eventually consistent, and costs money at scale.

Real-Time Logs

CloudFront real-time logs stream request-level data to Kinesis Data Streams within seconds of the request. This is fundamentally different from standard access logs (which are delivered to S3 with a delay of minutes to hours).

Real-time logs include cache hit/miss status, edge location, time-to-first-byte, and other metrics that are essential for understanding CDN performance. I use real-time logs with Kinesis Data Firehose to stream to OpenSearch for real-time dashboards, and to S3 for long-term analysis.

The sampling rate is configurable (1-100%), which controls cost. For most workloads, a 10-25% sample rate provides sufficient signal for performance analysis without the cost of logging every request.

Security at the Edge

CloudFront's security capabilities are extensive and deeply integrated with the broader AWS security ecosystem. This integration is one of the strongest arguments for CloudFront over third-party CDNs for AWS-native workloads.

AWS WAF Integration

CloudFront is a first-class WAF attachment point. You can associate an AWS WAF web ACL directly with a CloudFront distribution, and WAF rules are evaluated at the edge, before the request reaches your origin or even your VPC.

This is architecturally significant because it means:

DDoS traffic is dropped at the edge, not at your origin
Bot management happens globally, at the point closest to the bot
Rate limiting is enforced per-edge-location with global aggregation
Geo-blocking prevents requests from restricted regions from ever entering your infrastructure
Custom rules using IP sets, regex patterns, and SQL injection detection run at CloudFront's scale

For AWS-native architectures, WAF on CloudFront is the perimeter. Everything behind it (ALBs, API Gateways, Lambda functions) serves as a second line of defense.

AWS Shield

Every CloudFront distribution automatically receives Shield Standard at no additional cost. Shield Standard provides protection against the most common Layer 3 and Layer 4 DDoS attacks: SYN floods, UDP reflection attacks, and other volumetric attacks.

Shield Advanced ($3,000/month per organization, not per distribution) adds:

24/7 access to the AWS DDoS Response Team (DRT)
Advanced attack detection and mitigation
Cost protection (AWS credits your account for scaling charges caused by DDoS attacks)
Health-based detection using Route 53 health checks
Detailed attack forensics and reporting

For any organization where a DDoS attack would cause significant business impact, Shield Advanced on CloudFront is the most cost-effective DDoS protection available in AWS. The $3,000/month covers all CloudFront distributions in the organization, plus all ALBs, NLBs, Elastic IPs, and Global Accelerator endpoints.

Signed URLs and Signed Cookies

For content that should only be accessible to authorized users (premium video, paid downloads, private APIs), CloudFront supports signed URLs and signed cookies. Both use RSA key pairs managed through CloudFront key groups.

Method	Use Case	Granularity
Signed URLs	Individual file access, one-time downloads	Per-URL, time-limited
Signed cookies	Access to multiple files (video segments, image sets)	Per-cookie, covers URL patterns

The key architectural decision: signed URLs are simpler but require generating a unique URL for each asset. Signed cookies are more complex to implement but allow seamless access to sets of related files (like all segments of an HLS video stream) with a single authentication event.

Field-Level Encryption

Field-level encryption allows you to encrypt specific POST body fields at the edge using your own RSA public key. The encrypted fields traverse CloudFront, your ALB, and your application in encrypted form. Only the service with the private key can decrypt them.

This provides defense-in-depth for sensitive data like credit card numbers or social security numbers. Even if an attacker compromises your application server, the encrypted fields are opaque.

Edge Compute

CloudFront offers two edge compute options with fundamentally different architectures, capabilities, and cost models.

CloudFront Functions

CloudFront Functions run at edge locations (the tier closest to users) and execute in a lightweight JavaScript runtime purpose-built for high-volume, latency-sensitive transformations.

Characteristic	CloudFront Functions
Runtime	JavaScript (ECMAScript 2015+ with the 2.0 runtime)
Execution location	Edge locations (600+)
Maximum execution time	1 ms
Maximum memory	2 MB
Maximum package size	10 KB
Network access	No
File system access	No
Trigger events	Viewer request, viewer response
Price	$0.10 per million invocations

CloudFront Functions are ideal for lightweight, high-frequency transformations: URL rewrites, header manipulation, cache key normalization, redirect logic, request/response header addition, and simple A/B testing based on cookies or headers.

Lambda@Edge

Lambda@Edge runs at regional edge caches (not edge locations) and executes in the full Lambda runtime: Node.js or Python, with access to network, the Lambda execution environment, and up to 10 GB of package size.

Characteristic	Lambda@Edge
Runtime	Node.js, Python
Execution location	Regional edge caches (13)
Maximum execution time	5 seconds (viewer triggers), 30 seconds (origin triggers)
Maximum memory	128-3,008 MB
Maximum package size	1 MB (viewer), 50 MB (origin)
Network access	Yes
File system access	Yes (read-only /tmp)
Trigger events	Viewer request, viewer response, origin request, origin response
Price	$0.60 per million invocations + duration charges

Lambda@Edge is appropriate for operations that require network calls (authentication against an external IdP, fetching configuration from DynamoDB), complex logic (content personalization, dynamic origin selection), or response generation (rendering HTML at the edge, generating redirects based on database lookups).

Choosing Between Them

Use Case	CloudFront Functions	Lambda@Edge
URL rewrite/redirect	Yes	Overkill
Header manipulation	Yes	Overkill
Cache key normalization	Yes	Overkill
Simple A/B test (cookie-based)	Yes	Overkill
JWT validation (local, no network)	Yes (if < 1ms)	Yes
Authentication against external IdP	No (needs network)	Yes
Dynamic origin selection	No (needs origin trigger)	Yes
HTML rendering at edge	No (too constrained)	Yes
Image transformation	No	Yes
Geolocation-based content	Yes (header-based)	Yes

My general rule: start with CloudFront Functions. They are 6x cheaper and run at the edge (lower latency). Only move to Lambda@Edge when you need network access, origin-phase triggers, or the execution time exceeds 1ms.

CloudFront Functions vs. Lambda@Edge trigger locations

Pricing Deep-Dive

CloudFront Security Savings Bundle

The Security Savings Bundle is a 1-year commitment that provides up to 30% savings on CloudFront charges. You commit to a monthly spend level (minimum $100/month), and AWS credits your account for CloudFront usage up to that commitment at a discounted rate.

The bundle also includes AWS WAF usage for requests processed by CloudFront at no additional charge (normally $0.60 per million WAF requests). For workloads that use WAF on CloudFront, the bundle savings are even more significant.

My recommendation: if your CloudFront bill has been stable above $100/month for three consecutive months, enable the Security Savings Bundle. The commitment is minimal and the savings are automatic.

Price Classes

CloudFront lets you restrict which edge locations serve your content through price classes:

Price Class	Edge Locations Included	Relative Cost
All	All 600+ global locations	Highest
200	US, Europe, Asia, Middle East, Africa	Moderate
100	US, Europe	Lowest

If your users are primarily in North America and Europe, Price Class 100 reduces costs by excluding more expensive edge locations in South America, Asia Pacific, and other regions. Users in excluded regions still receive service from the nearest included edge location, which may be farther away.

Free Tier

CloudFront's free tier is permanent (not a 12-month trial) and includes:

1 TB of data transfer out per month
10,000,000 HTTP/HTTPS requests per month
2,000,000 CloudFront Functions invocations per month

For small sites and development environments, this free tier often covers the entire CloudFront cost.

Common Failure Modes

Cache Miss Storms

When you invalidate a popular object or deploy new content, every edge location simultaneously requests the object from the origin. For high-traffic sites, this can overwhelm the origin with thousands of concurrent requests for the same content.

Mitigation: Enable Origin Shield, which collapses concurrent requests into a single origin fetch. Also consider staggered deployments (updating content gradually rather than invalidating everything at once) and request collapsing at the origin (which Origin Shield does automatically).

502 and 504 Errors

502 (Bad Gateway): CloudFront could not connect to the origin, or the origin returned an invalid response. Common causes:

Origin is down or unreachable
Origin SSL certificate is invalid or expired
Security group on the ALB does not allow traffic from CloudFront
Origin is returning a response larger than the maximum (20 GB for download distributions)

504 (Gateway Timeout): CloudFront connected to the origin but did not receive a response within the Origin Read Timeout (default 30 seconds). Common causes:

Origin processing time exceeds the timeout
Network issues between CloudFront and the origin
Origin is overloaded and dropping connections

For 504 errors, the fix is often increasing the Origin Read Timeout or optimizing the origin's response time. Avoid blindly increasing the timeout to 60 seconds. If your origin routinely takes more than 30 seconds to respond, that is an origin performance problem that should be addressed directly.

Invalidation Propagation Delays

After issuing an invalidation, it takes 60-300 seconds to propagate to all edge locations. During this window, some users may receive stale content while others receive fresh content. This is expected behavior but can cause issues for deployments that require atomic cutover.

Mitigation: Use cache-busting filenames for assets (CSS, JS, images) and only use invalidation for HTML documents or URLs that cannot be versioned. For truly atomic deployments, use CloudFront's continuous deployment feature to create a staging distribution that shares the same origin, test against it, and then promote it.

Origin Failover Timing

When using origin groups for failover, CloudFront waits for the primary origin to fail (timeout or return a configured error code) before trying the secondary. This means failover adds the primary origin's timeout to the total response time for the first failed request. If your Origin Read Timeout is 30 seconds, users may wait 30 seconds before the failover kicks in.

Mitigation: Reduce the Origin Read Timeout for the primary origin in failover configurations (10 seconds is often appropriate), and ensure the secondary origin can handle the redirected traffic.

Operational Patterns

Blue/Green Deployments

CloudFront's continuous deployment feature enables blue/green deployments at the CDN layer. You create a staging distribution that shares the same production origin, route a percentage of traffic to it using a header-based or weight-based policy, validate behavior, and then promote the staging configuration to production.

This is particularly useful for testing cache behavior changes, new WAF rules, or Lambda@Edge function updates without affecting all users.

Multi-Origin Architectures

A single CloudFront distribution can route to multiple origins based on path patterns (cache behaviors). Common pattern:

Path Pattern	Origin	Use Case
`/api/*`	ALB	Dynamic API requests
`/static/*`	S3 bucket	Static assets (CSS, JS, images)
`/media/*`	S3 bucket (different bucket)	User-uploaded media
`Default (*)`	ALB	HTML pages

This allows you to serve an entire application (static assets, dynamic API, and media) through a single domain and a single CloudFront distribution, with each path pattern optimized for its content type.

A/B Testing at the Edge

CloudFront Functions can implement A/B testing by examining or setting cookies on viewer request:

On viewer request, check for an A/B cookie
If absent, assign the user to a variant (A or B) and set the cookie
Modify the request URL or add a header based on the variant
Origin serves the appropriate variant based on the modified request

This pattern moves A/B test assignment to the edge (sub-millisecond) and ensures consistent variant assignment through cookies, without any origin-side logic.

Conclusion

CloudFront's value for AWS-native architectures extends far beyond caching. The patterns I have found most effective in production:

Enable Origin Shield for any non-trivial origin. The cost is minimal and the origin traffic reduction is significant. This is the single highest-impact CloudFront optimization.
Separate cache policies from origin request policies. This lets you forward headers to the origin without polluting the cache key, which significantly improves cache hit ratios for dynamic content.
Use OAC for all S3 origins. OAC provides full IAM-based security and supports SSE-KMS. OAI is a legacy mechanism; migrate any remaining OAI configurations to OAC.
Lock down your ALB to CloudFront traffic only. Use the CloudFront managed prefix list in your ALB security group. Without this, anyone can bypass your CDN and WAF by hitting the ALB directly.
Use cache-busting filenames instead of invalidations. Invalidations are slow, eventually consistent, and cost money at scale. Versioned filenames are instantaneous and free.
Start with CloudFront Functions, escalate to Lambda@Edge only when needed. CloudFront Functions are 6x cheaper and run closer to the user. Only use Lambda@Edge when you need network access or origin-phase triggers.

CloudFront continues to evolve. VPC origins, continuous deployment, and CloudFront Functions have all been added in recent years. For architects building on AWS, it is a platform worth understanding at the architectural level, and treating as more than a checkbox on a deployment diagram.