Skip to main content

Lambda Behind ALB Behind CloudFront: An Architecture Deep-Dive

About the author: I'm Charles Sieg, a cloud architect and platform engineer who builds apps, services, and infrastructure for Fortune 1000 clients through Vantalect. If your organization is rethinking its software strategy in the age of AI-assisted engineering, let's talk.

Five ways to expose a Lambda function over HTTP. At least. AWS keeps adding more. Most teams pick API Gateway on day one and never revisit that decision. Fine. API Gateway handles a lot.

I kept hitting walls, though. The 29-second timeout? Killed my report-generation endpoints. Per-request pricing ate my budget once volume got real. The shared regional throttle turned into a coordination mess across teams. So I landed on CloudFront in front of an Application Load Balancer with Lambda targets. Routing flexibility. CDN edge capabilities. Serverless compute. No per-request pricing tax. No hard timeout ceiling.

Architecture reference. I'm assuming you already know Lambda, ALB, and CloudFront as individual services. How they work together, why you'd combine them, where the sharp edges hide: that's what this covers.

Why This Pattern Exists

API Gateway is the default front door for Lambda APIs. Auth, throttling, request validation, deployment stages. Good defaults. Three constraints bite at scale.

The 29-second hard timeout on REST APIs (30 seconds on HTTP APIs) can't be increased. Full stop. Report generation, batch processing, large data transformations. Anything that might run longer? Can't serve it through API Gateway. I hit this on a financial reporting system that needed 45 seconds for complex aggregations. No workaround.

Per-request pricing scales linearly. REST API: $3.50 per million requests. HTTP API: $1.00 per million. Hundreds of millions of requests per month and that line item alone dwarfs your Lambda compute cost.

Then there's the 10,000 requests-per-second regional throttle. Shared. Account-level. Across all API Gateway APIs in a region. You can request an increase, sure. Still a shared ceiling. Still forces coordination across teams. I burned two weeks once tracking down intermittent 429 errors. Another team's API was eating the shared quota. Two weeks.

November 2018, AWS announced ALB support for Lambda targets. ALB uses LCU-based pricing: pay for capacity consumed, not individual requests. No hard timeout beyond Lambda's own 15-minute max. No shared regional throttle. Each ALB scales on its own.

What does CloudFront add? Global edge caching (eliminating redundant Lambda invocations). AWS WAF integration. DDoS protection via Shield. TLS termination at the edge. An ALB alone gives you none of that.

For deeper discussion of the individual services, see the Amazon API Gateway: An Architecture Deep-Dive, the AWS Elastic Load Balancing: An Architecture Deep-Dive, and the Amazon CloudFront: An Architecture Deep-Dive.

Signals that should make you evaluate this pattern:

SignalWhy It Matters
Monthly request volume exceeds 30MPer-request API Gateway pricing begins to dominate your serverless bill
Endpoints that may exceed 29 secondsAPI Gateway's hard timeout is fixed; ALB has no equivalent limit
Multiple independent Lambda functions behind one domainALB listener rules route to independent target groups with no shared throttle
Need for WAF and DDoS protectionCloudFront provides AWS WAF and Shield Standard at the edge
Read-heavy workloads with cacheable responsesCloudFront caching eliminates origin requests entirely for cached content
Authentication offloadingALB natively integrates with Cognito and OIDC providers, removing auth logic from Lambda
Hitting API Gateway throttle limitsALB has no shared regional throttle ceiling

How ALB Lambda Targets Work

Register a Lambda function as a target in an ALB target group and here's what happens. The ALB invokes your function synchronously via the Lambda Invoke API with the RequestResponse invocation type. Constructs a JSON event from the HTTP request. Calls the function. Waits for the complete response. Translates the JSON return value back into HTTP.

Totally different mental model from EC2 targets. No persistent connections. No connection pooling. No draining. Every request: independent Lambda invocation.

Why does the event format matter? Because ALB sends its own format to your Lambda function. Distinct from both API Gateway event formats. Writing handlers that work across multiple invocation paths means dealing with these differences.

FieldALB EventAPI Gateway v1 (REST)API Gateway v2 (HTTP)
Request contextrequestContext.elb with target group ARNrequestContext with API ID, stage, identityrequestContext with API ID, route key, HTTP method
HTTP methodhttpMethodhttpMethodrequestContext.http.method
Pathpathpath and resourcerawPath
Query parametersqueryStringParametersqueryStringParametersrawQueryString and queryStringParameters
Headersheadersheadersheaders (comma-delimited multi-value)
Bodybodybodybody
Base64 encodingisBase64EncodedisBase64EncodedisBase64Encoded
Path parametersNot availablepathParameterspathParameters
Stage variablesNot availablestageVariablesstageVariables
CookiesVia headers onlyVia headerscookies (dedicated array)

Multi-value headers and query parameters. This one will bite you. By default, ALB sends only the last value when a header or query parameter appears multiple times. Breaks Set-Cookie responses (which require multiple values). Breaks any API using repeated query parameters like ?id=1&id=2. Fix? Enable multi-value headers on the target group. Changes the event format to include multiValueHeaders and multiValueQueryStringParameters. Always do this. Zero downside. Skip it and enjoy a subtle production bug that takes hours to find.

Health checks on Lambda targets. Different cost story than EC2. Every health check is a real Lambda invocation you pay for. Default 35-second interval across three AZs generates roughly 2,200 health check invocations per day per target group. At $0.20 per million requests, that cost is negligible. Cold start risk, though. That's the real concern. A health check hitting a cold Lambda can temporarily mark the target unhealthy.

ParameterDefaultRecommended for LambdaNotes
Interval35 seconds60–300 secondsReduces invocation cost and cold start exposure
Timeout30 seconds10 secondsLambda health checks should respond fast; a slow response signals a problem
Healthy threshold53Faster recovery after transient cold start failures
Unhealthy threshold23Avoids flapping from a single cold start timeout
Path//healthUse a dedicated lightweight handler that avoids database calls
Success codes200200Keep it simple

The Full Request Flow

You need to understand the full request path. User to Lambda and back. Otherwise debugging latency, configuring timeouts, and designing cache strategies is guesswork. I've spent more time than I'd like to admit tracing requests through this stack. CloudWatch logs open in three tabs. Notebook full of scribbled latency numbers.

flowchart LR
  U[User] --> CF[CloudFront
Edge Location]
  CF -->|Cache Miss| ALB[Application
Load Balancer]
  ALB --> L443["HTTPS Listener
:443"]
  L443 --> R1["Rule 1:
/api/users/*"]
  L443 --> R2["Rule 2:
/api/orders/*"]
  L443 --> RD["Default Rule"]
  R1 --> TG1[users-tg]
  R2 --> TG2[orders-tg]
  RD --> TGD[default-tg]
  TG1 --> LA[user-service
Lambda]
  TG2 --> LB[order-service
Lambda]
  TGD --> LD[default-handler
Lambda]
Request flow: CloudFront to ALB to Lambda with multi-function routing

Each stage:

  1. DNS resolution. Browser resolves your domain to a CloudFront edge location via Route 53 (or whatever DNS provider you use). Anycast routing picks the nearest edge.
  2. TLS termination at CloudFront. This is the single largest latency win in the whole architecture. TLS handshake happens within a few milliseconds of the user. Not hundreds of milliseconds away at your ALB's region.
  3. Cache lookup. CloudFront checks its edge cache. Origin Shield enabled? Also checks the regional cache. Cache hit means the response returns immediately. No ALB contact. No Lambda invocation. This is where cost savings live for read-heavy workloads.
  4. Origin request to ALB. Cache miss. CloudFront opens (or reuses) a persistent HTTPS connection to the ALB. Forwards the request with original headers plus whatever you've configured in your origin request policy.
  5. Listener rule evaluation. ALB evaluates rules in priority order. First match wins.
  6. Lambda invocation. ALB invokes the Lambda in the matching target group. Synchronous. Holds the connection open until Lambda returns or times out.
  7. Response propagation. Response flows back through ALB to CloudFront. Cache behavior and response headers permit it? CloudFront caches the response, sends it to the user.
StageTypical LatencyNotes
DNS resolution1–50 msCached after first lookup; Route 53 alias records add zero latency
TLS handshake (edge)5–30 msHappens at nearest edge location; TLS 1.3 reduces round trips
CloudFront cache check<1 msIn-memory lookup at the edge
CloudFront to ALB1–20 msPersistent connections; same-region ALB is lowest latency
ALB rule evaluation<1 msRule matching is in-memory and extremely fast
Lambda execution5–500+ msDominated by cold start (if any) and function logic
Response propagation2–30 msReturn path through ALB and CloudFront to user

Warm Lambda, same-region ALB, cache miss: total overhead from CloudFront and ALB runs 10–40 ms. Cache hit? Under 5 ms straight from the edge. Your ALB and Lambda locations stop mattering entirely.

CloudFront Configuration for ALB Origins

Lots of settings to get wrong here. Failure modes are subtle.

Origin configuration. HTTPS-only protocol. Keep-alive timeout of 30–60 seconds (connection reuse between CloudFront and ALB). Origin domain points to the ALB's DNS name (the *.region.elb.amazonaws.com hostname). Add a custom origin header (something like X-Origin-Verify with a secret value) that your ALB or Lambda validates. That's what ensures requests only arrive through CloudFront.

Cache behavior design. Where most complexity lives. A typical API mixes cacheable and non-cacheable paths. Design cache behaviors matching URL structure from day one. I learned the hard way. Retrofitting cache policies onto an existing API? Painful.

Path PatternCache PolicyOrigin Request PolicyTTLUse Case
/api/catalog/*Custom: cache on path, Accept headerForward Host, Authorization excluded300sProduct catalog, read-heavy, cacheable
/api/users/*CachingDisabledAllViewerExceptHostHeader0User-specific data, never cache
/api/auth/*CachingDisabledAllViewerExceptHostHeader0Authentication endpoints, never cache
/assets/*CachingOptimizedNone86400sStatic assets, long TTL
Default (*)CachingDisabledAllViewerExceptHostHeader0Catch-all for dynamic content

WAF placement. Three options: CloudFront, ALB, or both. Put WAF on CloudFront for rate limiting, geo-blocking, bot control. Edge evaluation before traffic hits your region. Only attach WAF to the ALB if you need rules inspecting headers or body content that CloudFront doesn't forward. Both layers? Doubles your WAF cost. Be deliberate about which rules go where. For a deeper look at CloudFront edge security, see the Amazon CloudFront: An Architecture Deep-Dive.

Multi-Function Routing

This is why the architecture earns its complexity. ALB's routing model.

API Gateway can route to multiple Lambda functions. ALB gives you way more flexibility. With API Gateway, all Lambda functions behind a single API share a regional throttle limit. ALB target groups? Independent invocation paths. No shared throttle. order-service takes a traffic spike and user-service keeps running. Unaffected. I watched API Gateway's shared throttle cause cascading failures across completely unrelated services once. That single experience pushed me toward ALB for multi-function workloads.

Path-based routing sends different URL prefixes to different Lambda functions, each in its own target group. Host-based routing means a single ALB serves multiple domains backed by different functions. Pay the ALB fixed cost once. Consolidation adds up fast.

flowchart TD
  ALB[Application Load Balancer] --> L["HTTPS Listener :443"]
  L --> R1["Priority 1
Host: api.example.com
Path: /users/*"]
  L --> R2["Priority 2
Host: api.example.com
Path: /orders/*"]
  L --> R3["Priority 3
Host: admin.example.com
Path: /*"]
  L --> RD["Default Rule
Fixed 404 Response"]
  R1 --> TG1["users-tg"] --> F1["user-service
Lambda"]
  R2 --> TG2["orders-tg"] --> F2["order-service
Lambda"]
  R3 --> TG3["admin-tg"] --> F3["admin-service
Lambda"]
Multi-function ALB routing with independent target groups

Seven condition types on ALB listener rules. All work with Lambda targets.

Condition TypeExampleNotes
host-headerapi.example.comEnables multi-tenant or multi-domain routing on a single ALB
path-pattern/api/users/*Most common condition; supports wildcards
http-headerX-Api-Version: v2Route by custom header value; useful for API versioning
http-request-methodGETRoute reads vs. writes to different functions
query-stringaction=exportRoute specific query patterns to specialized handlers
source-ip10.0.0.0/8Route internal vs. external traffic differently
Combined conditionsHost + Path + MethodUp to 5 conditions per rule for precise routing

Up to five conditions per rule, AND logic. A single listener supports 100 rules (adjustable via quota request). API Gateway can't touch this routing granularity.

Routing CapabilityALBAPI Gateway RESTAPI Gateway HTTP
Path-based routingYesYesYes
Host-based routingYesCustom domains onlyCustom domains only
Header-based routingYesNoNo
Query string routingYesNoNo
HTTP method routingYesYesYes
Source IP routingYesNoNo
Weighted routingYes (weighted target groups)Canary deploymentsNo
Fixed responseYes (custom status + body)Mock integrationNo
Redirect actionsYes (301/302)NoNo
Multiple conditions per ruleUp to 5 AND conditionsOne resource pathOne route
Independent throttle per routeYes (separate target groups)No (shared regional)No (shared regional)
Max rules per endpoint100 per listenerUnlimited routesUnlimited routes

Cost Analysis

Depends heavily on request volume and cache hit ratio. Raw numbers first, then the factors that shift things.

Each component in the CloudFront + ALB + Lambda path has its own pricing model.

ComponentPricing ModelKey MetricApproximate Rate (US East)
CloudFront requestsPer-requestHTTPS requests$1.00 per 1M requests
CloudFront data transferPer-GBData out to internet$0.085 per GB (first 10 TB)
ALB fixedHourlyRunning hours$16.20 per month
ALB LCUHourly per LCUHighest of 4 dimensions$0.008 per LCU-hour
Lambda requestsPer-requestInvocations$0.20 per 1M requests
Lambda computePer-GB-secondMemory x duration$0.0000166667 per GB-second

Total monthly cost comparison across five invocation paths. Assumptions: 128 MB Lambda, 100 ms average duration, 2 KB average request+response size, HTTPS, US East, no free tier, no CloudFront caching (worst case for CloudFront paths).

Monthly RequestsAPI GW RESTAPI GW HTTPCF + ALBFunction URLCF + Function URL
1M$4$1$18<$1$2
10M$39$14$33$4$16
100M$391$141$184$41$158
500M$1,838$685$853$205$790
1B$3,443$1,340$1,689$410$1,580

Without caching, CloudFront + ALB breaks even with API Gateway REST around 10–20 million requests per month. On raw per-request cost alone? Never beats API Gateway HTTP API. The advantage comes from two places: CloudFront caching (eliminates origin requests entirely) and operational capabilities (WAF, DDoS protection, edge TLS; stuff you'd otherwise pay for separately or can't get at all through API Gateway).

Now add caching. 80% cache hit ratio is realistic for read-heavy APIs. Product catalogs, config data, public content. Only 20% of requests actually reach ALB and Lambda. At 500 million requests per month that drops CloudFront + ALB from $853 to roughly $648. Cheaper than HTTP API's $685. Higher hit ratio, more savings. I've run APIs at 90%+ hit ratios. The numbers get very compelling at that point.

Hidden Cost FactorImpactNotes
CloudFront cachingReduces origin costs 50–90%Every cache hit eliminates ALB and Lambda charges entirely
ALB health checks$0.50–2/month per target groupEach health check invokes Lambda; use longer intervals
Provisioned Concurrency$0.0000041667/GB-secondEliminates cold starts but adds steady-state cost
CloudFront real-time logs$0.01 per 1M log linesOptional; standard logs are free but delayed
WAF rules$5/month per web ACL + $1/rule + $0.60/M requestsSame cost regardless of attachment point
Data transfer (ALB to Lambda)FreeLambda invocation within the same region incurs no data transfer charge

Comparison With Alternatives

Five distinct paths to invoke a Lambda function over HTTP now. Different trade-offs on each one.

flowchart LR
  subgraph P1["Path 1: REST API"]
    direction LR
    C1[Client] --> APIR[API Gateway
REST API] --> L1[Lambda]
  end
  subgraph P2["Path 2: HTTP API"]
    direction LR
    C2[Client] --> APIH[API Gateway
HTTP API] --> L2[Lambda]
  end
  subgraph P3["Path 3: CloudFront + ALB"]
    direction LR
    C3[Client] --> CF1[CloudFront] --> ALB1[ALB] --> L3[Lambda]
  end
  subgraph P4["Path 4: Function URL"]
    direction LR
    C4[Client] --> FU1["Function
URL"] --> L4[Lambda]
  end
  subgraph P5["Path 5: CloudFront + Function URL"]
    direction LR
    C5[Client] --> CF2[CloudFront] --> FU2["Function
URL"] --> L5[Lambda]
  end
Five Lambda HTTP invocation paths
CapabilityREST APIHTTP APICF + ALBFunction URLCF + Function URL
Max timeout29s30s15 min15 min15 min
Max request payload10 MB10 MB1 MB6 MB6 MB
Max response payload10 MB10 MB1 MB6 MB6 MB
Response streamingNoNoNoYesYes
WebSocketYes (dedicated type)NoNoNoNo
Built-in cachingYes (REST only)NoYes (CloudFront)NoYes (CloudFront)
WAF supportYes (auto-CF on edge)NoYes (CF and/or ALB)NoYes (CloudFront)
DDoS protectionShield StandardNoneShield Standard (CF)NoneShield Standard (CF)
Custom domain + TLSYesYesYesYes (partial)Yes
Auth offloadingAuthorizers, IAM, CognitoAuthorizers, IAM, JWTALB Cognito/OIDCIAM onlyIAM only
Request transformationYes (VTL mapping)NoNoNoNo
Request validationYesNoNoNoNo
Multi-function routingYesYesYesNo (single function)No (single function)
Per-route throttlingYesNoNoNoNo
Usage plans / API keysYesNoNoNoNo
Regional throttle limit10K RPS (shared)10K RPS (shared)NoneNoneNone
mTLSYesYesYes (CF to ALB)NoNo
gRPCNoYesNo (Lambda targets)NoNo
OpenAPI importYesYesNoNoNo
Canary deploymentsYesNoWeighted target groupsAlias routingAlias routing
Pricing modelPer-requestPer-requestLCU + fixed + per-request (CF)Per-request (Lambda only)Per-request (Lambda + CF)

When to use each path:

  • API Gateway REST API. You need request validation, VTL transformations, usage plans, or WebSocket support. Keep volume under 100M requests/month or the bill gets ugly.
  • API Gateway HTTP API. Simplest, cheapest API Gateway option. JWT auth built in. Good default for greenfield.
  • CloudFront + ALB. Multi-function routing, caching, WAF, auth offloading, no timeout or throttle limits. Volume needs to justify the ALB fixed cost.
  • Function URL. Single function, direct HTTP, minimal overhead. Webhooks. Internal service-to-service. Streaming. Dead simple.
  • CloudFront + Function URL. Single function needing edge caching, WAF, or a custom domain. No multi-function routing. Simpler than ALB. No fixed cost.

Cold Starts and Performance

Straightforward. Also unforgiving. ALB invokes Lambda synchronously and waits. Cold start takes 3 seconds? User waits 3 seconds. No automatic retry on cold start latency. Same as API Gateway there. Health checks add a wrinkle, though.

Here's what burned me in production. Lambda goes cold. Health check fires. Times out because the cold start exceeds the timeout window. Enough consecutive failures and ALB marks the target unhealthy. Stops routing traffic. 503 errors until subsequent checks succeed. But the function needs traffic to stay warm. And the ALB stopped sending traffic because the function was cold. Deadlock.

Three mitigations:

  • Increase the health check timeout to at least 10 seconds. Unhealthy threshold to 3. Give the function time to finish a cold start before ALB gives up.
  • Increase the health check interval to 60–300 seconds. Fewer invocations, less cold start exposure. Yes, the function is more likely to be cold when a check arrives. Acceptable trade-off. Real user traffic keeps functions warm. Health checks don't.
  • Use Provisioned Concurrency. Eliminates cold starts entirely. Keeps a specified number of execution environments initialized and ready. Health checks always hit a warm function. Problem solved.

Connection draining with Lambda targets is simpler than EC2. Deregister a Lambda from a target group during deployment and the ALB just completes in-flight invocations. No persistent connections to drain. Set deregistration delay slightly longer than your function's max execution time. 30–60 seconds for most API workloads.

Security Architecture

Five security layers. Each one addresses a different threat category.

flowchart TD
  CF["CloudFront Edge
TLS termination, geo-restriction,
Shield Standard DDoS protection"]
  WAF["AWS WAF
Rate limiting, IP reputation,
managed rule groups, bot control"]
  SG["Security Group
CloudFront managed prefix list
blocks all non-CloudFront traffic"]
  AUTH["ALB Authentication
Cognito user pool or
OIDC provider integration"]
  LAMBDA["Lambda Function
IAM execution role,
least-privilege permissions,
optional VPC placement"]
  CF --> WAF
  WAF --> SG
  SG --> AUTH
  AUTH --> LAMBDA
Security layers from edge to compute

Preventing ALB bypass. Most critical security concern here. Your ALB has a public DNS name by default. Anyone can call it directly. Skip CloudFront, skip WAF, skip geo-restrictions, skip all edge security. I've watched pen testers find exposed ALBs within minutes. Two layers of defense.

First: configure the ALB's security group to allow inbound traffic only from the CloudFront managed prefix list (com.amazonaws.global.cloudfront.origin-facing). AWS-managed list of IP ranges. Only CloudFront can reach your ALB at the network layer.

Second: add a custom origin header in CloudFront origin configuration (X-Origin-Verify: <secret-value>) and validate it in your ALB listener rules or Lambda. Even if an attacker routes traffic from CloudFront IP ranges somehow, they can't forge the secret header. Rotate periodically with Secrets Manager.

ALB native authentication. Underappreciated feature. ALB authenticates users against Cognito user pools or any OIDC provider (Okta, Auth0, Azure AD) before the request touches Lambda. Handles the full OAuth 2.0 / OIDC flow: redirect to provider, token exchange, token validation. Passes authenticated claims to Lambda in the x-amzn-oidc-data header. No auth logic in your function code. Entire class of security bugs gone.

Permission RelationshipResourcePrincipalPurpose
CloudFront to ALBALB listenerCloudFront (via prefix list + custom header)Allow CloudFront to forward requests to ALB
ALB to LambdaLambda function resource policyelasticloadbalancing.amazonaws.comAllow ALB to invoke the Lambda function
Lambda execution roleAWS services (DynamoDB, S3, etc.)Lambda functionAllow Lambda to access backend resources
WAF to CloudFrontCloudFront distributionWAF web ACLAttach WAF rules to the distribution
ALB to CognitoCognito user poolALBAllow ALB to authenticate users against Cognito

Limitations and Gotchas

Constraints. Several of them. Design around them from the start or pay later.

LimitationConstraintAPI Gateway EquivalentImpact
Request payload1 MB10 MBThe single biggest constraint; affects file uploads, large JSON bodies
Response payload1 MB10 MBLimits large API responses; pagination becomes mandatory
No WebSocketLambda targets only support HTTPREST API supports WebSocket API typeUse API Gateway WebSocket API or AppSync for real-time
No response streamingALB waits for complete Lambda responseAPI Gateway also does not streamUse Function URLs if streaming is required
No request validationMust validate in Lambda codeREST API has built-in JSON schema validationMove validation to Lambda or a shared middleware layer
No request transformationMust transform in Lambda codeREST API has VTL mapping templatesImplement in Lambda; arguably cleaner than VTL anyway
No usage plans or API keysNot availableREST API has full usage plan supportImplement in Lambda or use a third-party API management layer
No OpenAPI importMust configure listener rules manually or via IaCBoth REST and HTTP APIs support OpenAPI importUse CloudFormation, CDK, or Terraform for rule management
No per-route throttlingALB does not throttleREST API supports per-method throttlingImplement in Lambda or use WAF rate-based rules per path
No built-in canaryMust use weighted target groupsREST API supports canary deploymentsUse ALB weighted target groups with Lambda aliases
Note
The 1 MB payload limit is the most impactful constraint. Both request and response bodies are limited to 1 MB when Lambda is an ALB target. This is a hard limit that cannot be increased. If your API accepts file uploads, returns large datasets, or handles substantial JSON payloads, you must design around this: use presigned S3 URLs for file uploads, implement pagination for large responses, and compress payloads where possible. If the 1 MB limit is a deal-breaker for even a single endpoint, consider routing that specific path through API Gateway or a Function URL while keeping the rest of your API on ALB.

Multi-value headers. Another quiet failure. Forget to enable multi-value header support on the target group and Lambda silently receives only the last value of any repeated header. Missing cookies. Broken CORS when Access-Control-Allow-Origin appears in both origin and ALB. Wrong query parameters. Fix takes 30 seconds (flip the flag). Finding the problem takes hours. Nothing errors out. Data just vanishes.

Response streaming. Doesn't exist for Lambda behind ALB. ALB buffers the complete Lambda response before sending anything to the client. Need server-sent events? Chunked transfer encoding? Lambda response streaming? Function URLs with InvokeWithResponseStream. That's your only option.

Infrastructure as Code

More AWS resources than API Gateway. That's the trade-off. Each resource is independently manageable. The total count is just higher. Plan your IaC accordingly.

ResourcePurposeKey Configuration
CloudFront distributionEdge caching, WAF, TLS terminationOrigin pointing to ALB DNS name, cache behaviors per path pattern
CloudFront origin request policyControls which headers/cookies reach ALBForward only necessary headers to maximize cache hit ratio
CloudFront cache policyDefines cache key compositionVaries per path: caching for reads, disabled for writes
WAF web ACLRequest filtering and rate limitingAttach to CloudFront distribution; managed rule groups recommended
ALBRequest routing to Lambda targetsInternet-facing or internal; HTTPS listener with ACM certificate
ALB listenerTLS termination and rule evaluationHTTPS on port 443; default action returns 404 or routes to catch-all
ALB listener rulesPath/host routing to target groupsOne rule per route pattern; priority ordering matters
ALB target groupsLambda function registrationOne target group per Lambda function; multi-value headers enabled
Security groupNetwork-level access controlInbound HTTPS from CloudFront managed prefix list only
Lambda functionsBusiness logicHandler must return ALB-compatible response format
Lambda resource policiesAuthorization for ALB invocationAllow elasticloadbalancing.amazonaws.com to invoke each function

Compare that to API Gateway. Single AWS::ApiGateway::RestApi or AWS::ApiGatewayV2::Api resource gives you routing, throttling, TLS. Done.

With this pattern you manage the routing layer yourself. Full control over listener rules, target group weights, health check parameters. All independently modifiable without redeploying your entire API. I prefer that control. More resources to wrangle in CloudFormation or Terraform, though. No getting around it.

Common Failure Modes

Failure ModeSymptomRoot CauseMitigation
ALB returns 502Bad Gateway error to clientLambda function returned an invalid response (wrong JSON structure, missing statusCode)Validate Lambda response format; log raw responses during development
ALB returns 503Service UnavailableAll Lambda targets marked unhealthy; or Lambda throttled by concurrency limitIncrease health check tolerance; increase Lambda reserved concurrency
ALB returns 504Gateway TimeoutLambda execution exceeded ALB idle timeout (default 60s)Increase ALB idle timeout; optimize Lambda execution time
Intermittent 403 from CloudFrontForbidden errors on some requestsWAF rule blocking legitimate traffic; or custom origin header mismatch after rotationReview WAF logs; coordinate header rotation between CloudFront and ALB
Cold start health check failuresTarget flaps between healthy and unhealthyHealth check timeout shorter than cold start durationIncrease health check timeout and unhealthy threshold; use Provisioned Concurrency
Missing cookies in responseSet-Cookie headers lostMulti-value headers not enabled on target groupEnable multi_value_headers.enabled on the target group
Cache serving stale dataUsers see outdated content after Lambda updateCloudFront TTL longer than acceptable stalenessUse cache invalidation on deploy; or use short TTLs with stale-while-revalidate
CloudFront bypassAttackers call ALB directly, skipping WAFSecurity group allows traffic from non-CloudFront IPsRestrict security group to CloudFront managed prefix list; validate custom origin header
1 MB payload rejectionLarge requests or responses fail silentlyRequest or response exceeds ALB Lambda payload limitImplement presigned URL pattern for large payloads; add payload size validation early in the request

Key Architectural Recommendations

Built and operated this pattern across multiple production systems. Ten recommendations I keep coming back to.

  1. Use this pattern when monthly volume exceeds 30 million requests and you need routing, caching, or WAF. Below that? API Gateway HTTP API. Simpler. Cheaper. Above it, ALB's LCU pricing and CloudFront caching deliver real savings.
  2. Always put CloudFront in front of the ALB. ALB serving Lambda without CloudFront gives you routing. That's it. No caching. No WAF. No DDoS protection. No edge TLS. CloudFront adds all of that. Per-request cost often offset by cache hits alone.
  3. Lock down the ALB. CloudFront managed prefix list plus custom origin header. Security group blocks network-level bypass. Custom header blocks application-level bypass. Both. Always.
  4. Enable multi-value headers on every Lambda target group. Zero downside. Silently dropping duplicate headers will cost you hours of debugging. I know this firsthand.
  5. Health check intervals: 60 seconds or longer for Lambda targets. Default 35-second interval generates unnecessary invocations. Increases cold start exposure. Real traffic keeps functions warm. Health checks don't.
  6. Provisioned Concurrency for latency-sensitive endpoints. Eliminates cold starts. Prevents health check flapping. Predictable performance. Cost is modest relative to reliability improvement. On-call engineers will thank you.
  7. Design for the 1 MB payload limit from day one. Presigned S3 URLs for uploads. Pagination for large responses. Compressed payloads. Don't discover this limit at 2 AM in production.
  8. Consolidate APIs onto a single ALB. Host-based and path-based routing. Pay the ALB fixed cost once. Multiple target groups cost nothing extra beyond LCU consumption. Separate API Gateways per service? Costs significantly more.
  9. CloudFront + Function URLs for single-function use cases. One Lambda function needing edge caching and WAF? No multi-function routing needed? Skip the ALB. Simpler. No fixed cost.
  10. Three metrics to monitor: CloudFront cache hit ratio, ALB target response time, Lambda concurrent executions. Cache hit ratio tells you origin traffic savings. Target response time reveals cold starts. Concurrent executions warns you before hitting Lambda's regional concurrency limit. Everything else is secondary.

Additional Resources

Let's Build Something!

I help teams ship cloud infrastructure that actually works at scale. Whether you're modernizing a legacy platform, designing a multi-region architecture from scratch, or figuring out how AI fits into your engineering workflow, I've seen your problem before. Let me help.

Currently taking on select consulting engagements through Vantalect.