AWS Lambda Container Images: An Architecture Deep-Dive

Having spent years packaging Lambda functions as zip archives, I hit the wall that every team eventually hits: the 250 MB deployment package limit. The first time it happened was an ML inference function with a PyTorch model and its dependency tree. We burned weeks trying to strip binaries, use Lambda Layers creatively, and shave megabytes from scipy. When AWS launched container image support for Lambda in December 2020, it raised the size ceiling to 10 GB and fundamentally changed how I think about Lambda packaging, base image standardization, CI/CD pipelines, and the boundary between serverless and container workloads. Container images let you use the same Dockerfile, the same build toolchain, and the same base image across Lambda, ECS, and Fargate, which eliminates an entire category of "works in my container but not in Lambda" problems.

This article is an architecture reference for engineers and architects who need to understand how Lambda container images work under the hood, how to design images that are fast, secure, and cost-effective, and how to build CI/CD pipelines that deploy them reliably. It covers the architecture, trade-offs, performance characteristics, and operational patterns that inform how I design container-based Lambda workloads in production.

Why Container Images for Lambda

The zip deployment model served Lambda well for years, but it imposes constraints that become painful as functions grow in complexity. The 250 MB limit (50 MB for direct upload, 250 MB unzipped from S3) excludes many legitimate workloads: ML models, scientific computing libraries, functions with large native dependency trees, and custom runtimes with embedded interpreters. Lambda Layers help but introduce their own complexity. You are limited to five layers, each layer counts against the 250 MB total, and layer versioning creates a matrix of compatibility that is tedious to manage.

Container images raise the ceiling to 10 GB and bring the entire Docker ecosystem to Lambda. You get multi-stage builds for minimal images, layer caching for fast rebuilds, the same Dockerfile across Lambda and ECS, and the ability to run and debug locally with the exact same image that runs in production.

For small, simple functions, zip archives remain the right choice.

Dimension	Zip Archive	Container Image
Maximum size	250 MB (unzipped)	10 GB
Artifact format	.zip file in S3 or direct upload	OCI image in ECR
Cold start	Generally faster for small packages	Comparable; optimized by Lambda's chunk caching
Supported runtimes	AWS-managed runtimes only	Any runtime: bring your own
Lambda Layers	Up to 5 layers	Not supported (use Docker layers instead)
Local testing	Requires SAM CLI or manual invocation	`docker run` with RIE (identical to production)
Build toolchain	zip, SAM, CDK asset bundling	Docker/Buildx, any CI/CD system
Registry	S3 (managed by Lambda)	ECR (you manage)
Image sharing	Not applicable	Same base image across Lambda, ECS, Fargate
Versioning	S3 object versioning or CodeUri hash	Image tags and SHA256 digests
Rollback	Redeploy previous zip	Point Lambda to previous image digest
Pricing	Lambda compute only	Lambda compute + ECR storage ($0.10/GB/month)

The decision framework: use container images when your deployment package exceeds 250 MB, when you need a custom runtime, when you want to share base images across compute platforms, or when your team already has Docker-based CI/CD and wants a unified build pipeline. Use zip archives when your function is small, your dependencies are minimal, and you value the simplicity of sam deploy or inline code editing in the console.

Architecture Internals

Lambda's internal handling of container images drives the performance characteristics and operational behaviors you observe in production. Lambda performs a one-time optimization when you create or update a function, and then uses a caching system to assemble the runtime environment quickly rather than pulling the full container image from ECR on every cold start.

When you update a Lambda function's image URI, the Lambda service pulls the image from ECR, decompresses it, encrypts the layers, and breaks them into small chunks. These chunks are stored in a Lambda-managed cache distributed across the fleet of workers in the region. On cold start, the worker assembles only the chunks it needs (starting with the handler and its immediate dependencies) rather than downloading the entire image. A 2 GB image therefore does not necessarily produce a 2 GB cold start penalty; Lambda loads chunks on demand and caches them at the fleet level.

The Runtime Interface Client (RIC) is the critical piece that makes a container image compatible with Lambda. The RIC implements the Lambda Runtime API, the HTTP-based protocol that Lambda workers use to send invocation events to your function and receive responses. AWS provides RIC implementations for Python, Node.js, Java, .NET, Go, Ruby, C++, and Rust. When you use an AWS base image, the RIC is pre-installed. When you build from scratch, you must install the RIC yourself.

The Runtime Interface Emulator (RIE) is a companion tool for local testing. It emulates the Lambda Runtime API on your development machine, allowing you to invoke your containerized function with curl and receive the same JSON response format you would get in production. The RIE is included in AWS base images and can be added to custom images for local development.

When you update a Lambda function to point to a new image (or a new digest behind the same tag), warm execution environments continue running the old image until they are recycled. Lambda does not terminate warm environments to pick up the new image. New cold starts use the new image, but you may observe both old and new versions serving traffic simultaneously during the transition period. I recommend deploying via Lambda aliases with CodeDeploy traffic shifting rather than updating the function directly.

Runtime	RIC Package	Install Command
Python	`awslambdaric`	`pip install awslambdaric`
Node.js	`aws-lambda-ric`	`npm install aws-lambda-ric`
Java	`aws-lambda-java-runtime-interface-client`	Maven/Gradle dependency
.NET	`Amazon.Lambda.RuntimeSupport`	NuGet package
Go	`aws-lambda-go`	`go get github.com/aws/aws-lambda-go`
Ruby	`aws_lambda_ric`	`gem install aws_lambda_ric`
C++	`aws-lambda-cpp`	CMake build from source
Rust	`lambda_runtime`	`cargo add lambda_runtime`

AWS Base Images vs. Custom Images

AWS publishes official Lambda base images in ECR Public Gallery at public.ecr.aws/lambda/. These images include the Lambda runtime, the RIC, the RIE, and a minimal Amazon Linux operating system. They are the fastest path to a working container-based Lambda function and the approach I recommend for most teams starting out.

Runtime	Image URI	Approximate Size	OS
Python 3.12	`public.ecr.aws/lambda/python:3.12`	~580 MB	Amazon Linux 2023
Python 3.13	`public.ecr.aws/lambda/python:3.13`	~590 MB	Amazon Linux 2023
Node.js 20	`public.ecr.aws/lambda/nodejs:20`	~490 MB	Amazon Linux 2023
Node.js 22	`public.ecr.aws/lambda/nodejs:22`	~500 MB	Amazon Linux 2023
Java 21	`public.ecr.aws/lambda/java:21`	~620 MB	Amazon Linux 2023
.NET 8	`public.ecr.aws/lambda/dotnet:8`	~560 MB	Amazon Linux 2023
Ruby 3.3	`public.ecr.aws/lambda/ruby:3.3`	~510 MB	Amazon Linux 2023

Building from a custom base image gives you full control over the operating system, system libraries, and image size. This is the right choice when you need a specific Linux distribution for compliance reasons, need to minimize image size, or need system-level packages that are not available in Amazon Linux. The trade-off: you are responsible for installing and maintaining the RIC and for applying OS patches that AWS otherwise handles automatically.

Dimension	AWS Base Image	Custom Base Image
Maintenance	AWS patches the OS and runtime	You patch everything
RIC included	Yes	You install it
RIE included	Yes	You install it
OS choice	Amazon Linux 2023 only	Any Linux distribution
Size control	Limited: base image is fixed	Full control via multi-stage builds
Compliance	AWS-managed, SOC2/HIPAA eligible	You validate compliance
Cold start	AWS-optimized	Depends on your optimization
Native libraries	Amazon Linux packages	Any packages you install
FIPS 140-2	Not available in base images	Can use FIPS-validated OS

Example: Dockerfile using AWS base image (Python)

FROM public.ecr.aws/lambda/python:3.12

COPY requirements.txt ${LAMBDA_TASK_ROOT}
RUN pip install -r requirements.txt --target "${LAMBDA_TASK_ROOT}"

COPY app.py ${LAMBDA_TASK_ROOT}

CMD ["app.handler"]

Example: Dockerfile from custom base image (Python)

FROM python:3.12-slim

RUN pip install awslambdaric

WORKDIR /var/task

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY app.py .

ENTRYPOINT ["python", "-m", "awslambdaric"]
CMD ["app.handler"]

Building Lambda Container Images

The ENTRYPOINT and CMD instructions in your Dockerfile have specific meaning for Lambda container images. The ENTRYPOINT specifies the executable that implements the Runtime Interface Client, and CMD specifies the handler function in module.handler format. When using AWS base images, the ENTRYPOINT is pre-configured; you only need to set CMD. When using custom base images, you must set both.

Multi-stage builds are essential for keeping images small. A common pattern is to use a full build image (with compilers, headers, and build tools) to compile native extensions, then copy only the compiled artifacts into a minimal runtime image. This can reduce image size by 50-80% compared to a single-stage build.

Layer ordering matters for Docker build cache efficiency. Docker invalidates the cache for a layer and all subsequent layers when any file in a COPY instruction changes. By copying dependency files (like requirements.txt or package.json) before copying application code, you ensure that the expensive dependency installation step is cached across builds where only application code changes.

The /var/task working directory is a Lambda convention. Both AWS base images and the Lambda execution environment expect your function code to be in /var/task. While you can use a different directory with custom images, sticking with the convention avoids subtle issues with relative path resolution.

Best Practice	Rationale	Impact
Use multi-stage builds	Exclude build tools, headers, and intermediate artifacts from runtime image	50-80% size reduction
Order layers by change frequency	Copy dependency manifests before source code	Faster rebuilds via cache hits
Pin base image tags	Use `python:3.12.4-slim` not `python:3.12-slim`	Reproducible builds
Use `.dockerignore`	Exclude tests, docs, `.git`, `__pycache__`	Smaller build context, faster uploads
Set `WORKDIR /var/task`	Match Lambda's expected function directory	Consistent path resolution
Minimize layer count	Combine related `RUN` commands with `&&`	Smaller image, fewer cache layers
Remove package manager caches	Add `--no-cache-dir` (pip) or `npm cache clean`	50-200 MB savings
Use `--target` for build artifacts	Install Python packages to `${LAMBDA_TASK_ROOT}`	Clean separation of runtime deps

Example: Optimized multi-stage Python Dockerfile

FROM python:3.12-slim AS builder

WORKDIR /build
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt --target /build/deps

FROM public.ecr.aws/lambda/python:3.12

COPY --from=builder /build/deps ${LAMBDA_TASK_ROOT}
COPY src/ ${LAMBDA_TASK_ROOT}

CMD ["app.handler"]

Example: Optimized Node.js Dockerfile

FROM public.ecr.aws/lambda/nodejs:20

COPY package.json package-lock.json ${LAMBDA_TASK_ROOT}/
RUN npm ci --production

COPY src/ ${LAMBDA_TASK_ROOT}/

CMD ["app.handler"]

ECR as the Lambda Image Registry

Lambda container images must be stored in Amazon Elastic Container Registry. This is a hard requirement. Lambda cannot pull images from Docker Hub, GitHub Container Registry, or any other registry. Your CI/CD pipeline must push images to ECR, and your Lambda function must reference an ECR image URI.

Image tag strategy matters more than most teams realize. Lambda resolves an image tag to a specific SHA256 digest at the time you create or update the function configuration. If you subsequently push a new image to the same tag (a mutable tag like latest or v1), Lambda continues running the old digest until you explicitly update the function. Mutable tags therefore provide no automatic rollout and create confusion about which version is actually deployed. I recommend using immutable tags (ECR supports an immutable tag policy) or deploying by digest (123456789.dkr.ecr.us-east-1.amazonaws.com/my-func@sha256:abc123...).

ECR lifecycle policies are essential for cost control. Without them, every image you push accumulates at $0.10/GB/month. A team pushing 500 MB images daily will accumulate 15 GB/month, costing $1.50/month. That amount seems small, but it compounds across dozens of functions and environments. A sensible lifecycle policy retains the last N tagged images and expires untagged images after a few days.

Setting	Recommendation	Rationale
Tag immutability	Enabled	Prevents accidental overwrite; deploy by digest for safety
Image scanning on push	Enabled	Catches known vulnerabilities before deployment
Lifecycle policy	Keep last 10 tagged images; expire untagged after 3 days	Controls storage costs
Encryption	AWS-managed KMS key (default) or customer-managed CMK	At-rest encryption; CMK for compliance
Cross-region replication	Enable for multi-region Lambda deployments	Images available in each region Lambda runs
Repository policy	Grant `lambda.amazonaws.com` ECR pull access	Required for Lambda to pull images
Pull-through cache	Not needed for Lambda	Lambda only pulls from ECR directly

ECR offers two scanning modes. Basic scanning uses the open-source Clair project and scans for OS package vulnerabilities. Enhanced scanning uses Amazon Inspector and adds programming language package scanning (pip, npm, Maven, etc.) plus continuous monitoring that re-scans images when new CVEs are published.

Dimension	Basic Scanning	Enhanced Scanning
Engine	Clair (open source)	Amazon Inspector
OS package CVEs	Yes	Yes
Language package CVEs	No	Yes (pip, npm, Maven, Go, .NET)
Scan trigger	On push or manual	Continuous: re-scans on new CVEs
Cost	Free	Inspector pricing ($0.09/image/month for continuous)
Findings format	ECR native	Inspector findings + Security Hub
Severity levels	Critical, High, Medium, Low, Informational	Same, with CVSS scoring

Cold Start Performance

Cold starts are the primary concern teams raise when evaluating container images for Lambda. The performance gap between zip and container deployments has narrowed since the feature launched. Lambda's chunk-based caching system means that cold start time is not linearly proportional to image size. In practice, I see container cold starts ranging from 200 ms to 2 seconds for typical workloads, with image size being the dominant (but not the only) factor.

Lambda's cold start sequence for container images involves six phases: downloading cached image chunks to the worker, setting up the container filesystem, initializing the runtime environment, starting the RIC, executing your function's initialization code (module-level imports, global variable setup), and finally invoking the handler. The chunk download phase is where container images differ from zip, but Lambda's caching means that popular images (including your own, after the first invocation in a region) are already cached on the worker fleet.

Runtime	Image Size	Cold Start (p50)	Cold Start (p99)	Notes
Python 3.12	~580 MB (base only)	~300 ms	~800 ms	AWS base image, no additional deps
Python 3.12	~1.2 GB (with numpy/pandas)	~600 ms	~1.5 s	Common data science stack
Python 3.12	~3 GB (with PyTorch)	~1.2 s	~3 s	ML inference workload
Node.js 20	~490 MB (base only)	~250 ms	~700 ms	AWS base image, no additional deps
Node.js 20	~800 MB (with dependencies)	~450 ms	~1.2 s	Typical API function
Java 21	~620 MB (base only)	~3 s	~8 s	JVM startup dominates; use SnapStart
Java 21 + SnapStart	~620 MB (base only)	~200 ms	~500 ms	SnapStart eliminates JVM init
.NET 8	~560 MB (base only)	~400 ms	~1 s	AOT compilation improves this
Go (custom)	~50 MB (Alpine + binary)	~80 ms	~200 ms	Compiled binary, minimal runtime

Image size optimization impact: As a rough guideline, expect approximately 50-150 ms of additional cold start time per 100 MB of image size. The relationship is not linear; Lambda's chunk caching and lazy loading mean that the first 500 MB has less impact per MB than the next 500 MB. The most effective optimization is ensuring the handler and its immediate imports are small enough to load quickly.

SnapStart, originally available only for Java 11 and 17 zip deployments, now supports Java 21 container images. It takes a Firecracker microVM snapshot after initialization and restores from that snapshot on cold start, reducing Java cold starts from seconds to milliseconds. If you are running Java on Lambda, SnapStart with containers is the single most impactful optimization available.

Provisioned Concurrency eliminates cold starts entirely by keeping a specified number of execution environments warm. This works identically for zip and container deployments. At $0.0000041667 per GB-second of provisioned concurrency, it is cost-effective for latency-sensitive workloads with predictable traffic patterns.

For a detailed look at Lambda networking, invocation patterns, and integration with ALB and CloudFront, see the Lambda Behind ALB Behind CloudFront: An Architecture Deep-Dive.

Multi-Architecture Support

Lambda supports both x8664 and ARM64 (Graviton2) architectures. ARM64 Lambda functions are priced 20% lower than x8664 ($0.0000133334 per GB-second versus $0.0000166667) and in my experience deliver comparable or better performance for most workloads. For container-based functions, supporting both architectures requires building multi-architecture images using Docker's buildx and manifest lists.

A multi-architecture image is a single ECR repository tag that resolves to different image digests depending on the requesting platform's architecture. When you configure a Lambda function with arm64 architecture and point it to a multi-arch image tag, Lambda automatically pulls the ARM64 variant. This lets you maintain a single image tag across architectures while Lambda handles the selection.

The critical CI/CD consideration is build speed. Building ARM64 images on x86_64 build hosts requires QEMU emulation, which is 5-10x slower than native builds. For production pipelines, I recommend using ARM64-native CodeBuild instances (available since 2023) or a build matrix that runs each architecture on its native hardware.

Dimension	x86_64 (Intel/AMD)	ARM64 (Graviton2)
Lambda pricing	$0.0000166667/GB-s	$0.0000133334/GB-s (20% cheaper)
Compute performance	Baseline	Comparable; better for some workloads
Native build speed	Fast on x86 hosts	Fast on ARM hosts; slow via QEMU
CodeBuild support	All compute types	ARM-native instances available
Base image availability	All AWS base images	All AWS base images
Third-party library support	Universal	Most major libraries; verify native extensions
Docker buildx required	Yes (for multi-arch)	Yes (for multi-arch)
ECR storage	Per-architecture layers	Per-architecture layers
Graviton3 support	Not applicable	Not yet available for Lambda
Migration effort	None (default)	Test native extensions; rebuild images

To build multi-architecture images with buildx:

# Create and use a buildx builder
docker buildx create --name lambda-builder --use

# Build and push both architectures
docker buildx build \
  --platform linux/amd64,linux/arm64 \
  --tag 123456789.dkr.ecr.us-east-1.amazonaws.com/my-func:v1.0.0 \
  --push .

CI/CD Patterns

The canonical CI/CD pipeline for Lambda container images follows a predictable flow: source code triggers a build, the build produces a container image and pushes it to ECR, and a deployment step updates the Lambda function to use the new image. The details (how you trigger builds, how you tag images, how you deploy safely) determine whether this pipeline is robust or fragile.

The most important CI/CD principle for container-based Lambda: always deploy by image digest, never by mutable tag. When CodeBuild pushes a new image, capture the digest from the docker push output and pass it to the deployment step. This guarantees that the Lambda function runs exactly the image that was built and tested, not whatever happens to be tagged latest at deployment time.

Deployment Strategy	How It Works	Rollback Speed	Risk Level	Use Case
All-at-once	Update function image URI directly	Manual revert	Highest	Development and testing
Lambda versioning + alias	Publish version, shift alias	Point alias to previous version	Medium	Simple production deployments
CodeDeploy Linear10PercentEvery1Minute	Shift 10% traffic per minute	Automatic on alarm	Low	Production APIs with health metrics
CodeDeploy Canary10Percent5Minutes	10% for 5 min, then 100%	Automatic on alarm	Lowest	Production with strict SLAs
CodeDeploy AllAtOnce	Shift all traffic immediately via CodeDeploy	Automatic on alarm	Medium	When you want alarm-based rollback without gradual shift

Example: CodeBuild buildspec for Lambda container image

version: 0.2

env:
  variables:
    ECR_REPO: 123456789.dkr.ecr.us-east-1.amazonaws.com/my-func
    FUNCTION_NAME: my-function

phases:
  pre_build:
    commands:
      - aws ecr get-login-password --region $AWS_DEFAULT_REGION | docker login --username AWS --password-stdin $ECR_REPO
      - IMAGE_TAG=$(echo $CODEBUILD_RESOLVED_SOURCE_VERSION | cut -c 1-8)
  build:
    commands:
      - docker build -t $ECR_REPO:$IMAGE_TAG .
      - docker push $ECR_REPO:$IMAGE_TAG
      - IMAGE_URI=$(docker inspect --format='{{index .RepoDigests 0}}' $ECR_REPO:$IMAGE_TAG)
  post_build:
    commands:
      - aws lambda update-function-code --function-name $FUNCTION_NAME --image-uri $IMAGE_URI
      - printf '{"ImageUri":"%s"}' "$IMAGE_URI" > imageDetail.json

artifacts:
  files:
    - imageDetail.json

For deeper coverage of the build, deployment, and pipeline services referenced here, see the AWS CodeBuild: An Architecture Deep-Dive, the AWS CodeDeploy: An Architecture Deep-Dive, and the AWS CodePipeline: An Architecture Deep-Dive.

Infrastructure as Code

Defining container-based Lambda functions in infrastructure as code differs from zip-based functions in a few important ways. The most notable: you do not specify a runtime or handler in the Lambda function configuration. Both are embedded in the container image via the ENTRYPOINT and CMD instructions. You specify PackageType: Image and the ImageUri. That is all.

Lambda does allow you to override CMD, ENTRYPOINT, and WORKDIR at the function configuration level, which is useful for running different handlers from the same image (for example, separate Lambda functions for API handling and background processing, all built from a single image with multiple handler modules).

Property	Zip Deployment	Container Deployment
Package type	`Zip` (default)	`Image`
Code location	S3 bucket/key or inline	ECR image URI
Runtime	Required (e.g., `python3.12`)	Not set: embedded in image
Handler	Required (e.g., `app.handler`)	Not set: embedded in image CMD
Layers	Up to 5 layer ARNs	Not supported
Architecture	`x86_64` or `arm64`	`x86_64` or `arm64`
CMD override	Not applicable	Optional: overrides Dockerfile CMD
ENTRYPOINT override	Not applicable	Optional: overrides Dockerfile ENTRYPOINT

Example: CDK TypeScript

import { DockerImageFunction, DockerImageCode } from 'aws-cdk-lib/aws-lambda';
import { Repository } from 'aws-cdk-lib/aws-ecr';

const repo = Repository.fromRepositoryName(this, 'Repo', 'my-func');

new DockerImageFunction(this, 'MyFunction', {
  code: DockerImageCode.fromEcr(repo, {
    tagOrDigest: 'sha256:abc123...',
  }),
  memorySize: 1024,
  timeout: Duration.seconds(30),
  architecture: Architecture.ARM_64,
  environment: {
    TABLE_NAME: table.tableName,
  },
});

Example: Terraform HCL

resource "aws_lambda_function" "my_function" {
  function_name = "my-function"
  package_type  = "Image"
  image_uri     = "123456789.dkr.ecr.us-east-1.amazonaws.com/my-func@sha256:abc123..."
  role          = aws_iam_role.lambda_role.arn
  memory_size   = 1024
  timeout       = 30
  architectures = ["arm64"]

  environment {
    variables = {
      TABLE_NAME = aws_dynamodb_table.my_table.name
    }
  }
}

Security Architecture

Container images introduce a different security surface compared to zip deployments. The image itself becomes an artifact that must be secured across its lifecycle: build, storage, deployment, and runtime. The key principle is establishing an immutable chain from source to running function. You should be able to trace any running Lambda back to the exact source commit and Dockerfile that produced it.

ECR repository policies control who can push and pull images. Lambda needs pull access (granted to the lambda.amazonaws.com service principal), and your CI/CD role needs push access. Avoid granting broad ecr:* permissions. Scope push access to CI/CD roles and pull access to Lambda's service principal and your deployment roles.

Security Control	Implementation	Layer
Image scanning	ECR basic or enhanced scanning on push	Build/Registry
Deployment gate	Block deployments with Critical/High CVE findings	CI/CD
Immutable tags	Enable ECR immutable tag policy	Registry
Digest-based deployment	Deploy by `@sha256:...`, not by tag	CI/CD/IaC
Least-privilege ECR policy	Separate push (CI/CD) and pull (Lambda) permissions	Registry
No secrets in images	Use Lambda environment variables, Secrets Manager, or Parameter Store	Build
Base image patching	Scheduled weekly rebuilds pulling latest base image	CI/CD
Image signing	AWS Signer or Notation for supply chain integrity	Build/Registry
VPC deployment	Run Lambda in VPC for private resource access	Runtime
Execution role scoping	Minimal IAM permissions per function	Runtime

Never bake secrets, API keys, or database credentials into your container image. Images are stored in ECR and can be inspected by anyone with repository read access. Use Lambda environment variables (encrypted at rest with KMS), AWS Secrets Manager, or Systems Manager Parameter Store for runtime secrets. If you need secrets during build time (for example, to access a private package registry), use Docker BuildKit secret mounts (--mount=type=secret) which are excluded from the final image layers.

Scheduled base image rebuilds are essential for security hygiene. Even if your application code has not changed, the base image's OS packages may have new CVE patches. A weekly CI/CD job that rebuilds all Lambda images from the latest base image tag and runs a vulnerability scan catches these patches automatically.

Cost Considerations

Lambda pricing is identical for zip and container deployments; you pay the same per-GB-second rate regardless of package type. The additional costs specific to container images come from ECR storage, image builds, and (optionally) enhanced scanning.

Cost Component	Rate	Example	Monthly Cost
Lambda compute (x86_64)	$0.0000166667/GB-s	1M invocations, 512 MB, 200 ms avg	~$1.67
Lambda compute (ARM64)	$0.0000133334/GB-s	Same workload as above	~$1.33
Lambda requests	$0.20/million	1M invocations	$0.20
ECR storage	$0.10/GB/month	10 images × 1 GB each	$1.00
ECR data transfer (same region)	Free	Lambda pulls from same-region ECR	$0.00
ECR cross-region transfer	$0.01/GB	Replication to second region	Varies
CodeBuild (general1.small)	$0.005/minute	100 builds × 5 min	$2.50
Enhanced scanning	$0.09/image/month (continuous)	10 images	$0.90

The most common cost surprise is ECR storage accumulation. Without lifecycle policies, every build pushes a new image that persists indefinitely. A single function with daily deployments and 1 GB images accumulates 30 GB/month ($3/month), and across 50 functions that reaches $150/month of dead images. Lifecycle policies that retain only the last 10 tagged images and expire untagged images after 3 days eliminate this waste entirely.

ARM64 (Graviton2) provides the easiest cost optimization: a 20% reduction in Lambda compute cost with a one-line configuration change (set the architecture to arm64). For most workloads, the performance is equivalent or better. The only prerequisite is that your container image and all native dependencies are built for the ARM64 architecture.

Common Failure Modes

Failure Mode	Symptom	Root Cause	Mitigation
Image exceeds 10 GB	`ImageTooLargeException` on function create/update	Unoptimized image with unnecessary dependencies	Multi-stage builds; remove build tools, caches, docs
Missing RIC	Function times out immediately; no logs	Custom image does not include the Runtime Interface Client	Install the language-specific RIC package
Wrong architecture	`exec format error` in logs	ARM64 function referencing x86_64 image or vice versa	Build for target architecture; use multi-arch manifest
ECR permission denied	`AccessDeniedException` on function create/update	Lambda service principal lacks `ecr:GetDownloadUrlForLayer` and `ecr:BatchGetImage`	Add ECR repository policy granting Lambda pull access
Cross-account pull failure	`ImageNotFoundException` or access denied	ECR repository in different account lacks cross-account policy	Add cross-account resource policy on ECR repository
Cold start timeout	Function timeout on first invocation; subsequent invocations succeed	Init duration exceeds function timeout; large image or slow module imports	Increase timeout; optimize imports; use Provisioned Concurrency
Stale image after tag update	Old behavior persists after pushing new image to same tag	Lambda resolves tag to digest at deploy time, not at invocation time	Redeploy function with `update-function-code`; prefer digest-based deploys
Image not found	`ImageNotFoundException`	Typo in image URI, deleted image, or wrong region	Verify URI, check ECR lifecycle policies, confirm region
ENTRYPOINT conflict	`Runtime.InvalidEntrypoint` or handler not found	Custom ENTRYPOINT overrides RIC; or CMD missing handler	Ensure ENTRYPOINT runs RIC; CMD specifies `module.handler`
Slow QEMU builds	CI/CD pipeline takes 20+ minutes for ARM64 builds	Building ARM64 on x86_64 via QEMU emulation	Use ARM64-native CodeBuild instances or build matrix

Key Architectural Recommendations

Use container images when your deployment package exceeds 250 MB or when you need a custom runtime. For small, simple functions with standard runtimes, zip archives remain simpler and faster to deploy.
Start with AWS base images unless you have a specific reason to build from scratch. They include the RIC, the RIE, and receive automatic OS patches. Move to custom base images only when you need a different OS, aggressive size optimization, or FIPS compliance.
Deploy by image digest, never by mutable tag. Capture the digest from docker push in your CI/CD pipeline and pass it through to the deployment step. This guarantees reproducibility and makes rollback deterministic.
Enable immutable tags on ECR repositories. This prevents accidental overwrites and forces your pipeline to use unique tags, which makes audit trails clear and rollback reliable.
Optimize image size. Use multi-stage builds, remove build tools and caches, pin slim base images, and use .dockerignore. Every 100 MB you remove saves 50-150 ms of cold start time.
Use Graviton2 (ARM64) by default. The 20% cost saving requires minimal effort for most workloads. Build multi-architecture images with docker buildx so you can switch architectures with a single configuration change.
Implement ECR lifecycle policies from day one. Retain the last 10 tagged images, expire untagged images after 3 days. This prevents unbounded storage cost growth.
Use CodeDeploy traffic shifting for production deployments. The Canary10Percent5Minutes or Linear10PercentEvery1Minute strategies with CloudWatch alarm-based rollback provide safe, automated deployments. See the AWS CodeDeploy: An Architecture Deep-Dive for details.
Scan images on push and gate deployments on findings. At minimum, enable ECR basic scanning. For production workloads, use enhanced scanning with Amazon Inspector for continuous monitoring and language-package CVE detection.
Never bake secrets into container images. Use Lambda environment variables with KMS encryption, Secrets Manager, or Parameter Store. Use Docker BuildKit secret mounts for build-time secrets.

Why Container Images for Lambda

Architecture Internals

AWS Base Images vs. Custom Images

Building Lambda Container Images

ECR as the Lambda Image Registry

Cold Start Performance

Multi-Architecture Support

CI/CD Patterns

Infrastructure as Code

Security Architecture

Cost Considerations

Common Failure Modes

Key Architectural Recommendations

Additional Resources