What Are Lambda MicroVMs?
Lambda MicroVMs are a new compute primitive within AWS Lambda that gives each user or session its own isolated, stateful execution environment. Each MicroVM is a dedicated Firecracker virtual machine — the same technology that's powered trillions of Lambda function invocations — but exposed as a long-running, addressable sandbox rather than an event-driven function.
You bring a Dockerfile, Lambda builds a Firecracker snapshot from it, and every subsequent launch resumes from that snapshot near-instantly. The MicroVM gets a dedicated HTTPS endpoint, retains memory and disk state across requests, and can suspend/resume automatically to save costs during idle periods.
This is NOT a replacement for Lambda Functions. Lambda Functions remain the right choice for event-driven, request-response workloads. MicroVMs are purpose-built for multi-tenant applications that need to run user-generated or AI-generated code in per-session isolation.
Announced June 2026.
When to Use Lambda MicroVMs
Good fit:
- AI coding assistants running user-submitted code
- Interactive code environments (notebooks, REPLs, sandboxes)
- Data analytics platforms where each user gets a workspace
- Vulnerability scanners that need isolated execution
- Game servers running user-supplied scripts
- Any multi-tenant app running untrusted code per-user
Use Lambda Functions instead when:
- Your workload is event-driven and stateless
- You don't need per-session isolation
- Execution is short-lived (under 15 minutes)
- You don't need to retain state between requests
Use ECS/Fargate instead when:
- You need more than 8 hours of runtime
- You need more than 32GB memory or 16 vCPUs
- You need WebSocket support
- You want full networking control (VPC, load balancers, service mesh)
- Your workload isn't multi-tenant sandbox-style
How It Works
Image Creation
You supply a Dockerfile and code (zipped in S3). Lambda:
- Pulls the zip artifact
- Runs the Dockerfile (base image:
public.ecr.aws/lambda/microvms:al2023-minimal) - Initializes the application
- Takes a Firecracker snapshot of the running memory and disk state
That snapshot becomes your MicroVM Image. Every MicroVM launched from it resumes from the pre-initialized state — your app is already running the moment it comes up.
aws lambda-microvms create-microvm-image \
--code-artifact uri=s3://my-bucket/my-app.zip \
--name my-sandbox-image \
--base-image-arn arn:aws:lambda:us-east-1:aws:microvm-image:al2023-1 \
--build-role-arn arn:aws:iam::123456789012:role/MicroVMBuildRole
Launching a MicroVM
One API call gives you an initialized, running environment with a dedicated HTTPS endpoint:
aws lambda-microvms run-microvm \
--image-identifier arn:aws:lambda:us-east-1:123456789012:microvm-image:my-sandbox-image \
--execution-role-arn arn:aws:iam::123456789012:role/MicroVMExecutionRole \
--idle-policy '{"maxIdleDurationSeconds":900,"suspendedDurationSeconds":300,"autoResumeEnabled":true}'
No VPC setup. No load balancer. No networking config. The service returns a unique MicroVM ID and a dedicated endpoint URL.
Sending Traffic
Generate a short-lived auth token and send HTTPS requests with the X-aws-proxy-auth header:
TOKEN=$(aws lambda-microvms get-auth-token --microvm-id mvm-abc123)
curl -H "X-aws-proxy-auth: $TOKEN" https://mvm-abc123.lambda-microvms.us-east-1.amazonaws.com/
Your application receives the request as normal HTTP traffic. It's just a web server running inside a Firecracker VM.
Lifecycle Management
- Active: MicroVM is running, accepting requests. You pay for compute.
- Suspended: Memory and disk snapshotted, no compute charges. Triggered by idle timeout or explicit API call.
- Resumed: Automatically resumes from snapshot on next request. Near-instant — the user doesn't notice the pause.
- Terminated: MicroVM destroyed. State is gone.
Maximum total runtime: 8 hours. Idle timeout is configurable. Auto-resume can be enabled/disabled.
Key Characteristics
| Property | Value |
|---|---|
| Architecture | ARM64 (Graviton) only |
| Max vCPUs | 16 |
| Max memory | 32 GB |
| Max disk | 32 GB |
| Max runtime | 8 hours |
| Isolation | VM-level (Firecracker) |
| Base OS | Amazon Linux 2023 |
| Language/runtime | Any (you bring a Dockerfile) |
| Invocation model | HTTPS endpoint (not event-driven) |
| State | Stateful (memory + disk persist across requests) |
| Networking | VPC supported; inbound on any port, Layer 7 protocols only (HTTP, WebSocket, gRPC) |
What Makes It Different
vs. Lambda Functions
Lambda Functions are event-driven, stateless, and capped at 15 minutes. MicroVMs are session-based, stateful, and last up to 8 hours. Functions respond to triggers. MicroVMs respond to HTTP requests on a dedicated endpoint. Functions share your account's concurrency pool. Each MicroVM is independently addressable.
vs. ECS/Fargate
Fargate gives you containers with full networking control, long-running tasks, and service discovery. But you configure VPCs, security groups, load balancers, task definitions, and scaling policies. MicroVMs give you one API call to get an isolated environment with an endpoint — no infrastructure to configure. The tradeoff: less control, shorter max duration, smaller resource limits.
vs. Lambda Managed Instances
LMI runs your Lambda functions on EC2 with multi-concurrency for cost efficiency. It's still the Lambda programming model (event handlers, triggers). MicroVMs are a different model entirely — you bring any web server in a Dockerfile and get a per-session sandbox.
vs. Rolling Your Own
Before MicroVMs, building per-user sandboxes meant either ECS tasks per user (expensive, slow to start, complex networking) or custom Firecracker deployments (requires deep virtualization expertise). MicroVMs are AWS packaging that capability as a managed service.
Architecture Pattern: AI Coding Assistant
User Code Submission
↓
API Gateway → Lambda Function (orchestrator)
↓
Lambda MicroVMs API (run-microvm)
↓
Dedicated MicroVM (per user session)
↓
Code executes in isolation, results stream back
The Lambda Function handles authentication, session lookup, and routing. The MicroVM handles untrusted code execution. Each user gets their own VM — one user's code can't affect another's environment.
Architecture Pattern: Ephemeral Coordination Layer
Step Functions → Launch MicroVM (Redis image, pre-snapshotted)
↓
Fan out Lambda invocations
(hit MicroVM endpoint for sub-ms coordination)
↓
Collect results → Terminate MicroVM
Use a MicroVM as on-demand infrastructure for batch workloads. Snapshot a Redis instance (or any in-memory service) pre-configured with the right data structures. Launch it only when the batch job runs. Get sub-millisecond coordination without paying for an always-on ElastiCache cluster.
This pattern also applies as a lighter-weight alternative to AWS Batch for jobs under 8 hours — no compute environment warmup, no container pulls, just a pre-initialized environment that's ready immediately.
Regions
Available at launch: US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Asia Pacific (Tokyo).
Further Reading
Related Blog Posts
Looking for hands-on help? View my AWS architecture services →