Lambda MicroVMs: On-Demand Stateful Compute

AWS launched Lambda MicroVMs last week. They're positioning it as "isolated sandboxes for running untrusted code" for AI coding assistants, interactive notebooks, that sort of thing. And it does solve that problem well.

But within a few minutes of reading the announcement, I had use cases that have nothing to do with untrusted code or multi-tenant sandboxing. What they've actually built is on-demand stateful compute with near-instant startup and zero idle cost. That's broader than the sandbox framing.

The Gap That Existed

Lambda is stateless and capped at 15 minutes. Fargate is stateful but requires VPC setup, task definitions, and always-on pricing. There was nothing in between: instant-start, stateful, zero idle cost, no infrastructure to configure.

What Lambda MicroVMs Actually Are

Lambda MicroVMs are Firecracker VMs as a managed service.

The model is:

You write a Dockerfile. Any language. Any runtime. The base image is AL2023.
Lambda builds it, runs it, and takes a Firecracker snapshot of the running state (memory + disk).
When you need a sandbox, one API call launches a MicroVM from that snapshot. Your app is already running. No cold boot.
The MicroVM gets a dedicated HTTPS endpoint. You send it traffic. It responds.
When idle, it suspends (snapshot the state, stop charging). On next request, it resumes.
Max lifetime: 8 hours.

That's it. No VPC config. No load balancer. No task definitions. No service discovery. One API call, running environment with an endpoint.

Where This Sits in the Compute Spectrum

Here's how I think about the Lambda family now:

Primitive	Model	State	Max Duration	Designed For
Lambda Functions	Event-driven, request/response	Stateless	15 min	Backend logic, event processing
Lambda Durable Functions	Checkpoint-and-replay	Checkpointed	1 year	Multi-step workflows
Lambda MicroVMs	HTTPS endpoint, per-session	Stateful (memory + disk)	8 hours	Multi-tenant sandboxes
Lambda Managed Instances	Multi-concurrency on EC2	Shared across requests	15 min per invocation	High-throughput APIs

MicroVMs aren't competing with Lambda Functions. They're competing with "we'll build it ourselves on ECS" or "we'll use a third-party sandbox service." The target customer isn't someone running CRUD APIs. It's someone building a platform where end users submit and execute code.

The ARM64-Only Decision

MicroVMs launch on ARM64 (Graviton) exclusively. No x86 option at launch. For the target use cases (Python, Node.js, Go, Rust running in sandboxes) this is fine, and you get the 20% Graviton cost advantage. If you need x86-specific compiled libraries or legacy binaries, that's a constraint worth knowing about.

The Interesting Architectural Implications

The obvious use cases (AI coding sandboxes, per-tenant SaaS isolation, security scanning) are well-covered in AWS's announcement. Those work as advertised. What caught my attention:

As a lighter-weight AWS Batch alternative

For short-duration batch jobs (under 8 hours), MicroVMs could replace AWS Batch for workloads where you don't need Batch's scheduling and dependency orchestration. Batch is managed job scheduling on ECS. You still wait for container pulls, Fargate cold starts, and compute environment warmup. A MicroVM launches from a pre-initialized snapshot with your processing environment already running. For simple fan-out-and-process patterns that Step Functions can orchestrate, the managed compute environment of Batch starts to look like unnecessary overhead.

On-demand services, with caveats

My first thought was on-demand coordination layers: spin up a MicroVM running Redis (or any in-memory service) for the duration of a batch job, use it for sub-millisecond dedup/coordination across parallel Lambda invocations, then shut it down. No always-on ElastiCache for a workload that runs 20 minutes every few hours.

The catch: inbound access only supports OSI Layer 7 protocols (HTTP, WebSocket, gRPC). No raw TCP, no UDP. Native Redis clients won't work. You'd need an HTTP wrapper. Fire-and-forget UDP patterns (which we used for high-throughput status updates at a previous job) are off the table entirely.

The on-demand services pattern is real, but today it's limited to services you can express as HTTP APIs. A custom in-memory coordination service with a REST interface works. Dropping in off-the-shelf Redis or Postgres doesn't.

What I'd Want to See Next

Lower-layer protocol support. The Layer 7 restriction is the biggest limitation for on-demand services. Native Redis and Postgres clients speak TCP. Fire-and-forget status updates (the kind you'd use for high-throughput coordination) often use UDP. TCP/UDP inbound on a private subnet would open this up considerably.

Streaming/WebSocket support. For AI coding assistants, streaming code execution output in real-time would be more natural than polling HTTPS endpoints.

Bottom Line

AWS is marketing Lambda MicroVMs as sandboxes for untrusted code execution. That's a real use case, and it solves it well. But the primitive they've built (on-demand stateful compute with snapshot-based instant start and suspend-to-zero) is more general than their framing.