What Are Lambda Managed Instances?
Lambda Managed Instances (LMI) let you run Lambda functions on EC2 instances in your account while AWS manages everything. Instance lifecycle, OS patching, load balancing, auto-scaling, and routing. You keep the Lambda programming model (event handlers, IAM roles, CloudWatch integration) but get EC2's pricing advantages and hardware flexibility.
The key difference: multi-concurrency. Traditional Lambda runs one invocation per execution environment. LMI environments handle multiple concurrent requests, sharing resources across invocations. This cuts compute waste for workloads with sustained traffic.
Announced at re:Invent 2025, now available in all major commercial regions.
When to Use Lambda Managed Instances
Good fit:
- Steady-state, high-traffic workloads where per-invocation Lambda pricing is expensive but you don't want to manage ECS/Fargate
- Cost optimization: apply Compute Savings Plans or Reserved Instances for up to 72% discount over on-demand EC2
- Latency-sensitive APIs: pre-provisioned environments eliminate cold starts entirely
- Workloads needing more compute: up to 32GB memory and 16 vCPUs (vs Lambda's 10GB limit)
- Teams that want serverless DX without managing containers, load balancers, or scaling policies
Stick with standard Lambda when:
- Traffic is spiky and unpredictable (Lambda's per-invocation pricing wins for low utilization)
- You want true scale-to-zero (LMI has minimum capacity)
- Your function isn't thread-safe (multi-concurrency requires thread safety)
- You're doing simple event processing that doesn't justify the setup
Use ECS/Fargate when:
- You need WebSocket support, sidecars, or service mesh
- You want full control over the runtime environment
- Your workload doesn't fit the Lambda programming model
How It Works
Architecture
Client β API Gateway β Lambda (routing) β Capacity Provider β EC2 Instances
β
Execution Environments
(multi-concurrency)
- You create a Capacity Provider. Defines VPC, subnets, security groups, instance types, and scaling parameters
- You attach Lambda functions to the capacity provider
- AWS provisions and manages EC2 instances in your account
- Each execution environment handles multiple concurrent requests
- AWS handles scaling (adding/removing instances within tens of seconds)
Capacity Provider
The capacity provider is the new concept. It controls:
- Which VPC and subnets to place instances in
- Allowed EC2 instance types (or "all" for maximum flexibility)
- Maximum vCPU count for auto-scaling limits
- Scaling policy (automatic or CPU-based)
Multi-Concurrency
The biggest conceptual shift. In standard Lambda, one invocation = one environment. In LMI, one environment handles many concurrent invocations:
- Shared memory, file system, and SDK clients across requests
- Your code must be thread-safe: no writing to shared state without synchronization
- Reduces total compute consumption
This is similar to how a traditional web server (ASP.NET, Express) handles requests. Multiple concurrent requests share one process.
Pricing
Three components:
- Request charge: $0.20 per million invocations (same as standard Lambda)
- EC2 instance charges: Standard EC2 pricing for provisioned capacity (Savings Plans and RIs apply)
- Management fee: 15% on the EC2 on-demand price
No duration charge per request (unlike standard Lambda). You're paying for the instance capacity, not per-millisecond execution.
Cost comparison example
A function handling 10M requests/month, 200ms average duration, 512MB memory:
| Standard Lambda | Lambda Managed Instances | |
|---|---|---|
| Requests | $2.00 | $2.00 |
| Compute | $16.67 | ~$8-12 (depends on instance utilization) |
| Savings Plans | N/A (17% max) | Up to 72% off EC2 |
| Estimated monthly | ~$18.67 | ~$5-14 |
The savings increase with higher traffic and deeper Savings Plans commitments.
What Changes in Your Code
Nothing (mostly). Your handler, event sources, IAM permissions, and monitoring remain the same. The one requirement: your code must be thread-safe for multi-concurrency.
Things to verify:
- No shared mutable state across invocations without synchronization
- File paths should be unique per request (don't write to
/tmp/output.jsonfrom all requests) - SDK clients can be shared (they're thread-safe) but custom connection pools need review
- Static variables used as caches need
ConcurrentDictionaryor similar thread-safe structures
CDK Example
import { Function, Runtime, Architecture, Code } from 'aws-cdk-lib/aws-lambda';
// Note: CDK support for LMI capacity providers is via L1 constructs
// or higher-level constructs as they become available
const capacityProvider = new CfnCapacityProvider(this, 'AppCapacity', {
vpcConfig: {
subnetIds: vpc.selectSubnets({ subnetType: SubnetType.PRIVATE_WITH_EGRESS }).subnetIds,
securityGroupIds: [appSg.securityGroupId],
},
instanceTypes: ['c7g.large', 'c7g.xlarge', 'm7g.large'], // Graviton4 for best price/performance
scalingConfig: {
maxVcpus: 64,
scalingPolicy: 'AUTO',
},
});
// Attach function to capacity provider
const orderApi = new Function(this, 'OrderApi', {
runtime: Runtime.DOTNET_8,
handler: 'OrderApi::OrderApi.Function::Handler',
code: Code.fromAsset('./src/OrderApi/publish'),
memorySize: 1024,
capacityProviderArn: capacityProvider.attrArn, // connects to LMI
});
Comparison: Lambda vs LMI vs ECS
| Standard Lambda | Lambda Managed Instances | ECS/Fargate | |
|---|---|---|---|
| Scaling | Per-invocation | Instance-based (auto) | Task-based (manual config) |
| Cold starts | Yes (except provisioned) | None | None |
| Multi-concurrency | No (1 req/environment) | Yes | Yes (web server) |
| Pricing | Per-request + duration | EC2 + 15% management | Fargate per-hour |
| Savings Plans | Limited (17%) | Full EC2 (up to 72%) | Fargate (20%) |
| Max resources | 10GB / 6 vCPU | 32GB / 16 vCPU | 120GB / 16 vCPU |
| Infrastructure management | None | None | Minimal (task defs, scaling) |
| Code model | Event handler | Event handler | Web server |
Further Reading
Looking for hands-on help? View my AWS architecture services β