Lambda Managed Instances

What Are Lambda Managed Instances?

Lambda Managed Instances (LMI) let you run Lambda functions on EC2 instances in your account while AWS manages everything. Instance lifecycle, OS patching, load balancing, auto-scaling, and routing. You keep the Lambda programming model (event handlers, IAM roles, CloudWatch integration) but get EC2's pricing advantages and hardware flexibility.

The key difference: multi-concurrency. Traditional Lambda runs one invocation per execution environment. LMI environments handle multiple concurrent requests, sharing resources across invocations. This cuts compute waste for workloads with sustained traffic.

Announced at re:Invent 2025, now available in all major commercial regions.

When to Use Lambda Managed Instances

Good fit:

Steady-state, high-traffic workloads where per-invocation Lambda pricing is expensive but you don't want to manage ECS/Fargate
Cost optimization: apply Compute Savings Plans or Reserved Instances for up to 72% discount over on-demand EC2
Latency-sensitive APIs: pre-provisioned environments eliminate cold starts entirely
Workloads needing more compute: up to 32GB memory and 16 vCPUs (vs Lambda's 10GB limit)
Teams that want serverless DX without managing containers, load balancers, or scaling policies

Stick with standard Lambda when:

Traffic is spiky and unpredictable (Lambda's per-invocation pricing wins for low utilization)
You want true scale-to-zero (LMI has minimum capacity)
Your function isn't thread-safe (multi-concurrency requires thread safety)
You're doing simple event processing that doesn't justify the setup

Use ECS/Fargate when:

You need WebSocket support, sidecars, or service mesh
You want full control over the runtime environment
Your workload doesn't fit the Lambda programming model

How It Works

Architecture

Client → API Gateway → Lambda (routing) → Capacity Provider → EC2 Instances
                                              ↓
                                     Execution Environments
                                     (multi-concurrency)

You create a Capacity Provider. Defines VPC, subnets, security groups, instance types, and scaling parameters
You attach Lambda functions to the capacity provider
AWS provisions and manages EC2 instances in your account
Each execution environment handles multiple concurrent requests
AWS handles scaling (adding/removing instances within tens of seconds)

Capacity Provider

The capacity provider is the new concept. It controls:

Which VPC and subnets to place instances in
Allowed EC2 instance types (or "all" for maximum flexibility)
Maximum vCPU count for auto-scaling limits
Scaling policy (automatic or CPU-based)

Multi-Concurrency

The biggest conceptual shift. In standard Lambda, one invocation = one environment. In LMI, one environment handles many concurrent invocations:

Shared memory, file system, and SDK clients across requests
Your code must be thread-safe: no writing to shared state without synchronization
Reduces total compute consumption

This is similar to how a traditional web server (ASP.NET, Express) handles requests. Multiple concurrent requests share one process.

Pricing

Three components:

Request charge: $0.20 per million invocations (same as standard Lambda)
EC2 instance charges: Standard EC2 pricing for provisioned capacity (Savings Plans and RIs apply)
Management fee: 15% on the EC2 on-demand price

No duration charge per request (unlike standard Lambda). You're paying for the instance capacity, not per-millisecond execution.

Cost comparison example

A function handling 10M requests/month, 200ms average duration, 512MB memory:

	Standard Lambda	Lambda Managed Instances
Requests	$2.00	$2.00
Compute	$16.67	~$8-12 (depends on instance utilization)
Savings Plans	N/A (17% max)	Up to 72% off EC2
Estimated monthly	~$18.67	~$5-14

The savings increase with higher traffic and deeper Savings Plans commitments.

What Changes in Your Code

Nothing (mostly). Your handler, event sources, IAM permissions, and monitoring remain the same. The one requirement: your code must be thread-safe for multi-concurrency.

Things to verify:

No shared mutable state across invocations without synchronization
File paths should be unique per request (don't write to /tmp/output.json from all requests)
SDK clients can be shared (they're thread-safe) but custom connection pools need review
Static variables used as caches need ConcurrentDictionary or similar thread-safe structures

CDK Example

import { Function, Runtime, Architecture, Code } from 'aws-cdk-lib/aws-lambda';

// Note: CDK support for LMI capacity providers is via L1 constructs
// or higher-level constructs as they become available

const capacityProvider = new CfnCapacityProvider(this, 'AppCapacity', {
  vpcConfig: {
    subnetIds: vpc.selectSubnets({ subnetType: SubnetType.PRIVATE_WITH_EGRESS }).subnetIds,
    securityGroupIds: [appSg.securityGroupId],
  },
  instanceTypes: ['c7g.large', 'c7g.xlarge', 'm7g.large'], // Graviton4 for best price/performance
  scalingConfig: {
    maxVcpus: 64,
    scalingPolicy: 'AUTO',
  },
});

// Attach function to capacity provider
const orderApi = new Function(this, 'OrderApi', {
  runtime: Runtime.DOTNET_8,
  handler: 'OrderApi::OrderApi.Function::Handler',
  code: Code.fromAsset('./src/OrderApi/publish'),
  memorySize: 1024,
  capacityProviderArn: capacityProvider.attrArn, // connects to LMI
});

Comparison: Lambda vs LMI vs ECS

	Standard Lambda	Lambda Managed Instances	ECS/Fargate
Scaling	Per-invocation	Instance-based (auto)	Task-based (manual config)
Cold starts	Yes (except provisioned)	None	None
Multi-concurrency	No (1 req/environment)	Yes	Yes (web server)
Pricing	Per-request + duration	EC2 + 15% management	Fargate per-hour
Savings Plans	Limited (17%)	Full EC2 (up to 72%)	Fargate (20%)
Max resources	10GB / 6 vCPU	32GB / 16 vCPU	120GB / 16 vCPU
Infrastructure management	None	None	Minimal (task defs, scaling)
Code model	Event handler	Event handler	Web server