Lambda's pricing model is elegant: pay for what you use. But "what you use" depends heavily on how you build your functions. Unoptimized code running on oversized memory configurations can cost 5-10x more than it needs to.
I've spent years optimizing .NET Lambda functions—building open source libraries with AOT compilation as a first-class concern, implementing caching layers, and learning when Lambda isn't the right tool at all. Here's what actually moves the needle.
Cold Start Reduction
Cold starts are the tax you pay for serverless. For .NET, they can be brutal—JIT compilation, assembly loading, and dependency injection setup all happen on first invocation. The solution is eliminating as much of that work as possible.
AOT Compilation (Ahead-of-Time)
Native AOT compiles your .NET code to native binaries at build time, eliminating JIT compilation at runtime. Cold starts drop from seconds to milliseconds.
With AOT
~200-400ms
Cold start typical
Without AOT
~2-5 seconds
Cold start typical
All of my open source libraries (FluentDynamoDB, LambdaOpenApi, LambdaGraphQL) are built with AOT as a first-class concern. If a library doesn't support AOT, it's not going in my Lambda functions.
AOT Trade-offs
- • Reflection-heavy code needs rework
- • Build times increase significantly
- • Some NuGet packages don't support it
- • Stack traces are less useful—invest in better logging
Worth it for production workloads. Not worth it for quick prototypes.
The Memory/Cost Paradox
Lambda pricing is per GB-second. Intuitively, lower memory = lower cost. But Lambda allocates CPU proportionally to memory, so a 128MB function runs on a fraction of a vCPU while a 1769MB function gets a full vCPU.
The Math That Surprises People
A function that takes 1000ms at 128MB might take 100ms at 1024MB. Same work, same cost—but 10x better latency.
| Memory | Duration | GB-seconds | Relative Cost |
|---|---|---|---|
| 128 MB | 1000ms | 0.125 | 1.0x |
| 512 MB | 300ms | 0.15 | 1.2x |
| 1024 MB | 100ms | 0.1 | 0.8x ✓ |
The optimal memory setting depends on your workload. CPU-bound functions benefit from more memory. I/O-bound functions (waiting on DynamoDB, external APIs) often don't. Test with real traffic patterns, not synthetic benchmarks.
Caching with ElastiCache
The fastest database call is the one you don't make. For data that's read frequently and changes infrequently, caching in ElastiCache (Valkey/Redis) dramatically reduces both latency and cost.
Good Cache Candidates
- • User profiles and permissions
- • Configuration and feature flags
- • Reference data (lookup tables)
- • Computed aggregations
- • Session data
Poor Cache Candidates
- • Frequently changing data
- • Data with strict consistency needs
- • Large objects (>1MB)
- • Rarely accessed data
- • Security-sensitive data
Cache Invalidation
The hard part isn't caching—it's knowing when to invalidate. TTL-based expiration works for most cases. For data that must be immediately consistent, use write-through patterns or event-driven invalidation via DynamoDB Streams or EventBridge.
When Lambda Isn't the Answer
Lambda is great for event-driven, bursty workloads. It's not great for everything. Knowing when to reach for ECS or Fargate instead saves money and headaches.
Real Example: The API Router
For the Unified API Router, I chose ECS over Lambda despite Lambda being "serverless." Why?
- • Consistent latency required — cold starts on the critical path were unacceptable
- • Always-on traffic — steady request volume meant Lambda's per-invocation pricing was more expensive
- • Long-lived connections — the router maintains connection pools to backend services
- • Predictable performance — needed consistent behavior for SLA commitments
Lambda Wins
- • Bursty, unpredictable traffic
- • Event-driven processing
- • Low baseline with occasional spikes
- • Short-lived operations
- • Rapid scaling requirements
ECS/Fargate Wins
- • Steady, predictable traffic
- • Latency-sensitive workloads
- • Long-running processes
- • Connection pooling needed
- • Cost optimization at scale
Technology Stack
Outcomes
Cold Starts Under 500ms
AOT compilation brings .NET Lambda cold starts from seconds to sub-second
Right-Sized Costs
Memory tuning and caching reduce per-invocation costs without sacrificing performance
Predictable Latency
Caching and compute selection eliminate latency surprises
Informed Trade-offs
Clear criteria for Lambda vs ECS decisions based on workload characteristics