Back to Solutions

Serverless Optimization Patterns

Cold start reduction, memory tuning, caching strategies, and knowing when Lambda isn't the answer

Lambda's pricing model is elegant: pay for what you use. But "what you use" depends heavily on how you build your functions. Unoptimized code running on oversized memory configurations can cost 5-10x more than it needs to.

I've spent years optimizing .NET Lambda functions—building open source libraries with AOT compilation as a first-class concern, implementing caching layers, and learning when Lambda isn't the right tool at all. Here's what actually moves the needle.

Cold Start Reduction

Cold starts are the tax you pay for serverless. For .NET, they can be brutal—JIT compilation, assembly loading, and dependency injection setup all happen on first invocation. The solution is eliminating as much of that work as possible.

AOT Compilation (Ahead-of-Time)

Native AOT compiles your .NET code to native binaries at build time, eliminating JIT compilation at runtime. Cold starts drop from seconds to milliseconds.

With AOT

~200-400ms

Cold start typical

Without AOT

~2-5 seconds

Cold start typical

All of my open source libraries (FluentDynamoDB, LambdaOpenApi, LambdaGraphQL) are built with AOT as a first-class concern. If a library doesn't support AOT, it's not going in my Lambda functions.

AOT Trade-offs

  • • Reflection-heavy code needs rework
  • • Build times increase significantly
  • • Some NuGet packages don't support it
  • • Stack traces are less useful—invest in better logging

Worth it for production workloads. Not worth it for quick prototypes.

The Memory/Cost Paradox

Lambda pricing is per GB-second. Intuitively, lower memory = lower cost. But Lambda allocates CPU proportionally to memory, so a 128MB function runs on a fraction of a vCPU while a 1769MB function gets a full vCPU.

The Math That Surprises People

A function that takes 1000ms at 128MB might take 100ms at 1024MB. Same work, same cost—but 10x better latency.

Memory Duration GB-seconds Relative Cost
128 MB 1000ms 0.125 1.0x
512 MB 300ms 0.15 1.2x
1024 MB 100ms 0.1 0.8x ✓

The optimal memory setting depends on your workload. CPU-bound functions benefit from more memory. I/O-bound functions (waiting on DynamoDB, external APIs) often don't. Test with real traffic patterns, not synthetic benchmarks.

Caching with ElastiCache

The fastest database call is the one you don't make. For data that's read frequently and changes infrequently, caching in ElastiCache (Valkey/Redis) dramatically reduces both latency and cost.

Good Cache Candidates

  • • User profiles and permissions
  • • Configuration and feature flags
  • • Reference data (lookup tables)
  • • Computed aggregations
  • • Session data

Poor Cache Candidates

  • • Frequently changing data
  • • Data with strict consistency needs
  • • Large objects (>1MB)
  • • Rarely accessed data
  • • Security-sensitive data

Cache Invalidation

The hard part isn't caching—it's knowing when to invalidate. TTL-based expiration works for most cases. For data that must be immediately consistent, use write-through patterns or event-driven invalidation via DynamoDB Streams or EventBridge.

When Lambda Isn't the Answer

Lambda is great for event-driven, bursty workloads. It's not great for everything. Knowing when to reach for ECS or Fargate instead saves money and headaches.

Real Example: The API Router

For the Unified API Router, I chose ECS over Lambda despite Lambda being "serverless." Why?

  • Consistent latency required — cold starts on the critical path were unacceptable
  • Always-on traffic — steady request volume meant Lambda's per-invocation pricing was more expensive
  • Long-lived connections — the router maintains connection pools to backend services
  • Predictable performance — needed consistent behavior for SLA commitments

Lambda Wins

  • • Bursty, unpredictable traffic
  • • Event-driven processing
  • • Low baseline with occasional spikes
  • • Short-lived operations
  • • Rapid scaling requirements

ECS/Fargate Wins

  • • Steady, predictable traffic
  • • Latency-sensitive workloads
  • • Long-running processes
  • • Connection pooling needed
  • • Cost optimization at scale

Technology Stack

Lambda .NET 8 AOT ElastiCache Valkey DynamoDB ECS Fargate CloudWatch X-Ray

Outcomes

Cold Starts Under 500ms

AOT compilation brings .NET Lambda cold starts from seconds to sub-second

Right-Sized Costs

Memory tuning and caching reduce per-invocation costs without sacrificing performance

Predictable Latency

Caching and compute selection eliminate latency surprises

Informed Trade-offs

Clear criteria for Lambda vs ECS decisions based on workload characteristics

Need Help Optimizing Your Serverless Workloads?

Let's review your Lambda functions and find the quick wins.

Get in Touch