Back to Solutions

AWS Cost Optimization Strategies

Finding the 20-40% waste hiding in most AWS environments

Most AWS environments have significant waste—oversized instances, unused resources, suboptimal pricing models, and traffic patterns that cost more than they should. The good news: most of it is fixable without major architectural changes.

Here's where I typically find savings, roughly ordered from quick wins to longer-term optimizations.

Quick Wins: The Low-Hanging Fruit

Unused Elastic IPs

AWS charges for EIPs that aren't attached to running instances. At ~$3.65/month each, a handful of forgotten EIPs adds up.

Check: EC2 → Elastic IPs → look for "not associated"

Unattached EBS Volumes

Volumes left behind after instance termination. You're paying for storage nobody's using.

Check: EC2 → Volumes → filter by "available" state

Old EBS Snapshots

Snapshots accumulate over time. Many are from instances that no longer exist or backups that exceed retention needs.

Check: EC2 → Snapshots → sort by age, review anything older than your retention policy

Idle Load Balancers

ALBs and NLBs have hourly charges whether they're handling traffic or not. Dev/test environments often have load balancers that should be torn down.

Check: CloudWatch metrics for RequestCount = 0 over extended periods

Right-Sizing Compute

Most instances run at 10-30% CPU utilization. That's money left on the table.

EC2 Right-Sizing

  • • Use AWS Compute Optimizer recommendations
  • • Review CloudWatch CPU/memory metrics
  • • Consider Graviton (ARM) instances—often 20% cheaper
  • • Match instance family to workload (compute vs memory optimized)

ECS/Fargate Right-Sizing

  • • Review task CPU/memory reservations vs actual usage
  • • Container Insights shows actual utilization
  • • Over-provisioned tasks are common—start smaller
  • • Fargate Spot for fault-tolerant workloads

The Catch

Right-sizing requires understanding your workload patterns. A server that's 10% utilized most of the time but spikes to 80% during batch jobs needs headroom. Look at P95/P99 metrics, not just averages.

Commitment Discounts

Once you've right-sized, commit to what you're actually using. AWS rewards commitment with significant discounts.

Compute Savings Plans

Commit to a $/hour spend on compute (EC2, Fargate, Lambda). Flexible across instance types, regions, and OS. Up to 66% savings.

Best for: Most workloads. Start with your baseline steady-state usage.

EC2 Instance Savings Plans

Commit to specific instance family in a region. Less flexible but deeper discounts than Compute Savings Plans.

Best for: Stable workloads where you know the instance family won't change.

Reserved Instances

The original commitment model. Still useful for RDS, ElastiCache, OpenSearch, and Redshift where Savings Plans don't apply.

Best for: Databases and caches with predictable, steady usage.

ElastiCache Reserved Nodes

If you're running ElastiCache (Redis/Valkey) 24/7, reserved nodes can save 30-50% over on-demand.

Best for: Production caches that run continuously.

CloudFront Security Savings Bundle

Commit to CloudFront spend and get AWS WAF included at a discount. Good if you're using both anyway.

Best for: Workloads already using CloudFront + WAF together.

Network Cost Optimization

Data transfer charges are often the surprise line item on AWS bills. The key principle: keep traffic inside AWS, and ideally inside the same region.

Data Transfer Cost Hierarchy

Same AZ Free
Cross-AZ (same region) $0.01/GB each way
Cross-region $0.02/GB+
To internet $0.09/GB (first 10TB)

VPC Endpoints

Traffic to S3, DynamoDB, and other AWS services can go through VPC endpoints instead of the internet. Saves data transfer costs and improves security.

Same-Region Architecture

Keep services that talk to each other in the same region. Cross-region replication has its place, but don't do it by accident.

CloudFront for Egress

CloudFront data transfer is cheaper than direct EC2/S3 egress. For high-traffic APIs, putting CloudFront in front can reduce costs even without caching.

S3 Storage Tiering

S3 Standard is the default, but most data doesn't need instant access forever.

Tier Use Case Savings
Intelligent-Tiering Unknown access patterns Automatic, ~40%
Infrequent Access Accessed < 1x/month ~45%
Glacier Instant Rarely accessed, need instant retrieval ~68%
Glacier Flexible Archives, minutes-hours retrieval OK ~90%
Glacier Deep Archive Compliance archives, 12-hour retrieval ~95%

Lifecycle Policies

Set up lifecycle policies to automatically transition objects to cheaper tiers based on age. Most logs, for example, can move to IA after 30 days and Glacier after 90.

Tools & Services

Cost Explorer Compute Optimizer Trusted Advisor Savings Plans Reserved Instances S3 Intelligent-Tiering VPC Endpoints CloudWatch

Want to Find the Waste in Your AWS Bill?

Get a free 25-minute assessment and I'll identify 3-5 quick wins.

Get a Free Assessment