Cost Optimization on AWS for Java Workloads
Cost awareness is a feature. Treat it like performance: measure, set targets, iterate.
Quick Wins
- Migrate to Graviton-based instances for 20–40% better price/perf
- Right-size with autoscaling and target tracking on ALB latency
- Use EBS gp3 volumes; detach unused EBS and snapshots
JVM Tuning Essentials
- Prefer G1/Generational ZGC for low-latency APIs
- Set container-aware heap:
-XX:+UseContainerSupport - Emit structured logs; sample traces instead of full capture
Purchasing Options
- Savings Plans for steady-state compute
- Spot for stateless batch and non-prod
- S3 lifecycle rules and Intelligent-Tiering for object storage
Regular cost reviews prevent drift and keep performance predictable.
A Repeatable Monthly Cost Review (2 hours)
- Pull Costs by Service and Tag
- Use Cost Explorer or CUR Athena to list spend by
Service,LinkedAccount,Tag:env,Tag:app - Identify top 5 services: EC2/ECS/EKS, RDS, S3, NAT, Data Transfer
- Compute: Rightsize and Purchase Options
- Compare CPU/mem utilization vs instance sizing
- Move to Graviton where supported (Java 17/21 runs great)
- Commit Savings Plans for baseline; use Spot for non-prod/batch
- Java Runtime Tuning
# Recommended container-aware flags
JAVA_TOOL_OPTIONS="-XX:+UseContainerSupport -XX:MaxRAMPercentage=70 -XX:+AlwaysActAsServerClassMachine"
- Prefer G1 (APIs) or Generational ZGC (low-latency)
- Emit metrics to Prometheus and watch GC pause times vs throughput
- Storage and Data
- Switch EBS to gp3 and set throughput appropriately
- Use S3 lifecycle policies, Intelligent-Tiering for rarely accessed objects
- RDS: enable storage autoscaling, review instance class quarterly
- Networking
- Evaluate NAT costs; move egress to VPC endpoints where possible
- Cache outbound calls; batch and compress payloads
- CI/CD Hygiene
- Delete unused ECR images and old task definitions
- Tear down ephemeral environments automatically (TTL)
Cost Guardrails to Automate
- Budgets with alerts at 50/80/100%
- CloudWatch alarms on unusual spikes (e.g., NAT bytes, S3 GETs)
- Weekly report: cost per request, per environment, per team