The Bill That Did Not Match the Performance
A SaaS company's cloud bill grew 65 percent year over year while customer-facing performance metrics improved by 4 percent. The engineering team had been making infrastructure decisions one workload at a time, each one favoring performance over cost, each one defensible in isolation. The aggregate was a cloud bill that no longer matched the value being produced.
Flexera's 2024 State of the Cloud report found 32 percent of cloud spend is waste, with most enterprises significantly underestimating their own waste percentage (Flexera, "2024 State of the Cloud Report," 2024). The number has been roughly stable for five years. The underlying pattern is that cost and performance look like a single tradeoff and are actually three independent tradeoffs.
If your cloud bill is growing faster than your customer-facing performance is improving, the tradeoff curve has been moving in the wrong direction. Three levers, used in the right order, move it back.
Coined Frame: The Three Levers
Cloud cost-performance is not one curve. It is three independent curves on compute, storage, and network. Each one has its own optimization moves.
Lever 1 - Compute. CPU, GPU, memory. The largest line item for most workloads and the place where the largest savings usually live. Reserved instances, spot pricing, right-sizing, autoscaling, ARM-based instances (AWS Graviton, Azure Cobalt, GCP Axion) all move the compute curve. Most enterprises have pulled some of these levers and few have pulled all of them.
Lever 2 - Storage. Block, object, archive, database storage. Storage costs grow continuously because data accumulates. Tier management (hot, warm, cold), lifecycle policies, compression, deduplication, and retention enforcement move the storage curve. The biggest gains usually come from deleting or archiving data that should not have been kept on hot tiers.
Lever 3 - Network. Egress, cross-region transfer, NAT gateway charges, load balancer hours. Network costs hide in many line items and add up to 15-25 percent of total cloud spend for most enterprises. Moving compute closer to storage, consolidating regions, and reducing unnecessary cross-region replication all move the network curve.
A team that optimizes one lever well and ignores the others typically captures 20-30 percent of available savings. A team that addresses all three captures 50-70 percent.
What Actually Moves the Compute Curve
Six moves capture most of the compute savings available.
Right-sizing. Most instances run at 20-40 percent utilization. Right-sizing tools (AWS Compute Optimizer, Azure Advisor, GCP Recommender) identify oversized instances and recommend smaller alternatives. The savings are mechanical once the team commits to acting on the recommendations.
Reserved capacity. Reserved instances, savings plans, committed use discounts. For steady workloads with predictable utilization, reserved capacity saves 30-60 percent over on-demand pricing. Most enterprises under-buy reserved capacity.
Spot capacity. Spot instances run at 60-90 percent below on-demand pricing in exchange for interruption risk. Batch workloads, dev environments, and stateless compute can run on spot with appropriate architecture.
ARM-based instances. AWS Graviton, Azure Cobalt, GCP Axion offer 20-40 percent better price-performance than equivalent x86 instances for workloads that have been recompiled. Most major frameworks (Node, Python, Go, Java) run natively. The migration cost is small for cloud-native workloads.
Autoscaling. Workloads that scale down at trough and up at peak pay only for what they use. Most enterprises have basic autoscaling and few have tuned it well enough to capture the available savings.
Workload consolidation. Multiple workloads sharing infrastructure (container platforms, serverless) get better utilization than dedicated infrastructure per workload. Migration to consolidated platforms produces structural savings.
The teams that pull all six levers typically reduce compute spend by 35-55 percent. The teams that pull two or three see 15-25 percent.
What Moves the Storage Curve
Four moves capture most of the storage savings.
Tier management. Hot storage costs 5-10x cold storage. Most enterprises keep too much data on hot tiers because tier management was never built into the ingestion process.
Lifecycle policies. Automated movement of data from hot to warm to cold to archive based on age. The policies have to be designed once and then enforce themselves.
Retention enforcement. Most enterprises store data longer than they need to. Regulatory retention requirements are real; data hoarding beyond them is voluntary cost.
Compression and deduplication. Especially for object storage, modern compression algorithms reduce stored bytes 30-60 percent with minimal access overhead.
What Moves the Network Curve
Three moves capture most of the network savings.
Co-location of compute and storage. The biggest network cost driver is unnecessary cross-region or cross-AZ traffic. Architectural decisions that keep compute and storage co-located eliminate this cost line.
NAT gateway consolidation. Most enterprises have more NAT gateways than they need. Consolidation through VPC architecture refresh captures meaningful savings.
Egress optimization. CDN usage for traffic that leaves the cloud, private connectivity (AWS PrivateLink, Azure Private Link, GCP Private Service Connect) for partner integrations to avoid public egress charges.
The Performance Side of the Curve
Cost optimization that degrades performance does not survive the next budget review. The performance bar has to be maintained or improved while costs come down.
The teams that do this well measure performance continuously through the optimization work, set explicit thresholds that cannot be crossed, and roll back changes that violate the thresholds. Performance regression is treated as a release blocker, the same way functional regression is.
The pattern that works best is making performance improvements and cost improvements together rather than treating them as opposites. Right-sizing typically improves performance by removing oversized infrastructure with poor utilization patterns. Tier management often improves storage performance by moving cold data off systems that were not designed to hold it. The framing of cost versus performance is usually a false dichotomy.
What This Costs
A serious cost-performance optimization program for a mid-market enterprise typically requires one senior cloud engineer for one quarter for the initial audit and optimization push, plus 10-20 percent ongoing capacity for sustained operations.
The savings depend on starting state. Teams that have done little optimization typically capture 30-50 percent reduction in one quarter. Teams that have done basic optimization typically capture 10-20 percent additional. The math justifies the work at almost any scale above $50K per month in cloud spend.
What Logiciel Does Here
Logiciel works with engineering teams whose cloud costs have grown beyond what their workload justifies. The work is structured around the three-lever model with priority on the lever that produces the largest savings for the lowest effort.
The AI FinOps Framework covers the AI-specific cost optimization that sits inside this broader cloud cost discipline. The Data Pipeline Cost Optimization framework covers the data-specific costs that often dominate enterprise cloud bills.
A 30-minute working session is enough to assess your current spend against the three levers and identify the highest-leverage starting point.
Frequently Asked Questions
Which lever should I pull first?
Almost always compute, because it is the largest line item for most workloads. Right-sizing and reserved capacity together typically produce visible savings within 30 days. Storage and network optimization follow after compute is addressed.
How do I avoid breaking workloads while optimizing?
Three patterns. Continuous performance monitoring through the optimization work. Phased rollout of changes with explicit rollback criteria. Test environments that mirror production before production changes.
When should I migrate to ARM-based instances?
For cloud-native workloads written in modern languages, almost always. For workloads with compiled native dependencies or proprietary software, evaluate carefully. The price-performance gap justifies the migration cost in most cases.
How does this work for AI workloads?
AI workloads have their own cost dynamics covered in the AI FinOps Framework. The three-lever model applies but the specific moves are different (GPU vs CPU optimization, inference vs training cost shapes, embedding storage patterns).
What is the right organizational owner for cloud cost optimization?
Joint between engineering platform and finance, with dedicated FinOps capacity at scale. Engineering owns the technical levers. Finance owns the budget envelope and contract negotiation. The two have to coordinate; pure ownership by either function produces worse outcomes. Sources: - Flexera, "2024 State of the Cloud Report" - AWS Compute Optimizer documentation, 2024