A glimpse into cloud spending patterns via the Vantage customer base. Well done. This is pretty awesome and based on comments from the team that published it, I expect it will continue to get even better.
As much as I love it, one downside of vendors publishing data from their platforms is the lack of demographic data on the vendor’s customers.
Vantage cloud cost customers
Fresh off the quarterly earnings for Amazon, Microsoft, and Google — where cloud spend is slowing as companies focus on cost optimization — we are releasing our analysis into cloud usage based on Vantage usage. Vantage is a cloud cost management and optimization platform with a unique view into industry trends thanks to thousands of connected infrastructure accounts across 11 billing integrations.
This Q1 2023 Cloud Cost Report uses anonymized, real-world data to quantify how cloud spending is shifting and changing across the tech industry. For the first time this quarter, we are highlighting cross-platform spending patterns. To discuss this report in more detail, join our growing Slack Community of over 1,000 engineering leaders, FinOps professionals, and CFOs. View past reports here or download a PDF of this report.
Cloud Spending Profiles Return to Normal
On Demand spend as a percentage of all compute spend came in at just 31.36% at the end of Q4, but rebounded 13 points this quarter to represent 43.35% of EC2 compute costs.
This is in line with spending patterns earlier last year, and suggest the cloud infrastructure market is recovering after low growth rates to end 2022. While AWS earnings forecast lower growth rates, Azure and Google Cloud both reported Q1 growth and guidance that exceeded expectations.
Sparks of AI Costs? Not Yet.
One indicator for AI costs is GPU spending as a share of EC2 costs. This was relatively flat from Q4, albeit a couple percentage points higher than Q3 of 2022. These costs were not yet detectable in our data, and we suspect GPU supply constraints may have played a role. Vantage launched its OpenAI cost integration earlier this month, which will help increase cost visibility into this emerging industry.
One trend we did notice? Spending on inf1 instances was the highest ever recorded in March 2023, growing 500%+ from February. Our guess is that next quarter's numbers will reflect the AI wave. Microsoft is attributing 1% of growth for Q2 to AI. Sign up for the Q2 report at the bottom of the page.
Top 10 Services across AWS, Google Cloud, and Azure
We are breaking out cloud spending across multiple clouds in this report for the first time. After launching support for Google Cloud and Azure in 2022, this graphic provides a preliminary look at how spending on services stacks up across clouds. For a complete view of the rankings of every AWS service, refer to the Cloud Cost Leaderboard.
Although the ranking of services differs little across clouds, the pattern is that Compute, Data, Storage, Monitoring, and Bandwidth drive share of costs, in that order. BigQuery is famously the jewel of Google Cloud at the #3 spot, while EC2 remains the king of compute at AWS. On Azure, Databricks actually shows up in the list of top services due to its unique integration with Microsoft. AWS has no data warehouse in its list of top services.
S3 Optimization Hits a Ceiling
In Q4 we recorded a sharp uptick in organizations on Vantage using Intelligent Tiering, with 16% of S3 costs coming from tiers which are managed with Intelligent Tiering. For Q1 we are reporting almost the exact same percentage breakdowns in costs across S3 storage tiers.
This is an indication that Intelligent Tiering has some diminishing returns where 2/3 of an average company's storage costs in S3 may not be helped by turning it on.
S3 Storage Class | Share of Costs | S3 Intelligent Tiering |
Standard Storage | 64.40% | |
Standard Infrequent Access | 10.94% | |
Standard Frequent Access | 9.62% | Yes |
Glacier Flexible Retrieval | 5.80% | |
Archive Instant Access | 3.30% | Yes |
Infrequent Access | 2.43% | Yes |
Glacier Deep Archive | 1.03% | |
One Zone Infrequent Access | 0.84% | |
Archive Access | 0.53% | Yes |
Other | 1.11% |
Graviton Adoption on EC2 Accelerates to over 5% of Spend
Graviton adoption is increasing faster on EC2 than when we first started recording this metric in the Q3 report in 2022. Intel has also made a comeback in this dataset with the release of the first new Intel based EC2 instances in 2 years (more on that below).
“Server and networking markets have yet to reach their bottoms as cloud and enterprise remain weak.”
- Patrick Gelsinger, CEO at Intel
While Intel's market share in our dataset remains strong, custom chips like Graviton and an upcoming one from Microsoft are a significant challenge. Intel posted its largest quarterly loss ever in its Q1 earnings.
Data Warehouse and Caching Spend Increased in Q1
Compared to the holiday season where we observed greater on-demand database costs pared with well-committed data warehouse spend, Q1 saw increased on-demand ElastiCache and Redshift spending. With Autopilot, many companies loaded up on reservations which were subsequently sold off in Q1. This behavior matches the trend in the graph which shows RDS, ElastiCache, OpenSearch and Redshift on-demand cost as a percentage of all spend on those services.
GP3 Volume Upgrades are in Full Force
GP3 storage volumes for RDS had only just come out in the Q4 report and so GP2 volumes were the majority of block storage spend. In that report we noted what others like Corey Quinn said, namely that the GP3 volumes cost the same as GP2 volumes on a price/performance basis.
Nevertheless, Vantage customers adopted GP3 on RDS for their workloads, and GP2 saw a sharp drop, falling below the high performance IO1 drives as the volume of choice for RDS. We are still a long ways from GP2 being completely deprecated, and questions remain if the economics of GP3 may slow its adoption curve.
Upgrade Cycles can be 3 Years
The most notable change in the mix of instance types generating costs for Q1 is the emergence of the c6, r6, and m6 instance types as rapidly starting to consume workloads from the 5 series instance types. m5 and c5 instances both had sharp drops in their share of spend.
The introduction of new c6in instances and r6in based on Intel Ice Lake processors certainly helped, as x86 workloads had new options to upgrade to.
Datadog vs AWS
As a second look at multi-cloud workloads, Datadog offers nearly 20 observability, security, and telemetry products. AWS has a different mix of services targeting the same workloads. Teams are spending money on Metrics, APM, and Security in that order.
We can see why observability continues to be a hot space as costs from various CloudWatch categories dominates the monitoring cost leaderboard. Indeed, CloudWatch is the #7 service for costs on AWS the other cross cloud comparison graphic in this report. By anticipating a healthy spend for observability tooling, you can be ready to take advantage of committed use discounts from Datadog.
Cloud costs are a growing and often mysterious budget item that engineering, operations, and finance teams grapple with even as their organizations reap the cloud's many benefits. Vantage is a cloud cost management and optimization platform with a unique dataset of cloud spending patterns. To discuss this report in more detail, join our growing Slack Community of over 1,000 FinOps and cloud professionals here.
Today Vantage is releasing the inaugural Cloud Cost Report for Q3 which uses anonymized, real-world data to quantify how cloud spending is shifting and changing. This report analyzes interesting trends to provide engineering leaders, FinOps professionals, and CFOs with emerging and evolving patterns. This inaugural report focuses initially on AWS, with hopes to extend to other other providers for subsequent quarters.
Committed spend is on the rise as bottom lines come into focus.
On-demand compute costs are decreasing as a percentage of total compute spend, despite still being at relatively high levels in aggregate. On-demand compute as a share of costs has decreased from 50.19% a year ago to 44.66% at the end of Q3.
As bottom lines have come more into focus as financial markets correct we've seen more customers opt to make commitments in the form of AWS Savings Plans and Reserved Instances.
We expect this trend to continue as compute spend remain the major area of optimization for most organizations.
Intel Still Dominates.
Intel remains dominant among traditional servers even with the much-hyped rollout of ARM Graviton processors on AWS. AMD holds onto 20% of the compute market share.
While we see more forward thinking organizations make shifts over to AWS Graviton, its adoption is still low in comparison at less than 1% of represents EC2 costs.
Companies are Slow to Upgrade to Newer Instances.
Although newer generation instances have often been available for a while, companies are slow to move to them. In our data the c5, m5, and r5 instances make up the vast majority of compute costs, with c6, m6, and r6 instances only beginning to make a dent this past quarter.
In contrast with serverless workloads, EC2 instance upgrades are not happening as quickly even though newer generations offer better performance and lower costs than previous generation instances.
GP2 vs GP3: Older Storage Volumes are Driving Higher Costs.
Newer and more cost effective GP3 volumes have been available since 2020 but the majority of block storage costs are still on older GP2 volumes as presumably most customers don't fully grasp price-to-performance implications.
Despite the fact that customers can save 20% on block storage and achieve higher I/O performance for their volumes by upgrading, many customers are slow to transition.
The Share of GPU Spend In the Era of AI.
There are many GPU types available in the cloud. While AI training and inference are leading use-cases, a large share of spend is devoted to g4 instances which are best for graphics workloads. For example, virtual workstations (or "cloud desktops") help organizations adapt to remote-work quickly and in some cases provide repeatable development environments for developers.
Today's large models need p3 or p4 machines which contain NVIDIA's latest server-class boards and provide the amount of video memory and tensor cores needed to train large language models like GPT3 or generative image networks like Stable Diffusion.
Graviton Adoption on Lambda is Growing Quickly.
For serverless workloads on AWS, the CPU picture is much different than EC2. Since these applications are as highly abstracted from the hardware as possible, switching to the best price-to-performance CPU is easier. Here we find that the share of Lambda costs on Graviton are on track to exceed the share of costs on x86 among Vantage users.
The belief here is that adopters of AWS Lambda tend to be more forward-thinking and modern organizations and are cognizant of the price-to-performance benefits that Graviton offers.
More teams should use S3 Storage Tiers.
The vast majority of S3 storage costs were accrued on the standard storage tier - which is the default tier for S3. AWS offers services like S3 Intelligent Tiering, which automatically moves objects to cheaper storage tiers based on access patterns has been available since 2018.
There is an added monitoring cost with Intelligent Tiering but this was de-minimis as a sum-total of S3 costs.
S3 Storage Class | Share of Costs | S3 Intelligent Tiering |
Standard Storage | 80.95% | |
Standard Infrequent Access | 6.07% | |
Standard Frequent Access | 4.17% | Yes |
Glacier Deep Archive | 1.85% | |
Glacier Flexible Retrieval | 2.51% | |
Infrequent Access | 1.91% | Yes |
Archive Instant Access | 0.52% | Yes |
One Zone Infrequent Access | 0.52% | |
Archive Access | 0.11% | Yes |
Other | 1.39% |