The Right Stack
  • Cloud, Developer, AI, and Security Survey Collection
  • Blog
The Right Stack

Vendor research collection

Linkedin

Threads

RSS Feed

The State of AI Infrastructure at Scale 2024 - AI Infrastructure Alliance

Vendor Sponsor
Clear MLAI Infrastructure AllianceFuriosa
Research Published
March 13, 2024
Link to research
https://ai-infrastructure.org/the-state-of-ai-infrastructure-at-scale-2024/
Demographic or Methodology comments

Topic Tags
AIGenerative AIInfrastructureApp DevelopmentAI App Development
Sample
Survey Contacts
Sample Size
1000
Data Source
Survey
Gate
Gated
Demographics
Created time
Mar 19, 2024 1:03 AM
Directory name

The Rightstack Research DB

image

In this report, we surveyed a 1000 more enterprises on to see how they’re adapting to the growing demands of AI in their infrastructure. 96% of companies plan to expand their AI compute capacity and investment to embrace the new possibilities of AI. We discovered:

  • Open Source AI solutions and model customization are top priorities, with 96% of companies focused on customizing primarily Open Source models.
  • Optimizing GPU utilization is a major concern for 2024-2025, with the majority of GPUs underutilized during peak times.
  • A staggering 74% of companies are dissatisfied with their current job scheduling tools and face resource allocation constraints regularly, while limited on-demand and self-serve access to GPU compute inhibits productivity.

We focused on C-suite and team leaders with job titles like CIO, CTO, Head of AI, VP of Data or VP of AI, across a range of verticals.

Get it now. FREE.

Name *

Email *

Key Charts

image

Estimate your current allocation of existing GPU resources (i.e.non-idle GPUs) during peak periods.

When asked about peak periods for GPU usage, 15% of respondents report that fewer than 50% of available GPUs are in use. 53% believe 51-70% of GPU resources are utilized, and 25% believe their GPU utilization reaches 85%. Only 7% of companies believe their GPU infrastructure achieves more than 85% utilization during peak periods

image

What is your organization’s greatest concern about deploying Generative AI?

The biggest concern for deploying Generative AI was moving too fast and missing important considerations (e.g. prioritizing the wrong use cases), whereas the second most-important concern was moving too slow due to lack of ability to execute, exposing ambiguity amidst leadership. It appears that executives are caught between the desire to move quickly and the danger of costly mistakes.

Governance also weighed in the back of respondents’ minds, with upcoming regulations and lack of control over usage and scaling as the next two most-important concerns.

image

Rank your organization’s compute concerns for 2024

When asked about their organization’s compute concerns, latency was the top-ranked answer for 28% of respondents, followed by power consumption which was 21% of respondents’ top-ranked issue. Time delays in getting access to compute is also weighing on respondents’ minds; although it was top-ranked for only 14% of respondents, it received 30% of the votes as the second-ranked concern.

image