Executive summary

GPU cloud and neoclouds are the AI infrastructure market's most capital-intensive part. They exist because demand for capacity has outpaced what many buyers can get from the hyperscalers on the right timeline, contract shape, price, or operational model. That opening has created room for specialist providers such as CoreWeave, Lambda, Crusoe, Runpod, Nebius, Applied Digital, IREN, Vultr, Paperspace, Fluidstack, and Voltage Park.

The category is real, but the durable business is harder than the demand story makes it sound. Renting scarce GPUs during a shortage can create rapid growth. Earning durable returns requires something more demanding: chip allocation, power, financing, depreciation discipline, utilization, reliability, customer diversification, and enough workflow depth that customers see the provider as an AI infrastructure operator rather than a broker with expensive cards.

Owning GPUs does not make a neocloud interesting; turning scarce capacity into trusted production infrastructure does.

Why now

AI training and inference have turned compute supply into a strategic bottleneck. Frontier model labs need large clusters for training and post-training. AI application companies need capacity for fine-tuning, batch workloads, production inference, evaluation, and synthetic data. Enterprises need accelerator access for internal AI platforms and private workloads.

The hyperscalers remain the default infrastructure providers. AWS P5, Azure ND H100 v5, Google A3, Oracle GPU instances, AWS Trainium, and Google TPU are serious incumbent options, not straw men. The opening for neoclouds appears when buyers need capacity faster, want a committed cluster outside the main cloud contract, need better AI-specific operations, or want a second supplier for strategic resilience.

Power drives the timing. AI clusters are more than cloud software; they require data center electricity, cooling, racks, networking, server supply, and grid access. EIA data center electricity context, Vertiv's AI infrastructure material, and Equinix's AI-ready data center positioning all point to the same reality: the AI compute market is constrained by physical infrastructure as much as by software abstraction.

Market definition

GPU clouds and neoclouds sit between NVIDIA and the AI application layer. They buy or finance accelerators, place them into power-dense facilities, operate clusters, expose capacity through cloud interfaces, and sell access to AI teams. The customer may buy a GPU instance, a managed cluster, a reserved-capacity contract, or a private deployment.

The included segments are pure neoclouds, developer GPU clouds, public-market AI infrastructure companies, hyperscaler GPU products, NVIDIA DGX Cloud, and private AI infrastructure. The excluded segments are pure model APIs and inference gateways unless the provider also controls meaningful GPU capacity.

NVIDIA is the upstream center of gravity. Its data center platform and DGX Cloud matter because hardware, networking, software libraries, and cloud distribution are increasingly coupled. A neocloud can abstract some of that complexity for customers, but it cannot ignore where platform control sits.

Value chain

The upstream value chain starts with accelerators, server OEMs, high-speed networking, HBM memory, power equipment, real estate, and utilities. Dell, Supermicro, Vertiv, and Equinix show how much of the AI infrastructure stack lives outside the cloud console.

The build layer is where the business becomes physically hard. Providers need power, cooling, racks, cabling, cluster networking, storage, monitoring, and operational processes. A weak cluster can turn expensive hardware into a poor customer experience; a reliable one can make a specialist provider feel safer than a nominally larger platform with less available capacity.

The cloud layer is scheduling, images, storage, Kubernetes, billing, support, security, and capacity management. This is where neoclouds can differentiate from generic GPU resale. The best version feels less like renting a card and more like getting a working AI factory.

The downstream layer is the customer's workload. Training, fine-tuning, batch inference, real-time inference, synthetic data generation, evaluation, rendering, simulation, and research all stress infrastructure differently. That variation is why utilization matters so much. A provider with expensive idle GPUs has a balance-sheet problem; a provider with high utilization and trustworthy operations has a business.

Buyer and budget

The first buyer group is AI labs and model companies. They need large clusters, committed capacity, and operational reliability. CoreWeave, Lambda, Crusoe, and Nebius all position around AI compute for higher-intensity workloads.

The second group is AI application companies. They care about inference COGS, burst capacity, fine-tuning, evaluation workloads, and the ability to avoid being trapped by a single cloud supplier. The third group is enterprises building internal AI platforms, where security, procurement, support, data gravity, and private deployment options matter. The fourth group is developers and researchers, who care most about quick self-serve access and price-performance.

The budget can sit in research infrastructure, cloud infrastructure, product COGS, enterprise AI platform spend, or capex planning. That makes the sales motion unusually varied. Runpod, Vultr, Paperspace, and Lambda can serve self-serve users. CoreWeave investors, Nebius investor relations, Applied Digital investor relations, and IREN investor relations are better places to watch the larger capacity, utilization, and capital-market version of the thesis.

Incumbents and challengers

The incumbents are the hyperscalers and NVIDIA. AWS, Azure, Google, and Oracle have procurement, identity, storage, networking, security, and enterprise trust. NVIDIA has the upstream platform position through GPUs, networking, libraries, and distribution. Together, they make direct competition difficult.

The challengers win by being more specialized. CoreWeave is the flagship neocloud archetype, with public investor materials and a clear AI cloud positioning. Lambda combines GPU cloud with a broader hardware and infrastructure path. Crusoe ties AI cloud to energy and data center strategy. Runpod, Vultr, Paperspace, Fluidstack, and Voltage Park compete through accessible or specialized GPU access. Nebius, Applied Digital, and IREN show public-market and infrastructure-adjacent versions of the same supply-side AI compute story.

That mix is why the category can be misleading. Some companies are cloud software businesses wrapped around expensive infrastructure. Some are data center and power plays with AI branding. Some are developer clouds. Some are strategic capacity partners. The common thread is not company type; it is the attempt to turn accelerator scarcity into a repeatable infrastructure business.

Where control accrues

Control accrues at seven points. The first is GPU allocation. The second is power-secured data center capacity. The third is financing. The fourth is cluster operations. The fifth is customer contracts. The sixth is networking and storage performance. The seventh is workflow depth around AI frameworks, procurement, security, and enterprise support.

NVIDIA controls the most important upstream platform position. Hyperscalers control procurement, data gravity, and surrounding cloud services. Neoclouds control speed and specialization only when they can secure hardware, operate clusters well, and build enough trust that customers run important workloads outside the default cloud account.

Power may become the least glamorous control point and one of the most important. If power, cooling, and interconnect are the bottlenecks, the strongest providers may be the ones with secured data center capacity rather than the slickest interface.

Where profit accrues

Profit accrues where scarce capacity meets high utilization. That is the simple version. The fuller version is more demanding: the provider must finance hardware, fill clusters, manage depreciation, secure power, keep customers happy, and avoid being squeezed by hyperscaler capacity additions or NVIDIA allocation changes.

Generic GPU rental is risky because it can become a spread business. The stronger profit pool is reliable AI compute plus operational depth: managed clusters, reserved capacity, private deployments, enterprise support, customer workflow integration, and enough platform glue that switching away is operationally annoying.

Public investor materials from CoreWeave, Nebius, Applied Digital, and IREN are the best places to test whether the public-market version of this thesis is improving. Rather than simple revenue growth, look at utilization, contract duration, customer concentration, power-secured expansion, financing structure, and evidence that the provider is moving beyond one-off capacity resale.

Regulation and constraints

Export controls shape where advanced compute can be sold or used, and the BIS advanced computing controls are relevant background for the category. AI governance can also shape customer procurement and workload requirements through rules and frameworks such as the White House AI executive order and the NIST AI Risk Management Framework.

Power and data center constraints are core parts of the product, not side issues. If a provider cannot secure power and operate high-density clusters, it cannot serve the workloads that made the category attractive.

Bear case

The bear case is that neoclouds are a capacity-cycle trade. GPU shortages create room for fast growth. Then hyperscalers add capacity, supply normalizes, prices fall, and utilization weakens. In that world, thin GPU renters get squeezed.

The second bear case is custom silicon. AWS Trainium and Google TPU are examples of non-NVIDIA alternatives that can absorb some workloads inside hyperscaler environments. If custom accelerators improve enough for more workloads, NVIDIA-heavy neoclouds face a narrower wedge.

The third bear case is financing. This is not SaaS. Providers must fund expensive assets before they earn revenue from them. If contract duration, utilization, or pricing weakens, depreciation and financing costs can turn growth into stress.

What would change the thesis

The thesis weakens if hyperscalers satisfy most AI GPU demand at acceptable availability and price. It also weakens if public neoclouds fail to show customer diversification, utilization discipline, and contract durability. A major shift toward custom silicon would narrow the wedge for NVIDIA-heavy capacity providers.

The thesis strengthens if neoclouds disclose long-term contracts, high utilization, power-secured expansion, improving unit economics, and evidence that AI labs and enterprises treat multi-source GPU capacity as a strategic norm rather than a temporary shortage workaround.

Watch next

Watch CoreWeave's public disclosures for utilization, customer concentration, contract duration, and capital intensity. Watch Nebius, Applied Digital, and IREN for evidence that public-market AI infrastructure can become more than data-center speculation. Watch Lambda, Crusoe, Runpod, Fluidstack, Voltage Park, Vultr, and Paperspace for signs of workflow depth beyond capacity access.

Also watch NVIDIA allocation and DGX Cloud relationships, AWS Trainium and Google TPU as pressure on NVIDIA-heavy economics, and power-secured data center expansion. In this market, the real bottleneck may be less visible than the GPU brand on the invoice.

Sources