The GPU Shortage Aftermath

How the AI boom reshaped compute access

Illustration for The GPU Shortage Aftermath
gpu-shortage-aftermath The AI boom created a GPU shortage that's reshaping how companies think about compute access and infrastructure. GPU shortage, AI infrastructure, compute access, NVIDIA, cloud computing

Remember when getting an H100 meant waiting nearly a year? The GPU shortage of 2023-2024 broke something in the AI industry that cheap compute won't fix. The scars run deeper than anyone wants to admit.

TL;DR

Assess your compute access strategy. Multi-cloud, reserved capacity, and inference optimization are now table stakes.

The crisis is technically over. According to Tom's Hardware, lead times dropped from 11 months to 8 weeks. H100 rental prices fell from $10 per hour to under $3. But the shortage's legacy lives on in hoarded compute, elevated cloud costs, and an entire generation of AI startups that never got the chance to scale.

I've watched technology constraints shape industries before. The results are rarely what anyone predicts.

The Year of GPU Poverty

In 2023, the AI world split into GPU-rich and GPU-poor. The dividing line was brutal. Semi Analysis documented the chasm: companies with fewer than 20,000 A/H100 GPUs were fundamentally constrained, regardless of their talent or ambition. That included household names like Hugging Face, Databricks, and Together AI.

Meanwhile, hyperscalers bought everything. AWS, Google Cloud, and Microsoft Azure controlled roughly 66% of the cloud market and had first claim on supply. If you weren't a strategic partner, you joined the waitlist. NVIDIA couldn't make enough chips to satisfy demand, and the major cloud providers decided who got access.

The math was simple but devastating. Spending on GPUs jumped from $30 billion in 2022 to $50 billion in 2023. Everyone wanted in. Not everyone could get in.

The Startups That Couldn't Scale

For established AI labs, the shortage was an inconvenience. For startups, it was existential. I've seen the same pattern across multiple cycles: capital-intensive technology waves favor incumbents.

Here's what happened. A startup would raise a seed round, build a promising demo, find product-market fit, and then hit a wall. Scaling required compute they couldn't acquire at any price. Investors grew skeptical of backing companies going up against NVIDIA, Amazon, Microsoft, and Google simultaneously. Why fund the underdog when the compute moat was this wide?

The pattern echoes what happened in the broader AI startup landscape: companies building on rented infrastructure with no defensible advantage. Except during the shortage, even the rental option wasn't available.

Enterprise customers had AI ambitions too. Many needed thousands of H100s for training real models at scale. The bottleneck wasn't budget or talent. It was simply access to silicon.

The Hoarding Behavior

Scarcity creates hoarding. When GPUs became precious, rational actors started stockpiling. Not because they needed the compute immediately, but because they might need it later and couldn't risk being locked out.

This behavior persists even as supply improves. Companies that secured allocations during the shortage aren't giving them back. They're building internal compute reserves, maintaining relationships with cloud providers, and treating GPU access as a strategic asset rather than an operational expense.

Some early H100 buyers are now reselling their allocations as supply eases. That tells you something about how distorted the market became. People bought GPUs as speculation, not infrastructure.

The hoarding mentality won't disappear overnight. Anyone who lived through 2023 knows how quickly access can evaporate. The shortage may be over, but the fear isn't.

Cloud Costs: Down But Not Reasonable

Yes, prices dropped. H100 rental fell from $8-10 per hour to $2-3 on specialized providers. As MIT Technology Review reported, AWS cut prices by 44% in mid-2025. The worst of the gouging is over.

But context matters. Before the shortage, GPU cloud computing was already expensive. The price drops brought costs back toward pre-shortage levels, not to some new accessible baseline. Training a large language model still costs millions. Running inference at scale still burns cash.

The bigger issue is that the shortage accelerated enterprise AI spending to unsustainable levels. According to CloudZero's State of AI Costs report, average enterprise AI spending hit $85,521 monthly in 2025, up 36% from the previous year. Organizations planning to spend over $100,000 monthly more than doubled, from 20% to 45%. That's not AI becoming more valuable. That's budgets spiraling because acquisition was harder and timelines were longer.

Companies built cost structures around shortage-era pricing. Those costs don't automatically reset when supply normalizes. The financial scars remain in bloated budgets and embedded expectations.

The Memory Bottleneck Nobody Mentions

Just as GPU supply improved, a new constraint emerged: high-bandwidth memory. HBM3E became the binding constraint on AI infrastructure globally. Samsung, SK Hynix, and Micron are operating near full capacity with lead times stretching to 6-12 months.

DRAM supplier inventories fell to 2-4 weeks by late 2025, down from 13-17 weeks the year before. SK Hynix told analysts the shortage may persist until late 2027. All memory scheduled for 2026 production is already sold out.

This is the pattern I've watched repeat across technology cycles. Solve one bottleneck, expose another. The constraint moves but doesn't disappear. TSMC's CoWoS packaging has lead times past 52 weeks. The semiconductor supply chain remains fragile.

For anyone betting that the AI compute crunch is over, the memory crisis is a warning. The industry outgrew its infrastructure, and infrastructure takes years to catch up.

What the Shortage Changed Permanently

The GPU shortage of 2023-2024 wasn't just a supply chain hiccup. It restructured competitive dynamics in ways that persist even with adequate supply.

Compute Access Assessment

Rate your organization's compute position to understand your strategic vulnerability:

Compute Resilience: 0
Check applicable items

Vertical integration accelerated. Big tech companies now design their own chips. Google has TPUs. Amazon has Trainium and Inferentia. Microsoft is developing Maia. The shortage proved that depending on NVIDIA alone was strategically risky. Expect more custom silicon and less commodity dependence.

Geographic risk became real. 90% of advanced chips are manufactured in Taiwan. The shortage made cross-strait tensions an existential risk to the entire AI chip supply chain. That awareness isn't going away, even if the immediate crisis has passed.

The startup landscape thinned. Some companies that would have scaled didn't. They pivoted, folded, or were acquired at distressed valuations. The alternate history where compute was abundant would have produced different winners. We're living with the winners the shortage selected for.

Cloud provider leverage increased. When you couldn't get GPUs anywhere else, you went to AWS or Google or Azure. Those relationships are sticky. The hyperscalers converted a temporary supply advantage into durable customer lock-in.

The Lessons for What Comes Next

If there's one thing the GPU shortage should teach us, it's that the AI boom has real constraints beyond just hype cycles. Technology adoption is gated by physical infrastructure, and infrastructure follows its own timeline.

For startups, the lesson is brutal: capital-intensive technology waves favor the already-rich. If your model requires massive compute to train and scale, you're competing against Microsoft's balance sheet. That was true before the shortage; the shortage just made it undeniable.

For enterprises, the lesson is that AI costs aren't stabilizing soon. Memory constraints, packaging bottlenecks, and geopolitical risk all point toward continued supply pressure. Budget for volatility, not normalization.

For the industry broadly, the lesson is that the semiconductor supply chain is both essential and fragile. We're building world-changing technology on infrastructure that takes years to expand and can be disrupted overnight. The next constraint is already forming. We just don't know which one yet.

Compute Access Assessment

This interactive assessment requires JavaScript. The checklist below is still readable.

Rate your organization's compute position to understand your strategic vulnerability:

Assessment
Score: 0
Complete the assessment above

The Bottom Line

The GPU shortage is easing, but what it revealed isn't. The AI industry depends on a brittle supply chain, cloud providers with outsized leverage, and infrastructure investment cycles measured in years rather than months.

Companies that secured compute during the shortage emerged stronger. Companies that couldn't are gone or diminished. The market didn't select for the best ideas or the best teams. It selected for the best access to silicon.

As memory becomes the next bottleneck and geopolitical risk grows, the dynamics that made 2023-2024 brutal for small players haven't changed. They've just shifted to a different constraint. The shortage taught us that in AI, compute access isn't just an operational detail. It's a strategic advantage that determines who gets to play and who watches from the sidelines.

"The GPU shortage of 2023-2024 broke something in the AI industry that cheap compute won't fix. The scars run deeper than anyone wants to admit."

Sources

AI Strategy Review

Don't let your AI pilot become a statistic. Get honest assessment from someone who's shipped voice AI to the Coast Guard.

Book a Call

Disagree? Have a War Story?

I read every reply. If you've seen this pattern play out differently, or have a counter-example that breaks my argument, I want to hear it.

Send a Reply →