Infrastructure & ComputeCore

GPU (Graphics Processing Unit)

Definition
The hardware chip that powers AI training and inference. NVIDIA's H100 and B200 GPUs are the most sought-after compute in the industry, with wait times and pricing driving major strategic decisions.
Why it matters
GPUs are the bottleneck of the AI industry. Every model trained, every inference served, and every AI product deployed depends on GPU availability. NVIDIA controls approximately 80-90% of the AI accelerator market, creating a single-vendor dependency that keeps CEOs awake at night. GPU access determines who can train frontier models, how fast inference can scale, and what AI products are economically viable. The GPU shortage of 2023-2024 shaped the entire industry: companies with reserved GPU capacity had a strategic advantage, while those without were locked out of frontier training. Understanding GPU economics is essential for anyone making AI infrastructure decisions.
In practice
NVIDIA's H100 GPUs shipped in volume starting in 2023 at approximately $30,000 each, with cloud rental at $2-3/hour. The B200 (Blackwell architecture, 2024-2025) offered roughly 2.5x the performance. The 'GPU rich' companies (Microsoft, Google, Meta, Amazon) secured tens of thousands of GPUs through multi-billion-dollar orders. Startups and smaller companies relied on cloud providers or specialized GPU clouds like CoreWeave, Lambda, and Together AI. AMD's MI300X emerged as a credible alternative, and custom silicon from Google (TPU), Amazon (Trainium), and Microsoft (Maia) is diversifying the market. The GPU bottleneck is gradually easing as supply ramps up.

We cover infrastructure & compute every week.

Get the 5 AI stories that matter — free, every Friday.

Know the terms. Know the moves.

Get the 5 AI stories that matter every Friday — free.

Free forever. No spam.