Abhay Bhat

Life | Technology | Investing

Abhay Bhat
ai chip wars header

The AI Chip Wars: GPU vs TPU vs LPU

Why the future of AI isn’t one chip to rule them all


The 60-Second Primer

Three chips are fighting for AI’s soul. GPUs (Graphics Processing Units) — the Swiss Army knife that trains most AI models today. TPUs (Tensor Processing Units) — Google’s secret weapon, hoarded for its own data centers. And LPUs (Language Processing Units) — the new kid optimized purely for inference speed. Understanding which chip wins where isn’t just hardware trivia — it’s the difference between a startup burning cash on the wrong infrastructure and an enterprise shipping AI that actually responds in real-time.


The Hardware Stack, Decoded

GPUs: The Developer’s Best Friend

NVIDIA’s GPUs dominate AI training for one reason: CUDA. This software layer lets developers write parallel code without a PhD in hardware engineering. Combined with PyTorch — the framework that won the hearts of ML researchers — GPUs offer unmatched flexibility. You can run anything from experimental research models to production workloads. The H100 and H200 chips push ~1,000 images/second on standard benchmarks, but the real moat is ecosystem, not raw performance.

TPUs: Google’s Walled Garden

Google built TPUs specifically for tensor operations — the math that powers neural networks. They’re beasts at large-batch training, especially with TensorFlow and JAX. The catch? They’re essentially Google-only. While Google Cloud offers TPU access, the tight vertical integration (hardware + software + data centers) means you’re renting Google’s competitive advantage. Meta is reportedly in talks to deploy TPUs in their data centers (November 2025), which tells you something about the performance — but also about who controls the keys.

LPUs: The Inference Insurgent

Groq’s LPU architecture flips the script entirely. Instead of cache, it uses hundreds of megabytes of SRAM for weight storage, eliminating memory bottlenecks. The result? Deterministic, blazing-fast inference. In December 2025, NVIDIA acquired Groq — a $1.5 billion validation that inference-specific silicon is the future. The LPU thesis: training and inference are fundamentally different workloads, and optimizing for both with one chip leaves performance on the table.


Here’s what most people miss: enterprises don’t pick chips based on benchmarks. They pick based on time to first train — how fast can a developer go from chip-in-hand to training-initiated? GPUs with CUDA win this every time because developer experience is the real moat.

But inference is a different game. When you’re serving millions of API calls, cost per token per watt becomes existential. Users expect sub-200ms responses. The irony? APIs are deterministic by nature, but LLMs are not — so you need raw speed to iterate toward good outputs through eval loops.

A classic analogy here is building a manufacturing mould which is slow, technically intensive and overall expensive – this is training for you. And then we have mass production with that mould, which requires a different skill set and technology – this is an inference for you.

CUDA is the new x86. It won the developer war. But for inference at scale, purpose-built silicon like LPUs will quietly take over while everyone’s still arguing about training benchmarks.


The Bottom Line

The AI chip wars won’t have one winner. GPUs will own training (thanks to CUDA’s lock-in). Google will keep TPUs close to its chest. And LPUs — now under NVIDIA’s roof — will reshape how we think about inference economics.

What’s your bet? Are we heading toward specialized silicon for every workload, or will one chip eventually rule them all?

We all have been there https://t.co/4iAD8dBwqh


Me reading about @clawdbot: “this looks complicated” 😅 me 30 mins later: controlling Gmail, Calendar, WordPress, Hetzner from Telegram like a boss Smooth as single 🥃 @steipete sets a high bar for onboarding, cheers!


Me reading about @clawdbot: “this looks complicated” 😅 me 30 mins later: controlling Gmail, Calendar, WordPress, Hetzner from Telegram like a boss Smooth single 🥃 @steipete sets a high bar for onboarding, cheers!


With the pace it is moving, difficult to keep track of it, but if anyone is starting… https://t.co/6AkMYizLL2


RT @SahilBloom: Everyone needs to remember this… https://t.co/PsmDdSVB5h


GPU Glossary https://t.co/y2ccYw1EBO


Can’t agree more 😅 https://t.co/PA3rq7XVPi


Finally decided to ditch Google photos and One Drive with Synology, 3x storage and privacy, home server sticking to Hetzner


GB dropping some nuggets there. Interesting to see the industry gravitating to Datacenters in space and why it makes so much sense https://t.co/ZZZibeXAGN


Page 1 of 77