Substack / RSS Intelligence

SemiAnalysis

The Great GPU Shortage – Rental Capacity – Launching our H100 1 Year Rental Price Index

Daniel Nishball · 2026-04-02 · 3732 words

Anthropic’s Claude 4.6 Opus and Claude Code have soared in demand. Anthropic’s ARR has nearly tripled in just a single quarter from $9B at the end of last year to over $25B today. Open models such as GLM and Kimi K2.5 caused open model use cases to soar. Capital raises by firms like Anthropic, OpenA

SemiAnalysis

Dissecting Nvidia Blackwell - Tensor Cores, PTX Instructions, SASS, Floorsweep, Yield

Kimbo Chen · 2026-03-31 · 3718 words

Nvidia’s Datacenter Blackwell GPU (SM100) represents one of the largest GPU microarchitecture change in a generation, yet no detailed whitepaper exists. Until today, there is no public datacenter Blackwell architecture microbenchmarking study on PTX and SASS instructions, such as UMMA and TMA, with

SemiAnalysis

Nvidia – The Inference Kingdom Expands

Dylan Patel · 2026-03-24 · 5635 words

Source: Nvidia At GTC 2026, Nvidia delivered an event packed full of ground breaking announcements. Nvidia’s pace of innovation is not showing any signs of slowing, as they introduced three entirely new systems this year: Groq LPX, Vera ETL256, and STX. Also announced were updates to Nvidia’s Kyber

SemiAnalysis

The Great AI Silicon Shortage

Ivan Chiam · 2026-03-12 · 2662 words

SemiAnalysis x Fluidstack Hackathon We’re hosting a hackathon ahead of GTC this year on Sunday the 15th, in partnership with Fluidstack, and would love to have you join us! Apply here: https://luma.com/SAxFSHack The Compute Shortage Token demand is skyrocketing and the need for AI compute continues

SemiAnalysis

Are AI Datacenters Increasing Electric Bills for American Households?

Aishwarya Mahesh · 2026-03-03 · 4890 words

Are AI Datacenters Increasing Electric Bills for American Households? SemiAnalysis x Fluidstack is launching GTC with a 48-hour, full-stack AI infrastructure hackathon on March 15th from Power to Prefill, Dirt to Decode. With speakers from OpenAI, GPU MODE, and Thinking Machines, plus compute grants

SemiAnalysis

Vera Rubin – Extreme Co-Design: An Evolution from Grace Blackwell Oberon

Wega Chu · 2026-02-25 · 14303 words

At CES 2026, Nvidia officially announced in detail all 6 Rubin platform products: the Rubin GPU, Vera CPU, NVLink 6 Switch, ConnectX-9, BlueField-4, and Spectrum-6. VR NVL72 is the second generation of Nvidia’s rack scale Oberon architecture that takes the stage. With competition catching up on rack

SemiAnalysis

InferenceX v2: NVIDIA Blackwell Vs AMD vs Hopper - Formerly InferenceMAX

Dylan Patel · 2026-02-16 · 11626 words

Introduction InferenceXv2 (formerly InferenceMAX) builds on the foundation established by InferenceMAXv1, our open-source, continuously updated inference benchmark that has set a new standard for AI inference performance and economics. InferenceMAXv1 moved beyond static, point-in-time benchmarks by

SemiAnalysis

CPUs are Back: The Datacenter CPU Landscape in 2026

Gerald Wong · 2026-02-09 · 10092 words

Since 2023, the datacenter story has been simple. GPUs and networking are king. The arrival and subsequent explosion of AI Training and Inference have shifted compute demands away from the CPU. This meant that Intel, the primary supplier of server CPUs, failed to ride the wave of datacenter buildout

SemiAnalysis

Memory Mania: How a Once-in-Four-Decades Shortage Is Fueling a Memory Boom

Dylan Patel · 2026-02-06 · 3045 words

Prices of memory are going crazy. SemiAnalysis has been calling this out for over a year since late 2024. The scariest thing is that we aren't even close to the peak. We go through fab by fab production and expansion versus detailed end market demand by memory type to forecast memory revenue, pricin

SemiAnalysis

Claude Code is the Inflection Point

Doug O'Laughlin · 2026-02-05 · 3209 words

4% of GitHub public commits are being authored by Claude Code right now. At the current trajectory, we believe that Claude Code will be 20%+ of all daily commits by the end of 2026. While you blinked, AI consumed all of software development. Our sister publication Fabricated Knowledge described soft

SemiAnalysis

From Tokens to Burgers: A Water Footprint Face-Off

Nicolas Bontigui · 2026-01-15 · 2252 words

Tokens and burgers are not two ideas you often see paired in the same title. If you’re wondering what they could possibly have in common, stick with us for this short piece, because today we’re putting them face to face in a duel over a hot topic - datacenter water usage. If you want a different per

SemiAnalysis

Interconnects Beyond Copper, 1,000 CFETs, SK Hynix Next-Gen NAND, 2D Materials, and More

Gerald Wong · 2026-01-13 · 4457 words

It’s an odd time in the chipmaking industry. On one hand, we are ramping into the biggest supercycle ever seen. Advanced logic, DRAM, NAND, customers can’t get enough, chipmakers can’t build capacity fast enough, and we may soon be limited by fab equipment supply. On the other hand, the rapid techno

SemiAnalysis

Apple-TSMC: The Partnership That Built Modern Semiconductors

Sravan Kundojjala · 2026-01-08 · 4642 words

In 2013, TSMC made a $10 billion bet on a single customer. Morris Chang committed to building 20nm capacity with uncertain economics on the promise that Apple would fill those fabs. “I bet the company, but I didn’t think I would lose,” Chang later said. He was right. Apple’s A8 chip launched in 2014

SemiAnalysis

RL Environments and RL for Science: Data Foundries and Multi-Agent Architectures

AJ Kourabi · 2026-01-06 · 5756 words

We’re hiring for AI Analysts and Tokenomics Analyst roles. Apply here or reach out directly. Last June, we argued that scaling RL is the critical path to unlocking further AI capabilities. As we will show, the past several months have affirmed our thesis: major capability gains are coming from rampi

SemiAnalysis

Co-Packaged Optics (CPO) Book – Scaling with Light for the Next Wave of Interconnect

Dylan Patel · 2026-01-01 · 22850 words

Co-Packaged Optics (CPO) has long promised to transform datacenter connectivity, but it has taken a long time for the technology to come to market, with tangible deployment-ready products only arriving in 2025. In the meantime, pluggable transceivers have kept pace with networking requirements and r

SemiAnalysis

How AI Labs Are Solving the Power Crisis: The Onsite Gas Deep Dive

Ajey Pandey · 2025-12-30 · 7567 words

The Grid is Old and Tired Nearly two years ago, we were the first to predict a looming power crunch. In our report AI Datacenter Energy Dilemma - Race for AI Datacenter Space, we forecasted AI Power Demand in the US to grow from ~3GW in 2023 to over 28GW by 2026 – a pressure that would overwhelm Ame

SemiAnalysis

AWS Trainium3 Deep Dive | A Potential Challenger Approaching

Dylan Patel · 2025-12-04 · 16084 words

Trainium3: A New Challenger Approaching! Hot on the heels of our 10K word deep dive on TPUs, Amazon launched Trainium3 (Trn3) general availability and announced Trainium4 (Trn4) at its annual AWS re:Invent. Amazon has had the longest and broadest history of custom silicon in the datacenter. While th

SemiAnalysis

TSMC Overseas Fabs – A Success?

Dylan Patel · 2025-12-01 · 3162 words

Earlier this year, U.S. President Donald Trump called TSMC “one of the most important businesses in the world.” Although he was incorrect, TSMC is the most important business in the world, his government and others do realize the importance of bringing TSMC into their countries. TSMC CEO C.C. Wei,

SemiAnalysis

TPUv7: Google Takes a Swing at the King

Dylan Patel · 2025-11-28 · 9439 words

The two best models in the world, Anthropic’s Claude 4.5 Opus and Google’s Gemini 3 have the majority of their training and inference infrastructure on Google’s TPUs and Amazon’s Trainium. Now Google is selling TPUs physically to multiple firms. Is this the end of Nvidia’s dominance? The dawn of the

SemiAnalysis

Microsoft's AI Strategy Deconstructed - From Energy to Tokens

Jeremie Eliahou Ontiveros · 2025-11-12 · 5794 words

Microsoft was at the top of AI in 2023 and 2024, but then a year ago they changed course drastically. They paused their datacenter construction significantly and slowed down their commitments to OpenAI. We called this out a year ago to datacenter model clients and later wrote a newsletter piece abou

SemiAnalysis

xAI’s Colossus 2 – First Gigawatt Datacenter In The World, Unique RL Methodology, Capital Raise

Jeremie Eliahou Ontiveros · 2025-09-16 · 1904 words

Much has been written about xAI’s Colossus 1. The Memphis build belongs in the history books: the largest AI training cluster, erected from scratch in 122 days. With roughly 200,000 H100/H200s and ~30,000 GB200 NVL72, it remains, today, the largest fully operational, single-coherent cluster (setting

SemiAnalysis

Another Giant Leap: The Rubin CPX Specialized Accelerator & Rack

Dylan Patel · 2025-09-10 · 6404 words

Nvidia announced the Rubin CPX, a solution that is specifically designed to be optimized for the prefill phase, with the single-die Rubin CPX heavily emphasizing compute FLOPS over memory bandwidth. This is a game changer for inference, and its significance is surpassed only by the March 2024 announ

SemiAnalysis

Huawei Ascend Production Ramp: Die Banks, TSMC Continued Production, HBM is The Bottleneck

Dylan Patel · 2025-09-08 · 5666 words

Compute is the lifeblood of AI. He who controls the spice controls the universe the compute will control the production of tokens and reap the benefits of AI. Without compute you do not have a seat at the table. The United States technology community is all in on compute and AI as the next platform

SemiAnalysis

Amazon’s AI Resurgence: AWS & Anthropic’s Multi-Gigawatt Trainium Expansion

Jeremie Eliahou Ontiveros · 2025-09-03 · 2574 words

Two-and-a-half years ago, we flagged a looming “cloud crisis” at AWS. Today, the evidence has mounted. AWS is the crown jewel of the Amazon empire, generating ~60% of group profits, and dominating the lucrative Cloud Computing market. But it struggles to translate this strength into the new GPU/XPU

SemiAnalysis

H100 vs GB200 NVL72 Training Benchmarks – Power, TCO, and Reliability Analysis, Software Improvement Over Time

Dylan Patel · 2025-08-20 · 3366 words

Frontier model training has pushed GPUs and AI systems to their absolute limits, making cost, efficiency, power, performance per TCO, and reliability central to the discussion on effective training. The Hopper vs Blackwell comparisons are not as simple as Nvidia would have you believe. In this repo

SemiAnalysis

GPT-5 Set the Stage for Ad Monetization and the SuperApp

Doug OLaughlin · 2025-08-13 · 2674 words

To many power users (Pro and Plus), GPT5 was a disappointing release. But with closer inspection, the real release is focused on the vast majority of ChatGPT’s users, which is the 700m+ free userbase that is growing rapidly. Power users should be disappointed; this release wasn’t for them. The real

SemiAnalysis

Scaling the Memory Wall: The Rise and Roadmap of HBM

Dylan Patel · 2025-08-12 · 4671 words

The first portion of this report will explain HBM, the manufacturing process, dynamics between vendors, KVCache offload, disaggregated prefill decode, and wide / high-rank EP. The rest of the report will dive deeply into the future of HBM. We will cover the revolutionary change coming to HBM4 with c

SemiAnalysis

Robotics Levels of Autonomy

Reyk Knuhtsen · 2025-07-30 · 11131 words

Robots have powered manufacturing for decades, yet they stayed single-purpose and thrived only in perfect settings. Previous attempts at intelligent machines overpromised and underdelivered. But they were too early. Today, modern AI paradigms convert most robot roadblocks into data problems and push

SemiAnalysis

Intel 18A Details & Cost, Future of DRAM 4F2 vs 3D, Backside Power Adoption (or Not), China’s FlipFET, Digital Twins from Atoms to Fabs, and More

Dylan Patel · 2025-07-21 · 3428 words

Long time readers will recall that SemiAnalysis covers more than just datacenters and AMD. Today we’re back to semiconductors with a tech-focused roundup of the best from this year’s VLSI conference, the premiere design and integration. That includes the latest in chips manufacturing: fab digital tw

SemiAnalysis

Meta Superintelligence – Leadership Compute, Talent, and Data

Dylan Patel · 2025-07-11 · 3151 words

Meta’s shocking purchase of 49% of Scale AI at a ~$30B valuation shows that money is of no concern for the $100B annual cashflow ad machine. Despite seemingly unlimited resources, Meta has been falling behind foundation labs in model performance. The real wake-up call came when Meta lost its lead i