Google Unveils Ironwood TPUs and Axion for Inference Era

Google Unveils Ironwood, Its ‘Most Powerful’ and ‘Energy-Efficient’ AI Chip to Date

Google Unveils Ironwood, Its ‘Most Powerful’ and ‘Energy-Efficient’ AI Chip to Date

Generated with Google Gemini

Google unveils Ironwood, its most powerful TPU, for the age of inference, and Axion Arm VMs promising up to 2× better price-performance for AI workloads.

Written By
Liz Ticong
Liz Ticong
Nov 7, 2025

Google is turning up the heat in the AI hardware race.

The tech titan has unveiled Ironwood, its newest AI chip described as “its most powerful and energy-efficient to date,” promising a tenfold leap in performance efficiency for large-scale inference and model training.

Announced by Google Cloud executives Amin Vahdat and Mark Lohmeyer, Ironwood TPUs are purpose-built for the most demanding workloads, marking a shift toward what Google calls “the age of inference.”

Inference takes over as AI’s new arena

Google is framing that shift as a turning point for the industry, moving from teaching AI to keeping it running around the clock. In this “age of inference,” the spotlight falls on performance, responsiveness, and the seamless coordination between general-purpose compute and machine learning accelerators.

As models evolve to handle real-time reasoning and decision-making, Google says the next breakthroughs will come from system-level design, rather than just larger datasets or more complex architectures. That philosophy underpins Ironwood: a chip built to power AI that lives in motion.

Pushing AI performance to a new extreme

Google’s new Ironwood TPU is engineered to handle the heaviest AI workloads, from large-scale model training to rapid-fire inference, with a leap in speed and efficiency that redefines its silicon line.

The chip delivers 10× the peak performance of TPU v5p and more than 4× the performance per chip of its predecessor, Trillium (v6e), making it Google’s most advanced processor for both training and serving AI models.

Built with enhanced cooling, reliability, and power efficiency, Ironwood is designed for “planet-scale” deployment, capable of scaling across thousands of chips without losing stability.

Early adopters are already putting that promise to the test. Anthropic plans to tap into up to 1 million TPUs to serve its Claude models, while Lightricks and Essential AI report major boosts in generation quality and training efficiency.

Anthropic Head of Compute James Bradbury said, “Ironwood’s improvements in both inference performance and training scalability will help us scale efficiently while maintaining the speed and reliability our customers expect.”

Advertisement

More Google coverage

Where 9,000 chips think as one

Ironwood doesn’t stand alone — it’s the beating heart of Google’s AI Hypercomputer, a system built to make thousands of processors work together as one.

Each superpod links up to 9,216 TPUs via a 9.6 terabit-per-second network, enabling the chips to communicate almost instantly and operate as a unified system. Together, these pods share 1.77 petabytes of ultra-fast memory, removing the data slowdowns that typically hinder large-scale AI processing.

In practice, this means that enormous models, such as chatbots, image generators, or research systems, can run faster, more efficiently, and without interruption. By enabling thousands of chips to work together seamlessly, Google can deliver faster responses, lower latency, and smoother performance for businesses and developers using its AI infrastructure.

To keep that vast web running smoothly, Google relies on optical circuit switching — a self-healing fabric that reroutes workloads instantly in the event of interruptions. The company says its fleet has maintained 99.999% uptime since 2020, supported by advanced liquid cooling and automated cluster management.

A co-designed software layer, including Kubernetes Cluster Director, MaxText, vLLM, and GKE Inference Gateway, helps squeeze every bit of performance from the hardware, cutting latency and lowering serving costs for customers operating at planetary scale.

Axion steps in where power meets practicality

Alongside Ironwood, Google introduced Axion, its new line of Arm-based CPUs built to power the everyday computing that keeps AI systems running smoothly. The lineup includes the N4A, now in preview, and C4A Metal, coming soon. Both are designed to deliver up to twice the price-performance of comparable x86-based virtual machines.

In simpler terms, they promise more computing power for less cost and energy, making it easier and cheaper for businesses to run the supporting tasks that AI depends on, from data processing and analytics to app hosting and system management.

Companies testing Axion say the improvements are already tangible. Vimeo, for instance, reported a 30% boost in video transcoding performance, and ZoomInfo measured a 60% improvement in price-performance for core data workloads. Rise said the new instances helped cut compute consumption by 20% while maintaining low latency and strong margins.

Ironwood and Axion deliver a one-two punch to Google: raw acceleration for AI at scale, paired with efficient, general-purpose compute for everything surrounding it. It’s a full-stack strategy built for a future where intelligence never pauses, and where the cloud itself learns to think faster.

Google’s latest energy ambitions are just as audacious, with a new plan to harness solar power from orbit to keep its AI infrastructure running.

Liz Ticong

Liz Ticong is a technology writer specializing in artificial intelligence, cybersecurity, software reviews, and emerging business technologies. With more than a decade of professional writing experience and over five years contributing technology content for TechnologyAdvice, she helps readers understand complex technologies and evaluate the tools that best fit their needs. Liz has extensive experience researching, testing, and analyzing software platforms, AI tools, and technology solutions. Her work includes in-depth software reviews, buyer’s guides, product comparisons, and technology news coverage designed to help businesses make informed purchasing and implementation decisions. She regularly evaluates AI applications, automation tools, cybersecurity solutions, and business software, providing practical insights based on hands-on testing and research. In addition to her work with TechnologyAdvice, Liz has contributed technology content to leading industry publications, including eWeek and TechRepublic. Her background in technical writing and software analysis enables her to translate complex technical concepts into clear, actionable guidance for both business and technology audiences. Liz holds a bachelor's degree in Broadcast Communication from the Polytechnic University of the Philippines and continues to expand her expertise through ongoing education in artificial intelligence and emerging technologies. Through her writing, she helps readers navigate a rapidly evolving technology landscape with practical, research-driven insights and real-world product analysis.