Liquid Cooling: The Next Bottleneck in AI Infrastructure

The explosive growth of artificial intelligence (AI) is pushing data centers to their thermal limits. While GPUs and CPUs have largely transitioned to liquid cooling for optimal performance, storage systems lag behind, creating an inefficient hybrid architecture that undermines the benefits of modern cooling solutions. This isn’t merely a cost issue; it’s a fundamental structural liability that impacts rack density, sustainability, and ultimately, the ability to scale AI deployments.

The Inefficiency of Hybrid Cooling

Currently, many AI deployments rely on a patchwork system: liquid-cooled processors alongside air-cooled storage. This approach is operationally inefficient. Organizations end up maintaining two entirely separate and expensive cooling infrastructures – liquid loops for compute and traditional CRAC units for storage – without fully realizing the total cost of ownership (TCO) benefits of a unified system.

The problem is exacerbated by physical constraints. Bulky liquid-cooling components obstruct airflow within server chassis, concentrating thermal stress on air-cooled drives, memory, and networking hardware. Fans struggle to adequately dissipate heat around liquid plumbing, forcing the most heat-sensitive components into the worst possible thermal environment.

Water Consumption: An Overlooked Crisis

Beyond cost and performance, the environmental impact is significant. Air-cooled systems rely heavily on evaporative cooling towers, which can consume millions of gallons of water over time. As rack power densities increase, this water penalty becomes unsustainable. According to Hardeep Singh, thermal-mechanical hardware team manager at Solidigm, the current reliance on evaporative cooling is “environmentally and economically indefensible” in the long run.

The Shift to System-Level Thermal Design

Modern AI infrastructure isn’t built server by server; it’s engineered as tightly integrated rack- and pod-level systems. Power delivery, cooling distribution, and component placement are now inseparable. This means that storage architectures designed for airflow-dependent data centers are becoming a limiting factor. As GPUs move toward fully liquid-cooled, fanless designs, storage must adapt or become a bottleneck.

Storage: From Passive to Active Participant

Historically, storage was treated as a passive subsystem. This is no longer viable. Scaling AI now depends on whether storage can integrate cleanly into liquid-cooled GPU systems without fragmenting cooling architectures or constraining rack-level design.

Scott Shadley, director of leadership narrative and evangelist at Solidigm, emphasizes that the race to scale AI is no longer just about GPU count. It’s about who can keep those GPUs cool, reliably, and efficiently. Techniques like KV cache offload, which move data between GPU memory and high-speed storage, make storage latency and thermal performance critical to model serving efficiency.

The Road to Integrated Liquid Cooling

Moving to fully integrated liquid-cooled racks improves power usage efficiency (PUE) and reduces operational costs. It also eliminates the need for noisy computer room air handlers (CRAHs), potentially replacing them with modern, efficient liquid cooling distribution units (CDUs) capable of cooling racks at temperatures as high as 45° Celsius.

However, seamless integration requires a fundamental redesign of storage. Traditional SSD designs assume airflow for thermal management and often distribute components across both sides of a PCB – assumptions that don’t hold in a liquid-cooled environment. Serviceability is also critical; liquid cooling must not introduce leakage risks during drive insertion or removal.

The Future of Storage: Redesigned for Liquid

Solidigm has collaborated with NVIDIA to address these challenges, focusing on hot-swap compatibility and single-sided cooling solutions. The company advocates for redesigning SSDs with low-resistance heat transfer paths to efficiently conduct heat to a dedicated cold plate.

The industry is coalescing around standards to ensure interoperability. Solidigm leads the charge, working with the SNIA and Open Compute Project (OCP) to develop production-ready designs that integrate cleanly into liquid-cooled GPU platforms.

The shift is clear: storage is no longer an isolated engineering problem. It’s a direct variable in GPU utilization, system reliability, and operational efficiency. The future of AI scaling hinges on embracing this reality.