New data center cooling system submerges servers in a liquid that boils

Featured Content

This article is courtesy of TechRepublic Premium. For more content like this, as well as a full library of ebooks and whitepapers, sign up for Premium today. Read more about it here.

This article is courtesy of TechRepublic Premium. For more content like this, as well as a full library of ebooks and whitepapers, sign up for Premium today. Read more about it here.

Join Today

Two-phase immersion cooling system based on 3M's Novec liquid could dramatically shrink data center footprints and reduce energy costs.

Credit: 3M

Energy bills for data centres can account for almost one third the costs of building a facility and running it long term.

Enjoying this article?

Download this article and thousands of whitepapers and ebooks from our Premium library. Enjoy expert IT analyst briefings and access to the top IT professionals, all in an ad-free experience.

Join Premium Today

A significant amount of the electricity used within a data centre powers its cooling, and there is a perennial effort to find more efficient ways to stop servers overheating. Web giants such as Facebook and Google are leading this efficiency drive, running their data centres with a Power Usage Effectiveness (PUE) that hovers around 1.1, strides ahead of a global average of 1.8. The pair have cut power consumption through various techniques, including running cold aisles above 80F, minimising the use of chillers and cutting the number of AC/DC conversions.

To knock data centre energy draw down further, researchers are investigating approaches to sucking heat from servers which are less common, such as two-phase immersion cooling.

The method submerges servers in a bath of fluid, which is warmed to boiling point by heat from the motherboard. As the fluid boils and changes from liquid to vapour it absorbs most of the energy radiating from the board.

The hopes are that such a system could help a data centre achieve a PUE close to 1, so more of the power consumed by a centre could be used for computing.

Dr. Jeanie Osburn works at the Naval Research Laboratory in Washington, DC, where a new open bath immersion cooling system will be trialled, with the help of a consortium of companies including multinational 3M, chip maker Intel, high performance computer specialist SGI, and Schneider Electric.

"People are hitting a ceiling on how much power they can put into their facilities," she said.

"We've got people with several megawatts of power pulled into their centre. If you're using half of that power to cool stuff then you're looking for a much more efficient way of cooling, so you can actually field more resources to do work for you. This holds a lot of promise."

How the system works

Each test site will house half a rack of SGI Ice X servers in a sealed tank of Novec, completely immersed in the dielectric fluid made by 3M. Although submerging electronic equipment in liquid may seem counterintuitive, the servers function normally while submerged, as Novec neither conducts electricity or corrodes metals on the boards.

Credit: Allied Control
Novec can boil at various temperatures depending on how it's made up, but the version used during initial tests will boil at 49C. Novec is a non-flammable, non-toxic liquid also used in fire extinguishing systems inside data centres.

The heat removed from each server as the Novec boils and changes from liquid to vapour is significantly greater than that absorbed by just passing cold water or air over the surface of the board.

"The energy absorbed in the fluid during boiling is orders of magnitude greater than just a cool fluid flowing over a warm component, so we get significant better cooling with two-phase," said Michael Patterson, senior architect for power, packaging, and cooling in Intel's HPC systems pathfinding team.

The Novec vapour floats up to a condenser that sits above the bath of liquid and condenses the Novec back into a liquid, which then falls back into the bath, where the process begins again.

Cool water or a fluid with a higher boiling point, such as glycol, is pumped to the condensing plate where it absorbs energy from the Novec vapour and triggers condensation. The fluid is pumped to a dry cooler tower where it is air cooled, before being pumped back to the condensing plate.

"We're hoping to get as close to 1 PUE as possible," said Osburn, adding this rating would be a substantial improvement over the PUE of almost 2 it currently achieves using a mixture of air cooling and water-cooled doors on server enclosures.

"It gets really hot and muggy in Washington, DC and we're thinking if you can make a dry cooler work in a hot and muggy climate then you're probably going to be able to make it work just about anywhere."

"One of the advantages of this technology is you basically can cool this without having to make a huge investment in infrastructure," said Osburn, referencing the ability to dispense of much of the infrastructure associated with air and water cooling technologies - the likes of raised floors, chilled water plants, air handlers or hot/cold aisles. Removing these elements from the data centre lowers running costs and frees up floor space, she said.

Installation in a new data centre with the correctly configured servers, should be straightforward, said Patterson.

"It's really simple you bring in a box and put the servers with the Novec in it and you connect water, power, the fabric and that's it," he said.

"The other nice part is it doesn't have to be in a raised floor environment. You could argue it would use less space because you don't have to worry about the hot-cold aisle air-flow management and all of that."

The tanks at both sites will house half an IRU chassis laid on its back, designed to hold 144 Intel Sandybridge core sockets and accommodate 80kW, though neither site will operate at this level initially.

The effectiveness of the two-phase heat transfer and lack of bulky fans or water cooling equipment on boards could allow far more servers to be packed into a smaller space in future deployments of the system, as Osburn explains.

"You can imagine you could eventually build motherboards so you could have a very high density of chips on the motherboard, and in the same footprint have twice, three, four, eight - who knows how many times denser computing capacity," said Osburn.

Promotional materials for the Novec-based, two-phase cooling system claim that, with an optimal set up, it could support up to 100kW of computer power per square metre, allowing ten times the computing power to be packed into the same space than when using a typical air-cooled system.

The two sites will test the expectation that the system could cost 95 percent less to run than older air cooling systems in data centres with a PUE of 2.0 or above. The system at NRL at Washington is expected to be up and running by this summer and researchers will spend one year studying factors such as the system's energy usage. The PUE that will be measured at NRL is a modified PUE and will include the fans and pumps of the dedicated APC dry tower.

Another advantage of immersion over a conventional water-based system is that cooling is not limited to specific components that have cold water pumped over their surface.

Credit: Allied Control
"It's an elegant solution in the sense that it picks up all the heat from the board," said Patterson adding that the greatest absorption of heat will be concentrated on the hottest components, as the fluid touching these areas will boil.

"[With water-cooled systems] you've still got the majority of the board and all the other components left without a cooling solution, so often times those systems line up being a hybrid solution," he said, where air cooling is run alongside to pick up residual heat and "essentially you're paying for two cooling systems".

Heat from Novec immersion systems could also be harvested for use elsewhere, helping improve a facility's PUE.

The immersion systems can be run using Novec configured to have differing boiling points. The second test system - based at 3M's corporate headquarters in St. Paul, Minnesota - will switch to using a Novec mixture that boils at 74C. The vapour from the system will warm 60C water to 70C and test the viability of extracting enough heat energy to do useful work elsewhere. The system will dissipate about 78kW and be targeting a partial PUE of less than 1.01.

"That 70 - 80C Novec that goes to a heat exchanger where you can then pick that heat up in the water. As the water goes across the condensing tubes it will warm up to that temperature to condense the Novec. Now all of a sudden you've got some pretty good quality energy in the fluid that you could possibly use for a heat recovery system," said Intel's Patterson.

Immersion cooling is also suited to future hardware designs, as its cooling apparatus isn't tied to the design of existing server motherboards and CPUs designs in the same way that custom water blocks, heat sinks and fans are, claims 3M.

Maintenance of systems cooled by Novec immersion is simpler than systems that bathe servers in other fluids, for example mineral oil, said Osburn, because Novec evaporates from the board within "minutes" of it being removed from the tank and doesn't drip and pool on the floor. The 3M open bath immersion tank can also be opened to allow components to be hot swapped.

Real-world deployments

A 3M Novec two-phase, immersion cooling system has already been deployed in an Allied Control 500kW data centre in Hong Kong.

The Immersion-2 system differs from the test systems in that it houses 80W application-specific integrated circuit boards designed for Bitcoin mining inside smaller tanks of Novec. Inside the tanks is a condensing coil filled with facility water that flows through to a dry tower.

Credit: Allied Control
The system uses 20 conventional 19-inch racks, each housing three of these tanks. Each tank holds 92 boards arranged vertically on a backplane with 10mm distance between each board. The overall system occupies a floor space of less than 15 square metres.

The enclosure and rack design allows for 18 - 225kW per rack. PUE of Hong Kong system was 1.01 when it debuted and is expected to be 1.02 or less in the summer.

"This is a true facility PUE but do take into account that it is not a conventional datacenter," said Phil Tuma, of 3M Electronics Materials Solutions division.

"They have parasitic power draw for lighting but their site lacks much of the overhead that would add onto a conventional datacenter's PUE. One could, therefore think of it as a partial PUE even though, in their case, it is a true PUE."

Limitations of immersion cooling

Bathing servers in fluid is a radical departure from the approach taken by most current cooling systems, and some of the idiosyncrasies of the two-phase immersion system mean it is better suited to being built into new facilities than retrofitted in existing racks.

For instance, while Novec doesn't prevent servers from operating, the fluid does dissolve the thermal grease that sits on top of air- or water-cooled CPUs and helps conduct heat away from its surface.

The CPUs used inside the Novec test systems will be stripped of heat sinks and other conventional cooling apparatus and instead be coated with a material designed to promote boiling.

"Essentially it's a roughened surface full of nooks and crannies, that actually allows the boiling to start with tiny bubbles in that rougher surface," he said.

Another consideration is that spinning platter hard drives are not suited to being bathed in Novec, although solid state drives should work fine.

"If there's local storage it has to be solid state," said Patterson.

Other obstacles to adoption may be more psychological, said Osburn, wariness about placing a liquid near electronic equipment or about breathing in Novec vapours escaping from the sealed system.

There are also unanswered questions about the cost of running the immersion system.

What will be of particular interest to researchers is how often the Novec fluid needs replacing over time.

Patterson said the Novec fluid is "not cheap" and that replacing it on a regular basis could damage the return on investment of such a system.

"We've done TCO models and it truly depends on the Novec retention rate. What I mean by that is that is the retention rate as it boils. The system is sealed but it's not like it's hermetically sealed, so there might be a small amount of Novec that leaves the system. The Novec itself is fairly expensive so if we're replacing Novec on a frequent basis then it will not pan out in terms of an economic positive impact.

"On the other hand if the [loss of] Novec can be limited, in terms of its departure from the system when it evaporates then it has a positive ROI.

"We are going to be very carefully measuring and monitoring the capacity of Novec that has to be made up."

The rate at which Novec is retained - and the cost of running the system - could depend on the type of processing jobs being carried out by the servers, he said.

"It's fascinating because the retention rate will depend on workloads. If you're heating and cooling a lot and changing states in the hardware you'll get a different response than if you were just at a steady state, so there's a lot of different aspects to this that we're going to be working through to understand whether the model is going to pan out for us."

For the project to be considered a success Patterson said the system would need to be economic to use it for all types of computing workloads, something he believes will be possible.

"We would not go forward with a product where it has some benefits but only if you do this kind of workload," he said.

"We're starting in the high performance computing space because it's high density with sophisticated customers, and we think this will be the easiest spot to display the benefits.

"But we wouldn't have got involved with support on the processor side if we didn't see a strong potential for a positive outcome."

Join Premium Today