NVIDIA bringing supercomputer tools to Arm, but who can actually use it?

There's a new integration for NVIDIA CUDA, but the Arm server ecosystem is still relatively small.

NVIDIA's Pegasus AI chip could autonomous vehicles

NVIDIA announced support for CUDA GPGPU platform for Arm CPUs at the International Supercomputing Conference in Frankfurt on Monday, bringing support for artificial intelligence (AI), machine learning, and high-performance computing (HPC) libraries that execute instructions in parallel on NVIDIA GPUs.

NVIDIA's CUDA platform is utilized extensively in supercomputers, as GPUs are quite effective accelerators for calculations which CPUs are not well suited to perform. Reliance on purpose-built compute accelerators is likely to increase as Moore's Law—the doubling of transistors in integrated circuits about every two years—comes to an end. While GPUs are naturally NVIDIA's forte, other technology firms are pushing alternative compute accelerators, such as quantum computers and memory-driven computing.

For currently-deployed supercomputers, NVIDIA GPUs are found in "22 of the world's 25 most energy-efficient supercomputers," according to the Green500 list, also published Monday. With this integration generally available later this year, NVIDIA will support x86-64, POWER9, and Arm CPUs.

SEE: Vendor risk management: A guide for IT leaders (free PDF) (TechRepublic)

Arm is not particularly new territory for NVIDIA, as the company previously opened the NVIDIA Deep Learning Accelerator (NVDLA) architecture for Arm's Project Trillium machine learning initiative in 2018. NVIDIA's Tegra X1 system-on-a-chip (SoC) is also used in products such as the Nintendo Switch and Google Pixel C tablet.

So, are there Arm server processors that can actually use this?

NVIDIA's announcement is a trifle peculiar—the ecosystem of Arm CPUs for servers is not particularly strong, and has taken some substantive setbacks over the last few years.

Qualcomm, the leader for (comparatively) low-power Arm SoCs in smartphones—was previously all-in for Arm on servers, announcing a collaboration with Microsoft on a port of Windows 10 Server for the Centriq 2400 in 2017. That port appears to have not materialized, and Qualcomm is reported to have laid off 95% of their server SoC team as of December 2018.

Huawei announced the Kunpeng 920 Arm-based server processor at CES 2019, though with vendors fleeing in the other direction of Huaweiincluding Arm themselves—following their effective trade blacklisting by the Trump administration, this is also likely not a viable integration.

Amazon's Graviton CPU is a plausible answer, and the company already offers NVIDIA GPUs for Deep Learning AMI on AWS. That said, the debut generation of Graviton is somewhat underpowered, as it uses the somewhat older Cortex-A72 microarchitecture.

Interestingly, NVIDIA's press release includes a quote from Ampere Computing's CEO Renee James, stating that "We are thrilled that NVIDIA is moving CUDA and the rich ecosystem built around NVIDIA to Arm. This will accelerate our work in building out the software ecosystem for Arm-based servers and enable breakthrough Ampere platforms with NVIDIA GPUs for efficiency and performance." Ampere purchased AppliedMicro's X-Gene server technology in late 2018, and counts Arm itself as an investor.

Cavium is also seeing continued interest, particularly as toolchain improvements continue to be developed for their ThunderX2 Arm CPU. The ThunderX2 is used in the Mont-Blanc supercomputer project.

For more on Arm servers in the datacenter, learn why Linus Torvalds recently praised Arm servers, but also claimed the economics and ecosystem are missing; SolidRun's new developer-focused Arm workstation.

Also see