When we talk about cloud compute, we usually are thinking about the familiar x64 processor, whether it’s from Intel or AMD. But another technology is becoming increasingly important: the GPU, or graphics processing unit.
There are two reasons why. The first is that the modern GPU is a powerful parallel computing platform that’s ideal for the neural networks underpinning much of modern machine learning, powering familiar frameworks like TensorFlow. The other is a return to the GPU’s roots, providing cloud-scale render farms for visualising data and creating 3D images for augmented reality headsets, offloading graphics from headsets. GPUs are also part of a new generation of VDI (Virtual Desktop Infrastructure), delivering all the features users expect from their desktops, streamed from the cloud.
Microsoft has been at the forefront of this move, with GPU-based virtual machines in its Azure cloud. It’s been working with partner companies to build arrays of GPUs that can be shared by many VMs, in the same racks as its Olympus servers, with high-bandwidth connections between GPUs and CPUs. Getting started is as easy as launching any VM, with versions for compute, for machine learning, and for graphics.
Azure’s hybrid cloud
Azure’s hyperscale cloud is only part of Microsoft’s cloud. Redmond is clear that its vision embraces a hybrid cloud — one that reaches from simple IoT endpoints to edge networks, through data centres and on up to Azure itself. That has led to the launch of a family of Azure Stack products that provide Azure-consistent extensions of the cloud into your data centres and beyond. The heart of the Azure Stack family is Azure Stack Hub, the original Microsoft-certified data centre hardware running a stack of software that’s managed by Microsoft.
It’s been interesting watching the Azure Stack platform evolve, and the latest platform release has added support for a new generation of hardware that now includes GPU support. The initial versions of the Azure Stack software were focused on supporting core Azure services as well as virtual infrastructures to handle compute tasks. That limited the range of virtual machine sizes and functions, supporting Azure’s A, D, and F-series VMs.
Adding GPUs to Azure Stack Hub
The 2002 Azure Stack Hub release brings in a public preview of N-series virtual machines. Azure has been running a range of N-series VMs based on Nvidia GPUs for some time now, with NC-series VMs handling compute tasks, ND-series for big data-driven machine learning, and NV-series for visualization, rendering and VDI. They’ve proven to be popular, supporting both Windows and Linux guest operating systems.
Microsoft has been working with its Azure Stack hardware partners to deliver GPU support to server nodes. Adding support to existing Azure Stack Hub implementations will mean swapping out server nodes for newer hardware, as older nodes won’t have the appropriate GPUs. Microsoft has certified two different Nvidia GPUs with Azure Stack Hub for the public preview of the service, the V100 Tensor Core and T4 Tensor Core.
The V100 will support NCv3 virtual machines, with support for machine learning and visualisation, similar to Azure’s Standard NC6s v3. These have 6 virtual CPUs, up to 112GB of RAM and 736GB of local storage. You have access to a single GPU with 16GB of video memory. You’ll need to install the Nvidia GPU Driver Extension in your VM images to take advantage of the new hardware, as this lets you install appropriate drivers for GPU compute or rendering from the Azure Portal (and using ARM for preconfigured images).
The T4 GPU is a newer card than the V100, but it’s hard to compare directly as it’s targeted at different workloads. It’s certainly less capable at pure compute tasks, but it’s a much lower-power device, clocking in at 75W compared to the V100’s 250W. Nvidia suggests the T4 is more suitable for inferencing than training ML models.
With GPU support in Azure Stack Hub you can start bringing ML and other GPU-based workloads from the cloud to your data centre. Latency should be reduced, along with bandwidth costs, as data won’t need to travel over expensive WAN connections to Azure. You’ll also be able to build and test your models working against sensitive data, keeping your operations compliant with local regulations and your data under your control.
Putting GPU at the edge
It’s not just Azure Stack Hub that gets GPU support. Its own Azure Stack Edge hardware is now offering Nvidia T4 Tensor Core GPUs. Adding GPU to these single-rack units makes a lot of sense, especially if you’re planning on running machine-learning workloads in edge compute scenarios. GPU acceleration here will make it easier to run predictive models in IoT applications, allowing you to build, test, train and run your machine learning on-premise without having to push potentially sensitive training data sets based on industrial hardware to the public cloud. Keeping that data inside your own data centre and at the edge of your network keeps you more secure as well as helping both control data costs and bandwidth usage.
SEE: Multicloud: A cheat sheet (free PDF)
Using T4 GPUs on the edge of the network makes a lot of sense. You’re unlikely to be developing complex machine-learning models; instead, you’ll be using prebuilt models — either your own or in containers like Microsoft’s Cognitive Services. Putting GPU compute in a single rack like this is a good way to deliver it to smaller metro-scale data centres, or even a cage at a cellular transmitter. Microsoft is also working on putting Azure Stack Edge in rugged cases for use in extreme conditions, making it possible to deliver ML to ships, oil and gas platforms, and even as part of emergency and disaster response.
Machine learning is an increasingly important technology, but it’s also hard to build and deliver your own ML platforms. The ability to take advantage of Azure’s tools to build, test and deploy models to your own data centres and racks is a significant advantage for the Azure Stack family. The fact that you can use the same hardware to improve VDI performance and offload image rendering from mixed reality and mobile devices is an added bonus.