When Microsoft first announced in 2017 that it was putting Windows Server onto Arm processors, it seemed like a great way to get servers that could be cheaper, smaller and less power hungry. With server processors passing 200W and the trend to put multiple GPUs or custom accelerators in a chassis that could consume up to 10kW of power (for a heavy analytics or machine learning workload), balancing that out with lower-power Arm-based servers is an attractive idea for the data centre.

But while the Project Olympus specification that Microsoft uses for both x86 and Arm servers inside Azure is open source and commercially available from server manufacturers like Wiwynn, Windows Server for Arm isn’t available outside Microsoft. Arm processors have been showing up in NAS boxes for a while, but there’s no sign of a hyperconverged Storage Spaces Direct appliance to compete with them.

SEE: Tips for building a successful career as a software engineer (free PDF) (TechRepublic)

At the time, Microsoft distinguished engineer Leendert van Doorn said that Microsoft didn’t see an enterprise opportunity for Windows Server on Arm. Could the rise of IoT and edge computing, and Arm’s upcoming Neoverse N1 architecture, change that?

Missing motherboards

The idea of Arm servers has been around for several years, but Intel has continued to own the server market. First it was the lack of 64-bit processors and powerful enough performance, then it was the need to port server software and workloads to the platform.

An increasing number of Linux distributions, tools and workloads run on ARM 64 (and VMware ESXi support is promised). Performance was good enough for Azure by 2017 — in fact, it was so good that performance, rather than power or cost, was why Microsoft adopted Arm.

But because Arm only designs processors and leaves it up to customers to manufacture them, the problem became finding suppliers. Arm’s Server Base System Architecture (SBSA) spec was designed to reduce fragmentation and it allowed Microsoft to use both Qualcomm and Cavium processors in Project Olympus chassis. Building processors is expensive, and many hopeful Arm server processor makers have gone out of business or been acquired (Cavium is now part of Marvell, while Qualcomm dropped its Centriq development).

That means that even companies like Cloudflare who were planning to move completely to Arm hardware have found themselves staying on Intel silicon. “We’d ported our entire stack to Arm, to Qualcomm Centriq processors,” Cloudflare CEO Matthew Prince told us. The Linux projects they needed already ran on ARM64. “The performance was really good, we’d negotiated the deals for who was going to build the servers for us, we were going full steam ahead — and then Qualcomm shut the whole thing down.”

Cloudflare is now using Intel processors that have a lot of cores for the very parallel workloads it runs, but have a relatively low price and power envelope. The experience has left Prince questioning whether suppliers can deliver Arm server hardware: much of the Centriq team has moved to Ampere, but the performance doesn’t match the Centriq systems Cloudflare had tested. The best-performing Arm servers currently come from Huawei, Prince noted — but Arm has been forced to pull Huawei’s processor licence.

AWS got around the supplier problem by building its own Graviton Arm processors so it could offer Arm-based IaaS as part of EC2. Graviton is based on the work Annapurna Labs did to create network and storage accelerators for AWS, and so far many Arm processors showing up in servers have been used for offload rather than as the main CPU.

SEE: Windows 10 power tips: Secret shortcuts to your favorite settings (Tech Pro Research)

The AWS A1 Arm instances are for scale-out workloads like microservices, web hosting and apps written in Ruby and Python. Like Cloudflare’s workloads, those are tasks that benefit from the massive parallelisation and high memory bandwidth that Arm provides. Inside Azure, Windows Server on Arm is running not virtual machines — because emulating x86 trades off performance for low power — but highly parallel PaaS workloads like Bing search index generation, storage and big data processing.

For the first time, an Arm-based supercomputer (built by HPE with Marvell ThunderX2 processors) is on the list of the top 500 systems in HPC — another highly parallel workload. And the next-generation Arm Neoverse N1 architecture is designed specifically for servers and infrastructure.

Making a server

Part of that is Arm delivering a whole server processor reference design, not just a CPU spec, making it easier to build N1 servers. The first products based on N1 should be available in late 2019 or early 2020, with a second generation following in late 2020 or early 2021.

Plus, the new CPU architecture is designed for data-heavy server workloads like Memcached, NGINX, MySQL, Kubernetes, .NET and Java. N1 also supports what Arm calls ‘server-class’ virtualisation for both type 1 and type 2 hypervisors with enhancements for context switching; that means bare-metal hypervisors like Hyper-V as well as hosted hypervisors like Xen, and Arm’s senior director of product management Brian Jeff told us that “we’ve built in all the hooks to support Hyper-V”.

With up to 128 cores in a CPU, the cores need a more efficient interconnection than the ring Arm has used before. With N1, the cores are now directly interconnected in a mesh. “That means you can have half the CPU dedicated to one task and the other half to another,” Jeff explained to us. “At boot time you set up your memory map and it’s partitioned between the two halves; each side has its own dedicated cache and memory and it’s totally separate from the other. It looks like NUMA, you have cache affinity for each side, so if you have a high priority VM or a task you want to keep isolated from the general-purpose side of the system, you can do that.”

The new architecture is definitely something Windows Server could exploit. Microsoft has also been extending Windows on Arm (which is only available on laptops today); currently Hyper-V isn’t supported, but it will soon be available — not just for running virtual machines or Hyper-V containers, but also for virtualisation-based security like Windows Defender System Guard. VBS is also needed for SQL Server’s Always Encrypted feature. At Build 2019, Microsoft also demonstrated native ARM64 versions of the Sysinternals tools, as well as Firefox and the new Chromium-based Edge browser. Recompiling 64-bit Windows applications to ARM64 is now fairly simple.

Edge servers like Lenovo’s ThinkSystem SE350 could be a suitable platform for an Arm version of Windows Server IoT 2019. (The SE350 is currently Xeon-based.)
Image: Lenovo

That doesn’t mean that a general-purpose Arm version of Windows Server is any more likely now than it was in 2017, but one place that Arm servers make particular sense is at the edge — and Azure SQL Database Edge will run on both Arm and x64 processors, initially on Linux and then also on Windows. This is a ‘small footprint’ (about 250MB) version of SQL Server optimised for time series data (like a series of sensor readings) and running machine learning models that runs in a container and streams data to Azure Stream Analytics, and to a database like Azure Data Warehouse, Cosmos DB or SQL Server for storage.

The idea is that IoT devices produce so much data that you can’t stream it all to the cloud, and when you add machine learning like image recognition to check bottles of beer going down a bottling line or packages going along a conveyor belt, you need to make decisions in real time rather than waiting for the data to go to the cloud and the decision to come back. So you need a database where you can run local machine learning to handle that, using a model that was trained in the cloud. But you might also want to send a portion of the data to the cloud to improve the machine learning model.

Azure SQL Database Edge can run on an IoT device, which would mean Windows 10 IoT Core with the Azure IoT Edge runtime managed through Azure IoT Central, rather than Windows Server. But Microsoft also wants to have it run on gateway servers that aggregate multiple IoT devices and edge servers that connect to the cloud or your other servers.

Lenovo, for example, is making edge servers designed to go in the kind of environments where traditional servers don’t work well, like a factory floor with vibration and dust, or the cupboard in a retail store with no security and no Ethernet cable. If there’s an Arm version of the Lenovo ThinkSystem SE350 edge server rather than just a Xeon model, and if it runs Windows rather than just Linux, that might run Windows Server rather than Windows 10 IoT Enterprise (the replacement for Windows Embedded which is designed for fixed-function devices like ATMs and PoS systems).

ARM64 Process Explorer running on a Qualcomm Snapdragon 850 notebook at Build 2019.
Image: Mary Branscombe/TechRepublic

Microsoft already has Windows Server IoT 2019, which is the new name for Windows Embedded Server, and has high-performance networking and high availability storage with two-node clustering for more demanding edge scenarios like running real-time image recognition from 16 cameras at once, to monitor an entire line. An Arm version of that could make sense and the N1 architecture would fit right in.

But equally, some OEMs are interested in using Windows 10 Enterprise running on Qualcomm Snapdragon 850 devices (which is what Windows on Arm laptops use); the built-in LTE is ideal for sending data from places that don’t have traditional network connections.

Whether it’s Windows Enterprise or a specialised IoT version of Windows Server, Windows on Arm is going to be an option for industrial and edge computing. But unless the N1 architecture really does revitalise the Arm server market, don’t expect a general-purpose Windows Server on Arm just because the silicon could run it well.