Innovation

Inside The Machine: Hewlett Packard Labs mission to remake computing

Hewlett Packard Labs reveals the progress it's making in its attempt to reinvent computing for the era of big data.

martin-fink.png
Hewlett Packard Labs CTO and director Martin Fink at HPE Discover in London
Image: HPE
While the power of modern computers dwarves that of their ancestors, the design of today's digital devices is still bound by that of the earliest, room-sized machines.

Hewlett Packard Enterprise (HPE) wants to remove these decades-old constraints on how machines store data and in doing so create a computer able to handle tasks vastly more complex than is possible today.

HPE is building what it calls The Machine, a system it hopes will be able to store and retrieve huge amounts of data far more rapidly than is currently feasible.

Director of Hewlett Packard Labs Martin Fink describes what such a machine would be capable of - giving the example of how it could resolve surprisingly complex everyday problems, such as there being no available airport gates when your plane lands early.

"You think to yourself 'How hard could this be, just turn the plane and park', because you look out the window and there's plenty of open gates," he told the recent HPE Discover event in London.

"The reality is this is an extremely hard problem to orchestrate. But now what if you could take every pilot, every flight attendant, every single plane, every baggage handler, every handler for every gate, for every airport in the world and put it in memory all at the same time."

Getting passengers off a plane ahead of time is just one outcome that could be made possible were a machine able to store all these variables in a way that captures the relationship between them says Fink.

To achieve this goal Hewlett Packard Labs - the central research organization for HPE - wants The Machine to introduce a new architecture for computing, one that changes how machines store data. Today computers typically rely on small pools of memory from which they fetch and temporarily store data. This memory can be accessed rapidly but is typically limited in size. To store large amounts of data and retain it when the system is powered down, machines have to rely on hard disc or solid state storage, which is far slower to access than memory.

In large computer systems with many processors, this architecture also has the effect of creating lots of isolated islands of memory, each tied to different processors and which have to swap data between them.

This design fundamentally limits the efficiency of modern machines when handling very large datasets, said Fink.

"You in effect have to chop your data into chunks in order to match the limitations of the processors, which act as a gatekeeper," he said.

"The processors dictate how much memory you can have and as soon as you want to scale to more memory you need to add another processor and then often another server and so on and so forth."

The Machine would change this approach, allowing processors to share access to a large - initially hundreds of terabytes, increasing to petabytes - pool of "universal memory". This memory would differ from that typically used today in that it would be non-volatile, able to retain data in the event of losing power, while still able to read and write data far faster than hard disc or solid state storage.

"The Machine is by far and away the single most important research project we have at Hewlett Packard Labs," said Fink.

"The goal here is with this architecture we can ingest, store, manipulate truly massive datasets while simultaneously achieving multiple orders of magnitude less energy per bit."

One of the first uses that Hewlett Packard Labs envisions for The Machine is security, with Fink giving the example of how it could enable companies to create a DNS analyzer for net traffic capable of handling a lot more data than products today, which he said typically handle 50,000 events per second and hold five minutes worth of events.

"This is what we want to deliver as our first machine product and we call it The Security Machine. Being able to deal with ten million events per second and hold 14 days worth of events. This is what a 640 terabyte machine allows you to do."

If the technology works, HPE has talked about ambitions to eventually shrink the architecture down to allow it to be used for data-intensive tasks, such as voice recognition, inside smartphones or document analysis inside printers.

Ambition vs reality

Eighteen months ago, the then HP Labs announced its plan to build The Machine - with a view to releasing it a few years from now. However, The Machine is currently an ambition, with the technologies needed to fully realise it at various stages of readiness.

Perhaps the biggest missing piece is the memristor - a resistor that stores information after losing power and that would allow computers to store and retrieve large datasets far more rapidly than is possible today. However, attempts to develop a commercially viable memristor have been taking place for years and in summer this year HP admitted that it was unsure when it would be ready to release a memristor to market.

In light of the question mark over memristors, HP decided to switch prototypes of The Machine's universal memory pool to being reliant on DRAM, the conventional memory found inside computers today. Unlike memristors, DRAM doesn't store data if it loses power, and so the universal memory pool inside The Machine prototypes will rely on a separate power supply to the system's CPUs. Until memristors are ready, subsequent prototypes could exploit phase change memory, such as the "memristor-like" storage class memory that HP and SanDisk are planning to release, albeit at an unspecified date.

A single-rack prototype of The Machine with 2,500 CPU cores and 320TB of main memory will launch next year.

machine-gallery-prototype.jpg
Hewlett Packard Labs will release a prototype of The Machine next year.
Image: Richard Lewington
Connecting The Machine's many system-on-a-chips (SoCs) and the vast pool of memory together requires fast data transfer, and for this HP is counting on using optical interconnects between components. On this front Fink told HPE Discover that the "first silicon for our 100 Gigabit fiber optical engines have arrived and our initial turn ons are looking really, really good". The Machine's customized chips are also progressing, with Fink revealing that "our printed circuit boards have started to arrive, our programmable silicon FPGAs [Field Programmable Gate Arrays] are also getting ready".

The final part of the puzzle is the operating system. The Machine will rely on a custom Linux-based operating system that will run on each SoC in the system and that can work with the universal memory.

"We've done a lot of work with Linux for The Machine because we needed to do the work to allow it to scale to massive amounts of main memory," said Fink.

To avoid The Machine spending too much time shuttling data around each SoC also has its own local memory, as well as what HPE refers to as a "shared something" architecture, where data is replicated around the nodes in The Machine.

Programming The Machine

Ahead of the hardware for The Machine becoming available Sharad Singhal, director of Machine applications and software at HPE, said the firm has been emulating The Machine on a HPE Superdome X, "the largest machine HP makes", with 24TB of memory and 288 processor cores.

He believes developers will be able to program for The Machine in three different ways - saying that cluster computing frameworks such as Spark and Hadoop will be "bolted" onto The Machine architecture so code for these frameworks can be ported directly.

Another option for developers will be exploiting APIs that allow them to address the memory directly, using popular programming languages such as C++ and Java, without having to "worry about all of the lower levels details of how the actual system works, how transactionality is maintained inside The Machine, how fault management happens underneath".

Finally, developers who want and are prepared to deal with the complexity, will have "complete control over how data is managed" via low-level system APIs. These different approaches will give developers "control over all of the stack", he said.

To simplify the complexity of wrangling data at this scale, HPE also wants The Machine to be manageable using a declarative language, where the user specifies their desired outcome and leaves it to the system to calculate how to achieve it - as is the case with web page layout and HTML.

To give a taste of what it will be like to use The Machine, last week HPE released what it calls a "fabric-attached memory emulator", which allows people to begin begin experimenting with "what it means to have multiple processing engines attacking a single pool of shared memory". The open-source emulator is available to download from HPE.

Will The Machine work?

The Machine is as yet unproven and Hewlett Packard Labs' vision is not without its critics. There are those who point to the struggles that HP had with delivering a commercially-viable memristor, how difficult it will be to develop a new computing architecture and operating system, and who claim that software compatibility issues would leave it of little practical use.

The two companies created by splitting HP, HPE and HP Inc, are also facing a variety of challenges in their key businesses. In the most recent earnings for HPE and HP Inc, five out of six of HP's original core groups (Personal Systems/PCs, Printers, Software, Enterprise Services and Financial Services) all saw revenue slump. Only the Enterprise Group, which includes servers and networking, saw a slight uptick of two percent higher revenue on an annual basis.

For its part, HPE has inherited an enterprise services business that, despite still generating billions of dollars, saw operating earnings fall nearly 60 percent in the five years to Q2 2015.

But Fink insists The Machine is more than the vaporware some claim and that it will be in the hands of enterprises in the coming years.

"The Machine is very real, the parts are in, the software is there and it's coming."

Read more about enterprise technology

About Nick Heath

Nick Heath is chief reporter for TechRepublic. He writes about the technology that IT decision makers need to know about, and the latest happenings in the European tech scene.

Editor's Picks

Free Newsletters, In your Inbox