Microsoft began delivering HoloLens 2 late in 2019, with a focus on building mixed reality applications for what it calls ‘first line workers’. They need software that supports manufacturing, construction, medicine, and retail, using the HoloLens tools to overlay 3D virtual objects onto the physical world. HoloLens 2’s new sensors make it a powerful device, feeding on-board computer vision hardware and mixing sensors to help position users in a room, blending physical and virtual environments.
SEE: Augmented reality for business: Cheat sheet (free PDF) (TechRepublic)
A device like HoloLens 2 is attractive to more than the initial targeted markets. Mixed reality is a powerful tool in many different environments and markets, and the underlying hardware can support much more than simply blending the physical and the virtual.
HoloLens and the HPU
Much of that comes down to HoloLens’ custom hardware, especially its built-in computer vision silicon. Where most cameras can manage only one computer vision task at a time, the HoloLens silicon supports multiple streams of data and parallel image processing and recognition tasks. And that’s all without resorting to batch processes, as HoloLens is designed for continuous workloads.
The Holographic Processing Unit (HPU) is a custom ASIC. Its design mixes various modules for digital signal processing, a deep neural network AI core, and hardware to manage the computationally intensive task of keeping images’ depth stable when rendering complex images while a user’s head moves. By putting a DNN core in the HoloLens, Microsoft can avoid latency and lag when using computer vision algorithms.
If you’re building applications for HoloLens 2 you’re limited to the information you can get from it. It’s a powerful tool, but Microsoft has simplified much of the development experience, wrapping and consolidating sensor data into a mixed reality toolkit and a set of tightly defined APIs. That’s not a bad thing; for most purposes you don’t need low-level access to sensors — all you need is the data that helps you build your applications.
Microsoft has often described designing software as delivering pizza for more than a billion people. Not everyone is going to get the toppings they want, but everyone is going to get melted cheese and tomato sauce. However, the other side of Microsoft’s design philosophy is that you can usually take those pizzas and start to customise them, adding the software equivalent of your own toppings.
Using HoloLens in research
One area where access to all HoloLens 2’s sensors is necessary is scientific research, where mixed reality and computer vision are powerful tools. Bringing the two together in a portable, head-mounted computer makes HoloLens 2 attractive, especially when you remember that it has multiple cameras and depth sensors, as well as accelerometers, gyroscopes, and magnetometers. So it’s not surprising to learn that Microsoft is opening up access to many of these features in HoloLens 2’s new Research Mode.
Opening up all HoloLens 2’s sensors to research scientists makes a lot of sense. It’s not only computer vision that can benefit: there’s the option of using HoloLens as a head- and eye-tracking platform that can help solve many human interaction issues — for example, tracking all the head motions and eye movements used by pilots in a modern cockpit, helping understand what their cognitive load is and how the environment can be redesigned to keep passengers safe.
Accessing the HoloLens data streams
HoloLens sensors go beyond the traditional head tracking, adding in cameras that can be used to track hands and map the environment, giving researchers a much clearer view of the environment users are in, and how they interact with it. They also offer better light sensitivity, making it easier to bring in data from darker areas. By opening up access to all the cameras and sensors, research mode is able to produce a model of the wearer’s environment that can be replayed on demand, with full depth perception thanks to both its time-of-flight sensors and the stereoscopic positions of two of the head-tracking cameras. You still also have access to HoloLens 2’s colour camera, allowing you to build software that can bring greyscale and colour images together.
You’re not limited to raw data. The built-in computer vision tools support a key algorithm called SLAM (Simultaneous Location And Mapping) which gives you access to real-time information about how a device is moving through its generated 3D room mesh. Other consolidated and processed data includes hand- and eye-tracking APIs, as well as access to the device’s eight microphones.
How does Research Mode work?
Instead of a fusion of sensor data into specific functions, as supported by the Mixed Reality Toolkit (MRTK), when you switch into Research Mode and enable its new APIs, you get access to much lower level data alongside the MRTK APIs. It’s not quite the raw data from the sensors, as it’s initially processed by the HPU. So for positioning, you get head pose and anchor data like the MRTK, but you also have access to the raw sensor data as well as image data from the head-tracking cameras. Similarly, data from the cameras is processed and linked with data from the depth sensors to give you the familiar room mesh and hand articulations. At the same time, you get raw depth data, as well as a measure of how infrared light from the sensor emitters is being reflected.
Getting started is easy enough; you need a HoloLens 2 that’s in Developer mode. Once it’s enabled, turn on the device portal and log into the device’s management web console from a PC. In the console, navigate to the Research Mode option and enable access to the device’s sensor streams. Once the HoloLens reboots, you can start working with all the sensors. You can then write code to work with those sensor streams, using the Research Mode APIs that can be downloaded from GitHub. This provides access to the sensors, with all position data centered on the device. So if you want to map a room, say, you need to provide external anchors.
SEE: 91% of businesses already using or planning to adopt AR or VR technology
The Research Mode APIs provide well-documented access to the device sensors. That does mean you’re going to write a lot of code to process and use that data. It’s not a bad thing; if you need that level of access, you’re clearly not planning on using the built-in apps or even working with the MRTK! There are trade-offs, of course, as your code will need to run on the HoloLens 2’s ARM processor, and you might need to track down ARM versions of any binary modules or libraries you plan to use in your code. To help you get started, Microsoft provides sample code for some basic sensor-recording apps on the Research Mode GitHub.
Microsoft’s Azure Kinect Developer Kit works as an adjunct to the HoloLens in Research Mode. It’s based around the same depth sensor, and several can be chained together to give you a clear 3D picture of a room. Mixing the two devices can help build a set of sensors that can track users through a room or space, providing location data that can be used in conjunction with headset sensors.
HoloLens 2’s Research Mode is a powerful tool. By removing the filters of the MRTK it opens up many more scenarios for working with device sensors, providing a wearable computer vision platform that can do a lot more than simple cameras. The results can be impressive; just be prepared to write the code you need to achieve them!