IBM is unveiling new hardware that brings power efficiency and improved training times to artificial intelligence (AI) projects this week at the International Electron Devices Meeting (IEDM) and the Conference on Neural Information Processing Systems (NeurIPS), with 8-bit precision for both their analog and digital chips for AI.
Over the last decade, computing performance for AI has improved at a rate of 2.5x per year, due in part to the use of GPUs to accelerate deep learning tasks, the company noted in a press release. However, this improvement is not sustainable, as most of the potential performance from this design model–a general-purpose computing solution tailored to AI–will not be able to keep pace with hardware designed exclusively for AI training and development. Per the press release, “Scaling AI with new hardware solutions is part of a wider effort at IBM Research to move from narrow AI, often used to solve specific, well-defined tasks, to broad AI, which reaches across disciplines to help humans solve our most pressing problems.”
SEE: Malicious AI: A guide for IT leaders (Tech Pro Research)
While traditional computing has been in a decades-long path of increasing address width–with most consumer, professional, and enterprise-grade hardware using 64-bit processors–AI is going the opposite direction. IBM researchers are presenting a paper at NeurIPS detailing how to overcome challenges inherent to reducing training precision below 16 bits. Per the report, the techniques provide “the ability to train deep learning models with 8-bit precision while fully preserving model accuracy across all major AI dataset categories: image, speech, and text,” providing a two to four times faster training time for deep neural networks over current 16-bit systems.
IBM is also promoting the use of analog AI hardware, which has an opposite development path from traditional, digital systems. Designers are presently working to reduce precision in digital AI solutions, though analog systems have comparatively low intrinsic precision, which impacts accuracy of computing models. IBM’s newest solution utilizes in-memory computing, which the company touts as “roughly doubling the accuracy of previous analog chips, and consumed 33x less energy than a digital architecture of similar precision,” in their tests.
In-memory computing is an emerging model of system design which aims to increase system performance by moving compute tasks closer to RAM. Similar principles are already in use, most notably the use of RAM disks to bypass bottlenecks associated with data transfer between SSDs or HDDs to RAM, though this does not scale well due to density limitations in DRAM.
In IBM’s case, they have a customized type of phase-change memory, which the company claims is well-suited for low-power environments, making it possible to bring AI to Internet of Things (IoT) devices and edge computing applications.
In-memory computing is likely to be an increasingly widespread trend outside of AI as well over the next several years. Intel’s new Cascade Lake series of processors are the first to support Optane DIMMs, which support up to 512 GB in a single module. Optane, also known as 3D XPoint, is a non-volatile memory type which shares design principles with phase-change memory. Intel is marketing Cascade Lake powered solutions for database applications, which can benefit significantly from faster transaction speeds associated with bypassing disk-to-memory transfers.
The big takeaways for tech leaders:
- AI development has relied extensively on GPUs over the last decade for performance improvement, though dedicated AI hardware is necessary for continued increases in performance, according to IBM.
- In-memory computing is key to performance improvements in AI and general computing applications.
