Security researchers from Trail of Bits identified a GPU memory vulnerability they called LeftoverLocals. Some affected GPU vendors have issued fixes.
Researchers at cybersecurity research and consulting firm Trail of Bits have discovered a vulnerability that could allow attackers to read GPU local memory from affected Apple, Qualcomm, AMD and Imagination GPUs. In particular, the vulnerability—which the researchers named LeftoverLocals—can access conversations performed with large language models and machine learning models on affected GPUs.
Apple, Qualcomm, AMD and Imagination GPUs are affected. All four vendors have released some remediations, as follows:
Put simply, it’s possible to use a GPU memory region called local memory to connect two GPU kernels together, even if the two kernels aren’t on the same application or used by the same person. The attacker can use GPU compute applications such as OpenCL, Vulkan or Metal to write a GPU kernel that dumps uninitialized local memory into the target device.
CPUs typically isolate memory in a way that it wouldn’t be possible to use an exploit like this; GPUs sometimes do not.
SEE: Nation-state threat actors were found to be exploiting two vulnerabilities in Ivanti Secure VPN in early January (TechRepublic)
In the case of open-source large language models, the LeftoverLocals process can be used to “listen” for the linear algebra operations performed by the LLM and to identify the LLM using training weights or memory layout patterns. As the attack continues, the attacker can see the interactive LLM conversation.
The listener can sometimes return incorrect tokens or other errors, such as words semantically similar to other embeddings. Trail of Bits found their listener extracted the word “Facebook” instead of the similar Named Entity token such as “Google” or “Amazon” the LLM actually produced.
LeftoverLocals is tracked by NIST as CVE-2023-4969.
Other than applying the updates from the GPU vendors listed above, researchers Tyler Sorensen and Heidy Khlaaf of Trail of Bits warn that mitigating and verifying this vulnerability on individual devices may be difficult.
GPU binaries are not stored explicitly, and not many analysis tools exist for them. Programmers will need to modify the source code of all GPU kernels that use local memory. They should ensure that GPU threads clear memory to any local memory locations not used in the kernel, and check that the compiler doesn’t remove these memory-clearing instructions afterward.
Developers working in machine learning or application owners using ML apps should take special care. “Many parts of the ML development stack have unknown security risks and have not been rigorously reviewed by security experts,” wrote Sorensen and Khlaaf.
Trail of Bits sees this vulnerability as an opportunity for the GPU systems community to harden the GPU system stack and corresponding specifications.
Megan Crouse has a decade of experience in business-to-business news and feature writing, including as first a writer and then the editor of Manufacturing.net. Her news and feature stories have appeared in Military & Aerospace Electronics, Fierce Wireless, TechRepublic, and eWeek. She copyedited cybersecurity news and features at Security Intelligence. She holds a degree in English Literature and minored in Creative Writing at Fairleigh Dickinson University.