Apple has developed algorithms that allow it to collect sensitive user data while still protecting the privacy of customers, according to a new post on the tech giant’s Machine Learning Journal.
Understanding how customers use Apple devices can help improve the user experience, the post noted–however, accessing data that would provide such insights, such as what websites users visit, may compromise their privacy.
To gain the information without revealing who users are, Apple developed a system architecture that uses local differential privacy, released with iOS10. “It is rooted in the idea that carefully calibrated noise can mask a user’s data,” according to the post. “When many people submit data, the noise that has been added averages out and meaningful information emerges.”
SEE: Intrusion detection policy (Tech Pro Research)
With local differential privacy, user data is randomized before being sent from a device, so servers never see or receive raw data.
This system is transparent, and allows users to opt-in, Apple noted: “No data is recorded or transmitted before the user explicitly chooses to report usage information,” according to the post. When information is transmitted, it occurs over an encrypted channel once per day, with no device identifiers. Data moves to a restricted-access service, where IP identifiers are discarded, along with any association between information.
“At this point, we cannot distinguish, for example, if an emoji record and a Safari web domain record came from the same user,” according to the post. “The records are processed to compute statistics. These aggregate statistics are then shared internally with the relevant teams at Apple.”
Apple has taken this approach to determine which emojis are most popular, to identify which websites require high memory usage or cause excessive energy drain from CPU usage, and to discover new words to improve auto-correct functionality.
“We believe that our paper is one of the first to demonstrate the successful deployment of local differential privacy, in a real-world setting across multiple use cases. We have shown that we could find popular abbreviations and slang words typed, popular emojis, popular health data types while satisfying local differential privacy,” the post noted. “Further, we can identify websites that are consuming too much energy and memory, and websites where users want Auto-play. This information has been used to improve features for the benefit of the user experience.”
The 3 big takeaways for TechRepublic readers
1. In a post on its Machine Learning Journal, Apple explained how it developed algorithms that allow it to collect sensitive user data while still protecting the privacy of customers.
2. Users can allow Apple to collect information such as emoji usage, website history, and typed words, which the company will collect and analyze without including any identifying information from the user.
3. Collecting such data can help Apple improve the user experience and device functionality.