A new system from researchers at Google and MIT taps machine learning to professionally retouch photos taken on a smartphone in real time.
Artificial intelligence (AI) may soon make photo editing easier: This week at digital graphics conference Siggraph, researchers from Google and MIT announced a new system that uses machine learning to professionally retouch photos taken on a smartphone in real time—before you actually take the picture.
With the technology, a photographer can see the final version of an image on the screen while they are still framing the shot on their phone, according to MIT News.
The system is energy-efficient, and won't run down your phone's battery. It can also speed up existing image processing algorithms, MIT News said. In testing another Google algorithm for producing images with high dynamic ranges—which can capture the intricacies of color that are traditionally lost in digital photos—this new system produced results that were "visually indistinguishable" from those created with the algorithm, in just one-tenth of the time.
The technology leverages machine learning to train on thousands of pairs of images, in which one was raw and one was retouched, to determine how best to professionally edit the photo before it is actually taken.
SEE: The Machine Learning and Artificial Intelligence Bundle (TechRepublic Academy)
A previous project from the MIT researchers reduced the bandwidth consumed by server-based image processing by up to 98.5%, and the power consumption by up to 85%. Researchers were able to send the server a highly-compressed version of an image, and the server sent back an even smaller file with a "transform recipe," or simple instructions for modifying the original image, according to MIT News.
"Google heard about the work I'd done on the transform recipe," said Michaël Gharbi, an MIT graduate student in electrical engineering and computer science and first author on both papers. "They themselves did a follow-up on that, so we met and merged the two approaches. The idea was to do everything we were doing before but, instead of having to process everything on the cloud, to learn it. And the first goal of learning it was to speed it up."
The majority of the image processing is done on a low-resolution image, which reduces time and energy consumption, MIT news noted. In the past, it was difficult to use machine learning to increase an image's resolution, but MIT and Google researchers were able to address this using two methods.
The first is that the output of the machine learning system is a formula for modifying the colors of image pixels, not the image itself, which can better approximate the retouched version. The second is a technique for applying the formula to individual pixels in the high resolution image. Researchers trained the system on a data set created by MIT and Adobe Systems, who make Photoshop. The data set included 5,000 images, each retouched by five different professional photographers. The system was also trained on thousands of pairs of images produced by image-processing algorithms, such as the one for creating high-dynamic-range (HDR) images.
Google and MIT compared their technology to a machine learning system that processes images at full resolution rather than low resolution. The full-res version needed about 12GB of memory to operate, while the researchers' version needed about 100MB—roughly one-hundredth as much space. The full-res version of also took about 10 times as long to produce an image as the original algorithm, or 100 times as long as the researchers' system.
Google has applied AI to other types of vision projects: In 2014, the company purchased Word Lens, an app that translates foreign languages in real time using the built-in camera on a smartphone. Additionally, Google announced in July that it had advanced its machine learning algorithm to provide a more personalized news feed to Google Search mobile app users.
If this system becomes widely available, it could save photographers and marketing departments time and money spent editing photos for social media or other marketing materials. If it is expanded upon, it could potentially have other business uses, such as for interior designers to show what a room could look like in real time.
"This technology has the potential to be very useful for real-time image enhancement on mobile platforms," Jon Barron, senior research scientist at Google Research, told MIT News. "Using machine learning for computational photography is an exciting prospect but is limited by the severe computational and power constraints of mobile phones. This paper may provide us with a way to sidestep these issues and produce new, compelling, real-time photographic experiences without draining your battery or giving you a laggy viewfinder experience."
The 3 big takeaways for TechRepublic readers
1. This week at Siggraph, researchers from Google and MIT announced a new system that uses machine learning to professionally retouch photos taken on a smartphone in real time.
2. The majority of the image processing is done on a low-resolution image, which reduces time and energy consumption.
3. If made widely available, the technology could save photographers and marketing departments time and money on photo editing.
- Google hopes to lure business customers by ending Gmail scanning to target ads (TechRepublic)
- Google is using machine learning to create a news feed from your searches (ZDNet)
- Google's war on terror: 4 ways the search giant is fighting extremism online (TechRepublic)
- Google wants you to stop using its SMS two-factor sign-in (ZDNet)
- Research: Companies lack skills to implement and support AI and machine learning (Tech Pro Research)