Big Data

Why AI and machine learning are so hard, Facebook and Google weigh in

AI remains elusive in part because the humans programming it struggle to get outside themselves. But, it still holds promise for the future.

Image: iStockphoto/maxsattana

Pundits are quick to hype AI and machine learning as the future of everything. But, anyone who has been caught screaming at Siri for its lack of understanding of the most basic of queries knows that we have a long, ponderous way to go before "we have arrived." That's why I find Gil Press' summary of the recent O'Reilly AI Conference so helpful and important.

Some of the observations are banal ("AI is not going to exterminate us, AI is going to empower us"), but others capture the essence of what makes AI so promising...and beguiling.

AI is hard...get over it

The first observation ("AI is difficult") seems obvious, yet for all the wrong reasons. The first thing that makes AI and machine learning difficult comes down to trust. The reason, as Press captured in a statement made by Peter Norvig, director of research at Google, is that we can't see inside the machine to really understand what is happening: "What is produced [by machine learning] is not code but more or less a black box—you can peek in a little bit, we have some idea of what's going on, but not a complete idea."

SEE: Big data's big disappointment: Why AI personalization is pathetic

The second reason, according to Press, comes down to the difficulty inherent in "teaching" a machine enough about the world to allow it to "understand" context. Yann LeCun, director of AI research at Facebook, indicated that to truly grok the world "machines need to understand how the world works, learn a large amount of background knowledge, perceive the state of the world at any given moment, and be able to reason and plan."

No small feat.

Compounding the difficulty of doing this in an accurate way is that any data we feed into a machine is necessarily biased by the person, or people, injecting the data. In the very act of trying to set machines free to objectively process data about the world around them, we imbue them with our subjectivities. It's hard to see how to escape this reality.

Lastly, machine learning proves so difficult because of the programming that goes into it, as Norvig noted:

"Lack of clear abstraction barriers"—debugging is harder because it's difficult to isolate a bug; "non-modularity"—if you change anything, you end up changing everything; "nonstationarity"—the need to account for new data; "whose data is this?"—issues around privacy, security, and fairness; lack of adequate tools and processes—existing ones were developed for traditional software.

Despite all these seemingly intractable difficulties, real promise remains in machine learning and AI.

Finding structure in chaos

Whereas we used to live in a comfortable existence of relational databases, with neat and tidy rows and columns of data, today's world is a morass of unstructured or semi-structured data. It was always thus, of course, but we lacked the data infrastructure to process it. With NoSQL databases like Apache Hadoop, Apache Spark, and more, we finally have the right tools at the right price point (free and open source) to tackle our data.

SEE: AI chatbots are overhyped and unimpressive, say developers

However, we still struggle to uncover patterns in this chaotic haystack of data. This is where machine learning becomes so important. As Naveen Rao, co-founder and CEO of Nervana, said at the event: "The part that seems to me 'intelligent' is the ability to find structure in data." It's not that machines are interpreting the world in any remotely human way today. No, what they're doing is uncovering structure in a seemingly structure-free mountain of data, picking out patterns that no human brain could hope to find in any comparable amount of time.

The trick, then, is to enable machines and humans to operate in tandem. This is the challenge of the next decade of AI and machine learning, and the big reason that, despite its inherent difficulty, AI and machine learning are worth the hype.

Also see

About Matt Asay

Matt Asay is a veteran technology columnist who has written for CNET, ReadWrite, and other tech media. Asay has also held a variety of executive roles with leading mobile and big data software companies.

Editor's Picks

Free Newsletters, In your Inbox