For firms, the idea of a computer turning every employee into an expert in their field is a tempting prospect.

That dream of universal expertise is what IBM says its Watson question-answering, machine-learning system makes possible.

Watson can be trained to answer questions on any subject you choose. The system uses natural language processing to read huge numbers of documents, extracts and organises information about a particular topic and then refines its understanding of that subject based on human feedback.

SEE: Machine learning: The smart person’s guide

But how useful are the answers given by Watson and how difficult is it to train? One person who’s well-placed to talk about using the Jeopardy!-winning system is Lynda Chin, director of the Institute for Health Transformation at the University of Texas System.

For the past four years, the university’s MD Anderson Cancer Center has been working with IBM Watson to develop a system to advise oncologists on the diagnosis and treatment of cancer patients.

Starting with leukaemia, the Oncology Expert Advisor (OEA) has been learning about cancer, absorbing information about treatments, symptoms and side-effects from patient records from the center. The OEA is offering advice on diagnoses and treatments to MD Anderson’s network of clinicians, while continuing to learn, and the center is preparing to train Watson to specialise in lung cancer.

This is what Chin has learned about what’s needed to get the best out of Watson.

You’ll need to spruce up your data

Watson can read and, to an extent, understand documents written in everyday and specialised language. But you’ll likely still need to tidy up the data you use to train Watson, if you want to maximise the useful information extracted.

“You need real data and that’s a huge challenge in healthcare,” said Chin.

“You’ve probably heard that our healthcare data is far from being aggregated and normalised.”

When feeding patient records to Watson, the center needed to work out what kind of data was suited to training, how to clean it up and how to address ambiguity in human language or unclear sentences.

“Reading the medical chart to understand the problem of the patient is one of the hardest parts of training a system. You don’t know what you’ll encounter in the medical records and you have to be able to understand if a piece of information is relevant or not,” said Chin.

You’ll need a broad range of data

Watson and machine learning systems in general are only as good as the data you train them on.

If you want Watson to be an all-rounder, the training data not only needs to be high quality but also to be diverse.

When training the OEA, Chin said that researchers at MD Anderson, and those training a similar system at the Memorial Sloan Kettering Cancer Center in New York, realised that they were feeding the systems medical records of a very specific type of cancer patient.

The type of patients being treated at the centers are usually those with advanced cancers who have failed second-, third- and fourth-line therapies.

For Watson’s advice to be useful to oncologists in general, it will need data relating to a far broader spectrum of cancer patients, said Chin.

“What we realised is that we need more data. The kind of patient that ends up at a speciality hospital like MD Anderson and Memorial Sloan Kettering represent about two percent. We need more data that represents the other 98 percent.”

The center has been working with consultants PricewaterhouseCoopers to identify new data sources that could make Watson a better adviser for clinicians – with Chin hoping to one day pull data from many different hospitals, clinics and pharmacies, as well as from devices such as Fitbit, which provide information about the patient’s behavior at home.

“The intent is that we should have the secure flowing of data, not just from hospital and clinics but importantly from the real-world environment.”

Don’t expect the system to have a broad range of expertise

If Watson, or rival machine learning-based systems, are to possess a general level of expertise about a job they will need to understand the multiple domains of knowledge that relate to that role.

For instance, within healthcare, doctors are required to understand many different illnesses, from cancer to diabetes, and sub-domains, from lung to brain cancer.

However, training systems to understand each of these many different domains can be challenging and time consuming, which is why initial efforts have been focused on leukaemia, lung and other specific types of cancer.

“Think about a patient. I don’t want a doctor who just has expertise in cancer. I want them to know about my hypertension and my diabetes. The chances are I have all of them. So how could we build multiple expert systems that can support the same doctor, so the doctor can look at the patient as a whole person?” said Chin.

“What we need to do to get there is to have a common infrastructure that allows us to break down the silos of different specialities.”

It’ll likely take months, even years to train an expert adviser

Training Watson and turning it into an expert adviser is a long process, even if you’re focusing on just one domain of knowledge related to a role.

“It will take six, nine, 18 months to train a domain. It depends on how complicated the domain is, how much data you have,” said Chin, adding that training is an ongoing process.

“There will always be the training requirement. We’re looking at learning to build an engine while the airplane is in the air, so there are a whole lot of moving parts.”

The length of time it takes to train a system like Watson is a barrier to widespread adoption of machine learning within businesses, argued Harrick Vin, chief scientist at Indian outsourcer Tata Consultancy Services recently, particularly when that lengthy training may need to be repeated for every domain of knowledge related to a role.

“If you really want to see benefits, like the ones that are being projected, then you’ve got to figure out how to scale and not take six months, one year, 18 months to train an engine to perform one task, because a typical large business performs hundreds, if not thousands, of different activities,” he said at the time.

Chin disagrees that the need for protracted up-front and ongoing training means the technology isn’t scalable and points out that time spent training is being reduced as the underlying technology, processes for training and knowledge of different domains mature.

In the center’s case, the goal of helping doctors decide on the best treatments based on the latest medical evidence — much of which the physician would likely otherwise be unaware of — makes that upfront investment worthwhile.

“We want to support a practising physician as if they have access to the world’s experts for that tumor type for that disease.”

You won’t necessarily get black or white answers

In many instances you shouldn’t expect Watson to give you a definitive ‘yes’ or ‘no’ answer, particularly in complex fields where the correct decision comes down to a judgement call.

In these scenarios, Watson should be viewed as a tool for highlighting the best options available, so the human can make the decision.

“I believe that if we’re talking about human-like systems…they are never 100 percent correct,” said Chin.

“There is almost never a black and white, yes and no, answer in medicine, whether you ask a machine or ask a human.”

When questioned, Watson parses that enquiry and comes up with a shortlist of the best possible answers, based on what it has learned.

The idea was never for Watson to make the final call on a diagnosis or treatment, said Chin, but to flag possibilities to the clinician.

“People immediately get defensive, they don’t want to be replaced by the machine. That was never the intent.

“The system is intended to sort through a vast amount of information and present the thing you should be paying attention to because the human brain has limited capacity and a doctor has limited time.”

You may need more infrastructure

It may be necessary to deploy more infrastructure than you expect, in order to support the collection, storage and analysis of diverse datasets, as well as to make the resulting expert systems widely available.

To securely gather as much relevant data as possible and share the expertise from the resulting expert systems far and wide, the center has spent the past 24 months working with IBM, PricewaterhouseCoopers and AT&T on building a storage platform and dedicated network.

Chin said: “It’s like the Oncology Expert Advisor is the car, a vehicle to share knowledge and expertise, and then you find out we have no road. It goes nowhere. We need a highway system. A secure network that allows the car to go to a remote city where you only have one oncologist for 100 miles, so that they get support.

“Until you have that system you cannot really put OEA to use as I intended, which is to democratise and share expertise, to enable the practising physician to do the best job in taking care of their patients.”

Also see…