Amazon HealthLake uses natural language processing and ontology mapping to extract data from multiple sources.
Image: AWS

Doctors and health researchers finally have the right tools to defeat a longtime foe: interoperability. The combined power of cloud compute, natural language processing and ontology mapping is strong enough to tear down data silos and reassemble the information in a useful format.

Rush University Medical Center got a jump-start on this work during the height of the pandemic in 2020. Doctors used beta access to Amazon HealthLake from AWS to track the ups and downs of COVID-19 cases in Chicago. The hospital worked with the city to combine three data sets from 28 hospitals around the region. City leaders and public health experts used the data hub to understand the spread of the illness and ensure a standard of care for all patients.

AWS released Amazon HealthLake to general availability at the end of July. Healthcare and life sciences organizations can use the service to ingest, store, query and analyze health data at scale. HealthLake uses the industry-standard Fast Healthcare Interoperability Resources to store data and make information exchange possible across healthcare systems, pharmaceutical companies, clinical researchers, health insurers, patients and other sources of patient data. Organizations also can move FHIR-formatted health data from on-premises systems to a secure data lake in the cloud.

Dr. Bala Hota, vice president and chief analytics officer at Rush University Medical Center, said using HealthLake’s data infrastructure created a turnkey solution for structures his team thought they would have to build themselves, such as a FHIR server.

“What this gives us out of the box is enterprise level quality and security,” he said.

Hota said another key part of the solution are the microservices such as Sagemaker that connect to the data.

“The data is stored in an S3 bucket and now you have a couple of ways of connecting to the data,” he said. “What we’re excited about is the potential for other health systems to use this.”

Hota’s team has deep experience in working with backend data systems. Rush was the third site in the world to achieve the HIMSS level 7 AMAM certification, indicating the highest level of analytics use in a healthcare setting. Even with that level of expertise, not everyone on his staff had worked with the FHIR standard.

“Building a FHIR server or app from prototype to production for many users at a big hospital is not easy,” he said. “Any health department could use this platform to ingest data from different sources, put it all together and have a FHIR API layer on top of it.”

Services like Amazon HealthLake and Google Cloud Healthcare Data Engine provide a great head start on a cloud deployment for healthcare companies that have already committed to HL7 and Fast Healthcare Interoperability Resources (FHIR), according to Castlight Health CTO and Chief Architect Robert Stewart.

“These services can accelerate a cloud deployment by many months and should simplify security audits and certifications,” he said.

SEE: Cloud data storage policy (TechRepublic Premium)

Castlight’s platform makes it easy to compare costs and quality scores for healthcare services. The company sells its service to employers and healthcare plans for individuals to use when making healthcare decisions.

Bala said that the HealthLake work feels different because the projects are not just another one-off.

“This feels like a long-term solution and something that feels reproducible, not just something you build for one use case,” he said. “The other value proposition is being able to have apps that connect to the data; that’s where we get excited about FHIR.”

Making interoperability a reality

Dr. Taha Kass-Hout, director of machine learning at Amazon Web Services, said interoperability is still one of the most important factors in healthcare.

“HealthLake also stores data in the FHIR format so that it is easy for organizations, researchers, and practitioners to collaborate and accelerate breakthroughs in treatments, deliver vaccines to market faster, and discover health trends in patient populations,” Kass-Hout said.

Wellframe CTO Mohammad Jouni said technology has not had a transformative impact on the end user in the healthcare system as it has for other sectors because data was not easy to access.

“The challenge with healthcare is that data has always been locked behind silos with two strong limitations — one, no incentives for the data owners to share the data externally and two, no common technology and services that make it easy to store and move the data around,” Jouni said.

SEE: How Moderna uses cloud and data wrangling to conquer COVID-19

Wellframe is a digital health management service that guides individuals through treatment programs. Healthcare systems, individuals and insurance companies use the service.

Internal data from electronic medical records could be shared with other hospital systems for billing and care coordination, but could not be used externally by innovators.

The 21st Century Cures Act requires healthcare providers to share data with patients securely on a web portal or mobile app. This eases the first limitation, Jouni said, which leaves the second problem: Tooling, systems and standards to enable the storage and movement of this data. He sees FHIR as the ideal standard to drive data sharing.

Ready-made data infrastructure from cloud providers is the other key element to speed up the process. This dynamic is motivating the big cloud providers to position themselves as the entities that can commoditize that layer for payers, providers and vendors, according to Jouni.

“With the release of Amazon HealthLake, now all three vendors (Google, Amazon and Microsoft) have competing offerings that will ensure that whichever cloud vendor you are on, you now have a way to support the Cures Act,” he said. “This is excellent for innovators in health tech as it removes barriers for the big players to adopt these services.”

How machine learning powers data sharing and analysis

HealthLake uses natural language processing and ontology mapping to analyze and connect information from multiple healthcare sources such as medical histories, physician notes, prescriptions and medical imaging reports.

“We’ve developed integrated medical natural language processing using machine learning that has been pre-trained to understand and extract meaningful information from unstructured healthcare data,” said Kass-Hout of AWS.

This includes distinct categories, including entities (a procedure or medication), entity relationships (a medication and its dosage), entity traits (a test result or the time of the procedure), and Protected Health Information data from medical text.

The algorithm extracts the information and then organizes, indexes, and stores it in chronological order. This creates a comprehensive view of a patient’s medical history that doctors and other healthcare providers can use to understand relationships in the data, to improve patient experiences and guide treatment decisions.

SEE: Humana uses Azure and Kafka to make healthcare less frustrating for doctors and patients

Kass-Hout used the example of how HealthLake processes a medical note. The standard format for medical records contains a lot of information and reflects doctor behavior as well as patient data.

“Medical notes have no structure and contain typos, abbreviations and spelling errors, which makes it difficult to build a generic linking system that works across the domains,” he said.

HealthLake puts all this data in a structured format that retains all the relations.

“Having this level of detailed granularity improves the efficiency and throughput for a variety of use cases such as medication reconciliation, revenue cycle management, population and health analytics,” he said. “Associated medical codes are added to help identify medications and their brand names, or ICD-10 codes used for further analysis or billing.”

HealthLake extracts an average of 25 ICD-10 codes per note, Hass-Kout said, and flags any codes appropriate for billing, based on the context of the note.

The service also uses ontology mapping powered by machine learning to map the data to the appropriate ontology code (such as standard descriptions for illnesses, medications and insurance billing codes) with high accuracy and low latency.

“This type of entity linking overcomes many of today’s rule-based healthcare ontology mapping where customers spend hours sifting through and codifying,” he said.

Rush is also using the HealthLake to run a health equity study funded by the Robert Wood Johnson Foundation. The idea is to add blood pressure readings from machines in pharmacies to a patient’s health record.

The study is measuring access to healthcare in certain neighborhoods to identify gaps in care and ways to improve doctors’ prescribing habits.

Hota said his plan is to use this initial study to build a reference architecture for other conditions, such as diabetes and cancer. The Rush team plans to make the analysis tools available via open source when the project is finished.