AWS Announces Amazon HealthLake for Healthcare Organizations to Store, Transform and Analyze Data


AWS re:Invent, Amazon Web Services, Inc. (AWS), an Amazon company announced Amazon HealthLake, a HIPAA-eligible service for healthcare and life sciences organizations. Amazon HealthLake aggregates an organization’s complete data across various silos and disparate formats into a centralized AWS data lake and automatically normalizes this information using machine learning. The service identifies each piece of clinical information, tags, and indexes events in a timeline view with standardized labels so it can be easily searched, and structures all of the data into the Fast Healthcare Interoperability Resources (FHIR) industry standard format for a complete view of the health of individual patients and entire populations.

As a result, Amazon HealthLake makes it easier for customers to query, perform analytics, and run machine learning to derive meaningful value from the newly normalized data. Organizations such as healthcare systems, pharmaceutical companies, clinical researchers, health insurers, and more can use Amazon HealthLake to help spot trends and anomalies in health data so they can make much more precise predictions about the progression of disease, the efficacy of clinical trials, the accuracy of insurance premiums, and many other applications.

As machine learning becomes more mainstream, companies across every vertical business are trying to apply it to their data to deliver meaningful business value. Healthcare is applying machine learning to improve operations and patient care, with AWS customers like 3M, Anthem, AstraZeneca, Bristol Myers Squibb, Cerner, the Fred Hutchinson Cancer Research Center, GE Healthcare, Infor, Pfizer, and Philips embracing the cloud and machine learning to get more value out of their vast data troves. From family history and clinical observations to diagnoses and medications, healthcare organizations are creating huge volumes of patient information every day with the goal of getting a full view of a patient’s health and applying analytics and machine learning to improve care, analyze population health trends, and improve operational efficiency. However, clinical data is complex and renowned for being siloed, incomplete, incompatible, and stored in on-premises systems spread across multiple locations. Getting all this information aggregated and in the FHIR format is a start toward the goal of standardizing structured data, but the majority of data remains unstructured and still needs to be tagged, indexed, and structured in chronological order to make all of the data understandable and able to query.

Some healthcare organizations build rule-based tools to automate the process of transforming unstructured data (e.g., medical histories, physician notes, and medical imaging reports) and tagging clinical information (e.g., diagnoses, medications, and procedures), but these solutions often fail because the data needs to be normalized across disparate systems and because the tools can’t account for every possible variation in spelling, unintended typos, and grammatical errors. Other organizations use general-purpose optical character recognition (OCR) software to process data sources, but these tools lack the medical expertise to be effective and so organizations resort to manual data entry by medical professionals which adds expense to the digitization process.

Even if organizations are able to aggregate and structure their data, they still need to build their own analytics and machine learning applications to uncover relationships in the data, discover trends, and make precise predictions. The cost and operational complexity of doing all this work is prohibitive to most organizations; and as a result, the vast majority of organizations end up missing out on the untapped potential to use their data to improve the health of patients and communities.

Amazon HealthLake offers medical providers, health insurers, and pharmaceutical companies a service that brings together and makes sense of all their patient data, so healthcare organizations can make more precise predictions about the health of patients and populations. The new HIPAA-eligible service enables organizations to store, tag, index, standardize, query, and apply machine learning to analyze data at petabyte scale in the cloud.