November 14, 2023 | 5 minute

How Data Lakehouses Are Improving Healthcare Systems Across Nearly Every Category

Doctors with clipboards and nurses in a circle having a meeting
scroll to top

Author: Ted Healey, VP of Engineering,

Today, healthcare systems are facing serious headwinds. Patients are unhappy with their care, and staff and nurses are overworked, underpaid, and often in harm’s way. Plus, profits are trending in the wrong direction.

It’s critical that healthcare leadership leverage technology, data, and analytics to best understand where they stand, and what actions to take to improve their business and patient care.

To accomplish this, healthcare systems should take note of many other industries that have built and benefited from a data lakehouse.

What is a Data Lakehouse?

A data lakehouse leverages modern tools to aggregate data from all enterprise sources into a single location, on an ongoing, real-time, or near real-time basis, so that any data is available (with security and permissions, of course) for any kind of analysis today and in the future.

In the past, data has been siloed in databases and applications that served a single purpose, typically for the benefit of one team or group within the organization. For example, information about the structure or facilities of a hospital or division of hospitals would be stored, managed, and utilized by one group overseeing real estate and facilities management. Whereas staff data is managed and stored in an HR database.

If you wanted to study commute times for your nurses in the Chicago metropolitan area where we have three hospitals, you can’t just get straight to the analysis. Instead, you need to determine what data to get from which database, extract it, and make it available for this single purpose. For example, staff addresses can come from your HR system, and building addresses from facilities data. But once you do this, the data becomes stale and useless for any other purpose.

How Can Data Lakehouses Help Healthcare Systems?

Hospitals and healthcare facilities generate tremendous amounts of data in various categories including:

    • Patient Data – Demographic information, medical history, lab results, diagnostic images, and other clinical documentation.

    • Electronic Health Records (EHR) – Comprehensive information about a patient’s medical history, diagnoses, medications, allergies, and treatment plans.

    • Financial Data – Billing and payment information, insurance claims, and other financial records.

    • Human Resources Data – Information about hospital employees, such as job titles, salaries, work schedules, performance evaluations, and benefits.

    • Operational Data – Patient admissions and discharges, bed utilization, staffing levels, and asset/inventory management.

    • Research Data – Clinical trials, medical research studies, and other scientific investigations conducted at the hospital.

    • Quality Data – Patient satisfaction surveys, clinical outcomes, and quality improvement initiatives.

    • Administrative Data – Budgets, strategic plans, marketing initiatives, and regulatory compliance.

    • Asset Data – The location, condition, and par levels of critical equipment.

    • Staff and Patient Location Data – This comes from IoT solutions like, which offers tags for assets as well as staff and patient badges.

It has never been easier or more economical to aggregate this data into a data lakehouse. With a unified data lakehouse, healthcare systems can better understand clinical outcomes, operational efficiency, patient satisfaction, and more using advanced analytics across multiple data sources. Steps to accomplish this and utilize the full potential of your data include:

    • Data Ingestion – There are many ELT (Extract Load and Transform) tools that can easily copy data from point A to point B.

    • Data Cleansing – While the data is being copied it can be cleaned up and changed into a different format or schema.

    • Data Anonymization – The database can be “scrubbed” of data that should not be shared in the data lakehouse. For example, instead of copying a person’s Social Security number into a shared space in the data lakehouse, a placeholder like *** – ** – **** can be substituted. The real social security number can still be copied into a special location within the lake that requires additional permissions to access. This way, no data is lost, and data that requires additional safeguards is protected appropriately.

    • Machine learning and Business intelligence – Once the data is live and aggregated from all enterprise data sources, techniques like Machine Learning can be used to mine the data for many different business purposes and business intelligence tools can extract the precise data you need for reporting, dashboards, notifications, and more.

As new data sources emerge, like IoT sensor data, they can easily be incorporated into the existing data lakehouse infrastructure. By using these techniques, a healthcare company can progress along a maturity curve:

    • Chaos. No data. Confusion – Are staffing levels correct to deliver the best patient care within budget expectations? Do we know where the patients and assets are, and how many do we have in each ward or zone of the hospital?

    • Initial Data – investing in IoT allows the organization to begin to understand where people and things are, in space and time. It’s a great first step!

    • APIs and Applications. as an open platform offers this spatial data via APIs so the client can easily integrate with their internal systems. Alternatively, Kontakt offers unique applications that keep track of people, equipment, interactions, nurse rounds, and many other workflows.

    • Data Lakehouse – By combining this real-time location data with the rest of a company’s enterprise data, the full value of the data can be “unlocked” and provide insights that directly benefit everyone – patients, nurses, doctors, management, and more.

Unlocking enterprise data and making it easy to access and analyze in an integrated fashion enables any healthcare company to run much more efficiently. Real-time location data for people and assets combined with other data sources can improve the lives and day-to-day workflows of patients, staff, and the administration, all at once.’s data lakehouse technology provides health systems with a unified platform for data storage, processing, and analytics, making it easier to derive insights from operating data. You can scale your data infrastructure on-demand, transcend data silos or vendor lock-ins, reduce data management costs, and eventually improve time-to-insight with a single source of truth.