From Firefighting to Foresight: How Hospitals Can Stop Cascade Failures From Spreading
Key takeaways
- Hospitals are tightly coupled systems where operational failures (like delayed discharges) rapidly propagate through unrelated departments, degrading care for patients with no connection to the original problem.
- Most hospitals can’t prevent cascade failures because departments operate on siloed data, leaving no single operator with visibility across the full system until the chain reaction has already spread.
- Fusing EHR and RTLS data gives hospital teams the unified, real-time picture they need to identify failure nodes early and intervene before small delays become system-wide disruptions.
It’s 11 a.m., and Mr. Smith is cleared to go home, but the discharge can’t be finalized because an order-entry step is still open in the Electronic Health Record (EHR). Someone has to circle back and close it out, so it waits.
One hour passes.
During that hour, his bed remains occupied. Downstairs, Ms. Doe has been admitted and needs exactly that bed, so she boards on a hallway gurney instead. Her ED bay stays full, so the next ambulance gets diverted across town. In the PACU, a post-op patient is ready to move but has nowhere to go, so the next surgery starts late, and the OR slips behind.
When Mr. Smith finally leaves at 1 p.m., his room needs turning over all at once, in a window that collides with three discharges stacked up behind the same bottleneck. Transport and EVS scramble to respond.

What are cascade failures?
Anyone who’s ever been in a hospital before, as a patient, staff, or visitor, has witnessed an operational failure. Even small breakdowns (a one-hour delay in an inpatient discharge) have outsized effects: ED boarding, ambulance diversion, PACU holds that back up the OR schedule, and bed-turnover crunches that overwhelm transport and environmental services in departments with no obvious connection to the original delay.
These cascade failures, whose impacts rapidly radiate beyond their departments, are daunting challenges. But they are solvable; with the right tools and visibility, hospital teams can trace the root cause of cascade failures, interrupt the chain reaction before it escalates, and ultimately, prevent the consequences from reverberating and escalating through the hospital.
Hospitals are complex, and vulnerable to cascade failures
Cascade failures are not a failure of leadership or employees, but a byproduct of how hospitals are structured.
Unlike factories, which use a narrow range of specialized assembly lines to produce a limited range of high volume outputs to serve the largest market possible, hospitals use a wide range of low volume outputs to serve an even broader, more complex market. While a factory might produce thousands of brake valves with the same tooling, a hospital might treat several hundred patients with vastly different treatments (a foot surgery, chemotherapy, or appendicitis). Not only does this increase the complexity of a hospital, it also removes the advantages of economy of scale.
Hospitals are also tightly coupled: a subsystem (such as a department or wing) depends on the smooth operation of other subsystems in order to function efficiently. If a single subsystem is stretched to its limits, then there is little or no margin for error. Just as a handful of rubbernecking drivers can create hours of congestion for thousands of vehicles, one small incident can degrade the experience of many other patients.
Take ED boarding, whose upstream causes are well established: delayed inpatient discharges leave beds unavailable, forcing clinicians to board admitted patients in the ED. But its impacts can spread laterally, from inpatients to completely different populations such as non-admitted ED patients. Boarded patients tie up beds and nursing capacity, so these individuals wait longer to be seen, treated, and hopefully discharged, slowed by a bottleneck that has nothing to do with their care.
Visibility is the key
To stop these problems from escalating, hospitals need to solve a core problem: a lack of visibility and delayed resolutions. If each department or team has its own data, systems, and view of the hospital, then no one has the complete operational picture. This also means that no one can see a chain reaction forming, let alone stop it before its effects spread.
This is confirmed by research into how hospitals operate: despite being tightly coupled operationally, hospitals are loosely coupled informationally. A Kaiser Permanente study found that clinicians spend 10% of their time working around operational failures, which stemmed from a lack of interconnectedness among interdependent departments.
Unifying these operational and organizational silos to identify and act, is vital to preventing cascade failures. For instance, a patient flow coordinator sees that the ED is congested, while a case manager notices that three discharges are pending from the ICU. If both individuals can connect their insights to each other, then they can understand how one improvement (faster ICU discharges) creates benefits elsewhere in the hospital (lowered ED congestion); this enables them to work together to improve both their departments.
Given the scale and complexity at which hospitals operate, it is unrealistic to expect the coordinator or case manager to remain in constant contact with team members across other departments and recognize these opportunities to work in tandem. Instead, this is a job for AI, which could identify the link between the upcoming discharges and ED traffic and bring the information to the right person to act.
From reaction to prevention
Yet unified, real-time visibility is only part of the puzzle. To transition to a more proactive, preventive approach, hospitals also need to fuse two parallel data streams: EHR data, which carries clinical context and care progressions, and RTLS signals, which continuously track people, devices, and rooms. Only this complete picture can help teams find and fix failure nodes before they spread through the hospital.
This is what Kontakt.io’s Intelligent Orchestration platform is built to do. Patient Flow Agent applies this directly to the capacity cascade, predicting census hours ahead, flagging discharge barriers as they form, and surfacing bottlenecks before they impact care and patient/provider experience alike. By intervening earlier, a small adjustment (such as discharging patients a few hours earlier in the day) can completely eliminate the possibility of a domino effect (ED boarding). Patient Flow Agent makes this intervention possible by giving the right person the right information to act before the window closes.
To be clear, Patient Flow Agent only handles operational issues, and doesn’t supersede any clinical judgments or make decisions for human staff. For instance, it can identify patients who have already been cleared to leave, but kept waiting in their rooms because of paperwork or communication silos, and surface recommendations to patient flow coordinators or case managers for action.
Cascade failures are common in hospitals; however, they are not inevitable, nor are they signs of leadership or employee inadequacies. After all, modern hospitals are complex, highly interdependent environments, and even the best teams and leaders will struggle without the right information.
The hospitals that will truly succeed will be the ones who can identify impending problems, and break the chain reaction before it spreads through the system.
