Microsoft-CrowdStrike Outage: Global IT Disruption and Its Wide-Ranging Impacts, How to recover?

Windows blue screen of death

On July 19, a software update from CrowdStrike Holdings, a cybersecurity company, caused a massive IT outage affecting Microsoft Windows systems worldwide. This incident disrupted essential services across various sectors, including healthcare, banking, and travel, marking one of the most significant outages since Amazon’s 2017 cloud errors and Fastly’s 2021 media network disruptions.

Table of Contents

Microsoft-CrowdStrike Outage: Global IT Disruption and Its Wide-Ranging Impacts. 1

The Scope of the Outage. 1

Impact on Healthcare. 1

Government Agencies. 1

Airlines and Travel 1

Financial Sector. 1

Automotive Industry. 2

Media and Broadcasting. 2

Corporate Impact. 2

The Cause of the Issue. 2

What Went Wrong?. 2

Fixing the Problem.. 2

How to Manually Fix Affected Computers. 3

The Bigger Picture. 3

Lessons Learned. 3

Conclusion. 3

The Scope of the Outage

The outage led to critical system crashes, affecting hospitals, banks, airports, and government agencies. Both Microsoft and CrowdStrike quickly released patches to address the issue, but the fix required manual reboots, which delayed the recovery process.

Impact on Healthcare

In the UK, the National Health Service (NHS) experienced widespread disruptions. Doctors couldn’t access blood tests, patient histories, or scans. In the US, notable healthcare institutions like Memorial Sloan Kettering Cancer Center in New York and Boston’s Mass General Brigham warned patients about potential delays. Outages affected call centers and patient portals, causing some appointments to be rescheduled.

Government Agencies

Federal agencies in the US, including the FBI and the Department of Justice, were also affected. The US Cybersecurity and Infrastructure Security Agency reported that hackers attempted to exploit the situation for phishing and other malicious activities. Internationally, the Dutch and UAE foreign ministries experienced disruptions, and various state and local government operations, including emergency call centers, were impacted.

Airlines and Travel

Airports around the world, from Berlin to Los Angeles, faced delays and cancellations, affecting over 21,000 flights. US-based airlines like United Airlines and Delta Air Lines were severely impacted, with some flights temporarily halted.

Financial Sector

The London Stock Exchange Group’s (LSEG) news website, RNS, was disrupted, affecting the dissemination of regulatory announcements. Major banks, including Bank of America and JPMorgan Chase, had to rely on backup systems. Thousands of JPMorgan’s ATMs and teller stations went offline but were restored by late July 19.

Automotive Industry

Renault’s plants in Maubeuge and Douai in France had to shut down due to system issues affecting parts suppliers. Tesla’s CEO, Elon Musk, announced that the company had removed CrowdStrike from all its systems due to the outage’s impact on the automotive supply chain.

Media and Broadcasting

In the UK, major broadcaster Sky News was taken off air. In the US, several TV stations, including KSHB-TV in Kansas City, experienced broadcasting issues. Australian national news outlets like ABC and Sky News Australia also faced hours-long disruptions.

Corporate Impact

Companies like FedEx and Facebook (Meta) experienced operational difficulties. FedEx saw shipment disruptions, while Facebook’s content moderators faced challenges. In the Baltic port of Gdansk, Poland, and the twin ports of Los Angeles and Long Beach, marine terminals encountered minor disruptions. American Express and Starbucks also reported temporary operational difficulties.

The Cause of the Issue

The problem started with a faulty update from CrowdStrike. CrowdStrike’s software, called Falcon, is designed to protect computer systems from cyberattacks and malware. However, instead of protecting systems, the update caused Windows computers to crash and display the “blue screen of death” (BSOD). This BSOD error prevents computers from booting up normally, making them unusable.

What Went Wrong?

CrowdStrike’s Falcon software works deeply within the operating system. Any issues at this low level can prevent the operating system from booting. This is exactly what happened with the faulty update. The update caused significant problems, especially for organizations that use CrowdStrike’s software to protect their systems.

Fixing the Problem

CrowdStrike and Microsoft quickly released patches to fix the issue. However, the fix required manual reboots of the affected systems, which delayed the recovery process. IT teams had to physically access each affected computer to apply the fix, making it a time-consuming and labor-intensive process.

How to Manually Fix Affected Computers

For those with administrative access, CrowdStrike recommended the following steps to manually fix affected Windows computers:

Boot Windows into safe mode or the Windows Recovery Environment.

Navigate to the C:\Windows\System32\drivers\CrowdStrike directory.

Locate the file matching “C-00000291*.sys” and delete it.

Boot the machine normally.

While these steps are straightforward, they require physical access to each affected machine. This means IT teams had to locate and fix each remote machine, which was a significant challenge.

The Bigger Picture

This outage highlights the critical role that cybersecurity software plays in protecting computer systems. It also shows how a single faulty update can cause widespread disruptions across various sectors. While the issue caused significant headaches for IT teams, it could have been much worse if it had been an exploitation by a criminal gang or another state.

Lessons Learned

The incident underscores the importance of robust cybersecurity measures and the need for quick response strategies to address widespread IT failures. It also highlights the potential vulnerabilities in the IT supply chain and the need for continuous monitoring and improvement of cybersecurity practices.

Conclusion

The Microsoft-CrowdStrike outage caused significant disruptions across multiple sectors, affecting healthcare, government agencies, airlines, financial institutions, and more. While the issue has been largely resolved, it serves as a reminder of the critical importance of cybersecurity and the potential risks of relying on complex IT systems. By understanding the causes and impacts of such outages, organizations can better prepare for future incidents and minimize their impact.

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *

Share via
Copy link