Thursday, December 19, 2024

A fatal program update: How CrowdStrike crashed global computer systems

Must read

A botched update from one of the world’s preeminent software security companies wreaked more havoc on global business in one day than all but the very worst of hacking groups have ever managed to inflict.

CrowdStrike built its name and a more than $70 billion market value by catching and publicly identifying malicious electronic campaigns by Russian and Chinese spies and organized criminal gangs that take in hundreds of millions of dollars.

But the company depends on deep access to millions of computers to defend them against new attacks, and instructions CrowdStrike sent to those machines running Microsoft’s Windows operating system overnight rendered them useless by Friday morning.

As banking, airline and 911 emergency call systems struggled to recover, CrowdStrike apologized and blamed an error rather than a hacking attack on its internal systems.

“This was not a cyberattack,” CrowdStrike said on its blog. The Austin-based company said it identified the problem and provided a fix for customers to help their employees get working again.

Yet the failure was so extensive and its impact so profound that not all security experts were convinced it was merely human error. CrowdStrike has grown rapidly in the last year and just last month joined the S&P 500 index of top publicly traded companies. But it has made worldwide enemies by calling out hacking operations such as those by Russian intelligence that stole emails from the Democratic National Committee and Hillary Clinton’s campaign chair in 2016.

“I doubt this was accidental. Too many shortcomings,” said Matthew Hickey, founder of Hacker House training company. He said the offending file contained random data, had not been digitally signed and had not been adequately tested.

A U.S. federal official speaking on the condition of anonymity to discuss national security matters said there was no evidence of sabotage or foreign involvement.

GET CAUGHT UP

Stories to keep you informed

Some analysts said they were waiting to hear more from CrowdStrike and that the complexity of state-of-the-art hacking defenses made them dangerously fragile.

Jake Williams, a onetime hacker for the National Security Agency, said “endpoint detection” products like CrowdStrike’s Falcon tool often send out not just updated identifiers for malicious programs to block but also lines of active code to foil more complicated attack scenarios. He said it was possible that CrowdStrike’s systems for testing code before installing it everywhere might not have been “sufficiently diverse” to catch the mistake.

While computer network outages aren’t unusual, experts were stunned Friday that one company’s error rippled through so many systems.

“We haven’t seen a cascading failure like this — maybe ever,” said Chuck Herrin, an executive with the digital security firm F5 Inc.

The sheer extent of the tech crashes around the world Friday exposed the risks inherent in the sort of security software that many see as essential for businesses to ward off ransomware and other devastating hacks.

To be effective, such programs need to be able to see everything that is happening on a machine. But that access can make their failure catastrophic, as it was Friday, and the fix the company later provided was complex: Many organizations had to manually reboot each machine one at a time and delete the bad update file.

That privileged access also makes security programs a top target for spies and ordinary hackers. Just last month, U.S. officials banned Russian anti-virus software company Kaspersky Lab from new business in the country, after it was accused of playing a role in the theft of secrets from NSA employees and others.

Friday’s problems canceled or delayed thousands of flights and forced hospitals to postpone operations. The worst cyberattacks, such as the Russian NotPetya assault on Ukrainian businesses and the North Korean WannaCry virus, have done more lasting damage by permanently damaging computers. But not even those spread so rapidly and so far.

The extent of the financial damage from the outages, as well as who will bear those costs, will not be known for some time. Most software providers are free from legal liability for the harm caused by their programs, which are licensed instead of being sold. But they typically have service agreements with their largest customers that could require help with remediation, discounts or other compensation.

The failure at CrowdStrike is striking in part because the company’s executives have been among the industry’s most prominent voices faulting Microsoft for repeated security lapses. The software giant was blamed for recent major intrusions at U.S. agencies, including the theft of email last year from officials including Commerce Secretary Gina Raimondo. A scathing April report by the Cyber Safety Review Board, which is led by an official at the Cybersecurity and Infrastructure Security Agency, cited “corporate culture that deprioritized both enterprise security investments and rigorous risk management.”

Beyond those lapses at Microsoft, CrowdStrike has said that company’s dominant market position in operating systems and productivity software imparts any weakness with a potentially catastrophic impact.

As one of the few top security companies, some experts are now saying the same about CrowdStrike, one of a small set of network security companies with such broad reach and power.

“Obviously this is very serious, it’s going to be weeks. You have to get hands on keyboards,” said Bryan Palma, chief executive of rival security company Trellix. “This speaks to the need for redundancy and defense in depth.”

The Cybersecurity and Infrastructure Security Agency said it was helping with recovery efforts and warned that criminals pretending to be from CrowdStrike were trying to talk customers into downloading malicious programs or giving up access to their computers.

Marie Vasek, an assistant professor at University College London’s computer science department, said the widespread computer meltdowns showed how reliant global technology systems are on a small number of companies’ software, including that of Microsoft and CrowdStrike.

“The issue here is that Microsoft is a standard bit of software that everybody uses, and the bug in CrowdStrike is deployed to every single system,” she said.

Vasek said technology networks have become so sprawling, complex and interrelated that it increases the odds of one botched line of software code bringing down entire computer networks.

This defect only affected computers that use Windows, which powers hundreds of millions of personal computers and many back-end systems for airlines, digital payment, emergency services, call centers and much more.

In a statement, CrowdStrike said it is “working with all impacted customers to ensure that systems are back up and they can deliver the services their customers are counting on.”

Some companies affected by the CrowdStrike glitch, including banks and emergency service centers, said Friday that they had implemented CrowdStrike’s repaired software and were starting to recover.

Vasek said both Microsoft and CrowdStrike need to examine their procedures to prevent a repeat of such widespread technology failures.

She said CrowdStrike should consider how to safely update its software to many millions of computer networks. And Microsoft, she said, needed to do more to ensure that updates to software from other companies don’t cripple Windows machines.

“Microsoft needs to think about how to check that software is as it should be,” she said.

Microsoft didn’t directly address that criticism but said in a statement that the company is “actively supporting customers to assist in their recovery.”

The company had also reported outages with some of its popular web-connected software for corporate and government technology networks.

It wasn’t immediately clear how many of Friday’s computer network collapses resulted from the defective CrowdStrike software update and which were the result of problems that started Thursday with Microsoft online services and its corporate cloud computing service, Azure.

A spokesman for Microsoft said the company didn’t believe the CrowdStrike software bug was related to the outage that impacted a “subset of Azure customers.” It has been resolved, he said.

correction

A previous version of this article incorrectly spelled Bryan Palma’s first name as Ryan. The article has been corrected.

Latest article