I’d also say “no, they can’t” given crowdstrike is at risk of being litigated into non-existence right now. Also NDAs because security.
I had never heard of CrowdStrike before. The name seems to imply that it's a military cybersecurity weapon, such as for a massive DDOS on command.
This post has some info but seems more from a customer:
Sussing things out, they uploaded a "config" file that had a .sys extension that caused all the trouble.
.sys files are supposed to be protected system files and require specual privileges to touch. I imagine Crowdstrike requires some special type of Windows "root access" to operate effectively (like many antivirus packages) in order to detect and block low level attacks.
So where things likely went pear-shaped was Crowdstrike's QA process for config updates is possibly less stringent than core code updates. But because they were using .sys files it was actually given elevated privileges and gets installed during boot.
As for the actual bug, I expect it was either something like the sys file referencing itself or some sort of stack overflow somewhere, both of which I would pin on Microsoft for not being able to detect and recover from during boot up.
All of this is straight guesswork based solely on experience as a longtime Windows user.
This is a simple problem to understand and solve. Corporations need to buy a license from CrowdStrikeCrowdStrike.com. This allows large corporations that do not know crowdstrike is a security nightmare built on top of the microsoft security nightmare to assure themselves they are now secure.
My guess is it’s (phased rollout) like all safety precautions. You can flout it and you’ll be fine most of the time, possibly for a while. It works until it doesn’t.
In this case it’s a bit ironic since their whole business is mitigating risk.
The script of the film is being sold at this very moment. :D
They’ll never admit the obvious truth, which is that it was caused by a combination of incompetent administration and incompetent engineering.
I am not from Crowdstrike, but they were under pressure in 2022-2023 to show a profit. They did that by cutting expense growth across the board, and I am now guessing QA got hit particularly hard.
I expect we will see detailed forensics published by various third parties in the coming days and weeks. As to what CS itself will publish, that remains to be seen.
I saw a post somewhere that said configuration data was added after the QE cycle, but before the broad push to the world
I don't think you will get the introductory "As a (so-called) senior software engineer at Crowdstrike" brags here at this time since they are internally screaming and swallowing their pride over that massive outage.
Why would any of them come forward right now?
I think they don't share specific details because it will reveal how bad they handle QA and update rollouts processes for a product that expensive, maybe at some point law enforcement will reveal the embarrassing truth.
An internet connected kernel level driver, with a bricking update, that passed QA, that could also be rolled out via a non-staged method, at global scale, under 24hrs? That is unthinkable.
I'd guess they were missing a
if err != nil { return err }
I'm sure Crowdstrike will not take kindly to their employees spilling the beans here, just saying... Especially a very high-profile issue like this that is likely to end up in the courts.
The core issue? Using Wintel.
the world deserves a detailed post mortem and an apology!!! considering the scale of this shit show there's no way this didn't affect a significant amount of people very seriously. i would make a bet that people died because of it or experienced any kind of calamity be it personally or medically.
[dead]
> That other thread with more than 3000 msgs may or may not have this info, hard to read...
Feed it to an LLM to summarize /s
Oh it was a NULL pointer access... again.
I wonder if this is a preventable problem, hmmmm.
No sir, not at all! It was inevitable! Let's just leave this behind us! Those who think it's preventable are part of a certain "evangelism strike force", I tell you! They make zero good arguments, never listen to them!
Back to business as usual, boys! C/C++ are without a single flaw!
To people who say this is not a Microsoft issue... it absolutely is a Microsoft issue. Microsoft allowed third parties to muck with the Windows kernel in a way that makes the computer unbootable. How is that not a Microsoft issue?
Apple has a vetting process before they will allow an app to be added to their app store. Why doesn't Microsoft have a vetting process before allowing a third party to mess with the Windows kernel? Does Crowdstrike have SOC2 or some other certification to make sure they are following secure practices, with third-party verification that they are following their documented practices? If not, why not? Why doesn't Microsoft require that?
It is clear that the status quo can't continue. Think about the 911 calls that didn't get answered and the surgeries that had to be postponed. How many people lost their lives because of this? How does the industry make sure this doesn't happen again? Just rely on Crowdstrike to get their act together? Is it enough to trust them to do so?
Putting on my tin foil hat for a moment…
I suspect the US government might have pressured them to push this update because they’ve found out that Crowdstrike’s system has either already been breached or has a significant zero-day security vulnerability.
Now, taking off my tin foil hat…
By the way, Crowdstrike has shared "technical details", discussed here: https://news.ycombinator.com/item?id=41013198
Unfortunately it's rather lacking in both technicality and detail.