On July 18, 2024, a routine update from cybersecurity firm CrowdStrike unexpectedly started causing problems for IT systems around the world. This issue was particularly severe because it affected systems running on Windows, the operating system almost all of us use, that connects businesses and individuals around the world. Microsoft, though not directly responsible for the glitch, found itself involved due to its position relating to the IT infrastructure of many of the affected businesses.
How Were Industries Affected By The Outage?
We’ve asked experts to comment on how exactly this outage affected their operations over the last few days. Although the outage was attended to, many reported to have experienced delays, such as the NHS. They reported to have faced admin delays, where digital processes had to be done manually on paper.
They experienced issues with their appointment setters and records, but no actual treatments were affected. On July 22, they shared an update, saying, “Systems are now back online, and patients with an NHS appointment this week should continue to attend unless told not to.
“Thanks to the hard work of NHS staff throughout this incident we are hoping to keep further disruption to a minimum, however there still may be some delays as services recover, particularly with GPs needing to rebook appointments, so please bear with us.”
Our Experts:
Mark Lomas, Technical Architect, Probrand
Evgeny Shirinkin, Chief Product And Technology Officer, Trevolution Group
Adam Smart, Director of Product, Gaming, AppsFlyer
Kate Needham-Bennett, Senior Director, Resilience Innovation, Fusion Risk Management
Mat Westergreen-Thorne, CEO, Grantify
Mark Lomas, Technical Architect, Probrand
“We’re already familiar with the initial wave which has hit everything from global travel (with flights grounded), through to some TV broadcasts being knocked off-air. However, many other businesses and organisations have been affected in a secondary manner. This has included retail, with some businesses unable to take payments throughout the course of the day, either online, or in physical stores. Some delivery organisations were unable to proceed with their normal logistics operations, delaying services that – in some cases – might have been time critical. Shipping has also been affected, both locally and internationally.
“For example, some distributors faced challenges in processing and releasing orders, leading to procurement headaches. That these problems impacted businesses right down to the smallest single-person outfits, speaks to the integrated nature of our world, and the interdependency not just on technology, but on business systems beyond our own. The ongoing fallout will likely take a while to fully understand, but in the meantime, there are many lessons to be learned.”
Evgeny Shirinkin, Chief Product And Technology Officer, Trevolution Group
“Microsoft’s outage caused numerous “hiccups” in banking, healthcare and the travel industry. As for Trevolution group, air passengers worldwide were faced with delays and cancellations, and for us this resulted into an extra need for manpower to support the travellers worldwide.
“On Friday and Saturday, we had issues with the global outage with an increased amount of calls for customer service and reprotection teams. This was due to increased volume of schedule change cases, a lot more than usual. However, on Sunday the situation stabilised.
“Some people have compared this to the Y2K situation, but I don’t agree. Back then, a lot of people were concerned beforehand and expected more damage but nothing major happened. This was the other way around. The mistake of just a few developers led to such huge consequences worldwide.
“Going forward, it is evident that for the essential sectors resilience and caution when it comes to their IT infrastructure must be a top priority. Technology, no matter how smart, has potential to fail at times. The key is to ensure that there are mechanisms in place to manage the impact when smart technology fails and to mitigate the effects of such failures. These industries must be very conservative about what they receive from bundled technologies, and how many permissions and access rights they grant them.”
More from News
- INE Security and RedTeam Hacker Academy Announce Partnership to Advance Cybersecurity Skills in Middle East
- Online Fraud Rises To 3.3 Million Cases In The UK, Report Finds
- Hailey Bieber Sells Rhode For $1 Billion After 3 Years
- UK Hikers And Tourists Now Get Better Phone Coverage, Here’s How
- British Military Invests £1B in AI To Combat Cyber Warfare
- Fintech Funding Falls To Seven-Year Low
- Opsyte Appoints New Managing Director to Drive Next Phase of Growth
- OpenAI Partners with UAE Government: Will All UAE Residents Have Free Access To ChatGPT Plus?
Adam Smart, Director of Product, Gaming, AppsFlyer
“The recent CrowdStrike outage caused a significant issue with Microsoft Azure, driving widespread disruption across many apps relying on Azure. While we’ve seen impacts across so many industries, this outage came with significant implications for mobile marketers and their user acquisition campaigns. When apps go down, the user experience takes a direct hit, tarnishing the app’s reputation and often leading to user abandonment.
“Every minute an app is down translates to lost revenue, user churn, and wasted advertising spend. The amount of money you are losing every minute an app, particularly gaming apps, is down is insane. For example, AppsFlyer works with many gaming app developers who spend well over $1 million per day in user acquisition – if their app is down for half the day due to this outage, they are unlikely to reap much benefits from that half a million dollars they’ve spent.
“The uncertainty of how long these outages could take to fix leaves advertisers facing a dilemma: do you continue running campaigns that direct users to a non-functional app or do you halt these campaigns altogether. This is particularly challenging on platforms where stopping or pausing campaigns can disrupt historical performance data, leading to higher costs and reduced effectiveness once campaigns resume. From conversations across the ecosystem throughout the day today, it seems that most marketers would, and have, chosen to keep their campaigns running and eat the costs to save the short- and long-term headaches of rejiggering campaigns.
“This outage serves as a stark reminder of the importance of robust contingency planning and transparent communication with users. By understanding and mitigating the impact of these disruptions, mobile marketers can better navigate the challenges posed by unforeseen technical issues.”
Kate Needham-Bennett, Senior Director, Resilience Innovation, Fusion Risk Management
“The CrowdStrike outage – the biggest global IT outage – has impacted organisations across industries, causing some to resort to makeshift manual processes (if possible), while others were forced to halt services altogether. Without access to critical systems, computers, and applications, the outage has had a significant impact on major airlines and airports, grounding flights and leaving travellers stranded at airports on what has proven to be, in many parts of the world, one of the hottest days of the year and one of the busiest air travel days since 2019. It affected healthcare organisations and services, forcing hospitals to delay and reschedule operations and treatments, as well as impacting imaging and other critical health services and emergency response teams. It also impacted banking and financial services, including slowed or inaccessible online banking services.
“This has really highlighted the need across industries for more exercising and testing against severe but plausible scenarios. Historically firms have been reluctant to test against scenarios they deemed to be outside of their control such as global IT outages from third parties that underpin all services, such as Microsoft, AWS, Citrix, etc, or they simply declared them implausible or unlikely.
“Every industry needs to have contingency plans in place to deal with the impact of disruptions like this, even if they have no control over the root cause. There is an inclination to simply sit back and wait for IT to get the systems back up and running, but organisations will still have to deal with the operational, financial, and reputational fallouts. They must make it a priority to scrutinise their supply chains and understand which services are reliant on IT systems and third parties that business users may be unaware of, and run scenario testing regularly, at all levels, and against a wide range of scenarios – even the scary ones.”
Mat Westergreen-Thorne, CEO, Grantify
“At Grantify, we frequently help businesses with their risk management processes, to ensure they stand a better chance of securing funding, and it’s clear that the CrowdStrike incident will spark change across multiple industries within this area.
“We’ll likely see a surge in demand for more comprehensive cyber insurance that covers software update vulnerabilities in addition to attacks, especially from those in the financial sector, where compensation will be top of mind. From our understanding some insurance companies do not use this wording, which could leave organisations unprotected in eventualities like the one we witnessed last week.
“I imagine there will also be a renewed focus on robust backup systems and failsafes, with stricter protocols for regular data updates becoming the norm, and possible staggered update procedures to ensure operational continuity.
“SMEs and start-ups, should take note of this when planning their risk management structures, and ensure that they have contingencies in place to continue BAU, if something like this were to happen again.
“Essentially, this incident has highlighted the interconnectedness of our digital ecosystem, and should push businesses to conduct more thorough due diligence on their technology partners. While challenging, this situation will drive innovation in cybersecurity and risk management, leading to more resilient business models and new opportunities in these sectors.
“Ultimately, we envisage that we’ll see a more secure, albeit more cautious, business environment emerge from this wake-up call.”
How Was The Outage Handled?
Once the issue was detected, both CrowdStrike and Microsoft quickly began to deal with the situation. CrowdStrike says the issue lies in a defect of their Falcon sensor’s content update, which was designed for Windows hosts. The company rolled back the problematic update by using a fixed version that was tested and confirmed to stabilise the affected systems.
Microsoft shared, “We currently estimate that CrowdStrike’s update affected 8.5 million Windows devices, or less than one percent of all Windows machines. While the percentage was small, the broad economic and societal impacts reflect the use of CrowdStrike by enterprises that run many critical services.”
How Microsoft Is Working With Tech Giants To Resolve This
Microsoft shared in their statement that they are working with tech giants to bring a resolution and make sure this does not occur again. They assured, “We’re working around the clock and providing ongoing updates and support. Additionally, CrowdStrike has helped us develop a scalable solution that will help Microsoft’s Azure infrastructure accelerate a fix for CrowdStrike’s faulty update. We have also worked with both AWS and GCP to collaborate on the most effective approaches.”