What Actually Happens When A Server Goes Down?

Earlier this week, Amazon Web Services (AWS) suffered a server outage, dragging over 1,000 websites down with it.

Sites like Reddit and Snapchat, as well as banks like Halifax and Lloyds, were all affected, leaving customers waiting for them to come back online.

And whilst it was certainly inconvenient, it was also damaging to AWS’ reputation as a reliable server provider.

But what actually happens when a server fails? And how can businesses better safeguard themselves against it?

 

What Happened With AWS?

 

On Monday, AWS’ northern Virginia server experienced an outage. And whilst outages can be caused by cyberattacks or human errors, AWS announced that the failure happened in one of their internal systems.

The good news is that AWS did get back up and running within 24 hours, but before it did, it caused a period of chaos online, with many websites and services fully down.

For many businesses, this caused huge disruption to both operations and revenue.

 

What Does It Means When a Server “Goes Down”

 

When a server goes down, to put it simply, following usually happens:

1. A website becomes unresponsive: When the server initially goes down, it might load slowly, stop loading altogether or show errors as it fails to connect.

2. The links break: Every time you use an app or website to do an action, it send the server a ‘request’ that then replies with the data. During a server crash, this back and forth breaks, so the website is unable to return what you are after.

3. The system stops working: Any action taken on a website fails until the server can come back online and get the chain online again.

4. Everyone feels the pain: For users, things stop working which can be annoying. For businesses, they can lose sales, trust and the ability to operate as normal.

 

 

Why Are Server Outages So Bad For Businesses?

 

For businesses, servers going down are not just inconvenient, they actually affect much more, including:

Revenue: For businesses that rely on online operations, every minute of downtime means money lost. The more a business relies on servers, the more vulnerable it is. Additionally, for companies that guarantee connectivity for a percentage of time, a long outage could mean they have to pay out compensation.

Trust: For companies like banks or internet providers, outages might mean that consumers lose trust in the company and switch to competitors – especially if it happens regularly.

Operations: It’s not just customers who are affected, employees can be too. Many businesses rely on internal communications apps and APIs to work. A failure in the server could pause work and leave projects trailing behind.

 

What Causes Servers To Fail?

 

Whilst most servers are designed to be robust and reliable, they can fail for a number of reasons, including:

Being overloaded: When demand is higher than the server’s capacity, it might not be able to keep up.

Hardware breakages: Bits of hardware like cables, wires and chips can break, causing it to stop working.

Bugs: An update or misconfiguration could stop it from working as intended.

Network issues: Internal networks could stop, causing outages.

Cyberattacks: In some cases, cybercriminals may purposefully bring down servers as part of a targeted attack.

 

What Can Businesses Do To Protect Themselves?

 

In order to stay resilient and protect themselves from outages, businesses can:

1. Balance their users across different servers and providers so if one fails, another one is able to continue working.

2. Use specific monitoring tools to quickly alert teams to any outages so they can work quickly and have teams on hand to try and resolve the issue as quickly as possible.

3. Learn from their mistakes. If a server outage occurs for a specific reason, work to safeguard the business from being affected by the same issue twice.

 

What We Can Learn From The AWS Outage

 

So, what can we learn from the AWS outage? Well, firstly, even the biggest server providers can fail, so having all your server eggs in one basket can lead to problems.

It’s also a sign that even seemingly robust infrastructure can be vulnerable, so teams should always monitor and have a back-up plan ready.

But mostly, we can learn that a server outage isn’t just an admin headache, it’s an issue that can affect a business’ ability to make money, build trust and continue to operate.

In summary, for businesses around the world, the AWS outage should be more than a headache, it should be a wake-up call.