Industry / Networking
Rogers Communications Outage Shows Need for Carrier Diversity
If you were impacted by the Rogers Communications network outage that started on Friday, July 8th, 2022, you’re probably wondering what happened. More importantly, if your business was affected, it’s likely that you’re also wondering what can be done to prevent outages like this from happening in the future.
The post-outage press release from Rogers’ CEO,Tony Staffieri points to maintenance on their core network, which caused router malfunction, as the root cause for the outage. In network outage situations, only once the root cause has been identified can the hard work of restoration begin.
While it’s commendable that CEO Tony Staffieri took responsibility and that Rogers credited customer accounts with five days worth of service credits, it’s also worrisome that a maintenance upgrade went this badly. Major telecommunications carriers perform maintenance and upgrades on their networks way more frequently than most people are aware of. The larger a company is, the more maintenance the network requires. There’s a standard method of procedure for telecommunications maintenance on core infrastructure to document the maintenance and create a comprehensive scope of work. At least two technicians are responsible for the initial plan. One technician documents the plan and a second technician should review the plan and approve. On top of this, there should be a “backout” or “rollback” process as part of the maintenance plan in the event of an unexpected outcome. For maintenance of highly critical infrastructure, a test should be conducted in a lab environment prior to running any changes in the live environment. Without having intimate knowledge of what happened, it’s impossible to say whether these steps were taken or not. These standard steps lower the potential for a catastrophic failure, but the odds never go all the way to 0%.
It’s worth noting that the world has undergone a significant paradigm shift toward dependency on internet access and cloud-hosted applications which unfortunately makes our dependence on ISPs stronger than ever. The ability to connect to mission-critical business applications over the internet makes your ISPs network stability a key ingredient in your business’ ability to operate. It also impacts your ability to work from home and to access non-business related content. Also of considerable concern, there were several cited cases of people reporting that they were allegedly unable to contact emergency services (dial 911) during the outage. Less consequential, but impactful, even superstars like The Weeknd were significantly impacted by the outage.
It’s also worth noting that outages like this don’t just occur once in a blue moon. Although infrequent, they certainly can happen at any time. Verizon had a huge outage in January 2021, and Comcast had one in November 2021. Lumen had a similarly notable outage in 2020.
What can be done to mitigate the impact of outages like this in the future? In moments like this, it becomes clear that strong carrier competition as well as the utilization of multiple ISPs for internet access (network diversity) is crucial to business continuity. In theory, if everyone using Rogers’ network also had a secondary connection from a competitor, then everyone would have been able to switch over to the secondary network and continue to work, stream, and be merry. For these reasons, competition between telecommunications providers makes the community stronger. Unfortunately, in many regions, true carrier diversity is difficult to come by due to a lack of competition.
Anything man made has the potential to fail. If you or your business relies heavily on the internet, then it’s in your best interest to have a secondary internet connection from a redundant internet service provider. The secondary internet connection is essential to keeping your connectivity up and running in the event that an outage occurs on your primary service provider’s network. Although a secondary connection comes at a cost, it’s up to you to determine how costly potential downtime would be to your business.
How do you go about getting a secondary / redundant connection? That’s the hard part. Often, it’s difficult to tell if you’re getting true network redundancy or if a different carrier is quoting you a resold version of your primary carrier’s infrastructure. When quoting redundant circuits, we recommend asking the carrier whether or not they’re selling “layer 2” service that rides another carrier’s infrastructure, or if they actually have their own infrastructure at the site. Sometimes it may even be worth investigating a circuit’s point of entry into a building to avoid a single point of failure.
If you’re ever interested in getting these questions answered about a specific site without having to call a bunch of ISPs and wait for days, feel free to try quoting a circuit using Lightyear. When you receive quotes, we’ll ensure detail on carrier infrastructure is shared to ensure you get true redundancy if that’s what you’re looking for.
Want to learn more about how Lightyear can help you?
Let us show you the product and discuss specifics on how it might be helpful.