I came home from work today and sat down to check my Facebook messages and immediately noticed something strange. The shapes and formatting of the page would load but none of the images or text were loading.
Grrrrr... Facebook's refusing to load..
— igmrlm (@igmrlm) March 31, 2014
My next thought was to check the ping times between all the tier one ISPs on
http://www.internethealthreport.com/ This site shows a breakdown of the ping times between the largest fibre optic cable providers, these being tens or hundreds of gigabits. At this point the majority of evening traffic in North America was just starting as I live in EST and work a normal 9-5 schedule. What I saw showed that definitely something was wrong and it wasn’t my computer or my ISP.
http://www.internethealthreport.com/ This site shows a breakdown of the ping times between the largest fibre optic cable providers, these being tens or hundreds of gigabits. At this point the majority of evening traffic in North America was just starting as I live in EST and work a normal 9-5 schedule. What I saw showed that definitely something was wrong and it wasn’t my computer or my ISP.
@Level3 seems to be having issues.. I wonder if that's what's wrong with my #ping to #google and #facebook pic.twitter.com/mAY5t7pJ6c
— igmrlm (@igmrlm) March 31, 2014
Packet loss... @Level3 pic.twitter.com/QdSHM0HKB5
— igmrlm (@igmrlm) March 31, 2014
Then Netflix went down and there wasn’t much else to do than to figure out what was going wrong. I also decided to pick my own hashtag so I could group all my tweets in the future more easily.
Now #netflix is down too.. yay xd #northamericanfiberopticproblems pic.twitter.com/CsPIye6sBk
— igmrlm (@igmrlm) March 31, 2014
Right away you can see that the latency is starting to yellow-line and the packet loss has significantly red-lined. This usually means something has crashed or broken, or a cable has been cut, or it even could be a symptom of a DDOS attack, but how do you figure this out for sure when you don’t work for any of those companies?
Before long, to my surprise, Netfix customer service tweeted back with a troubleshooting guide. (Yea I’m a bit new to twitter.. leave me alone.) Now I felt obligated to tell Netflix that it wasn’t their fault since they were so nice to try to help and that it was just an IP transit issue or something.
@Netflixhelps its not your fault, level 3 in Toronto is loosing a lot of packets and timing out partially #northamericanfiberopticproblems
— igmrlm (@igmrlm) March 31, 2014
I went back to check the ping times and saw that another network had started dropping packets from Level 3.
@Level3 Wow it actually got worse #savvis #level3 #cogent #sprint #sbc #verizon #xo pic.twitter.com/idOgjpR7j3
— igmrlm (@igmrlm) March 31, 2014
I started searching for tools that had error reports for Canada to see if it was localized and found one, http://canadianoutages.com/ that showed a spike across every big Internet name I knew of and more such as Facebook, YouTube, PSN, Google, XBox Live, and more. At the time of this writing one can still see the spike in outages and issues across the board on that site.
@Netflixhelps @TekSavvyCSR @RingCentral pic.twitter.com/uJ1G3M5xI0
— igmrlm (@igmrlm) April 1, 2014
I wanted to make a call to tell my boss we might be having issues tomorrow, but when I tried to load my app on my phone it was unable to sync with the network. I tried calling myself with a landline and it rang through but might not have been working, I decided to use the landline instead.
I thought perhaps Bell Canada's Internet Tech department might be able to tell me if they had been notified of any fibre cuts or lines going down or something but I got the usual "if you can't give me an account or anything I'm not allowed to tell you anything" which in this case I guess is good business practice but annoying nonetheless. I tried calling Teksavvy after not finding much on their website (had not been posted yet) but then hung up on hold when I realized they must be swamped with calls from actual customers and probably were too busy and I shouldn't bother them. I decided decided I'd try more twittering..
It looks like @RingCentral might be down too #northamericanfiberopticproblems
— igmrlm (@igmrlm) March 31, 2014
I washed a few dishes then went back with my twitter i noticed Ringcentral had replied and I let them know also that it was something wrong with a main peering hub.
@igmrlm if the problem happens again please let us know and we'll be sure to look into it
— RingCentral (@RingCentral) March 31, 2014
@RingCentral heads up if you get any support calls, level 3 is loosing packets to nearly all their peering hubs
— igmrlm (@igmrlm) March 31, 2014
This includes #netflix #bell #rogers #teksavvy #cogeco #yahoo #psn #shaw #xboxlivedown and many others #northamericanfiberopticproblems
— igmrlm (@igmrlm) March 31, 2014
See http://t.co/YWWK3w7sb6 http://t.co/mg1ao4PqFK who knows how long it will last, probably only a few hrs #northamericanfiberopticproblems
— igmrlm (@igmrlm) March 31, 2014
Techsavvy must have noticed the hashtag usage somehow and replied.
@igmrlm I'd be glad to look in to that for you! Please follow and DM with your account info so I can better assist. JD
— TekSavvy Assistance (@TekSavvyCSR) March 31, 2014
I sent them a message and they referred to me to a forum thread they had just started as a staging ground for information as it developed. At this point it looked more like a fibre cut than anything else
http://www.dslreports.com/forum/r29144871-Slow-Service
At the time of the message there was one reply, at the time of writing there were 13 pages.
It wasn't much longer after this that the CEO of Teksavvy posted on the thread and confirmed that it indeed was a fiber cable cut, from Hurricane Electric effecting 100 gigabits of fibre, effectively half their upstream network and that of every other ISP in Ontario and Quebec and beyond to congest the whole north american grid.
Marc, the CEO, then posted a link to the details of exactly what happened:
Incident | |
Beauharnois | |
In progress | |
We have a fiber cut between Newark < > Beauharnois.
We contacted LEVEL3.
http://status.ovh.net/?do=details&id=6629We contacted LEVEL3.
http://status.ovh.net/?do=details&id=6632
And that's one of my favourite ways to spend the evening. I talked to a dozen different people in half a dozen different countries and have some new tools to use in the future. Was great fun :D
**update
Tuesday, 01 April 2014, 09:43AM
The provider has reported that the fault has been located, and was determined to have been caused by road construction. It appears that the 120-count fiber was damaged by a new post being installed during the repairs to a highway guardrail. Splice crews remain en route to the area; an estimated arrival time has not yet been provided.
*note
I am a self taught enthusiast in this subject, I know only as much about oc-192 and 40gige as I have read on my own. I do hold Comptia A+ and work in the IT industry but I'm not a professional expert on fibre transit lines by any means. Please feel free to submit positive criticism, I'm always searching to learn more.
and it's affecting multiple providers and services
- Level3 NWK / BHS (10x10G)
- Level3 NWK / MTL (2x10G)
- Hibernia NWK / MTL (2x10G)
- Telia NWK / CHI (6x10G)
- Telia NWK / PAL (2x10G)
- Level3 Paris / New York / BHS (4x10G)