Freaky Network Outages

Author
Terry Slattery
Principal Architect

One of our customers recently had a network outage, but for a very unusual reason: 
Someone burned a mattress near a fiber run and the heat melted the fiber.

That’s a new one!  This incident started me wondering about strange cases of network outages.  Common cases are cutting the wrong cable, such as happens during a renovation.  I’m aware of a 144-fiber bundle that went through the ceiling of a room that was being rennovated.  Of course, the overzealous “deconstruction” people cut out every cable in the room, including the big orange one.  Oops!

Everyone is familiar with the infamous backhoe.  There is also the impact driver that drives the road-side guard-rail supports into the ground.  In the Maryland area many years ago, a major fiber run from Annapolis to Baltimore was cut by a road crew.  The utility people had checked that the guard-rail wouldn’t hit the fiber, but in a moment of thoughtlessness, the road crew decided to run the guard rail a bit further.  Maybe they had an extra length of guard-rail on the truck and they didn’t want to take it back to the shop.

Of course, there’s the cable that’s snagged while working on equipment.  One that recently happened was due to a fiber termination tray that’s mounted using an all-thread (a rod that’s threaded its entire length).  In the fiber termination try, the fibers have very little to protect them – typically a small buffer tube.  As one of the workers was reassembling the tray, a fiber snagged on the all-thread rod, which neatly cut it.  I suggested that the cable trays be outfitted with plastic tubing over the all-thread rods, except where the nuts screw on, in order to reduce the possibility of a recurrence.

Another incident involved firearms.  A fiber cut was determined (loss of signal) and upon investigation, the work crew discovered that someone must have been refining their sharpshooting skills.  The cable had a bullet embedded in it.  I’m glad that I don’t have to work in that neighborhood!

I heard about another interesting incident just last week.  A sonet ring was partitioned.  The ring monitoring system that monitors link status showed a very interesting signature.  The link was running fine, when it suddenly experienced a burst of errors, then recovered.  About a second later the signal was lost.  We can only speculate about the exact cause, but that signature led us to ponder the work scenario that might have unfolded.  “Joe, this cable is stuck!  Ugh! There! It’s free now.  It must have been stuck on something.”  The first tug caused the burst of errors.  The signal recovered as the pressure on the cable relaxed.  Then the cable snapped when the workman pulled with a lot more force.  We don’t know if this is really what happened, but it is plausible, based on the reported signal variance.

Back in the days of the old multiport transceiver and thin Ethernet, I had a customer with an interesting network failure.  The network would experience regular failures, most often in the afternoons of sunny days in the summer.  After a few days investigation, asking questions, and partitioning the network to isolate cable segments, I was able to track it down.  A number of the terminators on the multi-port transceivers had cheap terminators with tinned center conductors. The building was near a large body of salt water and in the summer, the warm, humid conditions helped add a little corrosion to the tinned connectors.  As the rooms that contained the transceivers with these terminators warmed up in the afternoon, the small amount of corrosion caused the resistance to increase slightly.  It didn’t take much of a resistance increase to cause reflections that made the multi-port transceiver think that a collision had occurred.  The default operation is to propagate the collision to all other ports, so the network essentially died when a terminator’s resistance increased beyond a small threshold above the nominal value. I was able to verify it by making my own terminator with a variable resistor and determining the value where the network would fail. Upgrading the cheap terminators with gold plated terminators solved the problem.  This is a case where the gold plated solution was the right one. Smile

Once the fiber is cut, it has to be repaired.  An interesting story I heard a long time back was about a fiber repair crew working in the northeast in the winter to fix an outdoor cable.  It was a very, very cold night.  So they retreated to the warmth of the truck to prepare the ends of the fiber.  The armored underground fiber is very difficult to bend, so run one end in one truck window and the other end into the other truck window, roll the windows up, turn up the heater, and prep the fiber.  Just remember to go back outside to finish the job.  Well, wouldn’t you know it, this crew forgot to go back outside when they were ready to splice it.  It was spliced, the link was back up and they looked up to realize that the fiber was running through their truck.  Talk about people who are intent on their job!  I heard that the link was valuable enough that they cut open the truck to free the fiber instead of cutting and re-splicing the fiber. I sure would hate to have to make that call to my boss.  “Boss, the fiber’s fixed, but the truck needs some body work.”

Do you have any interesting causes of network failures?  Please share!

-Terry

_____________________________________________________________________________________________

Re-posted with Permission 

NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html

infoblox-logo

Leave a Reply

 

Nick Kelly

Cybersecurity Engineer, Cisco

Nick has over 20 years of experience in Security Operations and Security Sales. He is an avid student of cybersecurity and regularly engages with the Infosec community at events like BSides, RVASec, Derbycon and more. The son of an FBI forensics director, Nick holds a B.S. in Criminal Justice and is one of Cisco’s Fire Jumper Elite members. When he’s not working, he writes cyberpunk and punches aliens on his Playstation.

 

Virgilio “BONG” dela Cruz Jr.

CCDP, CCNA V, CCNP, Cisco IPS Express Security for AM/EE
Field Solutions Architect, Tech Data

Virgilio “Bong” has sixteen years of professional experience in IT industry from academe, technical and customer support, pre-sales, post sales, project management, training and enablement. He has worked in Cisco Technical Assistance Center (TAC) as a member of the WAN and LAN Switching team. Bong now works for Tech Data as the Field Solutions Architect with a focus on Cisco Security and holds a few Cisco certifications including Fire Jumper Elite.

 

John Cavanaugh

CCIE #1066, CCDE #20070002, CCAr
Chief Technology Officer, Practice Lead Security Services, NetCraftsmen

John is our CTO and the practice lead for a talented team of consultants focused on designing and delivering scalable and secure infrastructure solutions to customers across multiple industry verticals and technologies. Previously he has held several positions including Executive Director/Chief Architect for Global Network Services at JPMorgan Chase. In that capacity, he led a team managing network architecture and services.  Prior to his role at JPMorgan Chase, John was a Distinguished Engineer at Cisco working across a number of verticals including Higher Education, Finance, Retail, Government, and Health Care.

He is an expert in working with groups to identify business needs, and align technology strategies to enable business strategies, building in agility and scalability to allow for future changes. John is experienced in the architecture and design of highly available, secure, network infrastructure and data centers, and has worked on projects worldwide. He has worked in both the business and regulatory environments for the design and deployment of complex IT infrastructures.