Just as good personal hygiene is a prime contributor to personal health, good network hygiene is a major contributor to overall network health. Pete Welcher introduced me to the concept and I’ve found it referenced in a RIPE draft document titled Network Hygiene Pays Off, which talks about source address filtering. (I also talked about source address filtering a short time ago: Anti-spoofing filters). Pete’s way of looking at it is more widely encompassing that just source address filtering.
The principle is simple. Keep the network clean and it will have fewer problems. When problems do occur, they will be easier to spot and simpler to fix because there will be fewer negative interactions between network subsystems. If you don’t keep things clean, interactions between otherwise minor problems can create a larger problem.
What’s an example? Let’s say that you have a network that contains some duplex mismatches between network gear. But the links are all low utilization and since no one is complaining, you decided that they don’t need to be fixed. Some time later, a network link or device dies and traffic shifts and one of the duplex mismatched links is now carrying traffic at 30% of its rated capacity. The error counts on the interface begin going up. You now have two problems to troubleshoot and correct: the duplex mismatch on the link carrying the current load and the reason for the other failure that caused the traffic to shift.
Most network managers don’t have time to go looking for network problems that aren’t causing an outage. But it is important to run a clean network so that you don’t have multiple problems to solve when the big outage strikes. It is also useful to have a baseline of how the network was performing prior to the big outage so that you know what it used to look like and use that information to help identify where the problem occurred and how to correct it.
I like to start with simple things, like brushing your teeth. Oh, that’s right, we were talking about networking… Simple things like finding and fixing duplex mismatches, correcting root bridge placement in spanning trees, setting unused router ports and switch trunk ports to administratively down state (so I know when I see a down router port that it is a problem, even in a redundant network). I look for HSRP/VRRP/GLBP where there is only one device providing redundancy, native vlan mismatches, weak username/passwords, lack of network device security, and source address filtering. These are just a few of the things that, when correctly configured, constitute good network hygiene. A list of 25 of them appear in the poster that I developed at Netcordia, The Top 25 Network Problems and Their Business Impact.
If you look at the list, you’ll see that some of these items require configuration checks (think configuration policy) and others require comparing settings between multiple devices. How do I go about checking all of these things? Do I have the luxury of a lot of free time? No. I use automated tools.
I’ve been looking at a variety of tools recently and still favor NetMRI because of its automated network discovery and automatic checking of a lot of basic network things. With its Network Configuration Policy capability, I can write simple rules that can identify configs that don’t match what the network’s intended configuration. And even better, I get a list of exceptions that need to be addressed – kind of a “no news is good news” view of the network. When I see something it reports, it is something to investigate. At one site, it found a routing loop that no one had identified because it reported high numbers of TTL Exceeded messages being sourced by a router.
I’m interested in your list of Network Hygiene items. Please leave them in a comment so we can all benefit. Thanks!
Re-posted with Permission
NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html