Most network management tools provide a way to group devices and interfaces based on a variety of information, including descriptions, addresses, and names. I’ve been thinking about how to use the descriptions in both devices and interfaces to enable the automatic grouping mechanisms to help manage networks.
Networks tend to have a core network, with key links that need to be more frequently monitored, as well as real-time reporting when they experience problems. My initial thought was to add a short descriptive tag to each interface description, like the word “core”. But I found that I wanted to also label the uplinks from distribution devices to the core devices. What tag should they have? And what about tags for uplinks from access switches to distribution switches? I found that one word wasn’t sufficient for the level of tagging that I needed.
What I wound up using was “TAG:core-core”, “TAG:core-dist”, “TAG:core-gw”, and “TAG:dist-acc”. The prefix “TAG:” allows various management systems to filter on the tagging I added instead of triggering on some other text in the description. I didn’t differentiate between the ends of a link, preferring to keep the tags consistent on both ends of a link. The tags are created with the highest ranking device first, thus “core-dist” and “dist-acc”. The NMS systems were then configured to perform more frequent data collection on any interface that contained “TAG:core”. Interface down events, performance threshold exceptions, and error threshold exceptions can then be quickly reported if they are detected on any interface in this group. Even better, the tag allows classification of the alert that is generated when a problem is detected. An interface down event where the tag is TAG:dist-acc may generate an Error severity alert while an interface with TAG:core-core would generate a Critical severity alert.
Other tags would identify other important ports that may need special monitoring, reporting, or treatment, such as server ports (TAG:srvr). I don’t yet see any need to tag general edge ports, although it might be useful to add functional tags such as “TAG:dhcp” or “TAG:staticip” to identify whether a port should be configured for dhcp (i.e. ip helper address) or statically configured. Another benefit of interface tagging is that interface configurations can be automated. Uplinks can be configured to trust QoS with appropriate policing to handle oversubscription. Edge ports can be configured to classify and mark traffic, based on the type of port. For example, a hospital may tag life-critical interfaces so that appropriate QoS policies can be applied. A manufacturing operation may tag interfaces that connect process control or CNC machining equipment to the network.
Tags on interfaces that are to be configured for Netflow or for IP SLA endpoints may also be useful for automating the configuration of those network management tools. A tag that identifies IP SLA sources and destinations could be used to automatically create sets of tests from all sources to all destinations, saving a lot of time in maintaining the configurations.
A similar tagging methodology could be applied to network infrastructure devices. Clearly, a device that contains “TAG:core” in its sysLocation string (in Cisco, it is the “snmp location” command) would be important. But what about the importance of an edge device that is providing communications for a critical device, such as a life-critical monitoring system in a hospital or a key process controller in a manufacturing operation? These devices may need a tag like “TAG:life-crit” or “TAG:proc-ctl”. When these devices become unreachable or crash, the network team needs to be alerted immediately.
[Side note: I’ve heard that if the operation of the the manufacturing of candy like Twizzlers ® is interrupted for more than 5 minutes, the material hardens in the plumbing. The fix is to replace the plumbing, which is expensive and time-consuming. The point of this side note is that communication problems in manufacturing lines like this can have serious results.]
Since the tags are mostly useful for automated processes, I added them to the end of the existing description fields. So when the descriptions are shown in the network management systems, the descriptive portion that is most useful to people is what is shown; the tag is often truncated.
I’ve been told that the descriptions are of limited length and that the tags need to be very short. So far this has not been a problem in operation, but I’ve just started using this mechanism, so time will tell. Obviously, it is possible to create much shorter tags. Something like “T:c-c” would also suffice and still be human readable. It is even possible to implement a single letter lookup table, but it doesn’t have the advantage of being easily human readable.
I’m looking forward to using interface and device tags to help automate a lot of configuration tasks and helping prioritize alerts. If you use similar methodologies now, or in the future, please let me know how it works for you and if you’ve encountered problems with it.
Re-posted with Permission
NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html