I’m a proponent of network management automation and found a related blog post, Why End Users Love BPM, by long-time friend Scott Menter, who is working at BP Logix. BP Logix makes products for automating business processes. Scott’s post discusses what happens when people are forced to run errands for their computers, as described by Arno Penzias, former Vice President and Chief Scientist of Bell Laboratories, and a Nobel Laureate in Technology in People Services: Research, Theory, and Applications, by Macros Leiderman.
I find that most network management systems suffer from a similar failing: they require that the NMS administrator spend a lot of time performing tasks that the management system is better equipped to perform. A lot of NMS setup tasks provide very little value for monitoring and managing a network. I should be able to tell the NMS to discover the network devices in a part of the network and have it do the discovery, automatically identify the network devices, and populate its management device list. I’ve used platforms that require that I enter the specific IP address of each device to be monitored and managed. I once tried its automatic discovery mechanism and it found one 6500 by 30 different interfaces. That wasn’t bad, but it wasn’t smart enough to figure out that all 30 interfaces were on the same device and determine which interface it should use to monitor the system and all of its interfaces.
I’ll bet you’re now thinking “But what if the selected interface becomes unreachable?” The NMS has a list of all interfaces and their IP addresses. When it can’t reach the device by the selected address, simply try all the other interfaces. The interface may be down, or the network link to that interface failed, or routing to that interface failed. In some circumstances, another interface may be reachable. If no interface is reachable, the device may be down or there is a network failure that isolated it. That’s the time to report the device as potentially down. My gripe is that many NMS platforms require that I select a specific management interface and are dependent on me to perform the above actions. There are many other things that I should be doing to provide more value to the organization.
Another example is performing basic analysis of the collected data. Many systems collect a lot of data and make it available for analysis. But the network administrator must perform the analysis. Why can’t the NMS provide a mechanism by which I can write analysis rules that it performs regularly? For example, I want to verify that all Cisco device configurations contain the proper VTY, SNMP, NTP, and TACACS configurations. Or I want to identify all HSRP groups where there is a single router in the group. Why must I do that analysis on the collected data? The NMS should do that analysis and make the results available to me. These results are actionable, meaning that I can take specific actions based on the results.
Back to Arno Penzias’ comment about people copying data from one computer to another. I have used NetMRI’s excellent network discovery to export a CSV of known network devices. I then check the list of devices against what other products are monitoring and correct any discrepancies. Without using an API, this process can take a few hours to perform. It would be much better to have an automated process that is run daily. All of the above things are what I refer to as Network Management Automation. Ask your NMS vendor and their references how much work is required to keep their tools up to date with what’s on the network and whether you must perform the analysis of the collected data. It will be good when we have tools that no longer make us slaves to their deficiencies.
-Terry
_____________________________________________________________________________________________
Re-posted with Permission
NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html