Two things happened recently that reinforced my preference for network management appliances. The first is database configuration and maintenance. The second is system configuration for efficiency.
I’m installing and configuring a popular NMS software product. It relies on an external DB that’s not included with the product. The installation went well and it’s been running for a few weeks, then started reporting DB errors. Upon investigation, I found that the DB was “full.” What’s up with this? It was supposed to be an installation suitable for a reasonably large network (we have over 600 devices to manage in the core and around 4000 devices overall).
It turns out that the DB administrator had installed the “toy” version of the DB. The toy version is normally used for training and prototyping. It was limited to 4GB of storage space. In addition, he had not setup regular DB maintenance procedures and processes. Simple things like backups and DB compression were not being performed.
You might make the case that I should be checking this stuff. But I’m interested in managing the network, not managing a DB. I’m spending my time getting the system customized to report useful and actionable things about the network’s operation. (There’s another post topic: spending lots of time customizing the system to perform basic tasks.) I should be working to improve the network and make it more stable and efficient, not performing DB administrator functions. An appliance model would have all the DB configurations and processes predefined and enabled, or at least have them as part of the initial configuration wizard. In fact, the wizard should have reported that I had a large system configured to use the “toy” database. For a given installation size, the system can determine what maintenance needs to be done and make recommendations or enable what’s needed.
One of my disappointments with traditional NMS deployments is the need for a cadre of support staff to baby-sit it. That just increases the cost of the system and reduces its time to implement and its value. Appliances win big here because I can just install the hardware and spend a few minutes on the initial configuration and it’s learning the network and reporting problems to me.
The second event was in a different product’s forum discussion with someone who was having a problem where the system wasn’t able to get SNMP data from a specific device. It turned out that the problem was that the number of SNMP OIDs in the requests resulted in more data to be returned than would fit into a return packet. So SNMP returns a failure (see RFC1157, section 4.1.2, The GetRequest-PDU). I suggested that the product should automatically determine the optimum number of OIDs to request for the device and remember it. But this person wanted control of the number of OIDs in each request. Wow. Why would anyone want that level of control, unless he or she had lots of time on their hands? With over 600 network devices on my network, I don’t have time for that sort of busy-work. I want the SNMP data collection process to run as efficiently as possible, so that I get the greatest value from the NMS. I don’t want the SNMP process to break because I needed to increase a few interface descriptions and suddenly the return data doesn’t fit into one reply packet and the system is no longer able to collect important data on one or more interfaces.
A good NMS would adapt to changes in the network operation that may cause failures in SNMP. By using the largest supported packet, it reduces network overhead on the network device, the network links, and the NMS. I want my NMS to do the right thing and let me know when the network isn’t working correctly. I don’t want the NMS breaking because something insignificant changed.
I typically find that network management appliances tend to have a higher level of automation, because that’s what is expected from them. It is equally suitable for a software implementation to implement these features, but I’ve found that few vendors make their systems that resilient. Perhaps it is because the vendors have big professional services groups that make a lot of money deploying their products. Or maybe they are skimping on some development features in that area so that they can focus on other areas. But when they make their products difficult to install and use, they are hurting their customers and themselves.
-Terry
_____________________________________________________________________________________________
Re-posted with Permission
NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html