I’ve been doing a number of device configuration changes as a result of the things that NetMRI and syslog are reporting in a customer network. Both NetMRI and syslog have rather long reporting cycles before I know that something was fixed, so I’ve been thinking about a device real-time display as a nice function to have.
NetMRI is not unique in its data collection. Network management systems tend to using a polling mechanism to collect a lot of the data that is displayed. The collection tends to be on a periodic basis which is often dependent on how many devices need to be polled and the performance of the NMS system. The end result is that it can take a long time to update a system’s status after a configuration change is made.
Syslog also has a long time delay, but because of a different reason. I’ll see messages in syslog, make a change, then I have to watch syslog and wait long enough to satisfy myself that the cause of a given message has been fixed.
In both cases, the NMS doesn’t know that changes are being made to the network and that it should update a given device’s data more quickly than normal. In fact, I’ve never heard of an NMS that adapts its polling interval based on configuration changes that it detects in the network.
The result is that I’ve been thinking that network management systems should incorporate a real-time device viewer that allows me to see the changes I’m making and their impact on the device’s operation. Most network management systems only provide a real-time viewer for interface performance traffic. There are a lot of other data points that would be useful to display, either as graphs or showing the the current state of some operational variable.
However, network devices contain a lot of data, especially if you include large tables like routing, ARP, neighbor discovery, and switch forwarding tables. Polling all this data every few seconds could create a large load on both the NMS and the device. There are several ways to reduce the load. One way to reduce the load is to only poll for a sub-set of all the data. This is a viable approach because I am typically only interested in a few data points. A User Interface that allows me to select the data to be displayed could be used by the NMS to determine the data to poll, reducing the load on the NMS and the device. Ideally, the views of most or all of this data would be collected together on one display, providing me with a device dashboard. For example, I may want to have interface configuration and performance combined with HSRP configuration on one page.
Here are some examples of things that I would want to track…
- Interface performance and status. Watch interface status (up/down), duplex, utilization, and error counts. When the duplex setting is changed, how do the counter rates change? Some of these values would be best displayed as graphs, perhaps in the style of sparklines.
- HSRP and STP root bridge. Show which device is active and which is standby, if a standby router is known. For STP, show the root bridge and number of switches in the spanning tree. Is the root bridge the same as the HSRP active router for each VLAN that has both configured?
- QoS traffic class drops. Show a plot of the drops in each traffic class within a QoS policy. This would be great if done using a graphing tool, perhaps with sparklines, like mentioned above.
- Environmental parameters. Watch the temperature in a room go up or down as the air conditioning is turned off and on. Or watch power supply fluctuations as PoE devices turn on at the start of the day or in response to an EnergyWise automation system.
- Device operational parameters. Variables like CPU utilization, free memory, buffer misses, and backplane utilization may be good factors to display for a new device as part of a network benchmarking program.
Selecting a sub-set of all available device data monitoring sources would provide me with a neat way to look only at those variables that of interest to me at a given time. If I can save the view and use it later, I can then create views that augment my other network troubleshooting tools.
Imagine being able to select the Interface performance and QoS views to be display simultaneously on a router. I could quickly determine whether QoS is working as desired for a given traffic class, knowing that the interface traffic load is typical for the interface in question. Not seeing QoS drops? Is any traffic transiting the interface? Is the interface congested (QoS is only engaged when the interface is congested)?
With some careful UI work on top of a good polling engine, a real-time device viewer could be a pretty interesting and useful tool. It could also wind up being a pretty cool display in the NOC, with changing displays of device metrics.
Re-posted with Permission
NetCraftsmen would like to acknowledge Infoblox for their permission to re-post this article which originally appeared in the Applied Infrastructure blog under http://www.infoblox.com/en/communities/blogs.html