The Modern Tower of Babel in Network Coding and Automation

This blog describes some of the snags in modern coding / automation, and what might be done about them.

Automation is great. I happen to love coding and have been working with various API’s, getting some small but real problems solved. One of the factors I’m acutely aware of is right-sizing what I tackle. If you can solve something useful and later incrementally add functionality, you can then gate the time you sink into coding versus value (to yourself, your organization, versus having a life, etc.). Python is great for that — slap something together, extract repeated operations or clean tasks into functions, etc.

Another personal preference is to not re-invent the wheel. I tend to use tools (NetMRI, SolarWinds, etc.) that can run a bunch of show commands across devices, outputting a zip of the results, one file per device. With echoing of each show command and predictable separators between blocks to facilitate locating or extracting show blocks.

Those tools already have the device inventory and credentials in them. By doing that, my scripting can focus on analysis of the data. I also don’t have to leave my laptop at the customer site overnight to gather the raw data. Where appropriate, my Python (previously PERL) script output sometimes ends up being tab separated, because that imports easily into Excel for sorting on different columns, etc. The point here is to expend my limited coding time getting the job done, leveraging other tools where possible.

There are definitely some speedbumps in network automation that have me bemused. I don’t know that you or I can resolve these issues. It may be more of a matter of being aware of possible downsides in automation, so you can factor them into your planning. The intent of the following is not to discourage you from coding and automation, but to provide you with more data from experience.

API Documentation

I have yet to encounter what I’d call good API documentation, i.e. that doesn’t require reading the mind of the person who wrote it. Maybe that says something about me, or the vendor API’s I’ve been exploring.

Tentative conclusion: The API docs usually give someone a rough idea of the purpose of an API call, what it returns (vaguely), and the arguments, but little info about how to go about actually using it to accomplish a task, especially in cases where multiple API calls are needed to do that. Examples help me more than YANG syntax.

That may be a function of economics (it takes scarce and costly coder time to write documentation). Worse, writing English isn’t anywhere near as much fun as Python or whatever! Also, the potential audience is small, so not worth a lot of effort.

An example of what not to do: Document the relevant URL as https://product-prefix/api/<id>, tell me that for <id> I should substitute is “the ID”, and leave it at that. It helps to tell the user which ID. That is, what the ID I’m supposed to supply actually identifies, which of the several ID’s I’ve seen I should use with the URL.

Context and clarity matter. In general, technical people often seem to be poor at providing both. They assume you know the context around what they’re discussing, or they fail to specify other necessary details. (I have the same problem with Cisco Vue exams — isolated statements that might be true or false depending on the context that is missing.)

I’ve run into an API where resultslist gives me the ID for retrieving “results”, which is the actual list. Huh? How about “getResultslistID” then “resultslist”? Umm, basic language skills 101!

What seems to be missing in my modest API experience are good examples, or even just workflows. As in “do this to get this, then use that to get this other thingy” (that’s the tech term for it), and so on. What sequence of calls and arguments do I use to get the info I want or create the object or trigger the action I want? In other words, please document workflows and examples of how to do common things. Or the relationship between the various API items when you need to make a sequence of calls.

Sample input or parameters and output also helps. Providing a schema sort of helps, but is verbose and a bit obscure in comparison to just providing a concrete example. At least for those of us who don’t work with schema models that often.

Note to coders: This ought to be easy. You have test cases for your API, don’t you? Just include the inputs and outputs in your documentation. And make sure that the test cases not only test atomic (single function) API calls, but sequences of them to accomplish common user / admin tasks.

Credit should go to Cisco and DevNet, there are some fairly solid looking examples of how to use the ACI API to create things. Well done! There are also lots of tools in ACI and NX-OS to provide “breadcrumbs”, as in give you an example of how to configure or do things via the API.

Also, credit goes to the Postman developers. It really helps with poking around to find out what works. Although I’d rather not have to poke around, especially if doing so might hit the “destruct” button (so to speak).

Cross-Vendor / Cross-Platform Variations

I get structured information, I really do. Parsing Cisco show command output variations gets old, fast. E.g. “show cdp neighbor” output truncates interface names, etc. in some devices. I’ve done enough kluging around show output variations for a lifetime.

What I’d like is commonality — across Cisco or any vendor’s platforms, even across vendors.

With Cisco, IOS-XE is or will soon be common across switches and routers, more or less. All we have to do is wait 5-7 years until the installed base catches up. That leaves IOS-XR and NX-OS, ACI, and DNA or whatever the management product of the year is.

Sure, you can write front-ends to deal with each different platform and (maybe) homogenize the results. That just takes time. Oops!

I’ve previously made the point that the much maligned MIB-II SNMP is at least a multi-vendor standard. Telemetry sounds great. Where are the standard(s)?

The positive point I’d like to make is that structured information that varies between devices is only a partial solution. What’s needed is structure to the structure. Cisco IOS has global settings, per-interface settings, etc. That’s all pretty logical. It seems like carrying that over the API and attaining consistency shouldn’t be hard. Across vendors, OK, that might be harder. Good thing the IETF is trying to create models.

Code Verbosity

I recently worked through a pair of Cisco coding courses to fill in knowledge gaps and for continuing education credits (PRNE, NPDESI). Some good material, some good labs providing hands-on time I would not otherwise have found time for.

What particularly struck me was that if you are seriously writing a Python module or several with test cases, you might end up with pages of code to emit one line of configuration code. And that’s not even addressing the logic to check that you’re applying the configuration in an appropriate context, etc.

That led to the reaction, “that’s a win”? OK, my cynicism is showing…

I get the need for test cases. I’ve written enough code that needed ongoing troubleshooting when it met the real world. When it is just post-processing show commands, no big deal. When it is changing the network, very much a bigger deal.

One conclusion is if you’re writing code that only reads current state info, e.g. to report on something, that generally has few potential harmful side-effects. At least as long as the human consumers of the info are aware that there may be bugs, also quirks for certain platforms that you have not encountered yet.

If your code is configuring boxes, you need to be more sensitive to “how could this go wrong” and “how do I scope this, so I don’t configure the wrong <interface, or whatever>”. Or better, how do I verify this configured what I intended it to?

Ditto for upgrading devices. How does your code detect and resolve all the possible errors?

That leads to more coding. That’s where the Tower of Babel comes in: tons of lines of code, possibly several different open source packages (think languages)…

Trade Offs

For those of us with the need to get network (or other) tasks completed, maybe that suggests a focus on reporting.

Also, there are already tools (Cisco Prime / APIC-EM / DNA) that collect configs, do upgrades, etc. Comparing the purchase / licensing price to the hours it might take you to code up any of those tasks (let alone several) sounds tedious, but realistically, the ROI is likely worth it. Worth thinking about, anyway. Don’t re-invent the wheel without a darn good reason!

Comments

Comments are welcome, both in agreement or constructive disagreement about the above. I enjoy hearing from readers and carrying on deeper discussion via comments. Thanks in advance!

—————-

Hashtags: #CiscoChampion

Twitter: @pjwelcher

Disclosure Statement

NetCraftsmen Services

Did you know that NetCraftsmen does network /datacenter / security / collaboration design / design review? Or that we have deep UC&C experts on staff, including @ucguerilla? For more information, contact us at [email protected].