Troubleshooting a Large-Scale CUCM Deployment

Joe LaRosa
Engineer II

I’m honored to announce that I’ve been invited to the upcoming Enterprise Connect conference to participate as a panelist for the session “Ask the Expert: An Interactive Session on Network & Systems Management.” In this interactive Q&A session, attendees have the opportunity to present their most impossible problems to a panel of experts, with the hopes of getting some answers. The panel will include experts on IP networking, network management, network automation, and unified communications interoperation/migration/automation. Regardless of where your issue lies within the stack, you should leave our session with a list of next steps and some new ideas for how to approach the problem.

Let’s look at an example: At NetCraftsmen, we take pride in helping our customers tackle seemingly impossible tasks. One client had a large Cisco Unified Communications Manager (CUCM) deployment supporting 15,000 endpoints scattered across a sprawling campus. After deactivating the TFTP service on an overloaded CUCM subscriber, almost 1,200 phones were unable to receive any configuration updates. Further investigation showed these phones were configured for static IP and TFTP addresses. A Cisco phone normally uses DHCP Option 150 to set the phone’s TFTP server address, but the static configuration bypassed the use of DHCP.

How do you update the TFTP address on 1,200 phones without having to visit each one? Easy, turn on the web server and use a utility like Uplinx’s Phone Control to use the API/SDK of each phone and update them in bulk.

How do you identify which phones have the wrong TFTP address? This information must be stored somewhere in the CUCM Informix Database, or maybe it can be found in Cisco’s Unified Real-Time Monitoring Tool (RTMT)? Nope. It exists solely on the IP phones, and can only be accessed by enabling the web server or SSH/CLI (on some models, and YMMV). OK, still easy: Use Uplinx’s Phone Control to run a phone inventory report and filter in Excel on the TFTP ADDRESS fields. Oh wait, we have to enable an unsecure web server on 15,000 endpoints in a highly security conscious environment? OK, it’s time to think outside the box.

How can we find out which phones are still communicating with the old TFTP server? Set the TFTP service of the affected CUCM subscriber to log its operations at DEBUG trace level. Luckily, CUCM records the identity (MAC address) of a phone each time it requests a config file. The list of IP phones can then be loaded into Phone Control, which can enable the web server on just those phones and update the TFTP address to the new servers, or just permanently fix by switching to DHCP. This was much more acceptable to security, as the web server was enabled for just a short time for a smaller subset of phones.

This is great, but phones sometimes won’t check in to TFTP for weeks at a time. Do we have to constantly babysit the TFTP traces and RTMT? Of course not — there’s always a better way. You can use RTMT to schedule a recurring job that “archives” log files and ships them off to an SFTP server. By making it an archive job, the traces are deleted from the CUCM subscriber (running the TFTP service) to save space. Remember, debug traces are chatty and roll over quickly. A PowerShell script was then created to comb the trace files for any reference to MAC addresses and output the MAC addresses to a CSV file. That file was then used to scope a batch update job in Phone Control. Wash, rinse, repeat. It took almost three months to complete, but visiting 1,200 phones in that environment was truly an impossible task.

Does this sound like the caliber of issue you are currently experiencing in your environment? If so, please visit our panel and pick our brains!

Leave a Reply


Nick Kelly

Cybersecurity Engineer, Cisco

Nick has over 20 years of experience in Security Operations and Security Sales. He is an avid student of cybersecurity and regularly engages with the Infosec community at events like BSides, RVASec, Derbycon and more. The son of an FBI forensics director, Nick holds a B.S. in Criminal Justice and is one of Cisco’s Fire Jumper Elite members. When he’s not working, he writes cyberpunk and punches aliens on his Playstation.


Virgilio “BONG” dela Cruz Jr.

CCDP, CCNA V, CCNP, Cisco IPS Express Security for AM/EE
Field Solutions Architect, Tech Data

Virgilio “Bong” has sixteen years of professional experience in IT industry from academe, technical and customer support, pre-sales, post sales, project management, training and enablement. He has worked in Cisco Technical Assistance Center (TAC) as a member of the WAN and LAN Switching team. Bong now works for Tech Data as the Field Solutions Architect with a focus on Cisco Security and holds a few Cisco certifications including Fire Jumper Elite.


John Cavanaugh

CCIE #1066, CCDE #20070002, CCAr
Chief Technology Officer, Practice Lead Security Services, NetCraftsmen

John is our CTO and the practice lead for a talented team of consultants focused on designing and delivering scalable and secure infrastructure solutions to customers across multiple industry verticals and technologies. Previously he has held several positions including Executive Director/Chief Architect for Global Network Services at JPMorgan Chase. In that capacity, he led a team managing network architecture and services.  Prior to his role at JPMorgan Chase, John was a Distinguished Engineer at Cisco working across a number of verticals including Higher Education, Finance, Retail, Government, and Health Care.

He is an expert in working with groups to identify business needs, and align technology strategies to enable business strategies, building in agility and scalability to allow for future changes. John is experienced in the architecture and design of highly available, secure, network infrastructure and data centers, and has worked on projects worldwide. He has worked in both the business and regulatory environments for the design and deployment of complex IT infrastructures.