QoS for IP Telephony, IP VideoConferencing, and Microsoft Unified Communications

Author
Peter Welcher
Architect, Operations Technical Advisor

Over the years, I’ve provided a number of customers with QoS configurations in support of IP Telephony (IPT) and IP VideoConferencing (IPVC), generally non-Cisco. I’ve been working recently with a couple of customers with some Microsoft Unified Communications (MS UC) at their site (Microsoft OCS, Office Communications Server), sometimes brought in by Nortel with Microsoft sales people now positioning to take over UC at the site, eventually.

(That begs the question of whether MS OCS is really enterprise-class. I’ll leave questioning that to the Cisco and Avaya sales persons and their competitive analysis sales documents. MS seems to be moving in that direction.)

What I’m seeing now is increasing use of MS UC, either softphone or video messaging services. This is also happening with sites buying Tandberg Movi, basically fancy cameras and PC software to video-conference out of the PC. For that matter, Cisco UC sites run softphones or Cisco cams too.

The challenge is providing QoS for such “wild” IPT / IPVC (IP Telephony / IP Video Conferencing) endpoints. (Some people like the term “IP-VTC” = IP Video Teleconferencing — I lost my “tele’s” a while back.) One answer is “they’re wild and uncontrolled, so no QoS”. MS UC seems to sell with the pitch being roughly “you don’t need all the heavy lifting of QoS”, but then I’m told the documentation turns right around and says, in effect “but it would be a good idea to use QoS to provide the best quality voice”. I don’t see “no QoS for wild endpoints” as being a particularly viable answer in most organizations.

This article discusses the QoS-related issues I see. I’m hoping to start a discussion, and would love it if you read this, feel you have an insight or solution, and use the Comments capability to pass it on!

Standard QoS

Sites in need of QoS will have QoS Classification and Marking configured at a Trust Boundary by the time I’m done there. Away from the LAN edge, we use the DSCP markings with default or Cisco QoS SRND-based queuing in switches, and with bandwidth-aware QoS configs in the WAN routers. Generally, for links over T1 speed, we try for a percentage-based approach, one size fits all, to keep maintenance simple. Since we’re not policing or shaping, any “higher priority” bandwidth” left over can be used for the “lower priority” traffic classes. When the contracted WAN speed doesn’t match the link speed, we use a hierarchical QoS approach, shaping to the contracted speed, with the child policy feeding that and controlling how the bandwidth is used. All pretty standard stuff, although I like to think I / we provide good value in helping sites get quickly through the maze of configuring QoS properly — it is a bit confusing if you haven’t done it before.

Vendors, including Cisco, seem to expect the network to trust DSCP markings coming from the endpoint. I’m fine with that when it is a controlled endpoint, e.g.a  Cisco phone or an IP PBX Call Server, where I know the markings will be right. I appreciate things like a voice (or IPVC) VLAN, where only IPT/IPVC devices should be on it (perhaps with some DSCP MAC-based controls). They give me better control over (and simpler) ACLs for QoS and voice security.

I have serious problems trusting PC’s, however. Who knows what DSCP settings application programmers might have used, perhaps to try to make their applications look good (perform better under congestion conditions)? And if traffic is coming off a PC, how the heck are we supposed to provide it with voice security?

QoS for IPVC

For quite a while, when I consult on QoS at a site, I have to decide whether to trust markings from the Tandberg / Polycom IPVC endpoints or Nortel and Avaya PBX’s. I usually do not trust them. Do you?

With Nortel and Avaya IP PBX (Call Server) TLAN and ELAN or CLAN, that’s no problem, I usually just remark traffic coming off the PBX VLANs based on port matching (media or signaling).

With fixed IPVC endpoints, be they Tandberg or Polycom), I do some RTFM (“Read the Fine Manual”), do some packet capture for sanity check, and similarly just do IP and port matching and mark accordingly. In a couple of cases, we’ve put the IPVC into the last say 16 or 32 addresses in the voice VLAN. With well-thought out addressing, matching the high end of all voice VLANs takes one ACL statement (times however many TCP or UDP ports the product in question uses). It takes a day or three to check the docs, get and check the packet capture, and build the QoS ACLs. And longer to deploy the QoS, which is usually some mix of me/us and the consulting customer doing it.

Special case: with Cisco UC, especially if our UC guys have set it up, I know it’s right. And I haven’t really run into sites where someone else set up their Cisco UC.

By the way, it does appear MS handsets (Microsoft Office Communicator Phone Edition) can be set to learn a voice VLAN via DHCP and use it. See the References below.

Keeping Control

The reason I have taken the above approach, re-marking at the edge, is that I know it is right and I know that it will remain right, barring major VLAN / addressing shifts at the site, which are rather unlikely. I can easily imagine such things happening when upgrades occur. The same can happen with DSCP values, Avaya and Nortel having historically preferred CS6 where Cisco uses the correct DSCP value, EF. I’ve also seen cases where the phones’ DSCP wasn’t being set, apparently due to a contractor missing a field in the PBX setup GUI.

I’ve also seen cases where the N-th Avaya PBX got installed, the contractor didn’t get told the UDP port range, it used ports outside the site’s preferred range, and so some voice was not correctly marked. That is one reason I use packet capture to check what is actually configured. When I build a QoS edge remarking policy, I document things like media port ranges (ports like those for H.323 or SIP are highly unlikely to change), and explain to the network staff the necessity of thinking in terms of a “contract” between them and the IPT staff. By remarking at the edge, you know that inbound traffic is remarked, except if the port contract is violated.

Yes, if there’s a problem, whether you trust the PBX markings or do remarking, you’re going to be doing packet capture, and looking at UDP ports and DSCP markings. But with remarking, I know that an error by the PBX admin is unlikely to flood the network with EF-marked traffic, or to leave the bulk of IPT traffic unmarked or strangely marked. So I try to minimize the window of opportunity for the IPT or IPVC traffic to go unmarked or mis-marked.

By the way, the “contract” should also include the PBX version of media bandwidth between sites (i.e. without the network overhead added in — the number to be configured into H.323 zone control or PBX-PBX trunking forms), and how Call Admission Control is to be done. The point being, if the PBX admin changes that without working with the network staff, all bets are off, the “contract” is violated, and voice quality may degrade. A brief printed “contract” emphasizes the importance of teamwork on providing good IPT / IPVC service.

For IPVC, one very much needs the IPVC admins to understand that a central MCU / bridge allows predictable bandwidth and QoS, whereas multi-point conferencing from any endpoint is going to not only be lower quality due to the hardware capabilities of the endpoint, but is also going to require a lot more bandwidth to every site, or have poor QoS. (I’ve seen a case where the IPVC Polycom consultant brought in explicitly said that in their report, but the IPVC admin apparently hadn’t read or understood that important point.)

If you get the impression I’m a control freak, well, with regard to QoS that IS what it is about — control. Furthermore, I’ve run into a number of IPT and IPVC administrators who were not very tuned into the network side of things, or were fairly non-technical. When I say “IPVC bridge” and someone says “what’s that”, I get uncomfortable. When I’m uncomfortable, I want edge controls to try to minimize surprises.

How about MS UC and Softphones?

The approach described above just doesn’t work with PC-based IPT and IPVC, or with Microsoft UC softphones. (Which are attractive to sites, in that they can replace a costly handset with a bluetooth or wired headset / ear bud and simplify desktop clutter and cost. Similarly with built-in webcams. I understand the business benefits, I also see the network impact.) For the rest of this article, I’ll use “softphone” to also include PC-based IPVC endpoints.

Basically, the only choice I see is to trust the DSCP markings from all PCs with softphones.

I’d like to have other choices. For instance, if the softphone were to use a different VLAN, that might provide a lot more comfort, although trunking to every PC opens up some other interesting debatable topics. One could match on TCP and UDP ports, but for media traffic most products make such promiscuous use of ports there seems little point to doing so. If the softphone interacts with a proxy or gateway of some kind, ok, then for QoS we can latch onto the proxy IP as one conversation endpoint. I have yet to be at a site doing that, however.

Reading about MS OCS, my conclusion is that one very much wants the PC to be running Vista or Windows 7. They allow for per-application DSCP settings. The network administrator then has to work with the desktop staff to ensure the standard desktop build(s) use MS Group Policy Objects with the appropriate DSCP settings (and 0 for other apps). In other words, you have to make sure the desktop staff is trained on DSCP values, and perhaps do some packet capture to confirm or at least re-assure that the markings are indeed being used.

In a sense, that is not much different than the setting I described above. What has changed is that now a bigger pool of people has to Get It Right (PBX admins or contractors, IPVC endpoint admins or contractors, and key desktop admins). Does that make you uncomfortable? It does me. I’ve encountered a lot of variation in the skills of desktop admins over the years.

References

The following are some of the links I found when researching this a month or so ago. My conclusion is that either I was Google-searching with the wrong terms, or that the whole area of QoS for MS OCS is a rather un-explored territory, probably the latter. The word “underwhelmed” comes to mind. If you have better references, please share (email or blog comment)!

I recommend the last item below in particular, the idea of per-application QoS controls seems like the best “locked-down desktop” way to approach QoS on PC’s. The author notes, Windows XP doesn’t offer that level of granularity.

MS Office Communicator Phone Edition Release Notes
Setting up a voice VLAN for an MS-OCPE phone (DHCP)
http://www.microsoft.com/downloads/details.aspx?
FamilyID=934d503f-6f3f-4253-bb6a-827f21aa61ec&DisplayLang=en
Microsoft Office Communications Server 2007, Appendix A: Implementing in a QoS Environment
(Not all that useful, sub-heading does discuss enabling PC QoS)
http://technet.microsoft.com/en-us/library/bb870366.aspx
Understanding Protocols, Ports, and Services in Unified Messaging
(How to control the RTP port range, which defaults to 1024 through 65535)
http://technet.microsoft.com/en-us/library/aa998265.aspx
Microsoft Office Communications Server 2007 R2 Ports and Protocols http://technet.microsoft.com/en-us/library/dd425238(office.13).aspx
Blog with some interesting MS Registry Keys http://blogs.pointbridge.com/Blogs/mcgillen_matt/Pages/Post.aspx?_ID=70
Blog briefly recommending against RSVP on the PC / workstation http://blog.danovich.com.au/2009/08/03/qos-with-office-communicator-2007-r2/
Blog re QoS per-application in Vista, example for MS OCS http://voipnorm.blogspot.com/2009/01/quality-of-service-for-moc.html

Leave a Reply

 

Nick Kelly

Cybersecurity Engineer, Cisco

Nick has over 20 years of experience in Security Operations and Security Sales. He is an avid student of cybersecurity and regularly engages with the Infosec community at events like BSides, RVASec, Derbycon and more. The son of an FBI forensics director, Nick holds a B.S. in Criminal Justice and is one of Cisco’s Fire Jumper Elite members. When he’s not working, he writes cyberpunk and punches aliens on his Playstation.

 

Virgilio “BONG” dela Cruz Jr.

CCDP, CCNA V, CCNP, Cisco IPS Express Security for AM/EE
Field Solutions Architect, Tech Data

Virgilio “Bong” has sixteen years of professional experience in IT industry from academe, technical and customer support, pre-sales, post sales, project management, training and enablement. He has worked in Cisco Technical Assistance Center (TAC) as a member of the WAN and LAN Switching team. Bong now works for Tech Data as the Field Solutions Architect with a focus on Cisco Security and holds a few Cisco certifications including Fire Jumper Elite.

 

John Cavanaugh

CCIE #1066, CCDE #20070002, CCAr
Chief Technology Officer, Practice Lead Security Services, NetCraftsmen

John is our CTO and the practice lead for a talented team of consultants focused on designing and delivering scalable and secure infrastructure solutions to customers across multiple industry verticals and technologies. Previously he has held several positions including Executive Director/Chief Architect for Global Network Services at JPMorgan Chase. In that capacity, he led a team managing network architecture and services.  Prior to his role at JPMorgan Chase, John was a Distinguished Engineer at Cisco working across a number of verticals including Higher Education, Finance, Retail, Government, and Health Care.

He is an expert in working with groups to identify business needs, and align technology strategies to enable business strategies, building in agility and scalability to allow for future changes. John is experienced in the architecture and design of highly available, secure, network infrastructure and data centers, and has worked on projects worldwide. He has worked in both the business and regulatory environments for the design and deployment of complex IT infrastructures.