Flawless VoIP Business Phone Calling With Proper Echo Cancellation

White Papers
A Flawless VoIP Call — Peter Sandstrom, CTO, June 2006


How It Started
Back in the mid 90's a group of technology enthusiasts created a way to encode voice into IP packets, and then transport that voice data over IP networks in real time. It soon acquired a fairly large cult following. The lure was that it enabled an individual to beat the incumbent phone company on a global basis, and let one talk to his buddy for free.

This VoIP following was very similar in spirit to the free software movement insofar as it took the large incumbent out of the picture, and enabled the average user to communicate with anyone on the planet for next to nothing.

Bridging Two Worlds
But in the mid 90's the average user's access to the Internet was at a rate of 28kbs-56kbs, and typically closer to the 28kbs level. As such, trying to get a full 64kbs phone call (the normal bandwidth for a PSTN call) to travel down a narrow modem pipeline was not going to work. As a result, a lot of work went into building compression software that shrank the 64kbs voice stream down to something much smaller – 16, even 8kbs.

The result was that you could talk, but it certainly did not sound good. The audio quality did not approach "toll quality," as is said in the PSTN world. But the technology enthusiasts did not care; it was a free call.

Then the next phase kicked in--the techies wanted to call their moms and anyone else who had a phone attached to the PSTN. Gateways were the solution. These devices interfaced the VoIP calls on the Internet to the PSTN. Media gateways change the VoIP encoded call to PSTN time division multiplexed (TDM) 64kbs formats and back. But the conversion process added all sorts of new and odd sounding artifacts that made the poor quality call even worse.

So at this juncture we had heavily compressed voice running on the Internet (which at that time had its own set of problems, i.e. packet delays and packet loss) and then going though media gateways that further damaged the audio quality. Thus, VoIP got its start but did so under the curse of a bad first impression; the general public believed it to be an inferior technology.

Fast forward to today and to the 21st Century Internet within the continental United States. We have coast-to-coast delays of 100 msecs or less for IP packets. We have little or no packets loose (thanks to fiber transport). We have gateways bridging the internet to the PSTN that can handle the issue of echo cancellation correctly, and have a general public that on average has a bi-directional bandwidth connection to the internet of 384kbs or better. These are all the ingredients necessary for a quality VoIP call.

But even with these favorable factors in place most VoIP carriers still offer voice transport products that fail to live up to expectations. The reason? They have not dealt with the basics: end-to-end delays, echo cancellation, and quality of service at the customer premises. As a result the VoIP community is largely to blame for helping advance the myth that VoIP cannot sound as good as a toll quality PSTN call. This is simply not the case.

Let's take a quick look at three design parameters, and delve into why they are necessary to realize a toll grade VoIP call. On today's national Internet, VoIP to VoIP or VoIP to PSTN calls can sound as good as or better than a toll quality PSTN call.
Here's why...

End to End Connections
A long time ago, someone inside the now extinct Bell Labs came to the conclusion that end-to-end delays for a voice call would become objectionable to the human psyche if it exceeded 120 msecs. Indeed, it's been this user's experience with VoIP and other telecom technologies that when you exceed 120 msecs of delay for a VoIP call, or a TDM call, it becomes uncomfortable to talk.

In the old days, 100-800 msec delays on coast-to-coast IP routes were common. But today in the US almost any location-to-location call can be achieved at 100 msecs or less.

That being the case, the Internet has evolved to a point within the last few years whereby one of the biggest problems of VoIP has been solved, i.e. point-to-point delay. In many cases 30-50 msec delays are attainable, which is well within the tolerance limits of the human ear.

But that's for the US. Call outside the borders of the continental US and all bets are off. Many call centers have outsourced to facilities on foreign shores, such as India. These facilities then make VoIP calls back to the US, and do so with delays as high as 400-500 msecs. The result, again, it is not toll quality.

So VoIP works quite well on a national/local loop, if kept within a national IP network with low delays, such as the USA. But extended beyond its practical limit and the model starts to become questionable.

Echo Cancellation and VoIP (What did you say... clip, clip)
Echo cancellation has to be the least understood design parameter associated with VoIP transport today. Let's explore the reason for this, and show how Cyclix has resolved the issue.

First, why do we need echo cancellation on a voice call?

Basically, the caller using a two wire POTS telephone will insert a portion of the voice energy back to the person on the other end; i.e. phones reflect some of the audio coming to them back out to the network. This has to do with the imperfections of the phone's 2 to 4 wire converter (hybrid), and from the audio simply coupling from the speaker back to the mike.

It is this reflected audio that must be cancelled; or the end result is the bizarre experience of multitudes of echoes going back and forth between the two callers. All of us have experienced this at one time or another on a PSTN call. It happens when the echo canceller for your call does not get turned on.

So in summary, for any long distance call (20 msecs of delay or more), VoIP or PSTN, echo cancellation is mandatory. Without the cancellation, the call is an unworkable situation for the callers.

Bearing these echo cancellation issues in mind, it is possible today for users of the VoIP Session Initiation Protocol (SIP) to strike termination agreements with SIP based class-4 carriers, and have those SIP carriers terminate SIP originated telephone calls to the PSTN, or vice versa.

But a problem exists. Many SIP terminating carriers are using ISDN technology on the PSTN side of their media gateways. Consequently, they cannot handle the echo cancellation issues correctly; the caller using this type of network often hears audio clipping and long audio delays during the conversation.

The reason for these problems is that the ISDN enabled media gateways deployed by these class-4 VoIP carriers has no way of controlling the echo cancellation hardware deployed inside the PSTN network. Therefore, their media gateways insert their own echo cancellation into the voice stream, but they do so indiscriminately, i.e. in the middle of the path of the call, as opposed to near the called and calling parties. The result is a bad audio experience for the called and calling party; clips and voice delays are the norm.

Correct echo cancellation functions must be applied as closely as possible to each caller's origination point; the VoIP caller's IAD (Integrated Access Device) and the PSTN caller's class-5 central office.

The only way to ensure this is to deploy the SS7-ISUP protocol, instead of ISDN, at the VoIP/PSTN interface in the VoIP media gateway. With that done one can control the echo cancellation equipment deployed in the PSTN, close to the caller. SS7-ISUP enabled media gateways simply send out the correct command, via SS7-ISUP, and the PSTN turns on its echo cancellation hardware close to the called party. The result is a telephone call with an audio quality as good as, or better than, regular PSTN to PSTN calls.

Cyclix has resolved the echo cancellation issue by using SS7-ISUP gateways in key locations across it network. This gives Cyclix a call quality that is superior to other termination products on the market.

Addressing the Last Mile
Earlier we mentioned how the Internet bandwidth connections for the average user have greatly increased in the last several years. Now most users have access to at least 384kbs via cable or DSL connections. But bulk bandwidth alone is still no guarantee that a VoIP stream will have the Quality of Service (QoS) it needs for the call.

And before going any further, let's define QoS. For the purpose of this discussion it is enough reserved bandwidth to allow the call to stream uninterrupted.

Today the state of the art IP protocol on the Internet does not offer reserved bandwidth for any specific application, such as VoIP. Therefore a VoIP data stream gets thrown into the fray with everything else (email, http browsing, ftp file transfers, etc.) and it is all treated as equal entities. This is not a problem for the other services as they are not real time. VoIP however, needs to travel in real time.

Many of the consumer VoIP dial-tone services that surfaced over the last few years had no solution to this and simply dumped the VoIP stream onto the end user's LAN. As a result, the VoIP call sounds fine as long as nothing else is being sent over that same broadband connection. But the moment the VoIP user starts using their web browser or a large email arrives, the call degrades to poor or inaudible as a result of voice packets either being delayed or dropped altogether.

There are solutions to this QoS on the horizon, with MPLS being the most likely winner long term. But ubiquitous deployment of any solution is still some time away. Fortunately, this problem is localized. On the US Internet, it is only an issue at the last mile over the end user's broadband connection. If QoS is solved on that segment of the network, the entire QoS issue is virtually solved.

Cyclix has addressed the last mile, and offers a QoS solution that allows an enterprise to merge VoIP data streams with non real time bulk data (email, ftp, http, etc.), yet at the same time allows Cyclix customers to hear no ill effects in the VoIP quality. This is accomplished by using the Cyclix VoIP QoS certified user-agent device and QoS switch.

These two devices in tandem offer full QoS for VoIP, and guarantee a quality call, regardless of what is being sent over the customer's broadband connection.

In Summary
It takes three main ingredients to get a quality voice call over IP:

  • Low point to point IP network delays (< 120 msecs)
  • Proper echo cancellation techniques deployed at the caller and called party locales
  • QoS on that last mile

Cyclix worked hard to make sure that all three of these design parameters have been addressed for its customers. Cyclix is able to challenge the PSTN incumbents with VoIP quality that is second to none.

We have connected 200+ million flawless VoIP calls