This article is also available as a download that includes an Excel template you can use to run your own calculations.
VoIP bandwidth consumption over a WAN (wide area network) is one of the
most important factors to consider When building a VoIP infrastructure.
Failure to account for VoIP bandwidth requirements will severely limit the
reliability of a VoIP system and place a huge burden on the WAN infrastructure.
This article will teach you how to deal with the various audio compression
algorithms and WAN topologies. Once you learn how to control and calculate
bandwidth utilization, you can set the proper amount of
allocation for QoS.
VoIP codec types
We’ll start with the most common VoIP codec (compression/decompression)
algorithms and examine their bandwidth and quality characteristics. Note
that MOS stands for mean opinion score, which is a subjective rating of sound
|G.711||64 kbps||4.1||This is the most universally supported codec used in IP telephony.
This narrowband codec supports frequencies in the 300 to 3,400 hertz
range and is uncompressed. Although the quality is very good, it
consumes a lot of bandwidth.
|G.729||8 kbps||3.92||This is the second most supported codec and offers nearly the same
quality as G.711. The key advantage is that it is compressed eight
times smaller than G.711 while sounding almost as good.
|G.722||48 to 64 kbps||This is the most common wideband codec available in IP phones, though
wideband support is only recently gaining momentum. The quality is excellent at
twice the sampling rate of standard G.711, but the compression isn’t that
great. But considering the fact that it’s the same bit rate as narrowband G.711 but delivers much more realistic sound, G.722 will be one of
the best codecs to use if your IP telephones will support it. Wideband supports frequencies of 50 to 7,000 hertz. There are no
longer any patents covering G.722, so it’s free for anyone to use.
|G.722.1||16 to 32 kbps||This is a wideband codec, aka
Siren7, developed by Polycom. Its key advantage is that it’s a computationally efficient and compact
codec at 16 to 32 kbps, which is less than half the bandwidth required by
G.722. 16 kbps mode isn’t appropriate for noisy audio input or if
music is mixed in, since the compression artifacts are noticeable.
32 kbps is good for any kind of workload. This codec must be
licensed from Polycom, and it’s currently used only in Polycom’s high-end
video conferencing systems under the marketing name “Ultimate HD.” Current Polycom IP phones use the marketing term “HD Voice,” which
supports only generic G.722 for its wideband codec, although future IP phone
models may support G.722.1.
|G.722.2||6.6 to 23.85 kbps||Also known as AMR-WB, this is a wideband codec. A 6.6 kbps
mode is also supported, but 12.75 is the practical bit rate for speech in
a clean environment. The higher 23.85 bit rate is better for noisy
conditions and music. At the time of this writing, I’m not aware
of any IP phones that support this codec. It’s currently used by
T-Mobile in Germany for cell phone applications.
|Speex wideband||10 to 28 kbps||This is an excellent open source codec from
Speex that offers very good wideband
quality at relatively low bit rates. VBR (variable bit rate) is
also supported in 12 or 18 kbps mode. This is a free open source
codec and can be used by anyone. It’s supported by open source Asterisk PBX, but no hard IP phones and only one soft IP phone support it.
|Siren14 ultra-wideband||24 to 32 kbps||
Siren14 is a free-to-license (not to be confused with
license-free) ultra-wideband version of the G.722.1 codec from Polycom. Some of the high-end features in G.722.1, like echo
cancellation and noise reduction, are omitted. Even though this is royalty free, you
still need a license from Polycom. Siren14 supports a wider
dynamic range up to 14,000 hertz, compared with 7,000 hertz in wideband.
|Siren22 ultra-wideband||32 to 64 kbps||Siren22 is a proprietary ultra-wideband codec from Polycom that
currently can’t be licensed. Siren22 supports an even wider
dynamic range–up to 22,000 hertz. The digital sampling rate is 48
The bottom line is that G.711 and G.729 will be the lowest common
denominators that are universally supported. G.729 makes the most sense in
the narrowband range because it sounds almost as good as uncompressed G.711 but
is eight times smaller. Since
inexpensive G.722 phones are now on the market in the $80 to $400
range, anyone willing to expend 64 kbps should avoid G.711 and use wideband
G.722 as much as possible because no extra bandwidth is required. I have
seen too many VoIP implementations that blindly use G.711 without any
consideration to bandwidth.
There are plenty of nice, free wideband codecs that offer excellent sound
quality at half the bit rate of G.722, but getting some, let alone all, of the IP
phone manufacturers on board to use them is difficult. It’s hard
enough just getting the phone makers to support generic G.722. Let’s hope that
they all come around in the next few years and start supporting all the free
Free network administration newsletter
Need help configuring, administering, supporting, and optimizing network infrastructure? Turn to our free Network Administration NetNote. Automatically sign up today!
The packet overhead tax
It isn’t enough to look at just the raw payload bandwidth used by the
various codecs. We must also consider the packet overhead from the various types of
connections. Since VoIP never consumes more than 90 kbps on an Ethernet
line, there are almost never any bandwidth problems on a LAN (local area
network). Even a 10 mbps switched network can carry 100 simultaneous
channels of the heaviest codecs. The problem arrives when VoIP packets
must traverse the WAN or Internet, where bandwidth is
typically limited from 64 kbps to 1,536 kbps in a T1 connection. Although many
DSL or cable broadband connections boast 6 mbps or greater throughput, their
upstream speed is the big limiting factor, at 128 kbps to 1,000 kbps. For
connections that are limited to 64 kbps (often the case for retail
chains) and that must also support data applications, special measures must be
taken to make VoIP possible.
Packet overhead can be a severe problem if left unchecked. This is
because at least 50 VoIP packets must be sent per second to keep
the packetization delay to a minimum. Since the IP, UDP, and RTP headers
in each VoIP packet are 40 bytes (320 bits), 50 of them in one second
means it will consume 16 kbps. The physical transport mechanism and the
protocols they use will bump up the overhead even more.
On a frame relay
or T1 connection, the typical packet overhead is 18.8 kbps per channel. This may
not be a big issue for 64 kbps codecs like G.711 or G.722, since it’s relatively
small compared to the payload. But for 8 to 32 kbps codecs, like G.729 and
G.722.1, 18.8 kbps is huge by comparison. Packet overhead for site-to-site
IPSEC VPN connections is far more severe. The overhead can go from 35 kbps
to 45 kbps per channel depending on the type of VPN implemented. That
means every single call over a VPN tunnel gets a ~40 kbps tax slapped on top of
the codec bit rate.
Dealing with packet overhead
There are two main ways of dealing with packet overhead, depending on the
network topology. We can try to reduce the size of the packet headers with
compression or we can trunk multiple voice streams into a single packet. Packet header compression can reduce the IP/UDP/RDP header by a factor of 10 to
20 times, but it works only with frame relay or MP (multilink point-to-point
protocol) connections, and both sides of the connection must be configured for
it. Packet header compression is the ideal solution whenever it can be
MPLS connections can’t use header compression because they aren’t point-to-point connections and must go over a large virtual cloud through multiple
carriers. MPLS does have the advantage of a mesh topology, where any site
can talk to any other site directly without relaying through some central hub
site. This can reduce latency in some situations. VoIP calls over the
Internet also can’t use packet header compression. Site-to-site IPSEC VPN
connections not only can’t use packet header compression, they exacerbate the
situation by doubling the overhead. The only thing we can do for MPLS and
VPN WAN topology is use VoIP packet trunking.
VoIP trunking is a method of taking multiple VoIP streams and merging them
into a single stream sharing a single packet header. So instead of having
to multiply the packet overhead bit rate by the number of channels, it’s a one-time fixed cost for any number of channels with the same source and destination. This can be done only using an open source
Asterisk PBX gateway in each remote site.
This method works even
if you’re using a proprietary VoIP PBX and IP phones, so long as they can speak
the SIP or H.323 protocols. Asterisk can translate them to its own
efficient IAX2 protocol, which supports channel trunking, and retranslate to SIP
or H.323 once it emerges from the limited bandwidth connection. VoIP
trunking is the ideal solution whenever header compression can’t be used. In future articles, we’ll look at how to implement these solutions.
Case studies and examples
Let’s consider an example in which the WAN topology consists mostly of 64 to 256
kbps frame relay or MP connections. In this scenario, we must primarily
use narrowband G.729 8 kbps codecs, and header compression is essential. We can support a single high-end Polycom HD Voice conference phone using the
G.722.1 Siren7 codec at 32 kbps, which is justifiable when an important meeting
is taking place. Here is a breakdown of how much bandwidth for a given
number of channels is supported.
or MP WAN topology with header compression
|Channels||Overhead||Codec bit rate||Trunked bit rate||Unicast bit rate|
|4||3.6 kbps||G.729 (8 kbps)||35.6 kbps||46.4 kbps|
|8||3.6 kbps||G.729 (8 kbps)||67.6 kbps||92.8 kbps|
|16||3.6 kbps||G.729 (8 kbps)||131.6 kbps||185.6 kbps|
|1||3.6 kbps||G.722.1 (32 kbps)||35.6 kbps||35.6 kbps|
Although the ROI on trunking doesn’t seem like much to justify
building an Asterisk gateway, we can use that same Linux appliance as a
transparent proxy and HTTP content compressor. This allows us to get a lot more
out of our WAN connection. We can sometimes reduce the data bandwidth
consumption by a factor of 4 or 10.
Now, let’s see what happens in an MPLS WAN or Internet topology.
|MPLS WAN or
|Channels||Overhead||Codec bit rate||Trunked||Unicast|
|8||18.8 kbps||G.729 (8 kbps)||82.8 kbps||214.4 kbps|
|8||18.8 kbps||G.711 (64 kbps)||530.8 kbps||662.4 kbps|
|8||18.8 kbps||G.722.1 (32 kbps)||274.8 kbps||406.4 kbps|
|8||18.8 kbps||G.722 (64 kbps)||530.8 kbps||662.4 kbps|
As you can see, VoIP trunking is the only way to control bandwidth usage
for the 8 and 32 kbps codecs, and an Asterisk PBX gateway is the only way to
solve this problem.
Finally, lets see what happens in an IPSEC VPN site-to-site WAN topology. Note that we’re using only an IPSEC tunnel and we’re not using GRE on top, so that we don’t
increase the packet overhead any further.
site-to-site WAN topology
|Channels||Overhead||Codec bit rate||Trunked||Unicast|
|8||34.8 kbps||G.729 (8 kbps)||98.8 kbps||342.4 kbps|
|8||34.8 kbps||G.711 (64 kbps)||546.8 kbps||790.4 kbps|
|8||34.8 kbps||G.722.1 (32 kbps)||290.8 kbps||534.4 kbps|
|8||34.8 kbps||G.722 (64 kbps)||546.8 kbps||790.4 kbps|
IPSEC packet overhead is the most severe of all the topologies, but it can be
brought under control with VoIP trunking. IPSEC connections are the
cheapest of all WAN topologies because inexpensive high-speed broadband
connections (less than $100 per month) can be used. Relatively
cheap ISDN (MP 128 kbps) or MPLS connections can be used to provide reasonably
cheap redundancy, but none of this will be practical if we ignore the packet
overhead tax and use the wrong VoIP codecs.
If you want to run your own calculations, the download version of this article includes an
Excel template you can use.