You know the basics of how Voice over IP (VoIP) works: Voice signals convert to digital data that are broken into packets, which can be sent over the Internet or any TCP/IP network. But you may be confused by all of the protocols you hear about in connection with VoIP. What are the differences among them? How do they interact? Why are there so many? Today we'll take a look at common protocols used in VoIP communications.
Call signaling protocols
The most frequently referenced VoIP protocols are the call signalingprotocols. VoIP networks use these protocols to locate the device at the other end of the communication, and then negotiate the exchange between the sending and receiving devices.
The two most often used call signaling protocols are:
These two protocols basically do the same thing, and most VoIP devices use one or the other. Under the hood, the two protocols work differently to accomplish the establishment of a VoIP connection; SIP is ASCII-based and H.323 is binary-based. Although H.323 was by far the more popular at first, and many feel it is superior in its ability to work with the Public Switched Telephone Network (PSTN) and to transmit video, SIP has become increasingly popular due to support from the devices of many VoIP vendors. Many users also find SIP to be easier to deploy.
SIP is an application layer protocol that provides a means for identification of the calling and called numbers, authentication of the caller and recipient, and forwarding of calls. In identifying the caller and recipient, SIP addresses are similar to the PSTN with phone numbers, but SIP addresses look a little like e-mail addresses; the format is sip:userID@gateway.com. Users register their addresses with SIP servers called registrars, and the caller sends a SIP request to the server. Users can send SIP messages over either TCP or User Datagram Protocol (UDP).
You can put links to SIP addresses in a Web page or other HTML document so users can click it to place a voice call. For a detailed discussion of how SIP works, see http://www.protocols.com/pbook/VoIPFamily.htm#SIP.
H.323 is a suite made up of a number of many different
protocols that perform specific tasks together. Some members of the suite
For a complete list of the H.323 protocols and to see what each one does, see http://www.protocols.com/pbook/h323.htm.
A gateway, in its generic sense, is a device that provides
an interface between two types of networks. A VoIP gateway connects an IP-based
network to the PSTN or to a regular analog phone. VoIP gateways have two parts:
Another set of protocols, called device control protocols,
separate the call control logic from the media processing logic in VoIP
gateways. Examples of these protocols include:
The request for comments (RFC) protocol 3435 defines MGCP. It uses a call agent that directs and controls the MG and signaling gateway. Multiple call agents create fault tolerance. The MGC uses MGCP to find the locations and capabilities of the VoIP endpoints.
The IETF uses the name Megaco, and the ITU uses H.248 to refer to the same protocol. The two organizations developed the protocol through a joint effort. This outgrowth of MGCP is designed to provide remote control of VoIP gateways and other session-aware devices. MGCP and Megaco are similar, but Megaco supports more types of networks, including ATM networks.
VoIP networks built on a centralized architecture typically use Megaco and MGCP; the MGC/call agent is the centralized device that communicates with the media gateways. Networks that rely on a distributed architecture use SIP and H.323.
For more information about MGCP and Megaco and to learn how they work, read this article.
Real-time Transport Protocol (RTP) and related protocols
Once the MG extracts the voice signal from the PSTN circuit, the RTP carries it across the TCP/IP network. RTP is a standard for transmitting audio and video over IP networks. RFC 3550 defines it, and it works in conjunction with SIP or H.323. A VoIP call uses two RTP streams, one going in each direction.
RTP typically uses high numbered ports (16384-32767), but there is no standard port of RTP communications. RTP itself also doesn't provide for Quality of Service (QoS). RTP works with the RTP control protocol (RTPC), which provides the control information for the RTP communications. RTP handles the transmission of the data itself. RTPC can collect information (packets sent, packets lost, etc.) to report QoS issues.
The Secure Real Time Transport Protocol (SRTP) provides encryption, authentication, and integrity for RTP data. Secure RTCP (SRTCP) provides the same security services to RTPC. SRTP and SRTCP use the Advanced Encryption Standard (formerly known as Rijndael), which has been adopted by the U.S. government to replace the Data Encryption Standard.
Not all VoIP implementations use the standard protocols. Skype and some other VoIP services use proprietary protocols. The Skype protocols operate in a peer-to-peer setup instead of the client-server configuration used by most VoIP clients. Because its code is closed source, it's difficult to get information about its protocols and how they work.
You may also hear about Skinny or Skinny Client Control Protocol (SCCP), which is a proprietary protocol used by Cisco for communications between their Call Managers (an H.323 proxy) and their VoIP phones. The H.323 proxy uses SCCP to communicate with Skinny clients.
It's easy to get confused when trying to sort out the maze of protocols used for VoIP communications, but understanding the protocols is the first step toward understanding how VoIP works—and what implementation will work best for your organization.
Debra Littlejohn Shinder, MCSE, MVP is a technology consultant, trainer, and writer who has authored a number of books on computer operating systems, networking, and security. Deb is a tech editor, developmental editor, and contributor to over 20 additional books on subjects such as the Windows 2000 and Windows 2003 MCSE exams, CompTIA Security+ exam, and TruSecure's ICSA certification.