SIP stands for Session Initiation Protocol.
The protocol has been designed with easy implementation, good scalability, and flexibility in mind.
The protocol is used for creating, modifying and terminating sessions with one or more participants. By sessions,we understand a set of senders and receivers that communicate and the state kept in those senders and receivers during the communication. Examples of a session can include Internet telephone calls, distribution of multimedia,multimedia conferences, distributed computer games, etc.
Two protocols that are most often used along with SIP are RTP and SDP.The RTP protocol is used to carry the real-time multimedia data (including audio, video and text).The protocol makes it possible to encode and split the data into packets and transport these packets over the Internet.
Another important protocol is SDP, Session Description Protocol, which is used to describe and encode capabilities of session participants.
SIP has been designed in conformance with the Internet model. It is an end-to-end oriented signalling protocol which means that all the logic is stored in end-devices (except routing of SIP messages).
The end-to-end concept of SIP is a significant divergence from a regular PSTN Public Switched Telephone Network) where all the state and logic is stored in the network and the end-devices (telephones) are very primitive.The aim of SIP is to provide the same functionality that the traditional PSTNs have, but the end-to-end design makes SIP networks much more powerful and open to the implementation of new services that can hardly be implemented in the traditional PSTNs.
SIP is based on HTTP protocol.
HTTP and is probably the most successful and widely used protocol in the Internet.
SIP tries to combine the best of both. SIP is used to carry the description of session parameters.The description is encoded into a document using SDP. Both protocols (HTTP and SIP) have inherited the encoding of message headers from RFC822.The encoding has proven to be robust and flexible over the years.
SIP entities are identified using SIP URI (Uniform Resource Identifier).A SIP URI has the form of a domain name part, delimited by the @ character. SIP URIs are similar to e-mail addresses and it is, for instance, possible to use the same URI for e-mail and SIP communication. Such URIs are easy to remember.
SIP network contains more than one type of SIP element. Basic SIP elements are user agents, proxies, registrars and redirect servers.
Internet endpoints that use SIP to find each other and to negotiate a session’s characteristics are called user agents. User agents usually, but not necessarily, reside on a user's computer in form of an application.This is currently the most widely-used approach, but user agents can be also cellular phones, PSTN gateways, PDAs, automated IVR systems and so on.
SIP allows the creation of an infrastructure of network hosts called proxy servers. User agents can send messages to a proxy server. Proxy servers are very important entities in the SIP infrastructure.They perform routing of a session invitations according to invitee's current location, authentication, accounting and many other important functions.
Stateless servers are simple message forwarders.They forward messages independently of each other.
Stateful proxies are more complex. Upon reception of a request, stateful proxies create a state and keep the state until the transaction finishes.
Most SIP Proxies today are stateful because their configuration is usually very complex.They often perform accounting, forking and some sort of NAT traversal aid and all those features require a stateful proxy.
Communication using SIP (often called signalling) is comprised of a series of messages. Messages can be transported independently by the network. Usually they are each transported in a separate UDP datagram.
Although SIP messages are sent independently over the network, they are usually arranged into transactions by user agents and certain types of proxy servers. Therefore SIP is said to be a transactional protocol.
A transaction is a sequence of SIP messages exchanged between SIP network elements.
A transaction consists of one request and all responses to that request.
In a traditional telephone network, the infrastructure consists of large telephone switches which interconnect with each other to create the backbone network and which also connect to customer equipment (PBXs, telephones).While the internal network today is based upon digital communication, links to customers may be either analogue (PSTN) or digital (ISDN).
A similar construction is now considered by a number of telecom companies for IP-based backbone networks that may successively replace parts of their overall switched-network infrastructure.
Two types of gateways are used at the edges of the IP network to connect to the conventional telephone network: signalling gateways to convert SS7 signalling into IP-based call control (which may make use of H.323 or SIP or simply provide a transport to carry SS7 signalling in IP packets [SIGTRAN]) and media gateways that perform voice transcoding. Some central entity (or more probably, a number of co-operating entities) forms the intelligent core of the backbone, the Media Gateway
A number of protocols have been defined for communication between Media Gateway Controllers and media gateways. Initial versions were developed by multiple camps, some of which merged to create the Media Gateway Control Protocol (MGCP).
One particular protocol extension currently discussed in the IETF is the definition of a protocol for communication with an IP telephone at the customer premises that fits seamlessly with the Media Gateway Control architecture. Such a telephone would be a rather simple entity, essentially capable of transmitting and receiving events and reacting to them, while the call services are provided directly by the network infrastructure.