Web RTC Integrator's Guide


User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 382 [warning: Documents this large are best viewed by clicking the View PDF Link!]

WebRTC Integrator's Guide
Successfully build your very own scalable WebRTC
infrastructure quickly and efciently
WebRTC Integrator's Guide
Copyright © 2014 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval
system, or transmitted in any form or by any means, without the prior written
permission of the publisher, except in the case of brief quotations embedded in
critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy
of the information presented. However, the information contained in this book is
sold without warranty, either express or implied. Neither the author, nor Packt
Publishing, and its dealers and distributors will be held liable for any damages
caused or alleged to be caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the
companies and products mentioned in this book by the appropriate use of capitals.
However, Packt Publishing cannot guarantee the accuracy of this information.
First published: October 2014
Production reference: 1251014
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham B3 2PB, UK.
ISBN 978-1-78398-126-7
Cover image by Gagandeep Sharma (er.gagansharma@gmail.com)
Alessandro Arrichiello
Pasquale Boemio
Antón Román Portabales
Andrii Sergiienko
Commissioning Editor
Usha Iyer
Acquisition Editor
Llewellyn Rozario
Content Development Editor
Akashdeep Kundu
Technical Editor
Menza Mathew
Copy Editors
Karuna Narayanan
Laxmi Subramanian
Project Coordinator
Neha Thakur
Jenny Blake
Stephen Copestake
Maria Gould
Joel T. Johnson
Hemangini Bari
Mariammal Chettiyar
Rekha Nair
Ronak Dhruv
Valentina D'silva
Disha Haria
Abhinash Sahu
Production Coordinators
Adonia Jones
Nitesh Thakur
Cover Work
Nitesh Thakur
About the Author
Altanai, born into an Indian army family, is a bubbly, vivacious, intelligent
computer geek. She is an avid blogger and writes on Research and Development
of evolving technologies in Telecom (http://altanaitelecom.wordpress.com).
She holds a Bachelor's degree in Information Technology from Anna University,
Chennai. She has worked on many Telecom projects worldwide, specically in the
development and deployment of IMS services. She rmly believes in contributing to
the Open Source community and is currently working on building a WebRTC-based
JS library with books for more applications.
Her hobbies include photography, martial arts, oil canvas painting, river rafting,
horse riding, and trekking, to name a few.
This is her rst book, and it contains useful insight into WebRTC for beginners and
integrator in this eld. The book has denitions and explanations that will cover
many interesting concepts in a clear manner.
Altanai can be contacted at tara181989@gmail.com.
About the Reviewers
Alessandro Arrichiello is a computer enthusiast. He graduated in Computer
Engineering from the University of Naples Federico II, Italy.
He has a passion for and knowledge of GNU/Linux systems that began at age
of 14 and continues today. He is an independent Android developer, who develops
apps for Google Play Store, and has strong knowledge of C++, Java, and other
derivatives. He also has experience with many other interpreted languages such
as Perl, PHP, and Python.
Alessandro is a proud open source supporter and has given his contribution to
many collaborative projects developed for academic purposes.
Recently, he enriched his knowledge on Network Monitoring, focusing on
Penetration Testing and Network Security in general.
At the moment, Alessandro is working as a software engineer in the
Communications and Media Solution group of Hewlett Packard in Milan, Italy.
He's involved in many business projects as a developer and technology consultant.
Alessandro has worked as a reviewer and author for Packt Publishing. He has
technically reviewed the book, WebRTC Blueprints, and now, he's working on a
video course on developing an application using the WebRTC technology.
Pasquale Boemio fell in love with Linux and the open source philosophy
at the age of 12. He has a Master's degree in Computer Engineering, and he
works as a researcher at the Computer Engineering department of the University
of Naples Federico II, Italy. At the same time, he collaborates with Meetecho
(www.meetecho.com), experimenting with a large number of innovative technologies
such as WebRTC, Docker, and Node.js.
Even though Pasquale is involved in such activities, he still releases free software on
GitHub (www.github.com/helloIAmPau).
Antón Román Portabales is the CTO of Quobis. After graduating as a
telecommunications engineer, he began working in Motorola as an IMS developer.
In 2008, he left Motorola to join Quobis, a Spanish company focused on SIP
interconnection. It works for major operators and companies in Europe and South
America. In 2010, he nished a Pre-PhD program in Telematics Engineering as the
main author of a paper about the use of IMS networks to transmit real-time data
from the electrical grid; he presented this paper in an IEEE conference in 2011.
He has been actively working on WebRTC since 2012, when Quobis decided to focus
on this technology. He has recently got involved in the activities of IETF, along with
other colleagues from Quobis. He also frequently participates in VoIP-related open
source events.
Andrii Sergiienko is an entrepreneur who's passionate about IT and also
about travelling. He has lived in different places, such as Ukraine, Russia, Belarus,
Mongolia, Buryatia, and Siberia, spending a considerable number of years in every
place. He also likes to travel by an auto rickshaw.
From his early childhood, Andrii was interested in computer programming and
hardware. He took the rst steps in this eld more than 20 years ago. Andrii has
experience in a wide set of languages and technologies, including C, C++, Java,
Assembler, Erlang, JavaScript, PHP, Riak, shell scripting, computer networks,
security, and so on.
During his career, Andrii has worked for both small, local companies, such as
domestic ISP; and large world corporations, such as Hewlett Packard. He also
started his own companies; some of them were relatively successful, while others
were a total failure.
Today, Andrii is working on growing Oslikas, his company, headquartered
in Estonia. The company is focused on modern IT technologies and solutions.
They also develop a full-stack framework to create rich media WebRTC
applications and services. You can nd them at http://www.oslikas.com.
Support les, eBooks, discount offers, and more
You might want to visit www.PacktPub.com for support les and downloads related
to your book.
Did you know that Packt offers eBook versions of every book published, with PDF and
ePub les available? You can upgrade to the eBook version at www.PacktPub.com and
as a print book customer, you are entitled to a discount on the eBook copy. Get in touch
with us at service@packtpub.com for more details.
At www.PacktPub.com, you can also read a collection of free technical articles, sign up
for a range of free newsletters and receive exclusive discounts and offers on Packt books
and eBooks.
Do you need instant solutions to your IT questions? PacktLib is Packt's online digital
book library. Here, you can access, read and search across Packt's entire library of books.
Why subscribe?
Fully searchable across every book published by Packt
Copy and paste, print and bookmark content
On demand and accessible via web browser
Free access for Packt account holders
If you have an account with Packt at www.PacktPub.com, you can use this to access
PacktLib today and view nine entirely free books. Simply use your login credentials for
immediate access.
Table of Contents
Preface 1
Chapter 1: Running WebRTC with and without SIP 7
JavaScript Session Establishment Protocol (JSEP) 7
Signal and media ows 8
Running WebRTC without SIP 10
Sending media over WebSockets 10
getUserMedia 10
RTCPeerConnection 12
RTCDataChannel 18
Media traversal in WebRTC clients 23
WebRTC through WebSocket signaling servers 24
Node.js 24
Making a peer-to-peer audio call using Node.js for signaling 26
Running WebRTC with SIP 32
Session Initiation Protocol (SIP) 32
JavaScript-based SIP libraries 36
Summary 37
Chapter 2: Making a Standalone WebRTC Communication Client 39
Description of the WebRTC client-server model 40
The sipML5 WebRTC client 41
Developing a minied webphone application using Tomcat 42
Developing our customized version of the sipML5 client 46
The jsSIP WebRTC client 49
Developing our version of the jsSIP client 50
SIP servers 53
OfceSIP 57
SIP WS to SIP and vice-versa 58
The gateway to convert SIP over WebSocket to native SIP 59
The WebRTC2SIP gateway 59
Table of Contents
[ ii ]
The WebRTC client with Brekeke SIP server 64
The WebRTC client with the Kamailio SIP server 66
Limitations of the existing setup 74
Firewall and NAT issues 75
Media transcoding 75
Summary 79
Chapter 3: WebRTC with SIP and IMS 81
The Interaction with core IMS nodes 82
The Call Session Control Function 83
Home Subscriber System 83
The IP Multimedia Subsystem core 85
The OpenIMS Core 86
The Telecom server 96
The Mobicents Telecom Application Server 96
The Media Server 99
The FreeSWITCH Media Server 99
Media Services 103
WebRTC over rewalls and proxies 109
The nal architecture for the WebRTC-to-IMS integration 112
Summary 113
Chapter 4: WebRTC Integration with Intelligent Network 115
From mobiles to WebRTC client through GPRS 116
IMS connectivity to Gateway GPRS Support Node 118
From mobiles to WebRTC client through GSM 121
Call processed with the IN service logic 124
The WebRTC client's communication with the GSM phone
through IMS 125
The WebRTC client's communication with a GSM phone
with IN services 127
The services broker for endpoints and WebRTC in IMS to GSM
phone in Intelligence Networks 129
The WebRTC client's SIP messages to SMS in a GSM phone (SMSC) 130
The Kannel gateway 130
Summary 135
Chapter 5: WebRTC Integration with PSTN 137
What is PSTN? 138
WebRTC connectivity to the PSTN 139
The PSTN gateway 141
The PSTN connectivity to IMS via PSTN gateways 142
The call ow from a WebRTC SIP browser client to a xed landline phone 142
Table of Contents
[ iii ]
The challenges in connecting the WebRTC world to
the PSTN landscape 145
Address mapping 145
Translation from SIP to ISUP 145
The call setup 146
The call termination 147
The call in progress 149
The service logic 150
SIP service logic through application server 150
IN services via IMSSF 151
The Service Broker for the orchestration of services 152
Summary 154
Chapter 6: Basic Features of WebRTC over SIP 155
SIP services 156
Registering a SIP client 156
Making audio and video calls using SIP 159
Text Chat using SIP 165
Obtaining the online/ofine status of users using SIP 167
Services in the Application Server 172
Back-to-back user agent 174
Call screening 175
Basic call screening 175
Enhanced call screening 176
Call hold/resume 176
Call forwarding 177
Unconditional call forwarding 178
Call forwarding when the user is unavailable 178
Call transfer 179
Attended call transfer 179
Unattended call transfer 181
Generation of call log for tracking 182
Media Server-based features 182
Announcement 183
Media relay 183
Voicemail 184
Music on Hold 186
Interactive Voice Response 186
Conferencing 187
Multipart communication 187
Features of a web application 188
Geolocation 188
Authenticating users with OAuth 190
Table of Contents
[ iv ]
Import contacts from other accounts 191
Advertisements in the WebRTC call 192
Delivering an instant message as a mail 193
The admin console 194
Summary 194
Chapter 7: WebRTC with Industry Standard Frameworks 195
The Multitier architecture 196
The design of a WebRTC client 197
The Class diagram 197
The Entity Relationship model 200
The environment setup 201
Java Runtime Environment (JRE) 201
Integrated Development Environment with Java Enterprise Edition (EE) 202
Databases 202
The web application server 203
The web application infrastructure 204
JSP- / Servlet-based WebRTC web project 204
Programming the JSP- / Servlet-based web project structure 205
The development of modules 206
Struts- / Hibernate-based WebRTC web project 213
Programming the Struts- / Hibernate-based web project structure 213
The development of modules 215
Spring 3 MVC-based WebRTC web project 223
Programming the Spring 3 MVC web project structure 223
The development of modules 226
Testing 236
Testing the signal ow 237
Test cases for WebRTC client validation 237
Summary 241
Chapter 8: WebRTC and Rich Communication Services 243
Rich Communication Services 244
Position and adoption of RCS 244
Business impact of RCS 245
Technology impact 245
Rich Communication Services enhanced (RCS-e) 246
Joyn 246
The RCS conguration process 246
RCS specications 247
Service discovery by an RCS-enabled device 248
User capability exchange 248
Chats with multimedia sharing 249
Table of Contents
[ v ]
Group chat in a conference session 251
User availability through XCAP 252
REST-based notications 253
Interoperability and interworking 253
The RCS ecosystem and WebRTC 254
RCS services in WebRTC 255
User prole 255
Integration with social networks 257
The enhanced phonebook 258
User capabilities and Presence 259
Unied messaging box 260
Message history 261
Rich calls 261
Call logs 263
Message history 264
Multiparty conferencing 265
WebRTC architecture with RCS modules 266
Telecom operator's benet derived from RCS 266
Voice over LTE 268
Combination of WebRTC, VOLTE, and RCS 268
Summary 269
Chapter 9: Native SIP Application and Interaction
with WebRTC Clients 271
Support for WebRTC in various operating systems 273
Windows OS 274
Native browser support for WebRTC clients 274
SIP softphones capable of interacting with WebRTC clients 280
WebRTC unsupported browsers interacting with WebRTC clients 282
Linux OS 283
Native browser support for WebRTC clients 284
SIP softphones capable of interacting with WebRTC clients 286
Mac OS 289
Native browser support for WebRTC clients 290
SIP softphones capable of interacting with WebRTC clients 291
WebRTC unsupported browsers interacting with WebRTC client 294
Android OS for mobiles 295
Native browser support for WebRTC clients 295
Android phone's/tablet's SIP applications capable of interacting with WebRTC clients 298
Developing a lightweight Android SIP application 300
Windows OS for mobiles 301
Apple iPhone 302
iPhone/iPad IP applications interacting with WebRTC clients 302
Developing an iPhone SIP application 304
Summary 304
Table of Contents
[ vi ]
Chapter 10: Other WebRTC Use Cases 305
Unied Communicator 306
Team Communicator 306
Customized Communicator for specic enterprise segments 310
Branches and back ofce communications 310
The Customer Relationship Management system 313
Network Operation Center 318
The human resource management tool 319
Communicating with candidates for an open post directly
from the job portal 319
Social networking – targeting consumers 321
Social networking platforms 321
Dating sites with anonymous call and chat 323
Retail services 325
WebRTC online marketing centers 325
WebRTC contact centers 327
Users contacting customer care 328
Health care 329
Online medical consultation with the doctor 330
Financial services 334
Communication with nancial services 334
Insurance claims 336
Calling from the ATM 338
Remote management 338
Surveillance 339
Managing the connected device 340
WebRTC games 340
Two-player games 341
Multiplayer games 342
TV experience with WebRTC 343
Live broadcasting 344
IPTV integration and streaming 345
Streaming movies among peers 346
Interfacing services 348
WebRTC for e-learning 348
WebRTC for e-governance 350
Summary 350
Index 351
WebRTC Integrator's Guide is a deep dive into the world of real-time telecommunication
and its integration with the telecom network. This book covers a wide range of
WebRTC solutions, such as GSM, PSTN, and IMS, designed for specic network
requirement. It also addresses the implementation woes by describing every minute
detail of the WebRTC platform setup from the APIs to the architecture, code-to-server
installations, RCS-to-Codec interoperability, and much more. It also describes various
enterprise-based use cases that can be built around WebRTC.
What this book covers
Chapter 1, Running WebRTC with and without SIP, is a quick brush-up of WebRTC
basics such as Media APIs. It also describes the use of plain WebSocket signaling
to deliver WebRTC-based browser-to-browser communication.
Chapter 2, Making a Standalone WebRTC Communication Client, talks about the use
of the Session Initiation Protocol (SIP) as the signaling mechanism for WebRTC.
It describes the setup of the SIP server for this purpose.
Chapter 3, WebRTC with SIP and IMS, outlines the interaction of a SIP-based WebRTC
client with the IP Multimedia Subsystem (IMS).
Chapter 4, WebRTC Integration with Intelligent Network, describes the ways in which
WebRTC can be made interoperable with mobile phones, as the majority of mobile
communications today are still on GSM under the IN model.
Chapter 5, WebRTC Integration with PSTN, describes the backward compatibility of
the WebRTC technology to the old, xed-line telephones.
Chapter 6, Basic Features of WebRTC over SIP, describes the basic WebRTC SIP services
such as audio/video call, messaging, call transfer, call hold/resume, and others.
[ 2 ]
Chapter 7, WebRTC with Industry Standard Frameworks, discusses the
development of the WebRTC client over the industry-adopted framework
(that is, Model-View-Controller).
Chapter 8, WebRTC and Rich Communication Services, discusses how RCS enriches
the communication technology with features such as le transfer, Presence,
phonebook, and others.
Chapter 9, Native SIP Application and Interaction with WebRTC Clients, addresses a very
important concern, that is, the WebRTC interoperability with other SIP endpoints
such as desktop clients, SIP hardphones, and mobile-based SIP applications.
Chapter 10, Other WebRTC Use Cases, presents an interesting array of WebRTC use
cases that are both innovative and practical with the current WebRTC standards.
What you need for this book
A brief understanding of SIP is required to set up the operation environment.
It is recommended that you use Linux, as it supports the installation of many open
source components described in the book. Web development skills are required
to make the WebRTC web-based application using HTML and browser APIs. It is
recommended that you use the Eclipse IDE for client-side development, as depicted
in many screenshots provided in the book. To host the applications, any web server,
such as Apache, will do.
Who this book is for
Web developers, SIP application developers, and IMS experts can use this book to
develop and deploy a customized, readily deployable WebRTC platform. The use
cases described in the book cater to WebRTC integration in any industry segment.
Therefore, anyone with basic knowledge of HTML and JavaScript can develop a
WebRTC client after referring to this book.
In this book, you will nd a number of styles of text that distinguish between
different kinds of information. Here are some examples of these styles, and an
explanation of their meaning.
[ 3 ]
Code words in text, database table names, folder names, lenames, le extensions,
pathnames, dummy URLs, user input, and Twitter handles are shown as follows:
"We saw how to program the three basic APIs of WebRTC media stack namely,
getUserMedia, RTCPeerConnection, and DataChannel."
A block of code is set as follows:
public class loginServlet extends HttpServlet {
public loginServlet() {
Any command-line input or output is written as follows:
Request Method:
Status Code:
101 Switching Protocols
New terms and important words are shown in bold. Words that you see on the
screen, in menus or dialog boxes for example, appear in the text like this: "As peer 1
keys in the message and hits the Send button, the message is passed on to peer 2."
Warnings or important notes appear in a box like this.
Tips and tricks appear like this.
Reader feedback
Feedback from our readers is always welcome. Let us know what you think about
this book—what you liked or may have disliked. Reader feedback is important for
us to develop titles that you really get the most out of.
To send us general feedback, simply send an e-mail to feedback@packtpub.com,
and mention the book title via the subject of your message.
If there is a topic that you have expertise in and you are interested in either writing
or contributing to a book, see our author guide on www.packtpub.com/authors.
[ 4 ]
Customer support
Now that you are the proud owner of a Packt book, we have a number of things to
help you to get the most from your purchase.
Downloading the example code
You can download the example code les for all Packt books you have purchased
from your account at http://www.packtpub.com. If you purchased this book
elsewhere, you can visit http://www.packtpub.com/support and register to have
the les e-mailed directly to you.
Downloading the color images of this book
We also provide you a PDF le that has color images of the screenshots/diagrams
used in this book. The color images will help you better understand the changes in
the output. You can download this le from: https://www.packtpub.com/sites/
Although we have taken every care to ensure the accuracy of our content, mistakes
do happen. If you nd a mistake in one of our books—maybe a mistake in the text or
the code—we would be grateful if you would report this to us. By doing so, you can
save other readers from frustration and help us improve subsequent versions of this
book. If you nd any errata, please report them by visiting http://www.packtpub.
com/submit-errata, selecting your book, clicking on the errata submission form link,
and entering the details of your errata. Once your errata are veried, your submission
will be accepted and the errata will be uploaded on our website, or added to any list of
existing errata, under the Errata section of that title. Any existing errata can be viewed
by selecting your title from http://www.packtpub.com/support.
[ 5 ]
Piracy of copyright material on the Internet is an ongoing problem across all media.
At Packt, we take the protection of our copyright and licenses very seriously. If you
come across any illegal copies of our works, in any form, on the Internet, please
provide us with the location address or website name immediately so that we can
pursue a remedy.
Please contact us at copyright@packtpub.com with a link to the suspected
pirated material.
We appreciate your help in protecting our authors, and our ability to bring
you valuable content.
You can contact us at questions@packtpub.com if you are having a problem
with any aspect of the book, and we will do our best to address it.
Running WebRTC with
and without SIP
WebRTC lets us make calls right from a web page without any plugin. This was
made possible using media APIs of the browser to fetch user media, WebSocket for
transportation, and HTML5 to render the media on the web page. Thus, WebRTC
is an evolved form of WebSocket communication. WebSocket is a Transport Layer
protocol that carries data. The WebSocket API is an Application Programming
Interface (API) that enables web pages to use the WebSocket protocol for (duplex)
communication with a remote host.
In this chapter, we will study how WebRTC really works. We will also
demonstrate the use of WebRTC media APIs to capture and render input from a
user's microphone and camera onto a web page. In the later part of chapter, we will
nd out how to build a simple standalone WebRTC client using the plain WebSocket
protocol as the signaling mechanism.
JavaScript Session Establishment
Protocol (JSEP)
The communication model between a client and remote host is based on the
JSEP architecture, which differentiates the signaling and media transaction
into different layers.
Running WebRTC with and without SIP
[ 8 ]
The differentiation is shown in the following gure:
SessionDescription SessionDescription
Signaling vs Media
WebRTC: JSEP Approach
JSEP signaling and media
As an example, let's consider two peers, A and B, where A initiates communication
with B. Initially, in the rst case, A being the offerer will have to call the
createOffer function to begin a session. A also mentions details such as codecs
through a setLocalDescription function, which sets up its local cong. The remote
party, B, reads the offer and stores it using the setRemoteDescription function. The
remote party, B, calls the createAnswer function to generate an appropriate answer,
applies it using the setLocalDescription function, and sends the answer back
to the initiator over the signaling channel. When A gets the answer, it also stores it
using the setRemoteDescription function, and the initial setup is complete. This
is repeated for multiple offers and answers. The latest on JSEP specications can be
read from the Internet Engineering Task Force (IETF) site at http://datatracker.
Signal and media ows
The differentiation between signal and media ows is an important aspect of the
WebRTC call setup.
The signaling mechanism can be any among HTTP/REST, JavaScript Object
Notation (JSON) via XMLHttpRequest (XHR), Session Initiation Protocol (SIP)
over websockets, XMPP, or any custom or proprietary protocol. The media
(audio/video) is dened through the Session Description Protocol (SDP) and
ows from peer to peer.
Chapter 1
[ 9 ]
A few instances of end-to-end signaling and media ow variants are shown in the
following screenshot:
SessionDescription SessionDescription
Network JSON XMLHttpRequest
JSON XMLHttpRequest
websocket subprotocol JSON XMR
The preceding gure depicts signaling over the WebRTC API in the JSON format
via XHR.
Now, the following gure depicts signaling over the WebRTC API in eXtensible
Messaging and Presence Protocol (XMPP):
SessionDescription SessionDescription
eXtensible Messaging and Presence Protocol (XMPP)
Running WebRTC with and without SIP
[ 10 ]
While it's very popular to use the WebRTC API with SIP support through
JavaScript libraries such as JSSIP, SIPML5, PJSIP, and so on, these libraries cater
to the SIP/IMS (IP Multimedia Subsystem) world and are not mandatory for
setting up enterprise-level WebRTC Infrastructure. In fact, it is a misconception
that WebRTC is coupled with SIP in itself; it isn't.
IP Multimedia System (IMS) is part of the Next Generation
Network (NGN) model for IP-based communication.
Running WebRTC without SIP
HTML5 websockets can be dened by ws:// followed by the URL in the server eld
while readying a WebRTC client for registration. This enables bidirectional, duplex
communications with server-side processes, that is, server-side push events to the
client. It also enables the handshake after sharing media metadata such as ports,
codecs, and so on.
It should be noted that WebRTC works in an offer/answer mode and has ways
of traversing the Network Address Translation (NAT) and rewalls by means
of Interactive Connectivity Establishment (ICE). ICE makes use of the Session
Traversal Utilities for NAT (STUN) protocol and its extension, Traversal Using
Relay NAT (TURN). This is covered later in the chapter.
Sending media over WebSockets
WebRTC mainly comprises three operations: fetching user media from a
camera/microphone, transmitting media over a channel, and sending messages
over the channel. Now, let's take a look at the summarized description of every
operation type.
The JavaScript getUserMedia function (also known as MediaStream) is used to allow
the web page to access users' media devices such as camera and microphone using
the browser's native API, without the need of any other third-party plugins such as
Adobe Flash and Microsoft Silverlight.
Chapter 1
[ 11 ]
For simple demos of these methods, download the WebRTC read-only
by executing the following command:
svn checkout http://webrtc.googlecode.com/svn/trunk/
The following is the code to access the IP camera in the Google Chrome browser
and display the local video in a <video/> element:
/*The HTML to define a button to begin the capture and HTML5 video
element on web page body */
<video id="vid" autoplay="true"></video>
<button id="btn" onclick="start()">Start</button>
/*The JavaScript block contains the following function call to start
the media capture using Chrome browser's getUserMedia function*/
video = document.getElementById("vid");
function start() {
navigator.webkitGetUserMedia({video:true}, gotStream,
function() {});
btn.disabled = true;
/*The function to add the media stream to a video element on a page*/
function gotStream(stream) {
video.src = webkitURL.createObjectURL(stream);
When the browser tries to access media devices such as a camera and
mic from users, there is always a browser notication that asks for the
user's permission.
Downloading the example code
You can download the example code les for all Packt books you have
purchased from your account at http://www.packtpub.com. If you
purchased this book elsewhere, you can visit http://www.packtpub.
com/support and register to have the les e-mailed directly to you
Running WebRTC with and without SIP
[ 12 ]
The following screenshot depicts the user notication for granting permission to
access the camera in Google Chrome:
The following screenshot depicts the user notication for granting permission to
access the camera in Mozilla Firefox:
The following screenshot depicts the user notication for granting permission to
access the camera in Opera:
In WebRTC, media traverses in a peer-to-peer fashion and is necessary to exchange
information prior to setting up a communication path such as public IP and open
ports. It is also necessary to know about the peer's codecs, their settings, bandwidth,
and media types.
Chapter 1
[ 13 ]
To make the peer connection, we will need a function to populate the values
of the RTCPeerConnection, getUserMedia, attachMediaStream, and
reattachMediaStream parameters. Due to the fact that the WebRTC standard is
currently under development, the JavaScript API can change from one implementation
to another. So, a web developer has to congure the RTCPeerConnection,
getUserMedia, attachMediaStream, and reattachMediaStream variables in
accordance to the browser on which we are running the HTML content.
It is noted that WebRTC standards are in rapid evolution.
The API that was used for the rst version of WebRTC was the
PeerConnection API, which had distinct methods for media
transmission. As of now, the old PeerConnection API has
been deprecated and a new enhanced version is under process.
The new Media API has replaced the media streams handling in
the old PeerConnection API.
The browser APIs of different browsers have different names. The criterion is
to determine the browser on which the web page is opened and then call the
appropriate function for the WebRTC operation. The identity of the browser
can be determined by extracting a friendly name or checking for a match with a
specic library name of the different browser. For example, when navigator.
webkitGetUserMedia is true, then WebRTCDetectedBrowser = "chrome", and
when navigator.mozGetUserMedia is true, then WebRTCDetectedBrowser =
"firefox". The following table shows the W3C standard elements in Google
Chrome and Mozilla Firefox:
W3C Standard Chrome Firefox
getUserMedia webkitGetUserMedia mozGetUserMedia
RTCPeerConnection webkitRTCPeerConnection mozRTCPeerConnection
RTCSessionDescription RTCSessionDescription mozRTCSessionDescription
RTCIceCandidate RTCIceCandidate mozRTCIceCandidate
Such methods also exist for Opera, which is a new addition to the WebRTC suite.
Hopefully, Internet Explorer, in the future, would have native support for WebRTC
standards. For other browsers such as Safari that don't support WebRTC as yet, there
are temporary plugins that help capture and display the media elements, which
can be used until these browsers release their own enhanced WebRTC supported
versions. Creating WebRTC-compatible clients in Internet Explorer and Safari is
discussed in Chapter 9, Native SIP Application and Interaction with WebRTC Clients.
Running WebRTC with and without SIP
[ 14 ]
The following code snippet is used to make an RTC peer connection and render
videos from one HTML video frame to another on the same web page. The library
le, adapter.js, is used, which renders the polyll functionality to different
browsers such as Mozilla Firefox and Google Chrome.
The HTML body content that includes two video elements for the local and remote
videos, the text status area, and three buttons to start capturing, sending, and stop
receiving the stream are given as follows:
<video id="vid1" autoplay="true" muted="true"></video>
<video id="vid2" autoplay></video>
<button id="btn1" onclick="start()">Start</button>
<button id="btn2" onclick="call()">Call</button>
<button id="btn3" onclick="hangup()">Hang Up</button>
<xtextarea id="ta1"></textarea>
<xtextarea id="ta2"></textarea>
The JavaScript program to transmit media from the video element to another at the
click of the Start button, using the WebRTC API is given as follows:
/* setting the value of start, call and hangup to false initially*/
btn1.disabled = false;
btn2.disabled = true;
btn3.disabled = true;
/* declaration of global variables for peerconecection 1 and 2, local
streams, sdp constrains */
var pc1,pc2;
var localstream;
var sdpConstraints = {'mandatory': {
'OfferToReceiveVideo':true }};
The following code snippet is the denition of the function that will get the user
media for the camera and microphone input from the user:
function start() {
btn1.disabled = true;
getUserMedia({audio:true, video:true},
/* get audio and video capture */
gotStream, function() {});
Chapter 1
[ 15 ]
The following code snippet is the denition of the function that will attach an input
stream to the local video section and enable the call button:
function gotStream(stream){
attachMediaStream(vid1, stream);
localstream = stream;/* ready to call the peer*/
btn2.disabled = false;
The following code snippet is the function call to stream the video and audio content
to the peer using RTCPeerConnection:
function call() {
btn2.disabled = true;
btn3.disabled = false;
videoTracks = localstream.getVideoTracks();
audioTracks = localstream.getAudioTracks();
var servers = null;
pc1 = new RTCPeerConnection(servers);/* peer1 connection to server
pc1.onicecandidate = iceCallback1;
pc2 = new RTCPeerConnection(servers);/* peer2 connection to server
pc2.onicecandidate = iceCallback2;
pc2.onaddstream = gotRemoteStream;
function gotDescription1(desc){/* getting SDP from offer by peer2 */
pc2.createAnswer(gotDescription2, null, sdpConstraints);
function gotDescription2(desc){/* getting SDP from answer by peer1 */
Running WebRTC with and without SIP
[ 16 ]
On clicking the Hang Up button, the following function closes both of the
peer connections:
function hangup() {
pc1 = null; /* peer1 connection to server closed */
pc2 = null; /* peer2 connection to server closed */
btn3.disabled = true; /* disables the Hang Up button */
btn2.disabled = false; /*enables the Call button */
function gotRemoteStream(e){
vid2.src = webkitURL.createObjectURL(e.stream);
function iceCallback1(event){
if (event.candidate) {
pc2.addIceCandidate(new RTCIceCandidate(event.candidate));
function iceCallback2(event){
if (event.candidate) {
pc1.addIceCandidate(new RTCIceCandidate(event.candidate));
In the preceding example, JSON/XHR (XMLHttpRequest) is the signaling
mechanism. Both the peers, that is, the sender and receiver, are present on the same
web page; this is represented by the two video elements shown in the following
screenshot. They are currently in the noncommunicating state.
Chapter 1
[ 17 ]
As soon as the Start button is hit, the user's microphone and camera begin to
capture. The rst peer is presented with the browser request to use their camera and
microphone. After allowing the browser request, the rst peer's media is successfully
captured from their system and displayed on the screen. This is demonstrated in the
following screenshot:
As soon as the user hits the Call button, the captured media stream is shared in
the session with the second peer, who can view it on their own video element.
The following screenshot depicts the two peers sharing a video stream:
The session can be discontinued by clicking on the Hang Up button.
Running WebRTC with and without SIP
[ 18 ]
The DataChannel function is used to exchange text messages by creating a
bidirectional data channel between two peers. The following is the code to
demonstrate the working of RTCDataChannel.
The following code snippet is the HTML body of the code for the DataChannel
function. It consists of a text area for the two peers to view the messages and three
buttons to start the session, send the message, and stop receiving messages.
<div id="left">
<h2>Send data</h2>
<textarea id="dataChannelSend" rows="5" cols="15"
<button id="startButton" onclick="createConnection()">
<button id="sendButton" onclick="sendData()">Send Data</button>
<button id="closeButton" onclick="closeDataChannels()">
Stop Send Data
<div id="right">
<h2>Received Data</h2>
<textarea id="dataChannelReceive" rows="5" cols="15"
The style script for the text area is given as follows; to differentiate between the two
peers, we place one text area aligned to the right and another to the left:
#left { position: absolute; left: 0; top: 0; width: 50%; }
#right { position: absolute; right: 0; top: 0; width: 50%; }
The JavaScript block that contains the functions to make the session and transmit the
data is given as follows:
/*Declaring global parameters for both sides' peerconnection, sender,
and receiver channel*/
var pc1, pc2, sendChannel, receiveChannel;
Chapter 1
[ 19 ]
/*Only enable the Start button, keep the send data and stop send data
button off*/
startButton.disabled = false;
sendButton.disabled = true;
closeButton.disabled = true;
The following code snippet is the script to create PeerConnection in Google
Chrome, that is, webkitRTCPeerConnection that was seen in the previous table.
It is noted that a user needs to have Google Chrome Version 25 or higher to test
this code. Some old Chrome versions are also required to set the --enable-data-
channels ag to the enabled state before using the DataChannel functions.
function createConnection() {
var servers = null;
pc1 = new webkitRTCPeerConnection(servers,{
optional: [{RtpDataChannels: true}]});
try {
sendChannel = pc1.createDataChannel("sendDataChannel", {
reliable: false});
} catch (e) {
alert('Failed to create data channel.' +
'You need Chrome M25 or later with
--enable-data-channels flag'););
pc1.onicecandidate = iceCallback1;
sendChannel.onopen = onSendChannelStateChange;
sendChannel.onclose = onSendChannelStateChange;
pc2 = new webkitRTCPeerConnection(servers,{
optional: [{RtpDataChannels: true}]});
pc2.onicecandidate = iceCallback2;
pc2.ondatachannel = receiveChannelCallback;
startButton.disabled = true; /*since session is up,
disable start button */
closeButton.disabled = false; /*enable close button */
The following function is used to invoke the sendChannel.send function along with
user text to send data across the data channel:
function sendData() {
var data = document.getElementById("dataChannelSend").value;
Running WebRTC with and without SIP
[ 20 ]
The following function calls the sendChannel.close() and receiveChannel.
close() functions to terminate the data channel connection:
function closeDataChannels() {
pc1.close(); /* peer1 connection to server closed */
pc2.close(); /* peer2 connection to server closed */
pc1 = null;
pc2 = null;
startButton.disabled = false;
sendButton.disabled = true;
closeButton.disabled = true;
document.getElementById("dataChannelSend").value = "";
document.getElementById("dataChannelReceive").value = "";
document.getElementById("dataChannelSend").disabled = true;
Peer connection 1 sets the local description, and peer connection 2 sets the remote
description from the SDP exchanged, and the answer is created:
function gotDescription1(desc) {
function gotDescription2(desc) {
trace('Answer from pc2 \n' + desc.sdp);
The following is the function to get the local ICE call back:
function iceCallback1(event) {
if (event.candidate) {
Chapter 1
[ 21 ]
The following is the function for the remote ICE call back:
function iceCallback2(event) {
if (event.candidate) {
The function that receives the control when a message is passed back to the user
is as follows:
function receiveChannelCallback(event) {
receiveChannel = event.channel;
receiveChannel.onmessage = onReceiveMessageCallback;
receiveChannel.onopen = onReceiveChannelStateChange;
receiveChannel.onclose = onReceiveChannelStateChange;
function onReceiveMessageCallback(event) {
document.getElementById("dataChannelReceive").value =
function onReceiveChannelStateChange() {
var readyState = receiveChannel.readyState;
function onSendChannelStateChange() {
var readyState = sendChannel.readyState;
if (readyState == "open") {
document.getElementById("dataChannelSend").disabled = false;
sendButton.disabled = false;
closeButton.disabled = false;
} else {
document.getElementById("dataChannelSend").disabled = true;
sendButton.disabled = true;
closeButton.disabled = true;
Running WebRTC with and without SIP
[ 22 ]
The following screenshot shows that Peer 1 is prepared to send text to Peer 2 using
the DataChannel API of WebRTC:
Empty text areas before beginning the exchange of text
On clicking on the Start button, as shown in the following screenshot, a session is
established between the peers and the server:
Putting in text from one's peers after hitting the Start button
As Peer 1 keys in the message and hits the Send button, the message is passed on
to Peer 2. The preceding snapshot is taken before sending the message, and the
following picture is taken after sending the message:
Text is exchanged on DataChannel on the click of the Send button
Chapter 1
[ 23 ]
However, right now, you are only sending data from one localhost to another.
This is because the system doesn't know any other peer IP or port. This is where
socket-based servers such as Node.js come into the picture.
Media traversal in WebRTC clients
Real-time Transport Protocol (RTP) is the way for media to ow between end
points. Media could be audio and/or video based.
Media stream uses SRTP and DTLS protocols.
RTP in WebRTC is by default peer-to-peer as enforced by the Interactive Connectivity
Establishment (ICE) protocol candidates, which could be either STUN or TURN. ICE
is required to establish that rewalls are not blocking any of the UDP or TCP ports. The
peer-to-peer link is established with the help of the ICE protocol. Using the STUN and
TURN protocols, ICE nds out the network architecture and provides some transport
addresses (ICE candidates) on which the peer can be contacted.
An RTCPeerConnection object has an associated ICE, comprising the
RTCPeerConnection signaling state, the ICE gathering state, and the ICE connection
state. These are initialized when an object is rst created. The ow of signals through
these nodes is depicted in the following call ow diagram:
Peer A STUN TURN Signal Channel Peer B
Peer A STUN TURN Signal Channel Peer B
Who am I?
Symmetric NAT
Offer SDP
Answer SDP
Who am I?
ICE candidate(B)
ICE candidate(A)
Answer SDP
Offer SDP
ICE candidate(A)
ICE candidate(B)
Channel please
Running WebRTC with and without SIP
[ 24 ]
ICE, STUN, and TURN are dened as follows:
ICE: This is the framework to allow your web browser to connect with peers.
ICE uses STUN or TURN to achieve this.
STUN: This is the protocol to discover your public address and determine
any restrictions in your router that would prevent a direct connection with a
peer. It presents the outside world with a public IP to the WebRTC client that
can be used by the peers to communicate to it.
TURN: This is meant to bypass the Symmetric NAT restriction by
opening a connection with a TURN server and relaying all information
through this server.
STUN/ICE is built-in and mandatory for WebRTC.
WebRTC through WebSocket signaling
Signaling is a crucial activity to establish any kind of network-based communication.
It lets the endpoints share the session description and media information before
setting up the path to actually exchange media. For a simple WebRTC client,
there are JavaScript-based WebSocket servers that can provide such signaling
in a permanent, full duplex, real-time manner. Node.js is one such server.
Node.js is an asynchronous, server-side JavaScript server powered by Chrome's
V8 JS engine. There are many WebSocket libraries, such as Socket.io and SockJS, that
can run over it. Why are they used? They are used because the WebSocket server will
do the WebSocket signaling between WebRTC clients and the server without using
other protocols such as XMPP or SIP.
Chapter 1
[ 25 ]
Let's see how we can use Node.js signaling server through the following
simple steps:
1. On a Windows machine, install nodejs.exe from the ofcial download site,
2. To check whether Node.js is properly installed and working, check the
version using the following command lines
node -v
The output in my case is v0.10.26.
3. Open the command prompt, and type node <name of the JS file>
in the window. Consider the following command line as an example:
node signaler.js
To write and run a simple server-side program, open Notepad, make a sample JS le
with a name, say, console, and add some content to the console.log('node.js
running fine') le. Run this le using the following Node.js command from the
command prompt:
node console.js
The following screenshot shows the output of the preceding command line:
Running WebRTC with and without SIP
[ 26 ]
Let's now look at the overview of steps using Node.js to set up the signaling
environment for a WebRTC client.
1. First, we need a JavaScript library to support WebRTC signaling operations.
We can use signaller.js for this. Download signaller.js from
2. Next, we should run the JavaScript library using the Node.js server.
We can do so by executing the following command in the terminal window:
node signaler.js
3. Specify the address of the Node.js server machine in the WebRTC client.
Now, we can make inter-browser WebRTC audio/video calls, where
the signaling is handled by the Node.js WebSocket signaling server.
The following diagram depicts how Node.js is used as a signaling server:
Signaling Signaling
Encrypted Media
Bob's BrowserAlice's Browser
The preceding diagram denotes signaling across WebRTC clients over the
Node.js WebSocket-based server. The media ows from peer to peer.
Making a peer-to-peer audio call using
Node.js for signaling
We have seen how a JavaScript program is hosted on a Node.js signaling server.
Now, let's study the process of making an audio/video call using this setup. The
following code references Muaz Khan WebRTC experiments, which is under the
MIT license. The library used is PeerConnection.js. The following are the CSS
descriptions for the audio and video content on a page:
audio, video {
vertical-align: top;
Chapter 1
[ 27 ]
.setup {
border-bottom-left-radius: 0;
border-top-left-radius: 0;
margin-left: -9px;
margin-top: 8px;
position: absolute;
.highlight { color: rgb(0, 8, 189); }
Next, we will look at the JavaScript functions that dene the behavior of the WebRTC
client. This is a modied version of code from one-to-one-peerconnection.html
under the WebRTC experiments master from Muaz Khan. For better clarity and easy
understanding, I have removed the functions of unique ID, rotate video, and scale
video, and have minimal CSS styling.
The following code denes the websocket.onopen and websocket.send operations:
var channel = location.href.replace( /\/|:|#|%|\.|\[|\]/g, '');
var websocket = new WebSocket('ws://' + document.domain +
websocket.onopen = function() {
open: true, channel: channel
websocket.push = websocket.send;
websocket.send = function(data) {
data: data, channel: channel
The following code is for the creation of a new peer connection and for every user
who joins a session:
var peer = new PeerConnection(websocket);
peer.onUserFound = function(userid) {
if (document.getElementById(userid)) return;
/* adding the name of room to room list */
var tr = document.createElement('tr');
var td1 = document.createElement('td');
var td2 = document.createElement('td');
Running WebRTC with and without SIP
[ 28 ]
td1.innerHTML = userid + ' video call';
/* creating element button to room list */
var button = document.createElement('button');
button.innerHTML = 'Join';
button.id = userid;
button.style.float = 'right';
/* add the user to session on button click */
button.onclick = function() {
button = this;
getUserMedia(function(stream) {
// get user media
// add the stream
button.disabled = true;
The following code adds streaming to the video element of HTML and sets
its characteristics:
peer.onStreamAdded = function(e) {
if (e.type == 'local')
document.querySelector('#start-broadcasting').disabled =
var video = e.mediaElement;
video.setAttribute('width', 400);
video.setAttribute('height', 400);
video.setAttribute('controls', true);
Chapter 1
[ 29 ]
The following code is to close the streaming session:
peer.onStreamEnded = function(e) {
var video = e.mediaElement;
if (video) {
video.style.opacity = 0;
setTimeout(function() {
}, 1000);
document.querySelector('#start-broadcasting').onclick =
function() {
this.disabled = true;
getUserMedia(function(stream) {
document.querySelector('#your-name').onchange = function() {
peer.userid = this.value;
var videosContainer = document.getElementById(
'videos-container') || document.body;
var btnSetupNewRoom = document.getElementById('setup-new-room');
var roomsList = document.getElementById('rooms-list');
if (btnSetupNewRoom) btnSetupNewRoom.onclick =
The following code is to capture the user media:
function getUserMedia(callback) {
var hints = {
audio: true,
video: {
optional: [],
mandatory: {
minWidth: 200, minHeight:200, maxWidth: 400,
maxHeight: 400, minAspectRatio: 1.77
Running WebRTC with and without SIP
[ 30 ]
navigator.getUserMedia(hints, function(stream) {
var video = document.createElement('video');
video.src = URL.createObjectURL(stream);
video.controls = true;
video.muted = true;
mediaElement: video,
userid: 'self',
stream: stream
The following is the web page's HTML content to add a button to start transmitting
the media, a video element to display the media, a text eld to add a user's name,
and a table to list the existing available sessions:
<input type="text" id="your-name" placeholder="your-name">
<button id="start-broadcasting" class="setup">
Start Transmitting Yourself!</button>
<!-- list of all available conferencing rooms -->
<table id="rooms-list" style="width: 100%;"></table>
<!-- local/remote videos container -->
<div id="videos-container"></div>
The following screenshot depicts a user, Alice, creating a new session named alice.
Here, the user Alice creates a session for broadcasting video, which will be added to
the room list.
Chapter 1
[ 31 ]
Alice's media is streamed on the session space, as shown in the next screenshot:
A new user, Bob, views the list of ongoing sessions from his remote computer,
and clicks on the Join button, as shown in the following screenshot, to join
Alice's session:
Running WebRTC with and without SIP
[ 32 ]
The following screenshot displays a two-way audio and video session in progress
between Bob and Alice:
Bob and Alice are in an audio/video sharing session. Using other WebRTC APIs,
we can also add le sharing, text chat, screen sharing capabilities, and so on to this
simple demonstration to turn it into a multifeatured communication tool.
Running WebRTC with SIP
This section introduces the approach to use the SIP signaling mechanism with
WebRTC. Like any other VoIP protocol, SIP also provides the signaling framework
before setting up an actual media path. However, the foundation of open standard
and industry-adopted signaling protocol such as SIP is recommended, as it provides
the rst and most crucial step to a strong, scalable architecture.
Session Initiation Protocol (SIP)
As we already know, SIP is a signaling protocol that is used to establish an RTP
between two endpoints.
As per the ofcial document, RFC 3261, SIP is an application-layer
control protocol that can establish, modify, and terminate multimedia
sessions (conferences) such as Internet telephony calls.
Chapter 1
[ 33 ]
The SIP stack denes the Request and Response methods. These methods are
used to gather the information about endpoints that wish to participate in a
communication so that the device-specic information such as IP, port, availability,
media understanding, and audio-video device compatibility can be sorted out
before establishing a owing media connection.
However, it should be noted that traditional SIP is a bit different from SIP over
WebSocket (SIPWS), which is used in case of WebRTC with SIP signaling.
It is not by default that every SIP server would understand SIPWS. Only those
SIP servers that have WebSocket support, or state that they are WebRTC compliant,
will be able to proxy or understand the SIP messages sent from a WebRTC client.
Why do we use SIPWS? This protocol allows the development of Convergent
applications, that is, applications that support SIP for communication, HTTP for
web components, and WebRTC for media. SIPWS can be transformed into plain SIP
signal through a gateway, which can then interact with the IMS network. Also, SIP
can be used to integrate application logic such as call screening and call rerouting,
with the help of SIP Servlets or other kinds of SIP programming. More of this is
given in Chapter 3, WebRTC with SIP and IMS.
SIPWS is explained in detail in the IETF draft, The WebSocket Protocol as a Transport for
the Session Initiation Protocol (SIP) draft-ietf-sipcore-sip-websocket-10 and can be found at
The following gure depicts the use of SIPWS signaling plane with WebRTC
media plane:
Running WebRTC with and without SIP
[ 34 ]
The following gure provides the call ow of the SIPWS signaling mechanism.
Any SIP request is preceded by a one-time WebSocket handshake.
Alice loads a web page using her web browser and retrieves the JavaScript code that
implements the SIP WebSocket subprotocol. The JavaScript code (SIP WebSocket
Client) establishes a secure WebSocket connection with a SIP proxy/registrar
(SIP WebSocket Server) at proxy.example.com.
The following is an example of a WebSocket handshake in which the Client requests
the WebSocket SIP subprotocol support from the Server:
Request Method:
Status Code:
101 Switching Protocols
Request Headers:
Provisional headers are shown.
Sec-WebSocket-Extensions:permessage-deflate; client_max_window_bits,
Chapter 1
[ 35 ]
User-Agent:Mozilla/5.0 (X11; Linux i686 (x86_64)) AppleWebKit/537.36
(KHTML, like Gecko) Chrome/32.0.1700.102 Safari/537.36
Response Headersview source
The following diagram shows the call between Alice and Bob through the SIP proxy
server over WebSocket signaling:
Every SIP endpoint is registered with the SIP Server by a unique callable ID. This is
referred to as the SIP URI and is denoted by the sip:<username>@<domainname>
format. When a user, Alice, calls another user, Bob, through Bob's SIP URI, then the
SIP WebSocket Server at proxy.example.com acts as a SIP proxy node and routes
the INVITE call to Bob's contact. Bob answers the call to start a conversation and then
terminates it with a BYE request when the communication is over.
Running WebRTC with and without SIP
[ 36 ]
JavaScript-based SIP libraries
There are many popular JavaScript libraries that offer easy-to-integrate support
for WebRTC communication using SIP signaling:
SIPJS: This is an SIP stack in JavaScript to implement SIP-based audio
and video user agents in the browser. You can nd a running demo at
The demo application has the option to switch between WebRTC capabilities
and Flash for browsers that support and do not support WebRTC.
JSSIP: This is an SIP over WebSocket transport API for audio/video calls
and instant messaging. It works with all SIPWS-compatible SIP servers such
as OverSIP, Kamailio, and Asterisk servers. You can nd a running demo at
sipML5: This is an open source JavaScript library with a provision for
RTCWeb Breaker (audio and video transcoding when the endpoints do not
support the same codecs or the remote server is not RTCWeb compliant). For
example, features such as audio/video call, instant messaging, presence, call
hold/resume, explicit call transfer, and Dual-tone multi-frequency (DTMF)
signaling using SIP INFO are present. You can nd a running example at
QuoffeSIP: This is another WebRTC SIP library to establish real-time
communication between browsers. This is developed in CoffeeScript
(simple syntax). It features video/audio call capabilities using SIP over the
Websockets protocol and also uses the SIP Outbound and GRUU protocols.
You can nd a running example at http://talksetup.quobis.com/.
The implementation of the sipML5 and JSSIP libraries to constitute a simple
WebRTC browser client that is able to communicate to a similar peer in any
WebRTC-supported browser is covered in the next chapter.
Chapter 1
[ 37 ]
In this chapter, we learned that a WebRTC communication process is divided into
two parts: signaling, where the session setup and teardown is agreed to, and media
transactions, which deals with the actual RTP streams that contain voice/video/
data that the user has sent. We saw how to program the three basic APIs of WebRTC
media stack, namely, getUserMedia, RTCPeerConnection, and DataChannel.
The Running WebRTC without SIP section described signaling done over JSON via
XMLHttpRequest using Node.js as the intermediately signaling server to connect the
peers and prepare for the media ow. The next section, Running WebRTC with SIP,
listed the libraries or WebRTC clients that use SIP over WebSocket to take care of the
signaling between WebRTC peers. In the following chapters, we will see how to use
WebRTC media APIs over the SIP WebSocket protocol in detail.
Making a Standalone
WebRTC Communication
The objective of this chapter is to make a simple WebRTC client and server module
that bypasses a centralized server and, instead, makes a direct peer-to-peer
connection between browsers through a Session Initiation Protocol (SIP) proxy
server. The aim is to connect a WebRTC client to another WebRTC client using
SIP over WebSocket as the signaling protocol. In this chapter, we will study the
following three prime ways of making SIP WebRTC calls:
WebRTC to WebRTC call through a public cloud-hosted, WebRTC-capable
SIP server, such as SIP2SIP
WebRTC to WebRTC call through a locally hosted, WebRTC-capable SIP
server, such as OfceSIP
WebRTC call to SIP phone through a WebSocket gateway and SIP server,
such as Kamailio
We will begin the chapter by describing a simple WebRTC client-server model.
Making a Standalone WebRTC Communication Client
[ 40 ]
Description of the WebRTC
client-server model
The components of a typical WebRTC SIP-based client include the following:
SIP stack, in the form of a JavaScript library, to perform signaling
Cascading Style Sheets (CSS) to style a page
WebRTC media API to render a peer-to-peer connection between the
audio-video components of a page
An HTML5-based graphical interface to provide inputs such as registration
parameters, self-URI (short for Uniform Resource Identier), URI of the
party to be called, and so on
The following diagram depicts the important components to set up a WebRTC
VP8 Codec
Video Jitter
WebRTC media API
Session Management
SIP Server
jQuery SIP stack
Browser-based Client Environment Web Server
SIP Server
Chapter 2
[ 41 ]
The client side must be linked to a server that runs on the network side to complete
the signal ow. The components that must be deployed on the network side are
as follows:
The WebRTC gateway to connect to the native SIP world
The SIP server to embed the SIP application/proxy logic
The web browser is the key component in WebRTC transactions. It is the client-side
environment that pulls out the HTML content from a web server, interprets the
HTML tags, and displays the web page to the user. A WebRTC-capable browser has
the additional ability to access the user's input media devices, such as microphones
and camera, and stream them across the network. In the preceding diagram, there
are some key functions of WebRTC media API that are embedded in the browser.
These include codecs for audio and video, noise reduction, image enhancements,
jitter buffer, multiplexing, SRTP, ICE in STUN /TURN, and so on. The gateway is the
internetworking node between WebRTC's SIP over WebSocket side and traditional
SIP/IMS side. The traditional SIP-based network is depicted by the SIP server in the
preceding diagram.
The parts for media handling between WebRTC and non-WebRTC clients, such as
relay and transcoding, will be explained in depth in later chapters. Here, we shall
look at an infrastructure compromising of only SIP and WebRTC. For now, consider
a simple WebRTC client trying to communicate with another WebRTC client through
the browser interface. There are many SIP and WebRTC implementations available
today; we will consider sipML5 and jsSIP among these to make a simple WebRTC
client, which will communicate via an SIP server.
The sipML5 WebRTC client
The sipML5 client is a library of SIP and Session Description Protocol (SDP)
stacks written in JavaScript, using WebSocket as the network transport mechanism.
It supports TCP, UDP, and TLS transports. It is provided under the BSD license.
There are three ways of using SipML5 WebRTC client:
The rst option is to use the online demo version available at
The second option is to make use of the minied version of the JavaScript
API and code that can be imported and loaded directly using the web server
The third option (recommended for integrators) is to get the developers'
version of the sipML5 master that can be checked out from GitHub and
used for development and debugging for enhanced operations
Making a Standalone WebRTC Communication Client
[ 42 ]
Let's begin the exercise using only the primitive and necessary sipML5 functions to
make a call successfully from a web page without the need of backbone components
such as web.xml and the Java source. To simplify things, we will not look into the
enhanced features such as Presence (Subscribe, Notify), DTMF, and speed dialing at
this point. These topics will be covered in Chapter 6, Basic Features of WebRTC over SIP.
Developing a minied webphone application
using Tomcat
The steps to set up a Tomcat web server are described in this section.
1. First is the installation of a web application server to host the web archive (war)
that contains the WebRTC call page. We are using Apache Tomcat Version
7.0.50 here. It can be downloaded from https://tomcat.apache.org/.
2. We must ensure that JAVA_HOME is set as an environmental variable for
Tomcat in Windows (refer to the following screenshot).
3. Start the Tomcat batch script after the preceding two steps. You will see
the following output in the console:
The code for the web page that acts like a web-based phone using WebRTC calls
(along with the explanation of various code snippets) is given as follows:
1. Start the process by making a local copy of the SIP-signaling JavaScript
le. Open an empty text le and import the sipml5-api JS library le from
2. Write the following JavaScript functions to initialize the engine:
var readyCallback = function(e){
createSipStack(); // see next section
Chapter 2
[ 43 ]
var errorCallback = function(e){ // stack failed to initialize
console.error('Failed to initialize the engine:' + e.message);
SIPml.init(readyCallback, errorCallback);
3. The following function shows how to dene event reactions when the client
has started and when a call arrives:
var eventsListener = function(e){
if(e.type == 'started'){
else if(e.type == 'i_new_call'){
// incoming audio/video call acceptCall(e);
4. The following is the JavaScript code to start an SIP stack with parameters in
var sipStack;
function createSipStack(){
sipStack = new SIPml.Stack({
realm: 'sip2sip.info', // mandatory : domain name
impi: 'altanai', // mandatory : IMS Private Identity
impu: 'SIP:altanai@sip2sip.info',
// mandatory : IMS Public Identity
password: '/*enter sip2sip.info account password*/',
display_name: 'altanai',
websocket_proxy_url: 'wss://sipml5.org:10062',
outbound_proxy_url: 'udp://example.org:5060',
enable_rtcweb_breaker: false,
events_listener: { events: '*', listener: eventsListener },
//optional: '*' means all events
5. The declaration of the elements to make and receive a call is given as follows:
var callSession;
var makeCall = function(){
callSession = sipStack.newSession('call-audiovideo', {
video_local: document.getElementById('video-local'),
Making a Standalone WebRTC Communication Client
[ 44 ]
video_remote: document.getElementById('video-remote'),
audio_remote: document.getElementById('audio-remote'),
events_listener: { events: '*', listener: eventsListener }
6. The function denition to accept an incoming call using the sipML5 library is
given as follows:
var acceptCall = function(e){
// e.newSession.reject() to reject the call
7. Add HTML containers for local and remote videos as shown in the following
lines of code:
<video width="100%" height="100%" id="video_remote"
<video class="video" width="88px" height="72px" id="video_local"
autoplay="autoplay" muted="true">
8. Make a folder under the webapps folder within the Tomcat folder. Name it
miniSipml5phone, place the SIPml5-API.js le, and rename the le we
created as index.html.
9. Open http://<ip>:<port>/<foldername> in a browser to load the
web page. To test whether this code is working on the machine or not,
add http://localhost:8080/miniSipml5phone in the address bar to
load the call page.
After developing the WebRTC client and deploying it over the Application Server,
it's time to test its functions. The best way to do this is by inspecting the traces.
Use the console screen of Chrome or Firefox to see the traces of SIP requests.
The trace for an SIP stack initialization should be of the following structure.
SIPML5 API version = 1.3.214
User-Agent=Mozilla/5.0 (Windows NT 6.0) AppleWebKit/537.36 (KHTML, like
Gecko) Chrome/34.0.1847.116 Safari/537.36
WebSocket supported = yes
SIP stack start: proxy='ns313841.ovh.net:12060', realm='<SIP:sip2sip.
info>', impi='altanai', impu='<SIP:altanai@sip2sip.info>'
Chapter 2
[ 45 ]
Connecting to 'WS://ns313841.ovh.net:12060'
State machine: c0000_Started_2_Outgoing_X_oINVITE PeerConnectionClass =
function RTCPeerConnection() { [native code] } SessionDescriptionClass
= function RTCSessionDescription() { [native code] } IceCandidateClass =
function RTCIceCandidate() { [native code] }
Video Constraints:{ /* video constrains added to WebRTC client appear
here */}
ICE servers:[/* list of stun servers added to WebRTC client appear here
If an exception occurs for a missing resource such as an audio, a JavaScript, or an
image le, the browser console depicts a notication for it. The following statement
is a "GIF le not found" notication:
GET 404 (Not Found)
We must make amendments to the HTML content that points to the correct resource
path so as to run the WebRTC client code unobstructed. In case the JavaScript le
for SIP functions is not loaded properly, the web handshake and the subsequent
communication operation will not take place.
The SIP requests and SDP can be viewed here; this can help in solving errors.
The trace for the SIP INVITE request from the WebRTC client is of the following
structure. The ICE candidates come in to play rst. After this, the SIP INVITE
request for a call is generated and sent to the other user along with SDP.
SEND: INVITE SIP:testagent@sip2sip.info SIP/2.0
Via: SIP/2.0/WS df7jal23ls0d.invalid;branch=z9hG4bKSM7tGLdSUy3DpGehaPa78H
From: <SIP:altanai@sip2sip.info>;tag=2CEhe0X78NxVm7aRCaBa
To: <SIP:testagent@sip2sip.info>
Contact: "undefined"<SIP:altanai@df7jal23ls0d.invalid;rtcweb-breaker=no;c
Call-ID: 16a15a79-e4a6-78a1-2310-1243ebafe826
Making a Standalone WebRTC Communication Client
[ 46 ]
CSeq: 59935 INVITE
Content-Type: application/sdp
Content-Length: 3984
Max-Forwards: 70
o=- 6574822970880695000 2 IN IP4
s=Doubango Telecom - chrome
t=0 0
a=group:BUNDLE audio video
m=audio 15856 RTP/SAVPF 111 103 104 0 8 106 105 13 126
c=IN IP4
The SDP and SIP traces shown here are modied to depict only
the important headers. Many other headers have been removed
from traces for clarity.
Now that we have gained a bit of insight into WebRTC code components,
let's use this to develop the complete JavaScript of the sipML5 library and
constitute a dynamic web project, which will be later used to embed the logic
of other features such as phonebook, call logs, voicemail, and user prole.
Developing our customized version of the
sipML5 client
This section describes the process of building our own customized WebRTC client.
It is built over the SIP library, WebRTC API, and the web-based Graphical User
Interface (GUI) to enable the user to make and receive calls. The following steps
outline the process of creating a customized WebRTC client using Dynamic Project
wizard of Eclipse. This differs from the earlier approach, in which we were using the
web-deployed sipML5 WebRTC web page.
Chapter 2
[ 47 ]
Once we set up our own sipML5 web project, it is easy to make changes in the
congurations and user interface.
1. Download sipML5-master from GitHub (refer to https://github.com/
sipml5). Unzip and extract the folder.
2. Make an empty, dynamic, web project in Eclipse. Let's assume that the name
of the project is WebRTCSimpl5.
3. Copy the les under the release folder into the WebContent folder of Eclipse.
Now, the project explorer should look like the following screenshot:
4. Run call.htm on any web server such as Tomcat. Tomcatv7.0 is used on
the localhost.
Making a Standalone WebRTC Communication Client
[ 48 ]
5. Open the web page in a web browser that supports WebRTC. We must add
our SIP credentials on this page for the server to register the WebRTC SIP
client. An SIP client registration requires the authentication name, SIP URI,
password, and domain name elds to be specied at the time of registration.
The following screenshot shows the call.htm page that runs from the local
Tomcat web server:
6. Open the expert.htm page by clicking on the Expert mode? button. When
using a public server such as iptel or SIP2SIP, the domain name entered
in the registration section is enough to locate the server and connect the
client with it. However, for self-congured servers, the server address
of the WebSocket server must be entered in the WebSocket server URL
eld. For example, if the WebSocket SIP server is installed on the machine
with IP and the ws port is 443, then the ws URL will be
ws:// The following screenshot shows the Expert.htm
page, where the server parameters are entered when the page is run from the
local Tomcat web server:
Chapter 2
[ 49 ]
Expert.htm page, where server parameters are entered while the page is run from local Tomcat web server
7. Register the client with the SIP server by clicking on the Register button.
The browser console can be monitored at this stage; the console depicts the
registration SIP request being generated and sent to the SIP server.
Check browser console traces to nd out any server errors of components
and missing exceptions on the web page.
Similarly, a jsSIP WebRTC client can also be congured to use a WebRTC-supported
SIP server.
The jsSIP WebRTC client
The jsSIP client is a JavaScript library of the SIP stack and SDP, much similar to
sipML5.It can also be used in the following three ways:
The rst option is to use the online demo of the jsSIP WebRTC client that can
be found at http://tryit.jssip.net/.
The second option is to use the minied version of the jsSIP JavaScript API,
and the code.
The third option is to get the developer's version of the jsSIP master from
GitHub and use it for development and debugging.
Making a Standalone WebRTC Communication Client
[ 50 ]
Developing our version of the jsSIP client
To integrate the SIP WebRTC functionality into an existing web application, it is
required that you develop the WebRTC client from basic components so that it can be
customized later. We can perform the following steps to make a Dynamic Web Project
of the jsSIP WebRTC client using Eclipse Wizard, similar to the sipML5 project:
1. Download jsSIP-demo-master from GitHub (http://jssip.net/
download/) and unzip it.
2. Make an empty dynamic web project in Eclipse. Let's assume that we name it
3. Copy the les into WebContent of the project in Eclipse. The project explorer
should look like the following screenshot:
4. Open the index.html le to add the reference to the latest jssip-0.3.0.js
library le using the following line of code:
<script src="http://jssip.net/download/jssip-0.3.0.js" type="text/
Chapter 2
[ 51 ]
5. Instantiate the following cong parameters now or add them later:
// Default settings.
var default_SIP_uri = "jmillan@jssip.net";
var default_SIP_password = '';
var outbound_proxy_set = {
host: "tryit.jssip.net:10080",
WS_query: 'wwdf'
JSsipPhone = new JsSIP.UA(configuration);
6. Use the existing event denition or add your own under the existing
function's body. The event denition is as follows:
//WebSocket connection events
JSsipphone.on('connected', function(e){ });
JSsipphone.on('disconnected', function(e){ });
//New incoming or outgoing call event
JSsipphone.on('newRTCSession', function(e){ });
//New incoming or outgoing IM message event
JSsipphone.on('newMessage', function(e){ });
//SIP registration events
JSsipphone.on('registered', function(e){ });
JSsipphone.on('unregistered', function(e){ });
JSsipphone.on('registrationFailed', function(e){ });
7. The following functions describe how to make an outgoing or receive an
incoming audio/video call. Use the existing function calls to add your
Graphical User Interface (GUI) response with the help of CSS and jQuery,
such as show remote and local video captured in the video div and print
console info traces for tracing.
8. The following are the HTML5 <video> elements in which local and remote
videos will be shown:
var selfView = document.getElementById('my-video');
var remoteView = document.getElementById('peer-video');
Making a Standalone WebRTC Communication Client
[ 52 ]
9. Register callbacks to the desired call events using the following lines of code:
var eventHandlers = {
'progress': function(e){ },
'failed': function(e){ },
'started': function(e){
var rtcSession = e.sender;
10. Attach local stream to selfView using the following lines of code:
if (rtcSession.getLocalStreams().length > 0) {
selfView.src = window.URL.createObjectURL(
11. Attach remote stream to remoteView using the following lines of code:
if (rtcSession.getRemoteStreams().length > 0) {
remoteView.src = window.URL.createObjectURL(
'ended': function(e){ /* Your code here */ }
var options = {
'eventHandlers': eventHandlers,
'extraHeaders': [ 'X-Foo: foo', 'X-Bar: bar' ],
'mediaConstraints': {'audio': true, 'video': true}
JSsipPhone.call('SIP:bob@somedomain.com', options);
12. The event handlers for messages are similar to the event handlers for a call.
To send or receive messages, use the existing function calls to add your
GUI responses, such as open a new window or show an alert on successful
sending of messages using the following lines of code:
var text = 'Hello';
// Register callbacks to desired message events
var eventHandlers = {
'succeeded': function(e){ },
'failed': function(e){ };
var options = {
'eventHandlers': eventHandlers
JSsipPhone.sendMessage('SIP:bob@somedomain.com', text, options);
Chapter 2
[ 53 ]
13. Run index.html that contains the phone elements and uses the jsSIP call
functions on any web server, such as JBoss or Apache.
14. Open the web page in the Google Chrome or Firefox web browser.
15. Register the client with SIP server-supporting WebSockets, such as Kamailio,
or use a WebSocket gateway as OverSIP.
16. Monitor the WebRTC client traces on Wireshark.
In a similar fashion, other SIP stacks can also be integrated with WebRTC media
APIs to make a ready script to make and receive WebRTC calls over SIP.
The SIP stack can also be a proprietary C code or adopted from
freely available version of the Internet. JavaScript will aim to invoke
the functions from an HTML-based web page. WebRTC browser
media APIs will provide a way to capture and route the media.
SIP servers
The WebRTC client with an SIP stack can be registered and can send an invitation
or give answers through an SIP server. The SIP server might or might not have the
support for WebSocket. This categorization can be understood in two parts:
This part consists of a WebRTC-compliant SIP server, and the caller
and receiver are both on SIP over WebSocket (SIP WS to SIP WS).
The WebRTC-compliant SIP server can belong to one of the following
two categories:
°Using open public domains (such as SIP2SIP, JSSIP Tryit Server,
or sipML5.org). This is demonstrated in the following diagram:
Web-based SIP
Webrtc Client
dynamic web
WebRTC Client
dynamic web
WebRTC client on local machine and web-based SIP server such as SIP2SIP.info
Making a Standalone WebRTC Communication Client
[ 54 ]
°Using locally hosted WebRTC-compatible SIP server (OfficeSIP).
This is demonstrated in the following diagram:
Local hosted SIP
server (officeSIP)
Web-based WebRTC client and local installed / configured SIP server
WebRTC clientWebRTC client
This part consists of a simple SIP server that does not respond to SIP over
WebSocket, but only to SIP (Sip WS to Sip). This server can belong to one
of the following two categories:
°Using the WebRTC2sip gateway as an inter-conversion node between
SIPWS and SIP. This enables the WebRTC client to connect with a
legacy SIP server (such as Bea WebLogic, Rhino Telecom Application
Server, and Brekeke), which does not have support for WebSocket
yet. The OverSIP gateway also achieves the same goal. This
architecture is diagrammatically represented as follows:
WebRTC client on one and SIP phone on another
SIP server that does not understand SIP over WebSocket
communicates to one another using webrtc2sip gateway
Webrtc2sip gateway
sip SIP server
(bea weblogic)
WebRTC Client
dynamic web
Native SIP
Chapter 2
[ 55 ]
°If we implement a Telecom Server with both WebSocket and SIP
support, then the traditional SIP clients and WebRTC clients can
connect to each other without the use of any external gateway.
This is due the fact that the server itself does the conversion
between the SIPWS and SIP protocols as and when a request arrives.
Kamailio, FreeSwitch, and Mobicents are some of the open source
SIP servers of this nature. This architecture is diagrammatically
represented as follows:
SIP server
WebRTC client communicating with SIP client through SIP server
that also acts as SIP-WebSockets to SIP convertor
SIP client
WebRTC client media
This section describes a SIP-WS to SIP-WS call, which involves making a call from
the WebRTC client to another WebRTC client using SIP over WebSocket as the
signaling protocol. To begin this task, we can use either the online-hosted demo
WebRTC-enabled projects of sipML5/jsSIP, or the self-compiled source code on the
local machine, as seen in the rst part of this chapter. In addition to this, we must
set up a WebRTC server to provide signaling. The signaling can be in any of the
following ways:
Publicly hosted SIP server with WebRTC support as SIP2SIP
SIP servers' executables hosted in our servers, such as the OfceSIP server
SIP servers built from source and hosted in our servers, such as Kamailio
Making a Standalone WebRTC Communication Client
[ 56 ]
To test the functionality of our customized WebRTC client, let's register it with the
SIP2SIP server.
The steps to register the client with the SIP2SIP server are as follows:
1. To register the client with the SIP2SIP SIP server, make an account at
2. Log in with the credentials. On the home page, click on the Identity
tab to view your public address and outbound proxy, as shown in
the following screenshot:
The SIP2SIP Internet-based account page
3. Go to our WebRTC client, leave the expert.htm page empty, and enter
values into the call.htm page directly. The following screenshot shows
the registration elds of the call.htm page to be registered with SIP2SIP:
4. Click on the Login button to register with the server. Successful registration
will be indicated by the connected status on the web page.
Chapter 2
[ 57 ]
OfceSIP is the Window's version of an SIP server. It is free for academic and
personal use. To use the OfceSIP server to register the clients we made, we will
rst have to install and congure it by performing the following steps:
1. Download the OfceSIP server msi le from http://www.officesip.com.
It is free for academic use. Click on the Next button on successive windows
to proceed with the installation of the OfceSIP software.
2. Start the admin console .exe le from the installation directory or the
shortcut icon that gets created during the installation or can be seen on the
Windows start menu. Alternatively, go to http://localhost:5060/admin/.
3. Add Domain Name from the Domain tab. Add users to the domain from the
.csv File tab under USERS, as shown in the following screenshot:
Making a Standalone WebRTC Communication Client
[ 58 ]
4. Register the WebRTC clients with the OfceSIP server and make calls.
5. In the earlier example, we made use of a simple browser console debug
logle to see the SIP transaction. The following screenshot shows Wireshark
traces for the OfceSIP server. This is used to view the incoming and
outgoing data packets:
SIP WS to SIP and vice-versa
The task of connecting a WebRTC client to a native SIP client such as X-Lite, Twinkle,
and SIP phone is dealt with in two ways:
Chapter 2
[ 59 ]
Through a WebRTC to SIP gateway, use a gateway that does the SIPWS to
SIP conversion so that the traditional SIP server in the SIP legacy network can
understand the SIP request originating from WebRTC clients. To understand this
better, we can consider any native SIP server such as the Brekeke SIP proxy registrar
server or Bea WebLogic Sip Application server. These do not understand the
WebSocket protocol in their default behavior.
The hosted server supports SIP over WebSocket. In this case, the WebRTC client
does not need a gateway to pass its SIP messages, as the SIP server itself understands
WebSocket with SIP protocol. There are some popular servers that understand
WebSocket, such as Kamailio, Asterisk, and FreeSWITCH. It is, however, required
that you customize the default behavior of these servers, and add the WebSocket
module to the conguration le before usage. We shall cover both of these
approaches in the sections that follow.
The gateway to convert SIP over WebSocket
to native SIP
There can be custom-built or open source SIPWS to SIP gateways. To be able to
communicate with these SIP servers, we need to rst use a WebRTC to SIP gateway,
such as WebRTC2sip or OverSIP.
The WebRTC2SIP gateway
WebRTC2Sip is a gateway that uses RTCWeb Breaker and SIP. It allows calls from
the SIP legacy network to operate with calls from the SIP-based WebRTC client.
It primarily has the following three modules:
SIP proxy is used to convert the SIP transport from WebSocket protocol to
UDP, TCP, or TLS; these are supported by all legacy networks
RTCWeb Breaker is used to negotiate and convert the media stream to allow
SIP legacy endpoints and WebRTC clients to interoperate
Media coder is for interoperability between different codecs supported by
different endpoints
Making a Standalone WebRTC Communication Client
[ 60 ]
The following diagram shows the overall functioning of the WebRTC to SIP gateway:
HTML5-based GUI
WebRTC media
Hosted on
Apache Tomcat
Web server
WebRTC client
Audio Codec
VP8 Codec
Video Jitter
opus, G.711, G.722
GSM, AMR, G.729 ,
Speex , UBC
Theora, MP4V-ES
VP8, H.264, H.263
opus, G.711
VP8, H.264
WebRTC to SIP gateway
SIP over
The call ow of the SIPWS request from the WebRTC client, conversion to a simple
SIP request, and the passage from the SIP legacy network to reach the SIP legacy
endpoint via the WebRTC2sip gateway is shown in the following gure:
web Browser webrtc2sip SIP-legacy Network SIP-legacy endpoint
200 OK F3
100 Trying F7
200 OK F10
200 OK F11
200 OK F9
200 OK F4
Legacy MediaRTCWeb Media
Chapter 2
[ 61 ]
The steps for the installation of the WebRTC2sip gateway are described as follows:
1. The source code for webrtc2sip can be downloaded from
http://WebRTC2sip.org/ or by executing the following svn checkout
statement from the terminal window:
svn checkout http://WebRTC2sip.googlecode.com/svn/trunk/
After this, follow the technical guide in the document folder.
2. The WebRTC2sip gateway depends on Doubango IMS Framework v2.0.
Therefore, to congure the WebRTC2sipgateway, we rst need to install
the Doubango IMS framework by running the following command line
in the command prompt:
svn checkout http://doubango.googlecode.com/svn/branches/2.0/
doubango doubango
3. Also, we need to install some mandatory and optional libraries such as the
following ones:
°libsrtp for SRTP
°openSSL for WSS
°libspeex and libspeexdsp (these are audio codecs)
°YASM to enable VPX (VP8 video codec) or x264 (H.264 codec)
°libvpx ,libyuv provide support for video calls
°libopus for Opus audio codec
°libgsm for GSM based audio codecs
°g729, iLBC for G.729, and iLBC audio codecs
°x264, FFmpeg for H.263, H.264, and MP4V-ES video codecs
4. Build and install Dubango using the following command lines:
cd doubango && ./autogen.sh && ./configure --with-ssl --with-srtp
--with-vpx --with-yuv
--with-amr --with-speex --with-speexdsp --with-gsm --with-ilbc
--with-g729 --with-ffm
--with- ffm-peg
make && make install
Making a Standalone WebRTC Communication Client
[ 62 ]
The following screenshot shows how Dubango IMS is installed to support
libraries for the WebRTC2sip gateway:
Chapter 2
[ 63 ]
5. Build and install the WebRTC2sip gateway using the following
command lines:
export PREFIX=/opt/WebRTC2sip
cd WebRTC2sip && ./autogen.sh && ./configure --prefix=$PREFIX
make clean && make && make install
Making a Standalone WebRTC Communication Client
[ 64 ]
6. The gateway is congured using the following XML le named config.xml,
and it is stored in the same folder where the gateway is running:
<?xml version="1.0" encoding="utf-8" ?>
<!--Few more fields omitted for clarity -->
The le species the ports for transport protocols. It also species the
preference for video size and codecs supported, among others.
7. Register the WebRTC client with the WS:// address that contains the
WebRTC2sip gateway. To make the interaction of our WebRTC client with
the SIP server without WebSocket support (in our case, Brekeke), we will use
a WebRTC to SIP gateway.
The WebRTC client with Brekeke SIP server
Brekeke is also a popular SIP server that does not support WebSocket as yet.
The following steps describe the process of conguring WebRTC to run through
this SIP server with the help of the WebSocket gateway:
1. Download and run Brekeke on a Windows machine
(refer to http://www.brekeke.com/downloads/sip-server.php).
Chapter 2
[ 65 ]
2. Congure the Brekeke SIP server through the admin console in the local
network/machine. Register the X-Lite phone through the Brekeke SIP
registrar, as shown in the following screenshot:
3. Register the WebRTC client through the WebRTC2sip server to the Brekeke
SIP server as well.
Enter the address of the WebRTC2sip gateway machine in the WS server
input box of the Expert settings page, for example, WS://115:90:56:4:443.
Enter the address of the SIP server machine that runs Brekeke in the
outbound proxy input box of the Expert settings page, for example,
4. Run and test the X-Lite call to the WebRTC client using the WebRTC2sip
gateway and SIP server.
Making a Standalone WebRTC Communication Client
[ 66 ]
The WebRTC client with the Kamailio SIP server
Kamailio is an open source SIP server that also supports SIP over WebSocket,
among other features. It can be hosted only on Linux-based machines. Due to
machine dependency of the gateway and scalability issues, I recommend that you
use the Kamailio SIP server as an open source option to set up WebRTC to any
SIPUA infrastructure. Let's try to get a basic conguration of Kamailio started.
Some prerequisites for the installation of the Kamailio SIP server should be installed
on the machine before starting to build Kamailio from source. The following are the
packages you need to install before installing Kamailio 4.1.1:
Git client
Gcc compile
Now, perform the following steps to install the Kamailio SIP server:
1. The rst step to congure the Kamailio SIP server is to get the source, its
compilation, and its installation. We should create a directory on the le
system, where the sources will be stored, using the following command line:
mkdir -p /usr/local/src/kamailio-4.0
2. We can download the sources from GIT using the following command lines:
git clone git://git.SIP-router.org/SIP-router kamailio
cd kamailio
git checkout -b 4.0 origin/4.0
3. Generate the cong les for the build system using the following command:
make cfg
4. The next step is to enable the MySQL module. For this, edit the modules.lst
le and add db_mysql to the variable include_modules as follows:
include_modules= db_mysql
Chapter 2
[ 67 ]
5. Once you add the mysql module to the list of enabled modules, you can now
compile and install it using the following commands:
make all
make install
You might get error messages in between the installation if some
prerequisites were not installed. If so, just install these using yum install.
The following screenshot shows the execution of the Kamailio make all and
make install commands after the GIT checkout:
6. The next step is to set the path to the installation directories. So, before we
proceed further, let's have a look at the root directories and other installed
paths. The binaries to execute Kamailio and add or delete users are installed
inside the sbin folder of the Kamailio installation directory. These binaries
are as follows:
°kamailio: This is the Kamailio SIP server
°kamdbctl: This is the script to create and manage the databases
Making a Standalone WebRTC Communication Client
[ 68 ]
°kamctl: This is the shell script to manage and control the Kamailio
SIP server
°sercmd: This is the command line tool to interface with the
Kamailio SIP server
The conguration les can be found inside the etc folder of the installation
directory. Kamailio modules are installed inside the module, modules_k, and
modules_s folders. One must ascertain that the installation path for modules
match those inside the cong les so that Kamailio doesn't yield an error
when it starts or, at the worst, at runtime. The following screenshot shows
the content of the sbin folder:
Chapter 2
[ 69 ]
7. To congure the Kamailio SIP server as per out environment needs,
we must edit the cong les. We have to add the IP address of the server
in the kamailio.cfg le. Add the following lines to kamailio.cfg,
if not already present:
#!define WITH_DEBUG
#!define WITH_MYSQL
#!define WITH_AUTH
#!define WITH_RLS
#!define WITH_XMLRPC
#!define WITH_TLS
Make sure that the following line regarding the SIP domain is uncommented
in the kamctlrc le:
SIP_DOMAIN=<your domain name, for example, Somedomain.com>
The database type can be MYSQL, PGSQL, ORACLE, DB_BERKELEY,
DBTEXT, or SQLITE, by default, none is loaded. Also, one has to specify
the database host, database name, user, and password.
8. In the next step, we will cover the process of adding more modules to the
existing setup. We must use the make and configure commands for this
purpose. Once the .so le is created, copy it to the folder where modules
are installed.
Making a Standalone WebRTC Communication Client
[ 70 ]
9. In the next and most crucial step, we must create the MySQL Kamailio
database. To create the MySQL database, we have to use the database
setup script. First, edit the kamctlrc le to set the database server type.
Locate the DBENGINE variable and set it to MYSQL as follows:
10. Once we are done updating the kamctlrc le, run the following script to
create the database used by Kamailio. We can nd kamdbctl inside the
sbin folder of the installation directory.
# ./kamdbctl create
The preceding script will add two users in MySQL:
°kamailio (with the default password as kamailiorw): This user has
full access rights to the kamailio database
°kamailioro (with the default password as kamailioro): This user
has read-only access rights to the kamailio database
We can check the database created inside MySQL using the show database
and show tables commands of MySQL.
11. To start Kamailio, go to sbin inside the Kamailio installation directory and
run the following command:
./kamailio start
Chapter 2
[ 71 ]
12. To add users to the database, use the kamctl command as follows:
./kamctl add <user><password>
13. Register the WebRTC clients with Kamailio by adding user details and
domain information in the call.htm registration section. Add the address of
the machine as ws://ip:port, for example, ws://, under
the WS server input box on the register.htm page. The status displayed on
the call.htm page should read Connected.
14. Register another user, and one can make multiple calls between them.
Making a Standalone WebRTC Communication Client
[ 72 ]
Setting up the admin console for Kamailio is an optional task. However, it's
recommended as it provides a graphical provisioning system to congure and
alert the settings of the SIP server. The following screenshot shows the admin GUI
SIREMIS, which gives a visual interface to server management rather than the
command prompt to monitor user accounts and usage statistics:
Chapter 2
[ 73 ]
We can also monitor the real-time trafc using the Wireshark protocol analyzer.
The following screenshot depicts the ow graph generated from Wireshark, which
captures on all interfaces using the SIP and the WebSocket lters. In the following
screenshot, the ow graph traces the Kamailio SIP server:
Making a Standalone WebRTC Communication Client
[ 74 ]
The ow depicts a call session that begins with an invite request traversing across
various network nodes. It's not important to trace the path of the signal for now;
however, the sequence of signal ows is a crucial task in determining that the
WebRTC client server model is performing well.
Limitations of the existing setup
We saw how to develop a WebRTC client, install an SIP server, and congure a
WebRTC to SIP environment. A sky view of our nal, existing client-server solution
setup for SIPWS signaling and WebRTC media so far is shown as follows:
SIP server as proxy
webrtc client
(depicted here sipml5))
WebRTC client
(depicted here sipML5))
compliant SIP
Signalling SIP over WebSockets, media P2P
As per our current setup status, only the WebRTC-enabled client and servers can
participate in the communication ow in an offer/answer (O/A) model.
There are, however, numerous limitations of the existing solution, some of which
are mentioned in the following sections. In the upcoming chapters, we shall do away
with most of the limitations.
Chapter 2
[ 75 ]
Firewall and NAT issues
The existing architecture does not provide the Network Address Translation (NAT)
technique to overcome the blockage due to rewalls and enterprise policies. As a
solution, we must see the alternative for public IP discovery in the WebRTC client
server setup. NAT is possible in the Kamailio server through RTP proxy modules
and STUN.
Media transcoding
If the codecs on two endpoints do not match for audio and video communication,
then it could lead to a session failure with an abrupt termination of calls when a user
picks up a ringing call. There is where the media transcoder is required to support
communication with non-WebRTC devices such as SIP phone and softphones.
As a solution, we can either use the RTCWeb Breaker, which converts SDP and
media streams for WebRTC and other UAs, or congure the media server such as
FreeSWITCH, which provides the functionality. The following diagram shows the
complete architecture with the STUN server and RTCWeb Breaker:
Protocol RTP/SAVP
Making a Standalone WebRTC Communication Client
[ 76 ]
A call ow that depicts the ow of media from the WebRTC client to the non-
WebRTC client (SIP phone) through the RTCWeb Breaker is shown as follows:
Call flow with Media Transcoder to connect WebRTC and non-WebRTC endpoints
Real time-Transport Protocol (RTP), which is the media ow mechanism in most
SIP clients, including SIP-based WebRTC, comprises two parts: the RTP data transfer
protocol and the RTP Control Protocol (RTCP). In addition to this, WebRTC also
mandates the use of Secure RTP Prole (RTP/SAVP) for RTCP-based feedback.
An RTP prole denes media parameters such as compression and encoding.
The RTP/SAVPF prole, as depicted in the following diagram, is the
combination of the basic RTP/AVP prole, the RTP prole for RTCP-based
feedback (RTP/AVPF), and the RTP/SAVP. The RTCP-based feedback extensions
are needed for the improved RTCP timer that enables features such as more exible
transmission and report of congestion.
Chapter 2
[ 77 ]
After fullling the limitations, there are some recommended enhancements in the
existing architecture; they contribute to making a robust, secure communication
platform. The limitations are described as follows:
Media should not be free owing between peer to peer but passing through a
media relay mechanism. A media relay mechanism involves a media server
to be in the path of the media ow. This way, the media server receives the
audio/video from one end and relays it to the other end. This leads to a
better control on the communication by centralized network nodes.
WebRTC client
(depicted here sipML5)
WebRTC client
(depicted here sipML5)) WebRTC-
compliant SIP
Signaling SIP over WebSockets till proxy SIP server Kamilio ,
SIP services(call screening, call waiting, call forwarding) at telecom application server
RTP media P2P
Tomcat web server
Kamailio SIP server
as proxy
Media Relay and
media (SRTP)
Ipv4 and Ipv6 must be supported.
Making a Standalone WebRTC Communication Client
[ 78 ]
The Telecom Application server is needed to embed the logic of SIP services
such as call waiting, call forwarding, and call screening.
WebRTC client
(depicted here SipML5))
WebRTC client
(depicted here sipML5)) WebRTC-
compliant SIP
Signaling SIP over WebSockets till proxy SIP server Kamilio ,
SIP services(call screening, call waiting, call forwarding) at telecom application server
media P2P
Tomcat web server
Telecom Application
Kamailio SIP server
as proxy
Database implementation must happen to keep track of calls, user
authentication, user prole, and so on.
The monitoring tools allow for real-time statistics that, in turn, help the
service provider to make predictive judgments and review the status at real
time. This also aids in charging and billing if the service provider opts to bill
the customer.
We will overcome these and a few more limitations when integrating with the IP
Multimedia Subsystem (IMS) environment. They will be discussed in detail in the
next chapter.
Chapter 2
[ 79 ]
In this chapter, we learned how to make a dynamic web application for the WebRTC
client using primitive building blocks such as CSS, JavaScript, SIP library, and
HTML form elements. We also saw the setup of various kinds of SIP servers and
their applicability in establishing an end-to-end call. In this process, we studied the
implementation of WebSocket-supported SIP servers. We also studied the integration
of the SIP WebRTC client with non-WebSocket supported SIP servers, through
WebSocket gateways.
In essence, we learned about how client development and essential servers help
to support the WebRTC SIP infrastructure. This includes the Tomcat web server,
which caters to the loading of a web page and the HTTP handshake; the Kamailio
SIP server, which acts as a registrar; and the SIP proxy node. The WebRTC client
programs used open source libraries such as jsSIP and sipML5. The interaction and
challenges inherent in communication between non-WebRTC sip endpoints, such as
SIP phones and softphone, were also discussed.
WebRTC with SIP and IMS
IP Multimedia Subsystem (IMS) is an architectural framework for IP Multimedia
communications and IP telephony based on Convergent applications. It species
three layers in a telecom network:
Transport or Access layer: This is the bottom-most segment responsible for
interacting with end systems such as phones.
IMS layer: This is the middleware responsible for authenticating and routing
the trafc and facilitating call control through the Service layer.
Service or Application layer: This is the top-most layer where all of the call
control applications and Value Added Services (VAS) are hosted.
IMS standards are dened by Third Generation Partnership Project (3GPP)
which adopt and promote Internet Engineering Task Force (IETF) Request for
Comments (RFCs). Refer to http://www.3gpp.org/technologies/keywords-
acronyms/109-ims to learn more about 3GPP IMS specication releases.
This chapter will walk us through the interaction of WebRTC client with important
IMS nodes and modules. The WebRTC gateway is the rst point of contact for the
SIP requests from the WebRTC client to enter into the IMS network. The WebRTC
gateway converts SIP over WebSocket implementation to legacy/plain SIP, that is,
a WebRTC to SIP gateway that connects to the IMS world and is able to communicate
with a legacy SIP environment. It also can translate other REST- or JSON-based
signaling protocols into SIP. The gateway also handles the media operation that
involves DTLS, SRTP, RTP, transcoding, demuxing, and so on.
In the previous chapter, we saw how to create the WebRTC environment using
the SIP server that has WebSocket capabilities. In this chapter, we will study a case
where there exists a simple IMS core environment, and the WebRTC clients are
meant to interact after the signals are traversed through core IMS nodes such as
Call Session Control Function (CSCF), Home Subscriber Server (HSS), and
Telecom Application Server (TAS).
WebRTC with SIP and IMS
[ 82 ]
The Interaction with core IMS nodes
This section describes the sequence of steps that must be followed for the integration
of the WebRTC client with IMS. Before you go ahead, set up a Session Border
Controller (SBC) / WebRTC gateway / SIP proxy node for the WebRTC client
to interact with the IMS control layer.
1. Direct the control towards the CSCF nodes of IMS, namely, Proxy-CSCF,
Interrogating-CSCF, and Serving-CSCF.
2. The subscriber details and the location are updated in the HSS.
3. Serving-CSCF (SCSCF) routes the call through the SIP Application Server
to invoke any services before the call is processed. The Application Server,
which is part of the IMS service layer, is the point of adding logic to call
processing in the form of VAS.
4. Additionally, we will uncover the process of integrating media server for an
inter-codec conversion between legacy SIP phones and WebRTC clients.
The setup will allow us to support all SIP nodes and endpoints as part of the IMS
landscape. We will follow the interaction of the WebRTC SIP client with IMS nodes,
assuming that the SIPWS to SIP gateway is congured, as described in Chapter 2,
Making a Standalone WebRTC Communication Client.
The following gure shows the placement of the SIPWS to SIP gateway in the
IMS network:
IMS Network
WebRTC Server
Chapter 3
[ 83 ]
The WebRTC client is a web-based dynamic application that is run over a Web
Application Server. For simplication, we can club the components of the WebRTC
client and the Web Application Server together and address them jointly as the
WebRTC client, as shown in the following diagram:
HTTP WebRTC client
There are four major components of the OpenIMS core involved in this setup
as described in the following sections. Along with these, two components of the
WebRTC infrastructure (the client and the gateway) are also necessary to connect the
WebRTC endpoints. Three optional entities are also described as part of this setup.
The components of Open IMS are CSCF nodes and HSS. More information on each
component is given in the following sections.
The Call Session Control Function
The three parts of CSCF are described as follows:
Proxy-CSCF (P-CSCF) is the rst point of contact for a user agent (UA) to
which all user equipments (UEs) are attached. It is responsible for routing
an incoming SIP request to other IMS nodes, such as registrar and Policy
and Charging Rules Function (PCRF), among others.
Interrogating-CSCF (I-CSCF) is the inbound SIP proxy server for querying
the HSS as to which S-CSCF should be serving the incoming request.
Serving-CSCF (S-CFCS) is the heart of the IMS core as it enables centralized
IMS service control by dening routing paths that act like the registrar,
interact with the Media Server, and much more.
Home Subscriber System
IMS core Home Subscriber System (HSS) is the database component responsible for
maintaining user proles, subscriptions, and location information. The data is used in
functions such as authentication and authorization of users while using IM services.
WebRTC with SIP and IMS
[ 84 ]
The components of the WebRTC infrastructure primarily comprises of WebRTC Web
Application Servers, WebRTC web-based clients, and the SIP gateway.
WebRTC Web Application Server and client: The WebRTC client is
intrinsically a web application that is composed of user interfaces, data access
objects, and controllers to handle HTTP requests. A Web Application Server
is where an application is hosted. As WebRTC is a browser-based technique,
it is meant to be an HTML-based web application. The call functionalities
are rendered through the SIP JavaScript les. The browser's native WebRTC
capabilities are utilized to capture and transmit the data. A WebRTC service
provider must embed the SIP call functions on a web page that has a call
interface. It must provide values for the To and From SIP addresses, div to
play audio/video content, and access to users' resources such as camera,
mic, and speakers.
WebRTC to IMS gateway: This is the point where the conversion of the
signal from SIP over WebSockets to legacy/plain SIP takes place. It renders
the signaling into a state that the IMS network nodes can understand. For
media, it performs the transcoding from WebRTC standard codecs to others.
It also performs decryption and demux of audio/video/RTCP/RTP.
There are other servers that act as IMS nodes as well, such as the STUN/TURN
Server, Media Server, and Application Server. They are described as follows:
STUN/TURN Server: These are employed for NAT traversals and
overcoming rewall restrictions through ICE candidates. They might not be
needed when the WebRTC client is on the Internet and the WebRTC gateway
is also listening on a publicly accessible IP.
Media Server: Media server plays a role when media relay is required
between the UEs instead of a direct peer-to-peer communication. It also
comes into picture for services such as voicemail, Interactive Voice
Response (IVR), playback, and recording.
Application Server (AS): Application Server is the point where developers
can make customized logic for call control such as VAS in the form of call
redirecting in cases when the receiver is absent and selective call screening.
Chapter 3
[ 85 ]
The IP Multimedia Subsystem core
IMS is an architecture for real-time multimedia (voice, data, video, and messaging)
services using a common IP network. It denes a layered architecture. According
to the 3GPP specication, IMS entities are classied into six categories:
Session management and route (CSCF, GGSN, and SGSN)
Database (HSS and SLF)
Interworking elements (BGCF, MGCF, IM-MGW, and SGW)
Service (Application Server, MRFC, and MRFP)
Strategy support entities (PDF)
Interoperability with the SIP infrastructure requires a session border controller
to decrypt the WebRTC control and media ows. A media node is also set up for
transcoding between WebRTC codecs and other legacy phones. When a gateway
is involved, the WebRTC voice and video peer connections are between the browser
and the border controller. In our case, we have been using Kamailio in this role
(refer to Chapter 2, Making a Standalone WebRTC Communication Client). Kamailio
is an open source SIP server capable of processing both SIP and SIPWS signaling.
As WebRTC is made to function over SIP-based signaling, it is applicable to enjoy all
of the services and solutions made for the IMS environment. The telecom operators
can directly mount the services in the Service layer, and subscribers can avail the
services right from their web browsers through the WebRTC client. This adds a new
dimension to user accessibility and experience. A WebRTC client's true potential will
come into effect only when it is integrated with the IMS framework.
We have some readymade, open IMS setups that have been tested for
WebRTC-to-IMS integration. The setups are as follows:
3GPP IMS: This is the IMS specication by 3GPP, which is an association
of telecommunications group
OpenIMS: This is the open source implementation of the IMS CSCFs
and a lightweight HSS for the IMS core
DubangoIMS: This is the cross-platform and open source 3GPP IMS/LTE
KamailioIMS: Kamailio Version 4.0 and above incorporates IMS support
by means of OpenIMS
WebRTC with SIP and IMS
[ 86 ]
We can also use any other IMS structure for the integration. In this chapter, we
will demonstrate the use of OpenIMS. For this, it is required that a WebRTC client
and a non-WebRTC client must be interoperable by means of signaling and media
transcoding. Also, the essential components of IMS world, such as HSS, Media
Server, and Application Server, should be integrated with the WebRTC setup.
The OpenIMS Core
The Open IMS Core is an open source implementation for core elements of the IMS
network that includes IMS CSCFs nodes and HSS. The following diagram shows
how a connection is made from WebRTC to CSCF:
VoIP client
IMS network Core
WebRTC IM $gateway
Chapter 3
[ 87 ]
The following are the prerequisites to install the Open IMS core:
Make sure that you have the following packages installed on your Linux
machine, as their absence can hinder the IMS installation process:
°Git and Subversion
°GCC3/4, Make, JDK1.5, Ant
°MySQL as the database
°Bison and Flex, the Linux utilities
°libxml2 (Version 2.6 and above) and libmysql with
development versions
Install these packages from the Synaptic package manager or using the
command prompt.
For the LoST interface of E-CSCF, use the following command lines:
sudo apt-get install mysql-server libmysqlclient15-dev libxml2
libxml2-dev bind9 ant flex bison curl libcurl4-gnutls-dev
sudo apt-get install curl libcurl4-gnutls-dev
The Domain Name Server (DNS), bind9, should be installed and run.
To do this, we can run the following command line:
sudo apt-get install bind9
We need a web browser to review the status of the connection on the
web console. To download a web browser, go to its download page.
For example, Chrome can be downloaded from https://www.google.com/
We must verify that the Java version installed is above 1.5 so as to not
break the compilation process in between, and set the path of JAVA_HOME
as follows:
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/jre
The output of the command line that checks the Java version is as follows:
WebRTC with SIP and IMS
[ 88 ]
The following are the steps to install OpenIMS. As the source code is precongured
to work from a standard le path of /opt, we will use the predened directory
for installation.
1. Go to the /opt folder and create a directory to store the OpenIMS core,
using the following command lines:
mkdir /opt/OpenIMSCore
cd /opt/OpenIMSCore
2. Create a directory to store FHOSS, check out the HSS, and compile the
source using the following command lines:
mkdir FHoSS
svn checkout http://svn.berlios.de/svnroot/repos/openimscore/
FHoSS/trunk FHoSS
cd FHoSS
ant compile deploy
Note that the code requires Java Version 7 or lower to work.
3. Also, create a directory to store ser_ims, check out the CFCs, and then
install ser_ims using the following command lines:
mkdir ser_ims
svn checkout http://svn.berlios.de/svnroot/repos/openimscore/ser_
ims/trunk ser_ims
cd ser_ims
make install-libs all
After downloading and installing the OpenIMS installation directory,
its contents are as follows:
Chapter 3
[ 89 ]
By default, the nodes are congured to work only on the local loopback, and the
default domain congured is open-ims.test. The MySQL access rights are also
set only for local access. However, this can be modied using the following steps:
1. Run the following command line:
2. Replace (the default IP for the localhost) with the new IP address
that is required to congure the IMS Core server.
3. Replace the home domain (open-ims.test) with the required domain name.
4. Change the database passwords.
The following gure depicts the domain change process through
5. To resolve the domain name, we need to add a new IMS domain to bind the
conguration directory. Change to the system's bind folder (cd /etc/bind)
and copy the open-ims.dnszone le there after replacing the domain name.
sudo cp /opt/OpenIMSCore/ser_ims/cfg/open-ims.dnszone /etc/bind/
6. Open the name.conf le and include open-ims.dnszone in the list that
already exists:
include "/etc/bind/named.conf.options";
include "/etc/bind/named.conf.local";
include "/etc/bind/named.conf.default-zones";
include "/etc/bind/open-ims.dnszone";
One can also add a reverse zone file, which, contrary to the
DNS zone file, converts an address to a name.
7. Restart the naming server using the following command:
sudo bind9 restart
WebRTC with SIP and IMS
[ 90 ]
8. On occasion of any failure or error note, the system logs/reports can be
generated using the following command line:
tail -f /var/log/syslog
9. Open the MySQL client (sudo mysql) and add the SQL scripts for the
creation of database and tables for HSS operations:
mysql -u root -p -h localhost<ser_ims/cfg/icscf.sql
mysql -u root -p -h localhost<FHoSS/scripts/hss_db.sql
mysql -u root -p -h localhost<FHoSS/scripts/userdata.sql
The following screenshot shows the tables for the HSS database:
Users should be registered with a domain (that is, one needs to make
changes in the userdata.sql file by replacing the default domain
name with the required domain name). Note that while it is not
mandatory to change the domain, it is a good practice to add a new
domain that describes the enterprise or service provider's name.
Chapter 3
[ 91 ]
The following screenshot shows user domains changed from the default to
the personal domain:
10. Copy the pcscf.cfg, pcscf.sh, icscf.cfg, icscf.xml, icscf.sh, scscf.
cfg, scscf.xml, and scscf.sh les to the /opt/OpenIMSCore location.
11. Start the Policy Call Session Control Function (PCSCF) by executing the
pcscf.sh script. The default element port assigned for P-CSCF is 4060.
A screenshot of the running of PCSCF is as follows:
WebRTC with SIP and IMS
[ 92 ]
12. Start the Interrogating Call Session Control Function (I-CSCF) by executing
the icscf.sh script.
The default element port assigned to I-CSCF is 5060. If the scripts display a
warning about connection, it is just because the FHoSS client still needs to be
started. A screenshot of the running I-CSCF is as follows:
Chapter 3
[ 93 ]
13. Start SCSCF by executing the scscf.sh script. The default element port
assignment for S-CSCF is 6060.
A screenshot of the running SCSCF is as follows:
WebRTC with SIP and IMS
[ 94 ]
14. Start the FOKUS Home Subscriber Server (FHoSS) by executing
The HSS interacts using the diameter protocol. The ports used for this
protocol are 3868, 3869, and 3870.
A screenshot of the running HSS is shown as follows:
15. Go to http://<yourip>:8080 and log in to the web console with
hssAdmin as the username and hss as the password as shown in the
following screenshot.
Chapter 3
[ 95 ]
16. To register the WebRTC client with OpenIMS, we must use an IMS gateway
that performs the function of converting the SIP over WebSocket format to
SIP. In order to achieve this, use the IP port or domain of the PCSCF node
while registering the client.
The ow will be from the WebRTC client to the IMS gateway to the PCSCF
of the IMS Core. The ow can also be from the SIPML5 WebRTC client to the
webrtc2sip gateway to the PCSCF of the OpenIMS Core.
The subscribers are visible in the IMS subscription section of the portal of OpenIMS.
The following screenshot shows the user identities and their statuses on a web-based
admin console:
As far as other components are concerned, they can be subsequently added to
the core network over their respective interfaces. We can study the integration
of Policy Control Resource Function, Application Server, Media Server, and other
vital components in Chapter 7, WebRTC with Industry Standard Frameworks.
WebRTC with SIP and IMS
[ 96 ]
The Telecom server
The TAS is where the logic for processing a call resides. It can be used to add
applications such as call blocking, call forwarding, and call redirection according to
the predened values. The inputs can be assigned at runtime or stored in a database
using a suitable provisioning system. The following diagram shows the connection
between WebRTC and the IMS Core Server:
VoIP client
IMS network Core
WebRTC IM $gateway
WebRTC Client
Application Server
For demonstration purposes, we can use an Application Server that can host SIP
servlets and integrate them with IMS core.
The Mobicents Telecom Application Server
Mobicents SIP Servlet and Java APIs for Integrated Networks-Service Logic
Execution Environment (JAIN-SLEE) are open platforms to deploy new call
controller logic and other converged applications. The steps to install Mobicents
TAS are as follows:
1. Download the SIP Application Server logic package from
2. Unzip the contents. Make sure that the Java environment variables are in place.
3. Start the JBoss container from mobicents\jboss-5.1.0.GA\bin
Chapter 3
[ 97 ]
In case of MS Windows, click on run.bat, and for Linux, click on run.sh.
The following gure displays the traces on the console when the server is
started on JBoss:
4. The Mobicents application can also be developed by installing the
Tomcat/Mobicents plugin in Eclipse IDE. The server can also be added
for Mobicents instance, enabling quick deployment of applications.
5. Open the web console to review the settings. The following screenshot
displays the process:
Mobicents SLEE Management console home screen in a web browser
WebRTC with SIP and IMS
[ 98 ]
6. In order to deploy Resource Adaptors, enter:
ant -f resources/<name of resource adapter>/build.xml deploy
7. To undeploy the resource adapters, execute ant undeploy with the name of
the resource adapter:
ant -f resources/<name of resource adapter>/build.xml undeploy
Make sure that you have Apache Ant 1.7. The deployed instances should
be visible in a web console as follows:
Services deployed on Mobicents Telecom Application Server
8. To deploy and run SIP Servlet applications, use the following command line:
ant -f examples/<name of application directory>/build.xml deploy-
Resources hosted on Mobicents Telecom Application Server
9. Congure CSCF to include the Application Server in the path of every
incoming SIP request and response.
Chapter 3
[ 99 ]
With the introduction of TAS, it is now possible to provide customized call control
logic to all subscribers or particular subscribers. The SIP solution and services can
range from simple activities, such as call screening and call rerouting, to a complex
call-handling application, such as selective call screening based on the user's
calendar. Some more examples of SIP applications are given as follows:
Speed Dial: This application lets the user make a call using pre-programmed
numbers that map to actual SIPURIs of users.
Click to Dial: This application makes a call using a web-based GUI.
However, it is very different from WebRTC, as it makes/receives the call
through an external SIP phone.
Find me Follow Me: This application is benecial if the user is registered
on multiple devices simultaneously, for example, SIP phone, X-Lite, and
WebRTC. In such a case, when there is an incoming call, each of the user's
devices rings for few seconds in order of their recent use so that the user
can pick the call from the device that is nearest to him.
These services are often referred to as VAS, which can be innovative and can take the
user experience to new heights.
The Media Server
To enable various features such as Interactive Voice Response (IVR), record
voice mails, and play announcements, the Media Server plays a critical role. The
Media Server can be used as a standalone entity in the WebRTC infrastructure or
it can be referenced from the SIP server in the IMS environment.
The FreeSWITCH Media Server
FreeSWITCH has powerful Media Server capabilities, including those for functions
such as IVR, conferencing, and voice mails. We will rst see how to use FreeSWITCH
as a standalone entity that provides SIP and RTP proxy features.
Let's try to congure and install a basic setup of FreeSWITCH Media Server using
the following steps:
1. Download and store the source code for compilation in the /usr/src folder,
and run the following command lines:
cd usr/src
git clone -b v1.4 https://stash.freeswitch.org/scm/fs/freeswitch.
WebRTC with SIP and IMS
[ 100 ]
2. A directory named freeswitch is made using the following command line
and binaries will be stored in this folder. Assign all permissions to it.
sudo chown -R <username> /usr/local/freeswitch
Replace <username> with the name of the user who has the ownership
of the folder.
3. Go to the directory where the source will be stored, that is, the following
cd /usr/src/freeswitch
4. Then, run bootstrap using the following command line:
5. One can add additional modules by editing the conguration le using
the vi editor. We can open our le using the following command line:
vi modules.conf
The names of the module are already listed. Remove the # symbol before
the name to include the module at runtime, and add # to skip the module.
Then, run the congure command:
./configure --enable-core-pgsql-support
6. Use the make command and install the components:
make && make install
7. Go to the Soa prole and uncomment the parameters dened for
WebSocket binding. By doing so, the WebRTC clients can register with
FreeSWITCH on port 443.
Soa is an SIP stack used by FreeSWITCH. By default, it supports only pure
SIP requests. To get WebRTC clients, register with FreeSWITCH's SIP Server.
<!-- uncomment for SIP over WebSocket support -->
<!-- <param name="ws-binding" value=":443"/>
8. Install the sound les using the following command line:
make all cd-sounds-install cd-moh-install
9. Go to the installation directory, and in the vars.xml le under freeswitch/
conf/ make sure that the codec preferences are set as follows:
<X-PRE-PROCESS cmd="set" data="global_codec_
<X-PRE-PROCESS cmd="set" data="outbound_codec_
Chapter 3
[ 101 ]
10. Make sure that the SIP prole is directly using the codec values as follows:
<param name="inbound-codec-prefs" value="$${global_codec_prefs}"/>
<param name="outbound-codec-prefs" value="$${global_codec_
We can later add more codecs such as vp8 for video calling/conferencing.
11. To start FreeSWITCH, go to the /freeswitch/bin installation directory and
run FreeSWITCH.
12. Run the command-line console that will be used to control and monitor the
passing SIP packets by going to the /freeswitch/bin installation directory
and executing fs_cli.
The following is the screenshot of the FreeSWITCH client console:
WebRTC with SIP and IMS
[ 102 ]
13. Go to the /freeswitch/conf/SIP_profile installation-directory and look
for the existing conguration les.
14. Load and start the SIP prole using the following command line:
sofia profile <name of profile> start load
15. Restart and reload the prole in case of changes using the following
command line:
sofia profile <name of profile>restart reload
16. Check its working by executing the following command line:
Sofia status
17. We can check the status of the individual SIP prole by executing the
following command line:
sofia status profile <name of profile> reg
The preceding gure depicts the status of the users registered with the server at one
point of time.
Chapter 3
[ 103 ]
Media Services
The following steps outline the process of using the FreeSWITCH media services:
1. Register the SIP softphone and WebRTC client using FreeSWITCH.
2. Use sample values between 1000 and 1020 initially. Later, we can
congure for more users as specied by the /freeswitch/conf/directory
installation directory.
3. The following are the sample values to register Kapanga:
° Username: 1002
° Display name: any
° Domain/Realm:
° Outbound proxy:
° Authorization user: 1002
° Password: 1234
4. The sample value for WebRTC client registration, if, for example, we decide
to use the Sipml5webrtc client, for example, will be as follows:
°Display name: any
°Private identity: 1001
°Public identity: SIP:1001@
°Password: 1234
°WebSocket Server URL: ws://
Note that the values used here are arbitrary for the purpose of
WebRTC with SIP and IMS
[ 104 ]
IP denotes the public IP of the FreeSWITCH machine and the port is the
WebSocket congured port in the Soa prole. As seen in the following
screenshot, it is required that we tick the Enable RTCWeb Breaker option
in Expert settings to compensate for the incompatibility between the
WebSocket and SIP standards that might arise:
5. Make a call between the SIP softphone and WebRTC client. In this case,
the signal and media are passing through FreeSWITCH as proxy.
User A
(SIP phone / WebRTC)
User B
(SIP phone / WebRTC)
User A
(SIP phone / WebRTC)
User B
(SIP phone / WebRTC)
Media Media
Call from a WebRTC client is depicted in the following screenshot, which
consists of SIP messages passing through the FreeSWITCH server and are
therefore visible in the FreeSWITCH client console. In this case, the server is
operating in the default mode; other modes are bypass and proxy modes.
Chapter 3
[ 105 ]
6. Make a call between two WebRTC clients, where SIP and RTP are passing
through FreeSWITCH as proxy.
WebRTC with SIP and IMS
[ 106 ]
We can use other services of FreeSWITCH as well, such as voicemail, IVR,
and conferencing. We will cover them in later chapters of the book.
We can also congure this setup in such a way that media passes through the
FreeSWITCH Media Server, and the SIP signaling is via the Telecom Kamailio
SIP server.
Use the RTP proxy in the SIP proxy server, in our case, Kamailio, to pass the RTP
media through the Media Server. The RTP proxy module of Kamailio should be
built in a format and congured in the kamailio.cfg le. The RTP proxy forces
the RTP to pass through a node as specied in the settings parameters. It makes the
communication between SIP user agents behind NAT and will also be used to set up
a relaying host for RTP streams. Congure the RTP Engine as the media proxy agent
for RTP. It will be used to force the WebRTC media through it and not in the old
peer-to-peer fashion in which WebRTC is designed to operate. Perform the following
steps to congure the RTP Engine:
1. Go to the Kamailio installation directory and then to the RTPProxy module.
Run the make command and install the proxy engine:
cd rtpproxy
./configure && make
2. Load the module and parameters in the kamailio.cfg le:
loadmodule "rtpproxy.so"
modparam("rtpproxy", "rtpproxy_sock",
3. Add rtpproxy_manage() for all of the requests and responses in the
kamailio.cfg le. The example of rtpproxy_manage for INVITE is:
if (is_method("INVITE")) {
4. Get the source code for the RTP Engine using git as follows:
5. Go to the daemon folder in the installation directory and run the make
command as follows:
sudo make
Chapter 3
[ 107 ]
6. Start rtpengine in the default user space mode on the local machine:
sudo ./rtpengine --ip= --listen-ng=12334
7. Check the status of rtpengine, which is running, using the
following command:
ps -ef|greprtpengine
Note that rtpengine must be installed on the same
machine as the Kamailio SIP server.
8. In case of the sipml5 client, after conguring the modules described in
the preceding section and before making a call through the Media Server,
the ow for the media will become one of the following:
°In case of Voicemail/IVR, the flow is as follows:
WebRTC client to RTP proxy node to Media Server
°In case of a call through media relay, the flow is as follows:
WebRTC client A to RTP proxy node to Media Server to RTP Proxy
to WebRTC client B
The following diagram shows the MediaProxy relay between WebRTC clients:
communication media proxy dispatcher
Media Proxy
Jssip client Jssip client
Media Media
MediaProxy relay
WebRTC with SIP and IMS
[ 108 ]
The potential of media server lies in its media transcoding of various codecs.
Different phones / call clients / softwares that support SIP as the signaling protocol
do not necessarily support the same media codecs. In the situation where Media
Server is absent and the codecs do not match between a caller and receiver, the
attempt to make a call is abruptly terminated when the media exchange needs to
take place, that is, after invite, success, response, and acknowledgement are sent.
In the following gure, the setup to traverse media through the FreeSWITCH Media
Server and signaling through the Kamailio SIP server is depicted:
Media Path and Signal path from
WebRTC to other end-points
WebRTC client SIP softphone
SIP gateway SIP gateway
phone PSTN phone
Media Server
Kamailio SIP Server
(supported protocols legacy SIP
and SIP WS)
signal (SIP)
media (RTP)
The role of the rtpproxyng engine is to enable media to pass via Media Server;
this is shown in the following diagram:
Media Path and Signal path from
WebRTC to other end-points
WebRTC client SIP softphone
SIP gateway SIP gateway
phone PSTN phone
Media Server
Kamailio SIP Server
(supported protocols legacy SIP
and SIP WS)
Chapter 3
[ 109 ]
WebRTC over rewalls and proxies
There are many complicated issues involved with the correct working of WebRTC
across domains, NATS, geographies, and so on. It is important for now that the
rewall of a system, or any kind of port-blocking policy, should be turned off to be
able to make a successful audio-video WebRTC call across any two parties that are
not on the same Local Area Network (LAN).
For the user to not have to switch the rewall off, we need to congure the
Simple Traversal of UDP through NAT (STUN) server or modify the Interactive
Connectivity Establishment (ICE) parameter in the SDP exchanged. STUN helps
in packet routing of devices behind a NAT rewall. STUN only helps in device
discoverability by assigning publicly accessible addresses to devices within a
private local network.
WebRTC with SIP and IMS
[ 110 ]
Traversal Using Relay NAT (TURN) servers also serve to accomplish the task of
interconnecting the endpoints behind NAT. As the name suggests, TURN forces
media to be proxied through the server.
To learn more about ICE as a NAT-traversal mechanism, refer to
the ofcial document named RFC 5245.
The ICE features are dened by sipML5 in the sipml.js le. It is added to SIP SDP
during the initial phase of setting up the SIP stack. Snippets from the sipml.js le
regarding ICE declaration are given as follows:
var configuration = {
websocket_proxy_url: 'ws://',
outbound_proxy_url: 'udp://',
ice_servers: [{ url: 'stun:stun.l.google.com:19302'}, {
url:'turn:user@numb.viagenie.ca', credential:'myPassword'}],
Under the postInit function in the call.htm page add the following function:
oConfigCall = {
events_listener: { events: '*', listener: onSipEventSession },
SIP_caps: [
{ name: '+g.oma.SIP-im' },
{ name: '+SIP.ice' },
{ name: 'language', value: '\"en,fr\"' }
Therefore, the WebRTC client is able to reach the client behind the rewall itself;
however, the media displays unpredicted behavior.
Chapter 3
[ 111 ]
In the need to create our own STUN-TURN server, you can take the help of RFC
5766, or you can refer to open source implementations, such as the project at the
following site:
When setting the parameters for WebRTC, we can add our own STUN/TURN
server. The following screenshot shows the inputs suitable for ICE Servers if
you are using your own TURN/STUN server:
If there are no rewall restrictions, for example, if the users are on the same network
without any corporate proxies and port blocks, we can omit the ICE by entering
empty brackets, [], in the ICE Servers option on the Expert settings page in the
WebRTC client.
WebRTC with SIP and IMS
[ 112 ]
The nal architecture for the
WebRTC-to-IMS integration
At the end of this chapter, we have arrived at an architecture similar to the following
diagram. The diagram depicts a basic WebRTC-to-IMS architecture.
Application Layer
Network Control Layer
Transport Layer Web Client
Media relay
The diagram depicts the WebRTC client in the Transport Layer as it is the user
endpoint. The IMS entities (CSCF and HSS), WebRTC to IMS gateway, and Media
Server nodes are placed on the Network Control Layer as they help in signal
and media routing. The applications for call control are placed in the top-most
Application Layer that processes the call control logic. This architecture serves to
provide a basic IMS-based setup for SIP-based WebRTC client interaction.
Chapter 3
[ 113 ]
In this chapter, we saw how to interconnect the WebRTC setup with the IMS
infrastructure. It included interaction with CSCF nodes, namely PCSCF, ICSCF,
and SCSCF, after building and installing them from their sources. Also, FreeSWITCH
Media Server was discussed, and the steps to build and integrate it were practiced.
The Application Server to embed call control logic is Kamailio. NAT traversal via
STUN / TURN server was also discussed and its importance was highlighted.
To deploy the WebRTC solution integrated with the IMS network, we must ensure
that all of the required IMS nodes are consulted while making a call, the values are
reected in the HSS data store, and the incoming SIP request and responses are
routed via call logic of the Application Server before connecting a call.
In the next chapter, we will see the interaction of the WebRTC client and server logic
with Intelligent Networks (IN). The process of establishing communication between
the WebRTC client from the web browser and mobile handset will be discussed
using the GSM and GPRS technologies.
WebRTC Integration with
Intelligent Network
In the previous chapters, we saw the WebRTC client and server in a standalone
environment. We also studied the WebRTC client integration with IP Multimedia
Subsystem (IMS) Core and Media Server. In this chapter, we will discuss the
WebRTC client's interaction with mobile handsets by utilizing the telecom service
provider's GSM-based network, which is also known as Intelligent Network (IN).
The chapter has been contracted from two main viewpoints: making a call to
the WebRTC client through a mobile phone via the IMS network and applications
and making calls between the WebRTC client and a mobile phone via the service
logic of IN.
There are three ways one can make a call to a WebRTC client through a
mobile phone:
Using the mobile data packet, GPRS, to access the WebRTC client's web
page in WebRTC-enabled mobile browsers (web view)
Using the circuit-switched voice network, GSM, to call an SIP-based
WebRTC client
Using an Android-native SIP app to call a WebRTC client (this will be covered
in Chapter 9, Native SIP Application and Interaction with WebRTC Clients)
We will be covering all of the preceding approaches in this chapter. We will look
into every possible way to enable mobile phones to communicate with WebRTC
endpoints. We will also touch on the process of sending SIP messages to GSM
phones in the form of Short Message Service (SMS).
WebRTC Integration with Intelligent Network
[ 116 ]
The process of integrating the IN service logic to IMS/SIP and further to WebRTC
endpoints is also discussed in detail in this chapter. There are two ways in which an
IN application can be used by a WebRTC SIP client; they are as follows:
Use of Reverse IMSSF to use IN service logic in IMS
Use service broker for the orchestration of applications from the WebRTC SIP
IMS and GSM IN worlds
We will begin the discussion on General Packet Radio Services (GPRS) usage to run
WebRTC web pages in a mobile browser.
From mobiles to WebRTC client through
In this section, we will discuss the use of mobile data packets, GPRS, to access
the WebRTC client in WebRTC-enabled mobile browsers. Through generations
of telecom evolution, the connectivity to IP network has undergone a signicant
change. The rst generation services, which comprised xed line phones such as
Public Switched Telephone Network (PSTN) / Integrated Switched Digital
Network (ISDN), had no connectivity to the packet-switched world. However, as
the second generation of telecom arrived, there emerged GSM (2G) and GPRS (2.5G),
which enabled a web phone to access the Internet through data packets. The speed
and performance of IP connectivity accelerated with the introduction of 3G and 4G,
which enable high-speed multimedia sharing and real-time streaming.
The GPRS support nodes are responsible for transmitting IP packets to GSM or
Universal Mobile Telecommunications System (UMTS) devices. GPRS services
are mainly provided through GPRS Support Node (GSN). GSN also has two parts,
Gateway GSN and Serving GSN, described as follows:
Gateway GPRS Support Node (GGSN) manages the interworking between
packets from the Radio Access Network (RAN) to the external IP world such
as the Internet
Serving GPRS Support Node (SGSN) is responsible for mobility, routing,
authentication, and so on for the GPRS core
Chapter 4
[ 117 ]
The following diagram shows the structure of the core GPRS Support Nodes (GSNs):
IP connectivity via GSN (GPRS Support Nodes)
Internet Packet-switched domain
Mobile Phone
Node B
Use the GPRS functionality in GSM phones to access the WebRTC client through the
mobile web browser. The process of making calls through the browser to another
browser or SIP endpoint is already dened in Chapter 2, Making a Standalone WebRTC
Communication Client.
Only the phones that support WebRTC-enabled browsers
will be able to support it, for example, the mobile browsers
of Chrome and Mozilla.
The following is a diagrammatic representation of a WebRTC call that runs in the
mobile browser of GSM phones through the GPRS connectivity:
IP connectivity to WebRTC server in IN
Packet switched domain
Mobile Phone with
WebRTC browser
Node B
Computer system with
WebRTC browser
WebRTC signaling
Web Application
IP world
WebRTC Integration with Intelligent Network
[ 118 ]
In this case, there are no alterations required in the telecom operator's landscape,
except for the introduction of the web application server to host the WebRTC
application. However, the operator can only charge for the data packets consumed
by the end user in terms of data upload and download. This will be far less than the
conventional voice call or video call rate.
IMS connectivity to Gateway GPRS
Support Node
The WebRTC platform resides in the IP world of SIP and IMS. A mobile phone can
connect to the IP world via data packets, that is, GPRS. The following gure gives
an architectural representation of interconnecting a WebRTC IMS platform with a
mobile phone through GGSN:
Mobile Station
Node B
IMS Network Core IN Network Core
WebRTC client
WebRTC IMS Gateway
Dx cx
Packet switched domain
HSS in the IMS core network holds the prole of subscribers. CSCFs are the IMS
entities responsible for call control. Overall, the preceding diagram gives a clear
picture of GSN integrated with IMS to render GPRS packets to GSM mobile phones.
It also mentions 3G, which uses Node B as the access node, and 4G, which uses
Evolved Node B (eNodeB) as the access node.
Chapter 4
[ 119 ]
Node B is the node responsible for connecting the mobile phones with
the network in UMTS, which is the third generation of telecom (3G).
It is controlled by a Radio Network Controller (RNC) in RAN.
eNodeB is the node responsible for connecting the mobile phones
with the network in Long Term Evolution (LTE), which is the fourth
generation of telecom (4G).
Similar to BTS, in second-generation telecom networks, Node B and
eNodeB have frequency transceivers, that is, transmitter and receivers,
to connect the nearby mobile devices with a network.
The packet-switched domain provides the IP bearer with access to the IMS
through the Packet Data Protocol (PDP) context. The phases of mobile access
to the packet-switched network of IMS are as follows:
In the rst phase, the mobile registers with the packet-switched domain
via GPRS Attach
In the second phase, the mobile activates the PDP context and establishes
Radio Access Bearer (RAB)
The third phase consists of registering successfully with IMS and using
the services
The following diagram shows the ow of the phases of mobile access to the
packet-switched network of IMS:
IMS Registration
IMS Service Access
Bearer level Authentication
IP transport setup
IMS rService access
IMS registration and User Authetication
PDP context Activation
GPRS Attach
WebRTC Integration with Intelligent Network
[ 120 ]
The preceding gure depicts the sequence ow between a User Equipment (UE),
which is a mobile phone in this case, and a WebRTC client via Serving GPRS Support
Node (SGSN) and Gateway GPRS Support Node (GGSN). As outlined, the rst
step is GPRS Attach and PDP context activation. At this stage, the authentication at
the bearer level is achieved. In the next stage, the mobile phone will connect itself to
CSCF, which is the core of the IMS network. Once this is achieved, the mobile phone
can make calls, which will traverse through the IMS network. The IMS entities such
as Application Server and Media Server will also apply to such signals. The following
screenshot shows the sipML5 WebRTC client web page opened in a mobile browser:
Chapter 4
[ 121 ]
From mobiles to WebRTC client
through GSM
As we know that SIP IMS is a good way to implement a unifying technology
(refer to the following diagram) between the legacy circuit-switched and the IP-based
packet-switched networks, it is clear that the best approach to integrate WebRTC to
the IN telecom network is via SIP IMS; this is illustrated in the following diagram:
Telecom Applications
to other
SIP as unifying technology
In the previous section, we saw the process of using the mobile packet-switched
network to interact with the WebRTC client. This section describes the process of
using the circuit-switched voice network to call a WebRTC client. To achieve this,
there must be a point of translation between the voice network of the mobile and
the IP network of IMS. This is often referred to as the GSM Gateway.
The difference between Circuit-Switching and Packet-Switching technologies is
described in the following table:
Circuit-Switched (CS) network Packet-Switched (PS) network
Circuit-Switching is a connection-oriented
communication technology. In this case,
a fixed bandwidth is allocated for every
communication line, and this remains
open throughout the session. It cannot
be used by other data and phone calls.
In packet-switched communication
network, the message gets broken into
small data packets and is sent out to
travel to its destination, seeking the
most efficient route. Every packet
might go a different route. The packets
are reassembled in the correct order
on reception.
WebRTC Integration with Intelligent Network
[ 122 ]
Circuit-Switched (CS) network Packet-Switched (PS) network
The biggest advantage of Circuit-Switching
is the quality of service, which is due to
a guaranteed full bandwidth for the
duration of call.
The advantages of Packet-Switching
are good use of bandwidth and high
availability as it doesn't wait for a
direct connection to be available
The disadvantage of Packet-Switching
is that the quality of service might
be poor as there might be a delay in
transmission. Also, there is a high risk
of data packets being lost or corrupted.
Circuit-Switching is widely employed
in voice communication in the telecom
landscape. From the old PSTN phone to the
current 3G phone, all mobile devices use a
circuit-switched network to make calls.
Packet-Switching is used for data access
such as Internet browsing and e-mails.
Due to low reliability, there are still a lot
of reservations to adopt packet switching
in voice communication. However, IP
communications such as WebRTC make
use of the Packet-Switching protocol to
make and receive audio/video calls.
There are various ways in which a WebRTC client can interact with the GSM
endpoints that only understand ISUP. If the requirement is that of just connecting
the endpoints without a centralized-server-level logic, then it is merely required
to provide the interconnecting gateways that do media and protocol conversion
between the WebRTC format in the IMS and GSM formats. The following gure
depicts the components of this interconnecting gateway:
PSTN gateway
WebRTC client
IMS network
SIP WS to SIP gateway
Chapter 4
[ 123 ]
IMS provides interoperability with circuit-switched networks. This can be achieved
in the following way:
1. Signal Gateway (SGW) transforms ISDN User Part (ISUP) to SIP, and
Media Gateway (MGW) transforms data from CS to the IP–based data.
2. These are united and connected to the Media Gateway Control Function
(MGCF), which provides a protocol conversion between SIP and
ISUP / Bearer Independent Call Control (BICC). It also controls
resources in the media gateway. Bearer Independent Call Control (BICC) is
a signaling protocol based on ISUP that is used for supporting narrowband
Integrated Services Digital Network (ISDN) service.
3. This in turn is connected to the Border Gateway Control Function
(BGCF), which is an SIP proxy responsible for processing requests
which are telephone number and not DNS/ENUM types.
4. Finally, it's connected to Serving Call Session Control Function
(SCSCF), which is the very place responsible for central session
control, activation/cancellation of bearer service, and so on.
Thus, it's said that IMS is a multi-access architecture and holds the door to seamless
interconnectivity between current phones and futuristic SIP/WebRTC phones.
A call ow between a mobile phone and the WebRTC endpoint using the voice
network of the mobile operator, as well as the IMS network of the service provider,
is depicted in the following gure:
IMS network IN network
GSM phone GSM phone
WebRTC Integration with Intelligent Network
[ 124 ]
The SIP request is initiated from the WebRTC client that is in the SIP-over-WebSocket
(SIP-WS) format. A WebRTC-to-SIP gateway is required to convert this request from
the SIPWS format to the SIP format. Thereafter, the request is sent to the MCG/MG
node, which is an ISUP to SIP interworking node that has also been described in detail
in the previous section. Here, the SIP requests are converted to ISUP requests via SGW,
and then proceed towards the ISUP switch, which interconnects to mobile phones. The
request is generated from the mobile phone towards the WebRTC client ow in the
exact opposite direction. The process is repeated for every other request and response.
The media ow is also traversed via the MCG/MG node, which is responsible for
codec conversion between WebRTC-supported formats and traditional media formats.
Call processed with the IN service logic
This part deals with WebRTC client communication through call control logic in
Service Control Point (INSCP) of an IN. The use of this setup is that the operator
does not need to introduce services such as call screening, forwarding, or VPN
separately for WebRTC clients, as they can utilize the existing logic. If, however,
there is a requirement for introducing logic in the form of an application program
that resides on the telecom core, then there arise the following two cases:
Logic resides in the IMS Network Application Server (seen in the
previous chapter)
Logic resides on the IN network in Service Control Point (SCP)
As we have embedded SIP as our signaling protocol in our instance of the
WebRTC client, it is mandatory to either convert from SIP to the GSM protocol
or provide the IMS core for any further processing of the call. Once a steadfast
WebRTC-to-IMS system is set up and working, it is not a tough job to establish
backward compatibility with GSM-based handsets using the IP Multimedia
Service Switching Function (IM SSF) and reverse IM SSF.
What is IM SSF? It is a gateway through which operators can
transparently provide IMS users the access to existing IN services
using INAP and CAMEL signaling protocols. In essence, it connects
the IMS network to IN services.
Chapter 4
[ 125 ]
The following gure depicts the role of IM SSF in interconnecting the call ow
from IMS network to IN network's SCP for logic processing. After executing the
call control logic programmed within the SCP, the call is routed back to IMS
network nodes.
Initial address message
ISUP Switch
180 ringing
audio (ringtone)
200 OK
audio (RTP)
200 OK
180 ringing
200 OK
200 OK
audio (ringtone)
Address complete message (ACM)
Call progress(CPG)
ANM(Answer message)
media media
REL (Release)
complete) release complete
As shown in the preceding gure, the IM SSF node lies in the Application/Service
layer and comes into picture when an SIP user from the IMS network wants to
use an IN service, for example, old corporate Virtual Private Networks (VPNs).
Technically, IM SSF converts between ISC (the protocol used in the IMS network
between the S-CSCF and SIP-AS) and CAP3 (the protocol used in the GSM network
between the MSC and the Service layer).
Let's discuss the rst case where logic resides within the IMS network's
Application Server.
The WebRTC client's communication with the
GSM phone through IMS
In this section, we will observe the call between a mobile phone and a WebRTC
client; this call is processed by the application program hosted in IMS Telecom
Application Server (TAS). New age services such as Find-Me-Follow-Me, RingBack
Tone Advertisement, and many innovative services can be mounted on the
Application Server.
WebRTC Integration with Intelligent Network
[ 126 ]
In the Find-Me-Follow-Me Service, when the user receives an
incoming call, their subscribed call agents such as phones, WebRTC
clients, and desktop phones ring sequentially until any one of them
is answered.
In the RingBack Tone Advertisement services, while the call is in
the ringing mode, that is, until the time the receiver doesn't pick
the phone, the caller can enjoy the music played for them, instead
of the ringing tone.
The following diagram shows the WebRTC-to-GSM phone using the IMS
Application server:
SIP WS to SIP gateway
WebRTC Client
IMS network
GSM phone
Application server
This instance depicts a setup where the call is routed from the CS domain to the PS
domain, but it is set up utilizing the call control logic and services installed in the
IMS application server.
The major four kinds of SIP programming for the Application Layer are as follows:
SIP Servlets: These are Java extension APIs for SIP servers (SIP Servlet
Request / Sip Servlet Response) and are similar to the HTTP Servlets (HTTP
Servlet Request / HTTP Servlet Response)
JAIN SIP: This is also a Java-based API for SIP signaling, however, it's more
generic and low level
Chapter 4
[ 127 ]
SIP CGI: The Common Gateway Interface (CGI) for SIP is very similar to
CPL: This is the Common Programming Language, which is an XML-based
script for call control logic and needs to be validated and parsed by a
CPL interpreter
Using high-end service orchestration tools, it is also possible to create various
combinations and permutations of the services instead of putting fresh time and
effort to create new ones from scratch.
This is the case where logic resides in the SCP of the IN, and the WebRTC client
wishes to use this while placing a call to any endpoint, be it SIP WebRTC or GSM.
The WebRTC client's communication with
a GSM phone with IN services
In the previous section, we observed the scenario where the service logic for call
processing is derived from the application hosted on the IMS TAS. In this section,
we will observe the case where the call control logic is embedded with the SCP in
IN. The call originates from either a WebRTC client or a mobile phone and passes
on to the IMS network for routing. The SCSCF node of IMS, which is responsible for
negotiation with Application/Service Layer, sends across the signal to IN SCP via IP
Multimedia Service Switching Function (IMSSF). The following diagram shows the
WebRTC-to-GSM phone call control logic fetched from the SCP in IN via IM SSF:
SIP WS to SIP gateway
WebRTC Client
IMS network IN network
GSM phone
WebRTC Integration with Intelligent Network
[ 128 ]
This instance of WebRTC to GSM phone connectivity signies that while the call
is traversed from the PS to CS network, the application logic is also used from the
SCP of the IN by the means of IM SSF. This scenario is useful for making use of
the existing IN service logic while routing calls, even if the call arrives from a GSM
phone, SIP phone, or WebRTC client.
There could also be a third instance where both the networks, IN and IMS,
intercommunicate with each other. To enable this, one needs a component called
Service Broker.
A Service Broker is used to interoperate between multiple networks such as IN
and IMS. The protocol understands the spans across the CS and PS calls. It is able
to utilize the IMS services as well as the IN services in a transparent way. The
end user doesn't come to know whether the service hit is an IN service of the IMS
Application. This gure depicts call processing using the Service Broker as the
Service Orchestrator of IN and IMS applications. The following diagram shows a
typical Service Broker:
Service Broker
SCP/IN Application Server SIP AS
Services Content Server
Now, the existing service brokers to achieve the goal are as follows:
Oracle's Service Broker
Open cloud's Service Broker
Chapter 4
[ 129 ]
The services broker for endpoints
and WebRTC in IMS to GSM phone in
Intelligence Networks
The convergence of the Internet and Telecommunication Architectures is a key issue
in today's telecommunication world. In present times, Intelligent Networks are used
by telecom operators for creating and managing VAS in telecom networks for Circuit
switched Access networks based on 2G/3G. Originally, IN was applied in telephone
and voice services, but today its meaning is also growing in the service integration of
mobile and xed telephone networks and as a gateway to Internet-based networks.
This section particularly deals with WebRTC set up to Intelligent Networks
communication that is between WebRTC clients and GSM phones through Service
Broker. Service broker internetworking between GSM and SIP works seamlessly;
this is illustrated in the following diagram:
Service Broker
SCP/IN Application Server SIP AS
SIP WS to SIP gateway
WebRTC client
GSM phone
WebRTC Integration with Intelligent Network
[ 130 ]
So far, we have discussed the interworking between SIP-based WebRTC client's call
services and application logic with a mobile phone that is on the SS7 IN network.
The next section describes the process of linking message services for SIP-based
Instant Message to SMS in IN.
The WebRTC client's SIP messages to
SMS in a GSM phone (SMSC)
Typical SMS service in Intelligent Networks can be achieved in the IMS environment
via Short Message Service Center (SMSC). It sends an SMS message to a GSM
phone and retires if the message is undelivered (usually stores it in a buffer and
retires after a period of 2 days).
As we can extract content out of an SIP message and use the SMS gateway to deliver
the message to a GSM phone, it is thus also practical to extract the WebRTC-based
messages and send them over to a GSM phone as SMS. Another setup would be to
store the message in a database and let Kannel send them out in succession.
The Kannel gateway
The Kannel gateway is a Wireless Application Protocol (WAP) and SMS
(Short Message Service) gateway. It connects the HTTP Web Services to SMS
centers. We will only make use of the SMS functionality.
To congure and install the Kannel gateway, follow the next steps:
1. Download the source code from http://www.kannel.org/download.shtml
2. Congure the downloaded content using the following command line:
Chapter 4
[ 131 ]
The following screenshot shows the Kannel Congure running:
WebRTC Integration with Intelligent Network
[ 132 ]
3. Build the Kannel executables using the following command lines:
make bindir=/directory path for installation
4. Dene an SMS box group into the conguration le.
group = sms-service
keyword = complex
get-url = "http://host/service?sender=%p&text=%r"
accept-x-kannel-headers = true
max-messages = 3
concatenation = true
Start kannel
5. It requires a physical GSM handset to achieve this. We must connect the phone
to the machine that runs the Kannel gateway and modify the cong le for the
phone specication. The phone must bear a valid SIM. Also, the amount per
message is deducted from the balance of the SIM holders' account.
Chapter 4
[ 133 ]
6. Send the content from the WebRTC application in the HTTP format and pass
it on to the Kannel gateway. The content will be delivered to the phone in the
form of an SMS. For example, consider the following content:
SMSC from
GSM network
Text to SMS
Text over
For the service to look seamless, we can encrypt the logic in a telecom application
and install it in Application Servlet in the form of JAIN-SLEE with HTTP Resource
Adapter (RA) or just use SIP Servlet to pass an HTTP request. This way, when an
SIP message is sent from the WebRTC client/SIP phone, on the server end, we can
extract the message content, append it to the SMS box URL, and send it across to the
Kannel gateway to deliver it as an SMS to the destination phone.
Extract the
Message Body
and pass to
kannel gateway
SMSC from
GSM network
SIP Message
SIP Message
over SIP WS
SIP message from the WebRTC application/Sip phone to SMS in a GSM phone
WebRTC Integration with Intelligent Network
[ 134 ]
There are other options to build an SMS gateway; for instance, consider
the following:
The OpenSIPS SMS module also supports sending and receiving SMS
directly from a GSM network.
The Kamailio SMS module performs SIP message to GSMS and SMS delivery
too. It requires a GSMS telephone to act as the modem. The format of the SIP
address header should be as follows:
sip:<number>@domain, for example sip:988768238@tcs.com.
One can also opt for an enhanced SMSC gateway such as Mobicents SMSC
built over Mobicents SS7 and Mobicents JSLEE. They support Short Message
Peer-to–Peer Protocol (SMPP), SIP IM, and legacy SS7 MAP interfaces as well.
At the end of this chapter, we have arrived at an architecture similar to the one
shown in the following diagram:
WebRTC to IN world
Media relay
Application layer
Service Delivery Platform
Web Client
Network Control Layer
Transport Layer
Chapter 4
[ 135 ]
The gure depicts that the WebRTC client can make a call to a mobile phone in
many ways. It can either be through IDP to invite conversion or via GPRS nodes.
The application logic of IN SCP can also be integrated with SIP Application layer
through RIMSSF.
So far, we have discovered various ways of integrating the IN application logic to the
WebRTC call ow(IMSSF), making a call to the GSM client from the WebRTC client
(SGW, MGW) and delivering an SIP message from the WebRTC client to a GSM
phone (SMS gateway).
The implementations of these are meant to prove the feasibility of the proposal
and not necessarily signify how the production environment must be. In order to
construct a stable, scalable, and resilient communication system, one must ensure
that the signal is well connected to all nodes via their interfaces and that the media
is owing smoothly. Interoperability issues usually arise on the border of these
networks where issues such as protocol conversion and media codec conversion
take place. The placement of appropriate gateways helps prevent these errors and
smoothens out the differences.
The next chapter deals with WebRTC interconnectivity with old telephone systems
such as PSTN. This shall be carried out through interoperable hooks provided in the
IMS environment itself.
WebRTC Integration
with PSTN
The Voice over Internet Protocol (VoIP) telephony is one of the coolest things
ever invented. It gives us the ability to use the power of Internet in the context
of communication such as user discovery; user presence; virtual conferences;
le sharing; notications based on web feeds such as news updates, parental
control, and IPTV / Video On Demand; extensive option of call control; and, most
importantly, the ability to take our call from any place where there is IP connectivity.
However, what if there is no IP connectivity? What if there is just a xed cable
connection that supplies analog outputs?
There are still many such analog connections (called Public Switched Telephone
Network (PSTN) endpoints) in the world even today that only have a dialer as the
user interface and a handle with an embedded speaker and microphone.
As WebRTC is not intended to be just a web-only communication tool but to also
connect to all other devices capable of communication, there will be occasions
when a WebRTC user has to make a call to a PSTN endpoint. In such a scenario, the
described approach in this chapter (from WebRTC to the PSTN via the IMS network)
is the ideal way to achieve this goal.
So far, we have seen how WebRTC users can connect with other WebRTC users,
SIP phone users, and mobile phone users. This chapter will take us through the
detailed course of connecting the WebRTC signal and media to the PSTN signal and
media via IMS. We know that IMS systems have hooks for PSTN terminals through
the PSTN gateway. The rst part of the chapter will discuss the direct approach to
connect WebRTC clients to PSTN terminals through the IMS setup. The later part
of this chapter is a continuation from the previous chapter, where we discussed the
WebRTC connectivity to mobile stations in GSM/UMTS access technologies.
WebRTC Integration with PSTN
[ 138 ]
The IN setup, which is based on SS7 signaling, not only includes GSM/UMTS access
networks, but also includes legacy networks such as the Integrated Service Digital
Network (ISDN), the PSTN, and the Public Land Mobile Network (PLMN).
Here, the most primitive analog PSTN phones to slightly more sophisticated digital
ISDN phones are considered in order to discuss the complete interconnectivity
between the latest version of WebRTC and legacy telephones. The advanced
tunneling and PBX system can be further derived from this setup.
In this chapter, we will cover the following topics:
The PSTN system
The WebRTC connectivity to the PSTN
Challenges in connecting the WebRTC world to the PSTN landscape
The service logic
What is PSTN?
PSTN is the connection of many wired communication endpoints. The communication
is circuit-switched in nature. Originally, PSTN endpoints were xed-line analog
telephone systems, also referred to as Plain Old Telephone Systems (POTS).
However, most have completely converted to digital systems such as ISDN, and
some of them are digital towards the core side but have wired analog function
on the last mile, from the exchange center to the user location.
A Circuit-Switched system has a dedicated path for communication.
It offers a high quality of service and constant bit delay, as all the
data traverses the same path. On the other hand, packet-switched
systems move data in packets where each packet is independently
transmitted through a different path that is dynamically decided. At
the destination, the original message is reassembled from the received
packet in the proper sequence.
Chapter 5
[ 139 ]
The following diagram shows a typical PSTN setup for home subscribers and
enterprise network:
A Private Brach Exchange (PBX) is a telephone-switching system that comprises
cables and micro controllers. The modern IP PBX is also capable of switching
between VoIP and the traditional telephone system; however, we are only
considering the traditional PBX and PSTN setup in this chapter.
Just as GSM and UMTS are network access technologies, the PSTN
system is also an access network technology. The core existing
networks, such as IN, and the evolving networks, such as NGN IMS,
have specied gateways and are switched to provide interconnectivity
with the legacy communication endpoints.
WebRTC connectivity to the PSTN
For WebRTC connectivity to the PSTN phone, we can adopt one of two approaches:
while the rst approach is suited to an evolving next generation IMS landscape, the
second approach depicts an IN setup. The IN approach is discussed not because all
service providers have completely migrated to the IMS landscape, but because the
IN service ow and call execution still holds good for many phones. The existing INs
have hooks for interconnectivity between cellular phones and old analog phones. This
was established using legacy PSTN gateways that took care of the digital-to-analog
conversion. The ISUP switch provides the conversion to ISUP, which is responsible
for setting up telephone calls in the IN network.
WebRTC Integration with PSTN
[ 140 ]
The next generation network that revolves around the IMS setup also provides
interoperability with the PSTN system. This will be discussed in detail here.
The methodology is adopted from RFC 3398 Integrated Services Digital Network
(ISDN) User Part (ISUP) to Session Initiation Protocol (SIP) Mapping, and the call
ows depicted here are derived from RFC 366 SIP PSTN Call Flows
Now that we have the basic translation from SIP-WS to SIP via the WebRTC-to-IMS
gateway, such as WebRTC2SIP / OverSIP / the Kamailio proxy server, we can extend
this setup further and connect to the PSTN phone via the PSTN gateway. The following
diagram shows the WebRTC-to-PSTN connectivity via the PSTN gateway:
PSTN gateway
WebRTC client
IMS network
SIP WS to SIP gateway
IMS is the standard platform for the IP/SIP protocol communication; SIP is
integrated as the signaling protocol with the WebRTC client. For sound telecom
infrastructure, we have to set up a WebRTC-to-IMS link rst before extending the
connection to the PSTN domain (refer to Chapter 3, WebRTC with SIP and IMS).
Once we are through with this, we will then use the PSTN gateway to provide
the necessary signaling and media interoperability.
Chapter 5
[ 141 ]
The PSTN gateway
PSTN gateways are the major game players in this setup. They are the entry point to
the PSTN world, translating signal and media between the IP infrastructure and the
Circuit-Switched network of the PSTN.
As described in the previous chapter, a gateway primarily consists of three
Signaling Gateway (SGW)
Media Gateway (MGW)
Media Gateway Controller (MGC)
The following diagram shows the parts of the PSTN gateway interconnecting
the IMS and PSTN worlds:
In the preceding diagram, SGW provides the protocol for interconversion
between the new-age VoIP and the legacy network, MGW takes care of the
media transcoding between the different codec standards supported on two
ends, and MGC controls the call.
WebRTC Integration with PSTN
[ 142 ]
The PSTN connectivity to IMS via PSTN
A PSTN/CS gateway interfaces with PSTN Circuit-Switched networks.
A Media Gateway Control Function (MGCF) is a SIP endpoint that does call control
protocol conversion between SIP and ISUP/BICC and interfaces with the SGW over
SCTP. It also controls the resources in a Media Gateway (MGW) across an H.248
interface. Let's take a look at the signal and media ow from IMS to PSTN separately.
IMS signaling to PSTN signaling: For signaling, CS networks use ISDN
User Part (ISUP) (or BICC) over Message Transfer Part (MTP), while IMS
uses SIP over IP.
A Signaling Gateway (SGW) interfaces with the signaling plane of the CS.
It transforms lower layer protocols such as Stream Control Transmission
Protocol which is a transport layer protocol over IP, into Message Transfer
Part which is a Signaling System 7 protocol. This is done in order to pass
ISDN User Part (ISUP) from the MGCF to the CS network.
IMS Media to PSTN Media: For media, CS networks use Pulse-code
modulation (PCM), while IMS uses Real-time Transport Protocol (RTP).
The codecs required for this are G.711 and G.729, which can be congured
with Media Server.
A Media Gateway (MGW) interfaces with the media plane of the CS network,
by converting between RTP and PCM. It can also transcode when the codecs
don't match (e.g., IMS might use AMR, PSTN might use G.711).
The call ow from a WebRTC SIP browser client
to a xed landline phone
The SIP signals that originate from the WebRTC clients are proxied through the
IMS nodes. After the service logic is executed by the Application Server, the signal
arrives at the PSTN gateway, which is the entry point to the PSTN world. The
interconversion form IP standard protocols and codecs to PSTN accepted values take
place here. The modied version is sent across the PSTN network to the addressed
telephone device. The ow among nodes is demonstrated in the following sequence:
WebRTC browser | WebRTC to SIP gateway | MGC / PSTN Gateway | ISUP
Switch | Fixed Landline phone
Chapter 5
[ 143 ]
The following diagram shows the call ow between the WebRTC and PSTN endpoints:
SIPURI phonenumber
180 ringing
MGC / MG ISUP Switch Fixed Line
INVITE Initial address message (IAM)
audio (ringTone)
Address complete message (ACM)
180 ringing
Call Progress (CPG)
ANM (Answer Message) Answer offhook
Ringing voltage
audio (Analog Speech)
audio (PCM Speech)
200 OK
200 OK
Hangup onhook
REL (Release)
RLC (Release complete)
200 OK 200 OK
We have seen the role of the WebRTC-to-SIP gateway in Chapter 2, Making a Standalone
WebRTC Communication Client. In brief, it converts the SIP-over-WebSocket signals to
legacy plain SIP signals that IMS nodes can understand.
The MGC node handles all call signaling (SIP and ISUP), while MG handles media
under the control of MGC. For ease of understanding, they are depicted together.
They can be considered as the components of the PSTN gateway. The ISUP switch
further converts the ISUP signals from MGC into the analog format expected by the
PSTN endpoint. The ISUP switch is responsible for converting the incoming as well
as the outgoing call format from the analog to the digital ISUP format as understood
by the back network.
A step-by-step description of the call ow is given as follows:
1. The WebRTC client initiates a call through a click of a button on the call page.
On doing so, a SIP-over-WebSocket INVITE request message is sent to the
WebRTC gateway.
2. The WebRTC gateway converts it into a true SIP message and forwards it to
the IMS core nodes.
WebRTC Integration with PSTN
[ 144 ]
3. The IMS network checks the address in the "To" header of the SIP message.
The request URI in INVITE contains a telephone number. The IMS network
understands that the signal is for a PSTN endpoint and forwards it to the
PSTN gateway (refer to the Address Mapping section).
4. The PSTN Gateway maps INVITE to an SS7 ISUP Initial Address Message
(IAM) along with other essential headers (refer to the Translation from SIP to
ISUP section).
5. The ISUP switch is used to convert the signal from the digital to the
analog format. At this point, a ringing voltage is sent to the xed line
phone for ringing.
6. Until the user answers the call, the PSTN gateway receives the call-in–progress
message from the PSTN network and sends forth 180 ringing SIP responses to
the IMS network, which passes through the WebRTC gateway, and the status
is displayed on the WebRTC client's user interface.
7. In case of a successful call answer, that is, when the telephone is picked off
the hook, an answer message (ANM) is generated and sent to the PSTN
gateway from the PSTN network.
8. The PSTN gateway further sends out 200 OK SIP responses for the
WebRTC client.
9. The two-way speech path is generated right after that. The Media Server is
used to perform the codec conversion from legacy codecs such as G711/G729
to the WebRTC standard codecs such as PCMA/PCMU or Opus. We will
speak of only the audio codecs here as PSTN terminals do not have a video
call support unless they are customized to do so.
Note that the media flow is not depicted in the preceding diagram
for the sake of simplicity and ease of understanding.
10. To terminate the call, the user hooks up the PSTN telephone. As they do so,
a release (REL) message is sent to the PSTN gateway. The PSTN gateway
transforms the REL ISUP message to the BYE SIP message and forwards it to
the IMS network.
11. The IMS network routes it to the WebRTC gateway where the SIP message is
converted into the SIP over WebSockets message. BYE is forwarded to the
WebRTC client.
12. The call hung up status is updated on the WebRTC client user interface. The
WebRTC SIP session is prepared to be ready to make/receive another call.
This way, a call that originates from the PSTN endpoint can reach a WebRTC client,
and both the parties can communicate.
Chapter 5
[ 145 ]
The challenges in connecting the
WebRTC world to the PSTN landscape
There are many big/small problems that arise or may arise when interconnecting the
different networks of WebSocket, SIP, and PSTN. The most basic aspects of address
mapping and protocol conversion from SIP to ISUP are mentioned here. Other
challenges, such as policy control, billing, charging, and so on, are left to the reader
to implement as they wish.
Address mapping
The PSTN scenario is much different from VoIP, as in the PSTN, the user ID is in the
numeric form, while in VoIP networks, the username is followed by @domain name.
To resolve the intermapping between both the worlds, DNS and Enum servers are
used; this setup is depicted in the following diagram:
?? 76354_xx sip:bob@domain.com
INVITE sip:bob@domain.com
SIP world
PSTN +01 405638...
Translation from SIP to ISUP
This section describes the various ISUP call status responses and requests. It also
describes the translated SIP messages. We need to have a basic understanding of
ISUP messages before proceeding further. As the rst call setup request is INVITE,
we will begin describing the process of forming IAM from the INVITE message
through the gateway.
WebRTC Integration with PSTN
[ 146 ]
The call setup
This section describes the mapping of the SIP headers in an INVITE message
to the ISUP parameters in an IAM. An IAM ISUP message bears these ve
essential parameters:
Nature of Connection Indicators (NCI)
Forward Call Indicators (FCI)
This can further be classied in the following types:
°End-to-end method indicator
°Interworking indicator
°End-to-end information indicator
°ISDN user part indicator
°ISDN user part preference indicator
°ISDN access indicator
°SCCP method indicator
Calling Party's Category (CPC)
Transmission Medium Requirement (TMR)
Called Party's Number (CPN)
A major task involved in the translation of the elds of an INVITE message to the
parameters of an IAM is the inspection of the Request-URI and the construction of
the telephony URL. When an SIP INVITE arrives at a PSTN gateway, the gateway
should attempt to make use of the encapsulated ISUP, if any, within INVITE to assist
in the formulation of outbound PSTN signaling.
If suitable ISUP headers encapsulated within the SIP request body are not found
or the gateway ID is not able to decipher them anyway, then MGC formulates one
on its own.
The MGC gateway sets the Interworking Indicator bit of the FCI to "No
Interworking" and the ISDN User Part Indicator to "ISUP used all the way";
the gateway might also set the Originating Access Indicator to "Originating
access non-ISDN".
Chapter 5
[ 147 ]
The call termination
In case of no response, abrupt termination, or purposeful call termination, the SIP
network is sent an appropriate response, and all the resources in MG are released.
For example, when no answer is received by the PSTN terminal or the ISUP timeout
SIP Response is 504 gateway timeout, the Release and Release complete messages
are used by the ISDN to make the network ready for fresh reuse.
The REL message contains a cause value. The SIP response is sent based on this cause
value. If a cause value other than what is listed in the following table is received, the
default response of "500 Server internal error" will be used.
In the instances of Normal event, the following cause values might be generated; the
column alongside depicts the SIP response message they would be translated into:
ISUP Cause value SIP response
1: unallocated number 410 Gone
2: no route to network 404 Not found
3: no route to destination 404 Not found
4: send special information tone ---
16: normal call clearing ---
17: user busy 486 Busy here
18: no user responding 480 Temporarily unavailable
19: no answer from the user 480 Temporarily unavailable
21: call rejected 603 Decline
22: number changed 301 Moved permanently
27: destination out of order 404 Not found
28: address incomplete 484 Address incomplete
29: facility rejected 501 Not implemented
31: normal unspecified 480 Temporarily unavailable
WebRTC Integration with PSTN
[ 148 ]
The resource unavailable kind of cause value indicates a nonpermanent situation.
A Retry-After header has to be added to the response.
ISUP Cause value SIP response
34: no circuit available 503 Service unavailable
38: network out of order 503 Service unavailable
41: temporary failure 503 Service unavailable
42: switching equipment congestion 503 Service unavailable
44: requested channel not available 503 Service unavailable
47: resource unavailable 503 Service unavailable
The instances where the service or option not available status is generated, indicating
a permanent solution, are discussed here. It is not appended by a Retry-After
header, as in the previous case.
The service or option not available option indicates a permanent solution:
ISUP Cause value SIP response
55: incoming calls bared within CUG 603 Decline
57: bearer capability not authorized 503 Service unavailable
58: bearer capability not presently available 503 Service unavailable
63: service/option not available 503 Service unavailable
The instances where service or option not implemented status is generated are
discussed here. For enhanced SIP services such as SUBSCRIBE, NOTIFY, PUBLISH,
and MESSAGE, there is no equivalent service on the PSTN front. Such requests are
generally replied to with a "Not Implemented" cause value. The same is translated
into the SIP response message and shared with the WebRTC client.
ISUP Cause value SIP response
65: bearer capability not implemented 501 Not implemented
79: service or option not implemented 501 Not implemented
Chapter 5
[ 149 ]
For the cases where ISU receives an invalid message, the response and cause
values are given as follows. These are also translated into the SIP response,
503 Service unavailable.
ISUP Cause value SIP response
87: user not member of CUG 503 Service unavailable
88: incompatible destination 503 Service unavailable
95: invalid message 503 Service unavailable
In the event of protocol error and timer expiry, the ISUP response and equivalent
SIP response generated are shown as follows.
ISUP Cause value SIP response
102: recovery of timer expiry 408 Request timeout
111: protocol error 500 Server internal error
For instances where the interworking rules are not specied clearly, the cause of
the generated ISUP error is "Interworking, unspecied". It is translated to "Server
internal error" in the SIP message format.
ISUP Cause value SIP response
127: interworking unspecified 500 Server internal error
The call in progress
This section describes the response generated for the occasions where the call is
successfully routed across the network nodes from IMS to PSTN. Once the call signal
reaches the PSTN phone, it might send a busy tone if it is engaged in another call or
start ringing if it is free. In some cases, the telephones are congured just to record
the voicemail; in such cases, the status sent back to the SIP world is "Call is being
forwarded". Thus, in case of Call in Progress (CPG) message format, the following
are the responses sent back to the SIP network:
ISUP event code SIP response
1: Alerting 180 Ringing
2: Progress 183 Call progress
3: In-band information 183 Call progress
WebRTC Integration with PSTN
[ 150 ]
ISUP event code SIP response
4: Call forward; line busy 181 Call is being forwarded
5: Call forward; no reply 181 Call is being forwarded
6: Call forward; unconditional 181 Call is being forwarded
-: (no event code present) 183 Call progress
When a Conference (CON) call is encountered, 200 OK success responses are sent
back to the SIP network. After the receipt of acknowledgement, the media session
is established between both the end points.
The service logic
Just as explained in the previous chapter, the service logic execution for various
services such as conferencing, Virtual Private Network (VPN), announcements,
and other services applicable to PSTN systems, can be scripted in IN Service
Control Point (SCP) or IMS SIP Application Server, or both. We will discuss
these three approaches in brief here.
SIP service logic through application server
When the SIP call control logic is dened in the form of the SIP Servlet or the
JAIN-SLEE program or, in a similar way, it is loaded and deployed onto the SIP
application Server; the SCSCF consults the SIP application server for every call.
The following diagram shows the IMS-to-PSTN connectivity using the Application
Server call control:
PSTN gateway
WebRTC client
IMS network
SIP WS to SIP gateway
Application server
Chapter 5
[ 151 ]
Many SIP applications such as call screening, Music on Hold, and the ring back tone
advertisement are applicable for use with PSTN systems and can be integrated with
the approach mentioned in the previous section.
IN services via IMSSF
As majority of service providers have not adopted the IMS platform or have adopted
it partially and want to keep using the service logic dened in SCP for call control,
it becomes feasible if we use a Reverse IMSSF (IP Multimedia Service Switching
Function) node to convert SIP from INAP and deduce the logic from SCP. It happens
in a transparent manner, and the end user on both the PSTN front and the WebRTC
sides do not come to realize which node provides the call control services. The
following diagram shows the IMS-to-PSTN connectivity using SCP call control:
PSTN gateway
WebRTC client
IMS network
SIP WS to SIP gateway
WebRTC Integration with PSTN
[ 152 ]
The Service Broker for the orchestration
of services
A Service Broker, as discussed in the previous chapter, can interlink the services of
IN's Service Control Point (SCP) and IMS Application Server. It can call the services
in any permutation or combination. It can execute a service individually too. The
following diagram depicts the role of a Service Broker in orchestrating the services of
the SIP IMS and PSTN IN worlds:
SCP/IN Application Server
Service Broker
WebRTC client
SIP WS to SIP gatewayPSTN gateway
Chapter 5
[ 153 ]
At the end of this chapter, we have arrived at an architecture similar to the one
shown in the following diagram. The solution diagram for PSTN-to-WebRTC
connectivity consists of SS7 and IMS network layer nodes. It might also be
integrated with service logic embedded in the application layer components
and Service Delivery Platform (SDP). The following diagram shows the
IMS-to-PSTN connectivity using a Service Broker:
Media relay
Application layer
Service Delivery Platform
Web Client
Transport Layer
A Session Border Controller (SBC) is used for a variety of reasons that include
security through network hiding, transcoding, ofoading VoIP trafc, and so on.
It is recommended that you use a standard grade SBC for PSTN network integration.
WebRTC Integration with PSTN
[ 154 ]
The process described in the book establishes an ideal way to interconnect a legacy
telephone system with WebRTC. We discussed interworking based on the SS7
IN setup as well as the IMS-based environment. We saw how PSTN gateways
function with the help of MGW and SGW, which take care of media and signaling
conversions, respectively. We listed a few challenges that can occur when an SIP
network is made to function with the PSTN system; these challenges included
address mapping and SIP-to-ISUP headers. We saw the approach to integrating
call control logic with the WebRTC PSTN setup build so far.
After establishing the interconnectivity of a typical WebRTC ecosystem with all the
other useful networks, we can now shift our focus to deliver a robust, feature-rich
WebRTC application architecture, which will be discussed in the upcoming chapters.
Basic Features of WebRTC
over SIP
In the previous chapters, we had encountered the WebRTC integration architecture
with and without the solution design. We will now draw our attention to the
development of the full-edged WebRTC client; this includes the basic features and
rich communication services that must be implemented to put our WebRTC client in
the ranks of a standard communication body.
This chapter only describes the basic features expected from a SIP-based
WebRTC client. I have grouped the services into categories that best describe the
implementation. The basic SIP services such as registration, call, instant message, call
transfer, as well as enhanced SIP services, such as Presence and user capability, are
also explained. The Media Server plays a crucial role in the development of some of
the other SIP services, such as voicemail, IVR, Conferencing, Music on Hold, among
others. This chapter explains the introduction of Media Services in the WebRTC
client as well.
As the WebRTC is a web-based communication agent, it would be unfair if we do
not integrate interactive web application features such as user-friendly GUIs and
OAuth for token-based authentication, with other social networking platforms such
as Facebook or Google+. The integration of such web-based services will also be
covered as part of this chapter.
Basic Features of WebRTC over SIP
[ 156 ]
SIP services
SIP services are communication features that are intrinsic to SIP requests itself.
These services include the following:
Audio/video call
Instant message
We will discuss all of them in detail, accompanied by a call-ow diagram.
Registering a SIP client
A SIP client, such as a SIP phone, SIP soft client, and SIP WebRTC web page, ought
to register itself with the Registrar. The Registrar informs the VoIP systems about
how you can be reached for an incoming event such as a call, instant message, and
Presence update.
A Registrar notes down the physical address of each of the clients through the
SIP REGISTER request and conrms their registration with a 200 OK response.
The values obtained are copied to the HSS/Location Server as well.
The ideal scenario for the SIP Registration call ow without any authentication
challenge is shown in the following diagram:
200 OK
SIP Server
(SIP over
200 OK
Chapter 6
[ 157 ]
However, with the introduction of secure registration with authentication, the
server throws an authentication challenge in the form of a 407 Proxy Authentication
Required response. It is for the request initiator to provide the assurance of its identity.
For this purpose, a nonce (such as HTTP Digest) is computed and carried out in order
to validate its identity. The following diagram shows the SIP Registration call ow
with the authentication challenge:
200 OK
SIP Server
(SIP over
200 OK
407 authentication Required
407 authentication Required
During the processing of a call, the Location Server is queried to nd out
which Serving Call Session Control Function (SCSCF) should be used in order to
forward the request. We discussed the role of CSCF nodes in Chapter 3, WebRTC with
SIP and IMS.
The SIP Traces for the Registration request by WebRTC is as follows:
SEND: REGISTER sip:domain.com SIP/2.0
Via: SIP/2.0/WS df7jal23ls0d.invalid;branch=z9hG4bKkKS7wNb4eUtrGC4eqAcsLk
From: "userA"<sip:userA@domain.com>;tag=QprEnFLGbLoZ1JzOAZro
To: "userA"<sip:userA@domain.com>
Contact: "userA"<sip:userA@df7jal23ls0d.invalid;rtcweb-breaker=yes;transp
Call-ID: 45771a0b-a074-a703-137e-95732a99422c
CSeq: 40303 REGISTER
Content-Length: 0
Max-Forwards: 70
Authorization: Digest username="userA",realm="domain.com",nonce="U3mcb1N5
Basic Features of WebRTC over SIP
[ 158 ]
User-Agent: IM-client/OMA1.0 sipML5-v1.2013.08.10B
Supported: path
The rst line tells us that this is a REGISTER message. It also species the
Request-URI, which, in this case, is the domain for which the registration is meant
.This line also identies the version of the protocol, which is SIP/2.0. The To and
From headers denote the address of the record; in case of REGISTER, these two are
same unless it is a third-party registration. The Call-ID header eld is mainly for
dialog identication. For registration, a SIP or WebRTC client will use the same
Call-ID value to register with a particular registrar. The Contact header holds the
address bindings. The expires header indicates how long the registration should be
valid, with the value given in seconds. Other headers are not mandatory.
The authentication challenge with the 407 Proxy Authentication Required response
is thrown by the server. This is depicted in the following SIP RESPONSE trace:
recv=SIP/2.0 407 Proxy Authentication Required
Via: SIP/2.0/WS df7jal23ls0d.invalid;report=52131;
From: "userA"<sip:userA@domain.com>;tag=nBAgXU4kKmnC7Xx2JEpr
To: <sip:userB@domain.com>;tag=38aee63c0935fb6b672c4ad12db0cc71.0f98
Call-ID: ac8d4e18-3c8b-5a3f-45c9-f236112e8d69
CSeq: 60293 INVITE
Content-Length: 0
Proxy-Authenticate: Digest realm="domain.com",
Server: kamailio (4.1.1 (x86_64/linux))
SEND: ACK sip:userB@domain.com SIP/2.0
Via: SIP/2.0/WS df7jal23ls0d.invalid;branch=
From: "userA"<sip:userA@domain.com>;tag=nBAgXU4kKmnC7Xx2JEpr
To: <sip:userB@domain.com>;tag=38aee63c0935fb6b672c4ad12db0cc71.0f98
Call-ID: ac8d4e18-3c8b-5a3f-45c9-f236112e8d69
CSeq: 60293 ACK
Content-Length: 0
Max-Forwards: 70
Please note that 401 Unauthorized or 407 Proxy Authentication
Required is present before almost all the SIP requests such as
Register, Subscribe, and Invite. We have henceforth excluded
them from traces to provide simplicity and clarity.
Chapter 6
[ 159 ]
Making audio and video calls using SIP
This is primarily the most crucial and important part of any communication client.
We have already made sure of this feature in Chapter 2, Making a Standalone WebRTC
Communication Client. To refresh the concept of a call on SIP clients, we can recall
the SIP requests that are sent out when a party wants to invite the other side for a
communication session.
The overall working principle for a SIP call based on requests revolves around the
following four requests:
The Invite SIP request for session request
The Ack SIP request conrms a request
The Cancel SIP request is to end a pending request
The Bye SIP request is to end a session
A 200 OK SIP response to the Invite SIP request is followed by the transmission of
the Ack SIP request. The Ack SIP request is used to transport the Session Description
Protocol (SDP) for media negotiation. This leads to successful call establishment
between the caller and receiver parties. A call request that does not receive a success
response of 200 OK is gracefully ended with a Cancel SIP request. An ongoing call is
ended with a Bye SIP request.
SIP employs the Request-Response Model in a fashion similar to HTTP. The
transactions are used to keep an account of the internal state and keep timers for every
request/response. A dialog is a complete set of signaling protocols exchanged between
both parties, and every subset of requests and nal responses inside this is considered
as a single transaction. This implies that a dialog can consists of many transactions.
To brush up our understanding of the SIP communication sessions, let's go through
the difference between SIP Transaction and SIP Dialog.
SIP Transaction SIP Dialog
This occurs between a client and a
server and comprises all messages from
the first request sent from the client to
the server up to the final response sent
from the server to the client.
This is a peer-to-peer SIP relationship
between two UAs that persist for some
time. A dialog is identified by a Call-ID
header, a local tag, and a remote tag.
Basic Features of WebRTC over SIP
[ 160 ]
The following diagram gives a description of the SIP Dialog and SIP Transaction:
The call setup process, shown as follows, depicts a call between two WebRTC
clients through a WebSocket-capable SIP Server. These kinds of calls can operate
solely on SIP over WebSockets (SIP-WS) protocol and do not require a SIP-WS to
SIP gateway. The following diagram shows a call between WebRTC clients using
the SIP-over-WebSockets server:
200 OK
SIP Server
(sip over
200 OK
180 ringing
180 ringing
audio (RTP)
200 OK
200 OK
Chapter 6
[ 161 ]
For interoperability of WebRTC clients with SIP agents such as hard phones, other
desktop SIP phones, and mobile SIP phones, we require the WebRTC to SIP gateway
(SIP-WS to SIP convertor). A call ow depicting such cases is shown as follows; here,
the call is between the SIP agent and the WebRTC client using the SIP Server with a
SIP-over-WebSockets support/gateway:
SIP agent
200 OK
SIP Server (sip over
WebSockets) +
WebRTC to SIP G/w
200 OK
180 ringing
180 ringing
audio (RTP)
200 OK
200 OK
As visible from the preceding two diagrams, to call a WebRTC-only client,
a SIP-over-WebSocket server is sufcient. However, to be able to call any true SIP
agent such as hard SIP phone or a desktop-based soft SIP phone (Kapanga, X-Lite,
Twinkle, and so on), we need to have the SIP ow passed through a SIP Server,
which is capable of performing the inter-transformation from the WebSocket
request to a true SIP.
Assuming that we have such a server deployed, as discussed in Chapter 2, Making
a Standalone WebRTC Communication Client, we will focus on other services such as
Presence, message, and information for WebRTC clients.
Basic Features of WebRTC over SIP
[ 162 ]
Before establishing a call, the browser's WebRTC stack requests permission to
access the webcam and microphone to enable an audio/video call. The following
screenshot shows a browser requesting permission to allow the usage of the camera
and microphone for WebRTC calls:
The browser's WebRTC stack might also set the ICE parameters with the STUN
and TURN servers for network discovery, as shown in the following trace:
State machine: c0000_Started_2_Outgoing_X_oINVITE SIPml-api.js:1
PeerConnectionClass = function RTCPeerConnection() { [native code] }
SessionDescriptionClass = function RDOMAINessionDescription() { [native
code] } IceCandidateClass = function RTCIceCandidate() { [native code] }
Video Constraints:{"mandatory":{},"optional":[]} SIPml-api.js:1
ICE servers:[{"url":"stun:stun.l.google.com:19302"},{"url":"stun:stun.
counterpath.net:3478"},{"url":"stun:numb.viagenie.ca:3478"}] SIPml-api.
==stack event = m_permission_requested
onIceGatheringCompleted SIPml-api.js:1
Let's look at the SIP traces for a single side in a call between WebRTC clients in a
sequential order:
1. The INVITE request sent from user A to user B WebRTC client is as follows:
SEND: INVITE sip:userB@domain.com SIP/2.0
Via: SIP/2.0/WS df7jal23ls0d.invalid;branch=z9hG4bKNK4lzTbVKKnbk7c
From: "userA"<sip:userA@domain.com>;tag=nBAgXU4kKmnC7Xx2JEpr
To: <sip:userB@domain.com>
Contact: "userA"<sip:userA@df7jal23ls0d.invalid;rtcweb-breaker=yes
Call-ID: ac8d4e18-3c8b-5a3f-45c9-f236112e8d69
CSeq: 60294 INVITE
Content-Type: application/sdp
Content-Length: 3022
Max-Forwards: 70
Chapter 6
[ 163 ]
Proxy-Authorization: Digest username="userA",realm="domain.com",no
User-Agent: IM-client/OMA1.0 sipML5-v1.2013.08.10B
Organization: DOMAIN
/* SDP truncated */
Like the REGISTER SIP request message, INVITE also bears the same header
elds. The topmost line tells us that this SIP message is an INVITE message
that is used to establish call sessions. It contains the Request-URI, which
is same as the To header in case of INVITE. The From and To header elds
identify the caller and receiver, respectively. All the messages inside of a
dialog, in our case, a call, will bear the same unique Call-ID.
The CSeq elds maintain the order of transactions. The Contact header
is the address on which the sender is awaiting the next request/response.
The Content-Type header eld holds information about the message body.
In case of INVITE, it's SDP.
2. The 100 trying response for the invitation sent from user A to user B
is as follows:
recv=SIP/2.0 100 trying -- your call is important to us
Via: SIP/2.0/WS df7jal23ls0d.invalid;rport=52131;
From: "userA"<sip:userA@domain.com>;tag=nBAgXU4kKmnC7Xx2JEpr
To: <sip:userB@domain.com>
Call-ID: ac8d4e18-3c8b-5a3f-45c9-f236112e8d69
CSeq: 60294 INVITE
Content-Length: 0
Server: kamailio (4.1.1 (x86_64/linux))
3. The 180 Ringing response for the invitation sent from user A to user B
is as follows:
recv=SIP/2.0 180 Ringing
Via: SIP/2.0/WS df7jal23ls0d.invalid;rport=52131;
From: "userA"<sip:userA@domain.com>;tag=nBAgXU4kKmnC7Xx2JEpr
To: <sip:userB@domain.com>;tag=RwaGGC5SWWopJkgqeBRU
Contact: <sip:userB@df7jal23ls0d.invalid;alias=;transport=ws>
Call-ID: ac8d4e18-3c8b-5a3f-45c9-f236112e8d69
Basic Features of WebRTC over SIP
[ 164 ]
CSeq: 60294 INVITE
Content-Length: 0
Record-Route: <sip:;transport=ws;lr=on>
4. The 200 OK success response sent to user A by user B is as follows:
recv=SIP/2.0 200 OK
Via: SIP/2.0/WS df7jal23ls0d.invalid;
From: "userA"<sip:userA@domain.com>;tag=NXEl7RIgwL9Xq7RnOJVw
To: <sip:userB@domain.com>;tag=MHeTUftnAEUCVdzfd908
Contact: <sip:userB@df7jal23ls0d.invalid;alias=;transport=ws>
Call-ID: fe8e54e7-c04b-1064-4b4a-b3394fd06653
CSeq: 18645 INVITE
Content-Type: application/sdp
Content-Length: 2244
Record-Route: <sip:;transport=ws;lr=on>
/* SDP truncated */
5. The ACK response sent by user A to user B is as follows:
SEND: ACK sip:userB@df7jal23ls0d.invalid;alias=
9~5;transport=ws SIP/2.0
Via: SIP/2.0/WS df7jal23ls0d.invalid;branch=z9hG4bKERC7ZDrc2OEGAB3
From: "userA"<sip:userA@domain.com>;tag=NXEl7RIgwL9Xq7RnOJVw
To: <sip:userB@domain.com>;tag=MHeTUftnAEUCVdzfd908
Contact: "userA"<sip:userA@df7jal23ls0d.invalid;
Call-ID: fe8e54e7-c04b-1064-4b4a-b3394fd06653
CSeq: 18645 ACK
Content-Length: 0
Max-Forwards: 70
Proxy-Authorization: Digest username="userA",realm="domain.co
Chapter 6
[ 165 ]
Route: <sip:;transport=ws;lr=on>
User-Agent: IM-client/OMA1.0 sipML5-v1.2013.08.10B
Organization: DOMAIN
Similarly, a call between a WebRTC client and SIP phone is established via
the WebRTC-to-SIP gateway. Interoperability with various native mobile and
desktop-based SIP phones is described in Chapter 9, Native SIP Application and
Interaction with WebRTC Clients.
Note that the Media Gateway is required for demultiplexing, and the
Media Server is required for inter-codec conversion (transcoding)
between legacy audio/video codecs supported by SIP phones and
WebRTC-standard codecs. The process of integrating a Media Server
with a WebRTC is detailed in Chapter 3, WebRTC with SIP and IMS.
Text Chat using SIP
The MESSAGE SIP request is the way through which messages are delivered to the
SIP Server. The frontend of the WebRTC client issues a MESSAGE SIP request from the
sender to the receiver who carries the message body, which is sent through the SIP
signaling server infrastructure. The text of the instant message is transported in the
body of the SIP request. A call-ow diagram depicting the traversal of SIP MESSAGE
requests and subsequent responses is shown as follows:
SIP agent
200 OK
SIP Server (sip over
WebSockets) +
WebRTC to SIP G/w
200 OK
200 OK
200 OK
Basic Features of WebRTC over SIP
[ 166 ]
The SIP traces for instant messages in the form of text chats are shown through the
SIP request and response messages. The sent and received messages between two
parties using WebRTC clients are shown in a sequential order as follows:
1. The SIP message from user A to user B is as follows:
SEND: MESSAGE sip:userB@domain.com SIP/2.0
Via: SIP/2.0/WS df7jal23ls0d.invalid;branch=z9hG4bKBd24tswZZnBxamA
From: "userA"<sip:userA@domain.com>;tag=kkDy7bqurNd3XV10dlAO
To: <sip:userB@domain.com>
Call-ID: a0db8770-9489-4fca-78c1-0ce6d839f4e8
CSeq: 17363 MESSAGE
Content-Type: text/plain;charset=utf8
Content-Length: 16
Max-Forwards: 70
Accept-Contact: *;+g.oma.sip-im
Accept-Contact: *;+sip.ice
Accept-Contact: *;language="en,fr"
Proxy-Authorization: Digest username="userA",realm="domain.com",
User-Agent: IM-client/OMA1.0 sipML5-v1.2013.08.10B
Organization: DOMAIN
2. The RESPONSE SIP message for the message sent from user A to user B
is as follows:
recv=SIP/2.0 200 OK
Via: SIP/2.0/WS df7jal23ls0d.invalid;rport=51627;
From: "userA"<sip:userA@domain.com>;tag=kkDy7bqurNd3XV10dlAO
To: <sip:userB@domain.com>
Call-ID: a0db8770-9489-4fca-78c1-0ce6d839f4e8
CSeq: 17363 MESSAGE
Content-Length: 0
The MESSAGE requests do not establish a dialog and will always traverse the same set
of proxies. There are not one but many ways to implement text. We discussed the SIP
message in this section. It can also be implemented via the Message Session Relay
Protocol (MSRP). We will discuss this in greater detail in Chapter 8, WebRTC and Rich
Communication Services.
Chapter 6
[ 167 ]
Obtaining the online/ofine status of users
using SIP
The availability of users is determined by a SIP process called Presence. Depending
on their registration validation and reachability, a SIP or WebRTC client might send
online or ofine status notications. This is used to notify others that the particular
user is unavailable to take messages or calls right now.
The PUBLISH SIP request publishes the status of a user to the SIP Server, which
might either be online or ofine. Let's assume that user X has published their
status to the Server. The SIP Server also receives the SUBSCRIBE SIP request, which
indicates that the other users would like to know about the status update of this
particular user (user X). The Server then sends out the NOTIFY SIP request to update
the subscribed users about the current status of user X. This process is replicated and
repeated for all users. A person might like to subscribe for status updates of all their
contacts in the phonebook so that they can view their online or ofine status with a
green or red indicator alongside their address entries in their phonebook.
The call ow depicted in the following diagram is derived from SIP Extension for
Event State Publication of RFC 3903. The following diagram shows the call ow for
the Presence service, using the PUBLISH, SUBSCRIBE, and NOTIFY requests:
SIP Server (sip over
WebSockets) +
WebRTC to SIP G/w
200 OK
200 OK
200 OK
Basic Features of WebRTC over SIP
[ 168 ]
The working principle for the Presence service is described in this section.
The participating entities in this scenario are described as in the following gure:
Presence User Agent
Notify Subscribe
Presence Server
Presence Agent
Subscribe Notify
The Presence Agent publishes its current state to the Presence Server via the
Watcher, which continuously monitors the state change for users. The other user
agent listens for notications from the Presence Server .The three major SIP requests
for the Presence service are given as follows:
Publish for event state update
Subscribe to enable the receipt of notications
Notify to send updated information
The SUBSCRIBE message establishes a dialog and is immediately replied by
the server using the 200 OK response. At this point, the dialog is established.
The server sends a NOTIFY request to the user every time the event to which the
user is subscribed to changes. The NOTIFY messages are sent within the dialog
established by the SUBSCRIBE message.
Presence is a user's reachability and willingness to communicate its current status
information. Once subscribed, the user receives notications for every state change
of the agent. This is a way to have sustained stateful communication.
Chapter 6
[ 169 ]
The trace for WebRTC SIP Presence requests and responses are given as follows
in a sequential order:
1. The PUBLISH request is sent by user A to publish its current state, and it
is given as follows. It varies from ofine (user is not available) to online
(user is available) users:
SEND: PUBLISH sip:userA@domain.com SIP/2.0
Via: SIP/2.0/WS df7jal23ls0d.invalid;
From: "userA"<sip:userA@domain.com>;tag=gr4hcygvktExiuzDinch
To: "userA"<sip:userA@domain.com>
Call-ID: ce1e656e-fbb4-12a5-95dc-392ed1fe9519
CSeq: 23378 PUBLISH
Expires: 100
Content-Type: application/pidf+xml
Content-Length: 216
Max-Forwards: 70
Accept-Contact: *;+g.oma.sip-im
Accept-Contact: *;+sip.ice
Accept-Contact: *;language="en,fr"
Proxy-Authorization: Digest username="userA",
Event: presence
User-Agent: IM-client/OMA1.0 sipML5-v1.2013.08.10B
<?xml version="1.0" encoding="UTF-8"?>
<presence xmlns="urn:ietf:params:xml:ns:pidf">
<tuple id="a">
Basic Features of WebRTC over SIP
[ 170 ]
2. The response received by user A is given as follows:
recv=SIP/2.0 200 OK
Via: SIP/2.0/WS df7jal23ls0d.invalid;rport=44802;
From: "userA"<sip:userA@domain.com>;tag=gr4hcygvktExiuzDinch
To: "userA"<sip:userA@domain.com>;tag=0b8837ea7699295b1c5e8783be06
Call-ID: ce1e656e-fbb4-12a5-95dc-392ed1fe9519
CSeq: 23378 PUBLISH
Expires: 90
Content-Length: 0
SIP-ETag: a.1400155961.23214.7102.0
Server: kamailio (4.1.1 (x86_64/linux))
3. The SUBSCRIBE request sent by user B to user A is as follows:
SEND: SUBSCRIBE sip:userA@domain.com SIP/2.0
Via: SIP/2.0/WS df7jal23ls0d.invalid;
From: "userB"<sip:userB@domain.com>;tag=L6ITng8FhPuHaTGLK7r9
To: <sip:userA@domain.com>
Contact: "userB"<sip:userB@df7jal23ls0d.invalid;
Call-ID: 407695ab-b463-22e6-ed93-3a360dd30a51
Expires: 100
Content-Length: 0
Max-Forwards: 70
Event: presence
Accept: application/pidf+xml
User-Agent: IM-client/OMA1.0 sipML5-v1.2013.08.10B
Organization: DOMAIN
4. The SUBSCRIBE response that user B sends as a response to user A is
as follows:
recv=SIP/2.0 202 OK
Via: SIP/2.0/WS df7jal23ls0d.invalid;rport=49920;
Chapter 6
[ 171 ]
From: "userB"<sip:userB@domain.com>;tag=L6ITng8FhPuHaTGLK7r9
To: <sip:userA@domain.com>;
Contact: <sip:;transport=ws>
Call-ID: 407695ab-b463-22e6-ed93-3a360dd30a51
Expires: 100
Content-Length: 0
Server: kamailio (4.1.1 (x86_64/linux))
5. The NOTIFY request sent by user A to notify user B is as follows:
recv=NOTIFY sip:userB@;rtcweb-breaker=yes;clic
k2call=no;transport=ws SIP/2.0
Via: SIP/2.0/WS;branch=z9hG4bKbe5c.d63f742200000
From: <sip:userA@domain.com>;tag=0b8837ea7699295b1c5e8783be0664de-
To: <sip:userB@domain.com>;tag=L6ITng8FhPuHaTGLK7r9
Contact: <sip:;transport=tcp>
Call-ID: 407695ab-b463-22e6-ed93-3a360dd30a51
Content-Type: application/pidf+xml
Content-Length: 217
User-Agent: kamailio (4.1.1 (x86_64/linux))
Max-Forwards: 70
Event: presence
Subscription-State: active;expires=100
<?xml version="1.0" encoding="UTF-8"?>
<presence xmlns="urn:ietf:params:xml:ns:pidf">
<tuple id="a">
Basic Features of WebRTC over SIP
[ 172 ]
6. The NOTIFY response sent by user A to user B is as follows:
SEND: SIP/2.0 200 OK
Via: SIP/2.0/WS;branch=z9hG4bKbe5c.d63f742200000
From: <sip:userA@domain.com>;tag=0b8837ea7699295b1c5e8783be0664de-
To: <sip:userB@domain.com>;tag=L6ITng8FhPuHaTGLK7r9
Contact: <sip:userB@df7jal23ls0d.invalid;transport=ws>
Call-ID: 407695ab-b463-22e6-ed93-3a360dd30a51
Content-Length: 0
NOTIFY content = <?xml version="1.0" encoding="UTF-8"?>
<presence xmlns="urn:ietf:params:xml:ns:pidf">
<tuple id="a">
The SIP service discussed so far has covered audio/video call, registration, Presence,
and instant message services. We have seen the sequence of SIP request and response
messages in the preceding call-ow diagram and also analyzed the traces. These
SIP services are very basic in nature and provide simple communication scenarios
between two SIP or WebRTC endpoints. We are now ready to focus on more detailed
services within the developer-dened call-control logic described in the next section.
Services in the Application Server
With every communication client, there are a set of basic features that need to be
supported to make the client user friendly. Normally, it is taken for granted that these
features will come along with any kind of communication software or hardware that
the end users purchase. These include call hold/resume, call transfer/forward, call
screening, call ignore, mute, redial, and so on. We can integrate these either in the
frontend logic, which is written in JSP/HTML, or towards the SIP Application Layer,
which is written in the form of JAIN-SLEE or SIP Servlets.
Chapter 6
[ 173 ]
An Application Server is used to introduce call-control logic in a normal call-ow
scenario. The following diagram depicts the role of Application Server, managing
the call-control logic as well as components such as Registrar and Proxy Servers:
1. Register from different phones at
different locations
4. 200 OK
5. Call User-C using his public URI
10. 200 OK
11. ACK
8. 200 OK
13. ACK
3. 200 OK
Registrar /
9. 200 OK
12. ACK
Proxy Server
A SIP request that is sent from a SIP or WebRTC agent makes its way to the
SCSCF after passing through PCSCF. At SCSCF, it is sent to the Application Server
to process the call-control logic and proceed with the call accordingly. An application
denes the actions that need to be taken upon particular events. For example, to nd
out if the caller is blocked by the user, the SIP Server will reply to the INVITE SIP
request with the CANCEL request to the caller. The developers might also redirect
the blocked call to an audio le, stating that the user has been blocked. Many
other call-control services such as call forwarding and call transfer can also be
programmed as SIP applications to be deployed on the Application Server.
In this particular section, we will throw light on the following services:
Simple back-to-back user agent
Call screening
Call hold/resume
Call logs
Basic Features of WebRTC over SIP
[ 174 ]
Back-to-back user agent
In the most basic setup of the WebRTC SIP IMS architecture, there arises two scenarios.
The rst is when a server acts like a proxy agent. In this case, the call request
is just forwarded to the destination end by the server without any modications.
The server acts like a mute tunnel to link the two sides in a communication channel.
It can be seen in the rst section of this chapter, making audio/video call via a server
with SIP WebSockets support.
The second case is when the server acts like a Back-to-Back User Agent (B2BUA).
Unlike a proxy agent, which maintains only transactions, the B2BUA agent maintains
the call state of all the SIP events.
B2BUA is the way in which many VoIP elements such as SBCs and
PBXs (Asterisk or FreeSWITCH) work. Though it is not a service
itself, it has been introduced to depict the default behavior of call
control by the Application Server.
The following diagram shows the SIP call ow mediated by the B2BUA deployed on
the Telecom Application Server:
180 ringing
SIP Server
(SIP over
200 OK
SIP Server
(SIP over
180 ringing 180 ringing 180 ringing
200 OK200 OK200 OK
audio (RTP)
200 OK 200 OK 200 OK 200 OK
Chapter 6
[ 175 ]
The preceding gure depicts how the Server participates in call setup, processing,
and termination by maintaining a separate call log on each side, that is, between the
caller and the SIP Server and between the SIP Server and the receiver. As a result
of this, the B2BUA agents are generally used to perform special operations such
as failover control, topology hiding, protocol interworking, and transaction with
database to fetch the screened users list. B2BUA are relevant to services that entail
media processing too.
Call screening
To block unwanted callers, the user might activate the call-screening service. As part
of this service, the user denes a list of SIP URIs that are barred from making calls to
them. This was a basic call-screening use case; however, it might also be applied with
special provisions and lters. We will look into the basic call-screening process rst.
Basic call screening
Let's assume that there are two users, user A and user B. A typical call screening
application will block the SIP URI of user A, while they are trying to contact
user B, based on preset values. In real time, when user A calls user B, the SIP
request reaches the core network logic and then matches the SIP URI of user
A with the blocked SIP URI list. If a match is found, the call is cancelled; otherwise,
the call is continued as normal to user B. The following diagram shows the SIP call
ow for a call-screening application:
SIP Server
(SIP over
403 Forbidden
audio (RTP)
403 Forbidden
fetch list of
blocked users
call screening application login
Basic Features of WebRTC over SIP
[ 176 ]
The preceding diagram denotes the case when a caller 's SIP URI is found matching
the entries in the screened user's list of receivers. In such a case, the caller is
responded to with an error message.
Enhanced call screening
The requirement for a more accurate call-control logic in screening applications is a
primary concern for telecom service providers due to the following reasons:
When a call is screened midway, the air interface of the Telecom Service
Provider is still made use of, and then the call is dropped. It is a direct
revenue loss after the usage of precious air interface. In order to generate
some revenue inputs even from blocked/screened calls, a failed call should
still be connected to supplementary services such as voicemails and IVR
announcements and asked to leave a message so that the services that can
be billed are invoked.
To provide for a better user experience even with failed/screened calls and
boost user engagement, a media playback of interactive responding services
is required, instead of error messages and abrupt termination of calls.
To enable the user to exercise more control over their incoming calls,
we should give them a detailed provisioning system to enter day-time
preferences and gray/white/black listed users.
To provide the subscriber with the option to link their calendar with their
call-control service so that all calls that are made during an important activity
scheduled through their calendar, such as a business meeting, get screened.
Call hold/resume
It is a simple requirement to be able to put an ongoing call on hold and resume it
after a period of time. During an ongoing call, that is, during the course of SRTP or
RTP, the media is continuously owing between the endpoints; however, during a
call hold session, the media ow is temporarily put on hold. This prevents media
exchange for the period until the user resumes it.
Chapter 6
[ 177 ]
The following call ow depicts the call-hold and call-resume operations between two
WebRTC clients:
180 ringing
SIP Server
(SIP over
200 OK
SIP Server
(SIP over
180 ringing 180 ringing 180 ringing
200 OK200 OK200 OK
200 OK
200 OK 200 OK 200 OK
200 OK
INVITE resume (SIPWS) INVITE resume (SIP) INVITE resume (SIP) INVITE resume (SIPWS)
200 OK 200 OK 200 OK
no media
media session reestablished
media session
The working principle of call hold and resume lies in the correct usage and
understanding of the INVITE SIP message .We use the process of sending the
RE-INVITE SIP message to determine when to put the call on hold and a
RE-INVITE message again to determine when to resume the ow.
Call forwarding
Call forwarding is a useful service to connect the call to a second party when
the concerned party is not available to answer the call. A real-time use case is
that of a boss forwarding all the incoming calls to his assistant. Call forwarding
can be of two variants:
On unavailable (busy/no answer)
Basic Features of WebRTC over SIP
[ 178 ]
Unconditional call forwarding
In this scenario, all the incoming calls are transferred to a third party on
an unconditional basis. The following diagram shows the SIP call ow for
unconditional call forwarding:
User A
SIP Server
(SIP over
User B
User C
181 Call is being forwarded
180 ringing
media session
call forwarding
The enhanced logic can also be specied in the application that will be deployed
on the application server. A user might switch on the call-forwarding service for a
period of time or for specic people. These values are fed into the system using a
provisioning system, which can be IVR based or web based.
Call forwarding when the user is unavailable
For instances where the call is to be forwarded on specic error responses such as
486 Busy Here and 487 Request Timeout (that is, the call is not answered), the SIP
Server connects the call to a second party. The logic for this application, along with
identities of the primary and secondary party who want to use the call-forwarding
service, must be deployed on the application server.
Chapter 6
[ 179 ]
The following diagram shows the SIP call ow for call forwarding in case of
user unavailability:
User A
SIP Server
(SIP over
User B
User C
100 Trying
180 ringing
media session
486 Busy Here
181 Call is being forwarded
call forwarding
Call transfer
An ongoing call can be transferred from one user to another while a session is in
progress or even before a call is received. There might arise two variations to call
transfer, which takes place using the REFER SIP request and the Refer-To header;
they are transfer attended and transfer unattended.
Attended call transfer
In this scenario, the user, who we assume is the transfer originator, puts the called
party on hold and establishes a call with the transfer target to alert them to the
impending transfer. The user then places the target on hold and then proceeds with
the transfer using an escaped Replace header eld instead of the Refer-To header.
Basic Features of WebRTC over SIP
[ 180 ]
The following gure depicts the attended call transfer from user B to user A:
WebRTC User A
200 OK
media session
WebRTC User BWebRTC User C
200 OK
no media session
200 OK
media session
200 OK
no media session
202 Accepted
200 OK
200 OK
no media session
200 OKNotify
200 OK
200 OK
In this scenario, the rst handshake between A and B creates a new RTP session
and then puts it in a hold state. The second handshake between B and C creates a
new RTP session and puts it in a hold state too. Then, user B passes the credentials
of C to A using the REFER message, and A establishes a new RTP session with C.
Thereafter, C closes the session with B, and A noties B about the new session
with C. Then, B closes the session with A.
Now, user A and user C are in a session successfully. Despite the BYE message sent
by user C, the dialog still exists until the subscription created by the REFER message
has terminated.
Chapter 6
[ 181 ]
Unattended call transfer
In case of an unattended call-transfer scenario, the user provides the contact URI
of the target (the SIPURI or PSTN number) to the receiver. The receiver attempts to
establish a session using that contact and reports the results of this attempt.
The following diagram depicts a call ow for call transfer in an unattended manner:
WebRTC User A
200 OK
media session
WebRTC User BWebRTC User C
Refer (c)
202 Accepted
200 OK
200 OK
media session
200 OK
200 OK
This scenario bears a substantial difference from the attended call transfer where
parties were preinformed about the call transfer. However, this is not the case here.
In this case, the call transfer takes place without putting any party on hold. Just like
in the previous case, here too, user A calls user B initially. Thereafter, B refers C to
A, and users B and A disconnect. Now, A is connected to C, which also replaces the
session. In this example, the Replaces header eld is inserted into the Refer-To URI
to achieve unattended call transfer from B to C.
Basic Features of WebRTC over SIP
[ 182 ]
Generation of call log for tracking
Recording the call logs is a necessity for user overview and auditing purposes alike.
Call logs are stored with a unique transaction ID, timestamp, caller, receiver, and
duration of call.
We will look into the working principle behind generating call logs from the
Application Server program. To facilitate storing details about calls made and received,
the Application Server must refer to an external database entity. For every incoming
call, the Application Server either initiates a temporary counter till the end of a call
and then writes the values to the database or writes the values to the database in the
beginning itself and updates them once the call is terminated. To adopt the latter
approach, refer to the call-ow logic depicted in the following diagram:
180 ringing
SIP Server
(SIP over
200 OK
SIP Server
(SIP over
180 ringing 180 ringing 180 ringing
200 OK200 OK200 OK
200 OK 200 OK 200 OK 200 OK
Record timestamp between
caller and recipient for call begin
audio (RTP)
Record timestamp between
caller and recipient for call end
Media Server-based features
Media Servers are used for the purpose of transcoding media streams; for example,
in case of WebRTC, it converts audio/video from the WebRTC standard to other
codecs understandable by legacy agents.
Chapter 6
[ 183 ]
Media-related operations such as audio/video recording, playback, announcement,
and conferencing are handled via the Media Server. Few of the advanced use cases
such as Music on Hold, IVR, Video on Demand, and others are also congured
through the Media Server logic. Importantly, the Media Server also provides for
Media Resource Function (MRF), which processes real-time audio and video media
streams and forms a crucial part of the IMS architecture.
To play a simple announcement through the Media Server, the establishment of
a SIP session is requested. Similarly, record and playback operations, on part of the
Media Server, require the establishment of a SIP session, just as in case of a call.
The call ow for media announcement is depicted in the following diagram.
It is derived from Basic Network Media Services with SIP in RFC 4240.
SIP Server
(SIP over
media announcement
200 OK
media server
100 trying100 trying
200 OK200 OK 200 OK
200 OK 200 OK
Media relay
Once the SIP signals have been transmitted successfully, a communication session
is established between the two WebRTC endpoints. By default, the media ows in
a peer-to-peer fashion. However, to enable the media services such as transcoding
and recording text to speech, the media must not ow peer to peer. Instead, the
transmission should take place through a relay agent. We can use an RTP proxy
linked to a Media Server for this purpose.
Basic Features of WebRTC over SIP
[ 184 ]
An RTP Proxy can function in one of the following two modes based on conguration:
Basic proxying mode: In this mode, it does not alter the media stream
Functional mode: In this mode, the media parameters are altered to get the
best conguration
The following diagram shows the SIP call ow for media relay:
180 ringing
SIP Server
(SIP over
200 OK
SIP Server
(SIP over
180 ringing 180 ringing 180 ringing
200 OK200 OK200 OK
200 OK 200 OK 200 OK 200 OK
audio / video media
media server
audio / video media
The preceding diagram depicts a media relay operation through an RTP engine inside
a Media Server. The other signaling operations such as SIP Requests for INVITE, ACK,
BYE, and SIP Responses for 200 OK and 180 ringing messages are unaffected.
Voicemail enables a user to deliver a recorded voice message to another user.
Usually, this service is hit in the event of a receiver not answering the call and
callers getting automatically redirected to voicemail. The voicemail application
records an audio message that is delivered to the receiver through their mailbox.
Chapter 6
[ 185 ]
A diagrammatic description of voicemail components in the WebRTC setup is shown
in the following diagram:
The summarized working principle of voicemail is based on the call recording
and playback component of Media Server, which is sometimes also referred to
as the Voicemail Server.
The Application Server hosts a SIP Servlet (RecorderServlet) that functions as a
terminal. It handles an incoming call from Bob; the Application Server connects
it to a Voicemail Server, which hosts a VXML or MSML script. The Media Server
establishes an RTP session and records voice data as an audio le in the AVI format.
Alice receives the recorded voice mail le, which can be played by a SIP phone,
WebRTC client, or PSTN telephone.
Basic Features of WebRTC over SIP
[ 186 ]
Music on Hold
When a call is put on hold, the sender and receiver receive a music playback to avoid
a mute line; this service is called Music on Hold.
Music on Hold is established with the help of the Media Server. In a more detailed
manner, when an existing SIP session is put on hold, the SIP Application Server
connects the parties to the Media Server, which establishes a media session with the
user and plays back an audio le. The working principle behind Music on Hold is the
transfer of call from one party to the Media Server. This allows for music playback
till the call is transferred back to the original communicating party.
Interactive Voice Response
An IVR is a prompt-like audio message that is played to convey information to the
user. IVR is a prerecorded message that is played on the occurrence of some event.
Consider the following table for some events and the IVR associated with them:
Event IVR samples
When the recipient of a call is on another line User is busy. Please try after some time.
On occasion of joining a conference You are the rst member in the
Member X has joined the
Member Y has left the conference
For announcements Your balance is too low for
the call
You are not allowed to call
this number
For the main menu Press 1 to enable call
screening service
Press2 to enable call
forwarding service
Press 3 to go back to the
main menu
As an IVR is a prerecorded message of a text-to-speech le, it is stored and
congured with the Media Server. When a user event that requires an IVR to be
played occurs, the SIP Application Server sends an invite to the Media Server that
established a media session with the user and plays back the audio.
Chapter 6
[ 187 ]
Conference connects multiparty audio and/or video call such that everyone
has access to the ongoing media. All the members of the conference call can
simultaneously participate in the call.
Multipart communication
The working principle of multiparty audio/video call conferencing is based on the
playback ability of the Media Server. The following diagram depicts conferencing
between multiple parties:
For a dedicated conferencing app, services such as group chat, le sharing, private
chat, and desktop sharing are of prime importance. WebRTC and Media Server alone
cannot provide all the capabilities required to build a team conference application.
The various ways of implementing a WebRTC-based conference will be discussed in
Chapter 10, Other WebRTC Use Cases
Basic Features of WebRTC over SIP
[ 188 ]
Features of a web application
As WebRTC is more popular as a web-hosted service, it makes a lot of sense to take
advantage of its web nature and utilize its full potential as a web application. The
features discussed here are particular for WebRTC clients only and not applicable to
usual SIP agents such as hard phones, desktop-based soft clients, and others. These
features include OAuth, Geolocation, RESTful web services to fetch information such
as news and weather updates, integration with social networking accounts, as well
as importing contacts, sending web mails, and so on.
In this section, we will discuss the following features:
OAuth for authentication against third-party servers
Importing contacts from other accounts
Message to mail
The Geolocation API by W3C provides a method to locate the user's position. This
is useful in a number of ways, ranging from providing a user with location-specic
information/advertisement/search results to providing route navigation.
There are primarily four ways in which Geolocation is fetched from a computer:
IP Geolocation: Using this, each IP block corresponds roughly to a
geographical area, which often results in false positives.
GPS: Using GPS satellites, maximum accuracy of user location is obtained.
Wi-Fi Positioning: Using Wi-Fi networks and routers, especially in urban
areas (using Skyhook Wireless), user location is obtained.
Cell Tower Triangulation: This is based on the cellular signals that the user
gets from towers. This is mostly useful for mobile devices that have built-in
cellular radios.
We make use of IP Geolocation to track down the user's current location and share
the same whenever required. For example, consider the following code to fetch the
current position of the browser:
If (navigator.geolocation)
Chapter 6
[ 189 ]
The list of parameters that can be fetched from the Geolocation API embedded in a
browser are as follows:
Latitude: 28.459497
Longitude: 77.02663799999999
Accuracy: 25000
Altitude: Null
Altitude accuracy: Null
Heading: Null
Speed: Null
Note that these are arbitrary values.
Geolocation is a powerful feature for any application. In the context of
WebRTC clients, developers can use Geolocation to place the user's contact
from the phonebook on a map as per their obtained longitudes and latitudes.
The Geolocation database maintains user groups divided into different geographies
as per their current location. This can be used for many interesting use cases such
as targeted advertisements, currency conversion, language-based localization,
and policymaking.
In case of outdated browser or dead-slow uplink speed, the Geolocation service
might be unavailable. Also, when the user has not granted explicit permission in
the pop-up bar that appears to share location, the Geolocation service does not work.
The following screenshot shows the browser requesting for permission to allow the
usage of the computer's location:
After fetching the Geolocation coordinates, it is upto the programmer to either
display it in a tabular format, store it for backend processing only, or display it
on a map on a web-based GUI.
Basic Features of WebRTC over SIP
[ 190 ]
The process to obtain the Geolocation coordinates through a WebRTC client and
display them on a map is outlined in the following diagram:
Let's briey study the Permission, Latency, and Error Handling services in the
Geolocation service. The HTML5 specication explicitly requires the user to grant
permission to any web page that requests Geolocation information. Geolocation
is not instantaneous. It usually takes between 1 and 20 seconds. The request for
Geolocation information is an asynchronous call.
Authenticating users with OAuth
The OAuth mechanism provides a safe and easy authentication mechanism. It does
away with password-based logins and introduces token-based authentication.
The working principle of OAuth with social networking sites is token-based
authentication, which exists as long as the other account's session is valid. Some
of the social networking accounts whose API's can be used for OAuth-based logins
are as follows:
Chapter 6
[ 191 ]
A diagram depicting the integration of OAuth API for two popular social
networking platforms with the WebRTC client login is shown as follows:
Programmers can set up a link from SIP URI to social networking ID for each
account, thus maintaining an account prole for every external account linked
with the WebRTC account.
Import contacts from other accounts
The phonebook holds added contact numbers and SIP URIs that the user wants
for easy access. Developers can also program the WebRTC application to have the
ability to merge the SIP contacts with other contacts imported from the user's social
networking accounts, such as Facebook and Google+.
Basic Features of WebRTC over SIP
[ 192 ]
The working principle of this service relies on the base of linking the SIP prole
ID or SIP URI with the account information imported from other platforms such
as social networking or business networking sites. For example, as a user signs in
with their Facebook credentials through the OAuth app, their validated username
can be mapped with the SIP URI and stored in the backend database. The SIP URI
is mapped to the Facebook account ID or Google account ID; when these IDs are
referenced again, the SIP URI of the selected person can be fetched. This way, when
a user imports a friend from, say, Facebook or Google+, the IDs from these social
networking accounts are pulled in and matched with the data store for any existing
mapping between SIP account holders. As an entry is found, it is added to the user's
phonebook and synced with the contacts that already exist.
Advertisements in the WebRTC call
Advertisements are a good source of revenue for any service provider. They are
usually shown on a section of a web page. However, in case of SIP and WebRTC,
we can show or play an advertisement in the time period between after the caller
makes a call and before the receiver picks it. This service is similar to the RingBack
Tone Advertisement.
The logic of the application lies in the process of playing an advertisement in the call
section of the WebRTC client web page, while there is a 180 Ringing status from
any party. This can be achieved in two ways:
The web application itself recognizes that there is a 100 trying or 180
ringing SIP response and displays the advertisement using the video src
element of HTML5. Consider the following code:
<video width="500" height="450" controls>
<source src="advertismentawalmart.mp4" type="video/mp4">
The SIP Server recognizes a 100 trying or 180 ringing SIP response and
connects the call to the Media Server, which plays the advertisement over RTP.
Chapter 6
[ 193 ]
Delivering an instant message as a mail
The purpose of this feature is to send out additional messages, invites, and/or
reminders to users either unavailable at that point of time or not registered with
WebRTC yet.
The working principle of the mail service relies primarily on the Mail Server.
The user sends a text with a target mail ID over HTTP; the web application passes
this text over to the embedded SMTP client. The SMTP client authenticates itself with
the SMTP server and sends the text in the message body, substituting the target mail
ID in the To header. It can be congured to add a preset subject or a customizable
subject as well. For example, in case user A wishes to send a WebRTC invite to user
B who is not currently registered with the WebRTC client base, they can send a
preset invitation to user B's mail inbox from the WebRTC client itself. The following
diagram shows the mail service integrated with the WebRTC client:
The preceding diagram shows a general SMTP message delivery sequence.
By importing specic e-mail libraries in the WebRTC client application,
this feature can be easily developed for the WebRTC communication tool.
Basic Features of WebRTC over SIP
[ 194 ]
The admin console
An administrator must have the privilege to view the WebRTC usage statistic
in addition to framing policies and guidelines for correct use. To enable this,
a developer must add an admin logic and admin console page for the administrator's
accounts. The logs accumulated from calls, messages, voicemails, and so on should
be shown in tabular and graphical formats. The graphical formats might include
pie charts to depict call share status and bar chart to monitor the activity levels per
hour. Refer to the following image that shows the charts for the admin console:
These were some suggested features that might be developed and integrated
with the WebRTC project. However, the developers must exercise their own will
to create more new features and integrate them with the WebRTC platform to
enhance interactivity or increase productivity.
In this chapter, we discussed some features that a communication client is expected
to have. It ranges from the default SIP features such as registration, call, Presence,
and message to enhanced applications such as call screening, call forwarding, and
call transfer. Media-related services that are basic in nature, such as media relay,
announcement, voicemail, conferencing, and Music on Hold, were also described.
In addition to this, the services possessed by a typical web application such as
OAuth, Geolocation, admin console, and advertisements were also touched upon.
In the next chapter, we will study the process of developing a WebRTC client in
the best industry-adopted frameworks, which include Struts and Spring MVC in
addition to a simple JSP/Servlet web project.
WebRTC with Industry
Standard Frameworks
In Chapter 2, Making a Standalone WebRTC Communication Client, we saw how to
build a web page purely based on HTML, JavaScript, and CSS, that is capable
of WebRTC-based SIP communication. However, to integrate WebRTC with an
enterprise- or consumer-based application, it is essential that we envelop the
WebRTC technology in a web-based application project. This chapter takes us
through the process of actually developing the WebRTC web client application.
A service provider or the network operator hopes to benet from WebRTC by
extending it as another communication endpoint. Not only this, WebRTC also gives
a new dimension to IP telephony by enabling any service provider to integrate the
click-to-call service directly from his website. However, the WebRTC solution will
be nonprotable to a Telecom Service Provider if it isn't resilient, scalable, and able
to integrate with the operator's already set-up infrastructure. As we know, WebRTC
standards only describe media capture and streaming mechanism. To provide for
signaling, we will use SIP APIs from sipML5 (refer to https://code.google.
com/p/sipml5/). In this chapter, we will learn how to develop a web communicator
project with WebRTC support.
WebRTC with Industry Standard Frameworks
[ 196 ]
The Multitier architecture
An efcient application is composed of multiple tiers. Tiers are used to isolate the
functionality of the application between different sections. Majorly, the structure
comprises three tiers:
Presentation Tier: This tier deals with the GUI through which the client
interacts with the application. It is usually a set of HTML elements with
other frontend technologies such as JavaScript for scripting logic and CSS for
design format. It will be loaded into the browser as a web page, for example,
login.html, home.html, and so on.
Logic Tier: The middle tier, also known as the Application Tier, is
responsible for processing logic, obtaining values from the Data Tier,
and delivering the results to the web engine.
Data Tier: The database and repositories that hold data values and les are
referred to as the Data Tier.
The following diagram shows the different layers of a Multitier architecture:
File repo
Logic Tier
Data Tier
database File repo
Presentation Tier
Scripts CSS
Now, we shall study the Software Development Life Cycle (SDLC) of a WebRTC
client application in detail.
Chapter 7
[ 197 ]
The design of a WebRTC client
A design is usually the activity planned in the rst stages of SDLC; later, it's followed
by development and testing (this is described later in the chapter).
Unied Modeling Diagrams (UMLs) aid in the design process by creating an
abstract view of the system. UML diagrams can be used to do the following:
Visually represent a system/project
Communicate one idea or model to other parties
Using specic tools (UML diagrams can also be used) to generate
code directly
There are various kinds of modeling diagrams in UML such as Class diagram,
ER diagram, Use case diagrams, and others. They give a graphical overview of the
project that is about to begin. This section presents some critical design diagrams
for WebRTC client web project.
The Class diagram
A Class diagram is used to visualize the overall structure of data organization in the
Data Tier. The main classes used in the WebRTC client project are:
UserDetails: This class contains the main eld for user identication and
registration, such as SIP URI, private identity, domain, display name, and
password. These values are required to register a user with the SIP Registrar so
that he can make use of SIP services such as call, message subscribe, and notify.
CallLogs: This class is meant to record the information of every incoming,
outgoing, missed, or failed call for every user in his history of transactions. The
information includes caller and called SIP URI, date/time stamp, and call ID.
MessageLogs: This class holds the records for all messages sent and received
by the user. Similar to call logs, this class also contains member variables
such as the sender and receiver SIP URI, date/time stamp, and message ID.
OtherAccount: Since the WebRTC client also interacts with third-party social
networking platforms and services, this table takes care of mapping between
SIP URI and other account IDs. For now, it includes elds for Google,
Facebook, Yahoo, and Twitter.
WebRTC with Industry Standard Frameworks
[ 198 ]
Geolocation: The Geolocation coordinates of a user obtained through their
browser's Geolocation API, Mobile phone's GPS, or by any other means is
stored in this table. The values are mapped to a user's unique SIP URI for
later referencing. The elds include SIP URI, latitude, longitude, date/time
stamp, and so on. In this project, any user's new Geolocation values are
overwritten over the existing ones.
Conferencing: Since WebRTC supports a multiparty conferencing feature,
this table is made to keep record of all conferences. The values include
conference name, conference ID, host URI, members URI, and sequence.
The host URI denotes the SIP URI of the user who is the host of the
conference, and the members URI contain a list of SIP URIs of users who
are guests for a conference.
Notification: Any kind of notications such as missed calls or conference
call invitations are stored in this table along with date/time stamp.
Voicemail: Voicemails are audio message les that are sent when a user
is unavailable to receive messages or calls. This table contains links to
voicemails and their associated sender's SIP URI with date/time stamp.
OfflineMessages: Some messages are not delivered through the SIP
Instant Message service on account of the user getting disconnected or being
unavailable. Such messages can be directly sent to the user's mailbox using
the SMTP gateway. The record of such messages or mails is kept in this table.
Phonebook: The quick contacts or friends of users are stored in this table for
easy reference. The SIP URI is used to link some values of the user details
table to appear here.
Presence: Online or ofine status update of the user is stored in this table.
The value is used to inform others about the availability or unavailability
of this user.
Chapter 7
[ 199 ]
The following screenshot shows the Hibernate mapping Class diagrams for the
WebRTC web application:
The preceding screenshot is the Hibernate mapping implemented between the classes
in the WebRTC client web application project and database tables. For example, the
MNotification database table and its entities point to the web project's com.webrtc.
model.MNotifcations class and its member variables. Please note that it does not
depict all the tables and their respective Hibernate mappings. Many classes and tables
such as voicemails, user details, and ofine messages are not visible.
WebRTC with Industry Standard Frameworks
[ 200 ]
The Entity Relationship model
The Entity Relationship (ER) model describes the relationship between various
logical blocks of programs. It is a conceptual data model that views the real world as
entities and relationships. For example, in a small-scale WebRTC project, the SIP URI
is adopted as the primary key for uniquely identifying all other values associated
with a user. A user is allowed only one unique SIP URI and they use it to log in to
the WebRTC client application and register itself with the Telecom SIP Registrar.
The following screenshot shows the ER diagram for the WebRTC web application:
In the preceding screenshot, the eld SIP URI is linked to other tables as a foreign
key. An ER diagram is often used in database designing. It lets the developers
quickly get started on dening the database structure in DBMS. A few shortcomings
of ER modeling include no proper standards and a high level of depiction.
Chapter 7
[ 201 ]
The environment setup
The environment setup for building a WebRTC web project is the rst step in the
development stage. Since our development is going to be on Java, it is essential to
have the Java application development tools installed, which include:
Eclipse IDE
Web application server such as JBoss/Apache Tomcat
The database set up also requires a server and client installation. Let's study all these
setups one by one.
Java Runtime Environment (JRE)
In an occasion where one does not want to use a standard IDE but rather build
and test Java programs with basic tools, they need to have JRE in the system. JRE
is also known as Java Virtual Machine (JVM). The entire Java Development Kit
(JDK) containing the JRE can be downloaded from http://www.oracle.com/
It is noted that, though Java is platform-independent, JRE is not. Therefore, one must
be cautious to download the specic JDK or JRE that is supported on a machine's
operating system and bit version. Furthermore, after installation, the JAVA HOME
environment variable must point to this directory.
WebRTC with Industry Standard Frameworks
[ 202 ]
Integrated Development Environment with
Java Enterprise Edition (EE)
Integrated Development Environment (IDE) is used by programmers to
develop Java applications. We can integrate any programming language model
with WebRTC. Here, we shall be using Eclipse IDE for Java EE Developers.
The following screenshot shows the Eclipse Kepler IDE for Java EE projects:
The preceding screenshot depicts a new workspace in the Eclipse Kepler IDE.
It is to be noted that a standard Eclipse installation does not support Advanced
or web-based capabilities by default. It is required to install Web Tools Platform
(WTP) explicitly to develop web applications. More information on Eclipse WTP
can be found at https://www.eclipse.org/webtools/.
We know that a database contains tables holding the records of user details as well
as associated records from other tables using key mapping. Some simple options to
choose from are PostgreSQL, Oracle, and MySQL Database Management System
(DBMS). We have used MySQL 5.5 Server and MySQL Workbench to access the
database. The following screenshot shows the MySQL Workbench connected to the
MySQL Server instance for the WebRTC web application:
Chapter 7
[ 203 ]
The preceding screenshot depicts a MySQL Workbench opened with a connection
instance to the WebRTC database on a MySQL server. The MySQL components
can be downloaded from http://dev.mysql.com/downloads/.
The web application server
The web application server is a container for WebRTC client application projects.
We can use JBoss developed by Red Hat, or Apache Tomcat, or any other server
capable of deploying a WAR le. We shall make use of Tomcat v7.0 Server for
this purpose.
WebRTC with Industry Standard Frameworks
[ 204 ]
The web application infrastructure
In this section, we are going to make the WebRTC client suited to programming
frameworks. We shall see the three most popular alternative approaches to develop
a WebRTC client application. They are as follows:
JSP- / Servlet-based WebRTC web project: These are small-scale
applications usually employed for Proof of Concept (POC) building.
Struts- / Hibernate-based WebRTC web project: This framework is
used for more scalable and organized applications. Hibernate employs
data abstraction.
Spring 3 MVC-based WebRTC web project: This framework is the most
preferred one for the development of rich and enterprise-grade WebRTC
client application projects.
Here, we shall discuss the best framework suited to our use and applicability of
WebRTC clients.
JSP- / Servlet-based WebRTC web project
In the early days of POC building, one can decide to just use the simplest of
approaches for testing and verifying whether WebRTC really meets the expected
performance requirements. Keeping the tight deadlines in mind, it was very obvious
to proceed with whatever looked like the shortest way to a demonstrable, workable
WebRTC client. A typical JSP- / Servlet-based dynamic web project is the most
viable option to test the preliminary functioning of the WebRTC functions.
Some of the advantages of JSP- / Servlet-based MVC architecture for WebRTC
developments are as follows:
Quick development: This does not require thorough design and is ideal for
small applications with light processing
Easy to deploy and debug: The modules are divided into only three source
folders: DAO, Model, and Controllers
Chapter 7
[ 205 ]
Programming the JSP- / Servlet-based web project
The JSP- / Servlet-based application architecture has quick buildup time and
doesn't require detailed design structure. However, it must be noted that it leads to
complexity and becomes hard to alter once the size and number of modules begin to
increase beyond a point. The components of a JSP Servlet web project are as follows:
Deployment descriptor: This describes the classes, resources, and the
conguration of the application and how the web server uses them to serve
web requests.
Controller: The Servlet acts as a controller that is responsible for processing
requests and creating any beans needed by the JSP page. It also decides
which requests need to be passed to which JSP page.
Model: The classes here are composed of the declaration of variables and
their general getters, setters, and constructors.
DAO: The DAO classes are responsible for invoking the database connection
object and performing Create Read Update Delete (CRUD) operations on
the records stored in the database. The Java Database Connectivity (JDBC)
technology is employed in this simple example.
View: This comprises only the visual elements on a page. They consist of
HTML and JSP pages.
The overall architecture of WebRTC web project POC, based on JSP- / Servlet-based
design, is depicted in the following diagram:
JDBC mysql
Controller View
bean classes
getters and
jsp and
pages (JS
and CSS)
WebRTC with Industry Standard Frameworks
[ 206 ]
The development of modules
There are only a handful project components for JSP- / Servlet-based web
project development. All of them are dened with their individual roles with
no or minimal dependency on the other modules. We shall cover three prominent
modules in this section:
The User Account module
The Communication module
The Phonebook module
The Call module is an HTML/JSP page-driven mechanism that does not depend on
Java programming. It has been provided in Chapter 2, Making a Standalone WebRTC
Communication Client. The ow between the View and Controller classes from the
process of logging in to the process of displaying the home page with user-specic
data is depicted in the following diagram:
Match Yes
Match No
Web Server
Call Controller
Login Controller SIPURI
Authentication SIPURI and
Pass values to call page
Initially, the user lands on the Login.jsp page where he is required to either
register for a new account or login with his credentials. A new user must create
an account so that the database is populated with his SIP entities, such as domain,
display name, and private identity, and any consecutive login with the SIP URI can
fetch his existing details. Once the account is created and the user has entered his
credentials to login, the Login Controller Servlet consults the database to ascertain
whether the entered username and password were correct or not. If they were
incorrect, the user reaches the login page again. If the user has entered the correct
username and password, then he is redirected to the Call.jsp page where he can
make or receive calls.
Chapter 7
[ 207 ]
The steps to build a JSP- / Servlet-based WebRTC application are as follows:
1. Create a dynamic web project in Eclipse; let's assume we name it
2. Create the Controller Servlet inside the src folder. Make sure the Servlets
are mapped in web.xml.
3. Create JSP pages named call.jsp and login.jsp inside the WebContent
folder. These are the frontend bodies that bear the login and call functionality.
The User Account module
The User Account module holds the SIP entities, such as SIP private identity, public
identity domain, password, and display name, that the user is registered with initially.
Later, the user uses these values to login and make calls to other WebRTC users.
The logic for the login and registration processes must be programmed to enable
new user registration after he has lled the registration form. The users who already
have an account should be able to login with their username and password. The
following code snippet is the Servlet implementation class named loginServlet:
public class loginServlet extends HttpServlet {
public loginServlet() {
protected void doPost(HttpServletRequest request,
HttpServletResponse response) throws ServletException,
IOException {
PrintWriter out = response.getWriter();
switch (request.getParameter("processtag")){
String privateIdentity=null,sipuri=null;
String userName = request.getParameter("userName");
String password = request.getParameter("password");
webrtclogin wl = new webrtclogin();
WebRTC with Industry Standard Frameworks
[ 208 ]
registration reg= new registration();
LoginDao dao=new LoginDao();
if(dao.register(reg)==true) {
Chapter 7
[ 209 ]
The following screenshot is the GUI representation of the Registration/Login page:
In the preceding screenshot, the top section has a registration form with input
boxes for Display Name, Authorization SIPURI, Private Identity, Password,
and Domain. The lower section has a login form that is only for registered users
to login to WebRTC client using their Authorization SIPURI and Password.
The Communication module
In a WebRTC client application program, the majority of the media-related tasks are
handled by the browser's WebRTC media stack and signaling through SIP stack
(in our case sipML5 JavaScript library). Our responsibility is to record the logs
and render the AJAX calls to SIP invite methods. It's also concerned with the display
of an appropriate status message if a call is not connected. When a call is successfully
established between two parties, the video window with the local and remote party's
captured video as well as the captured audio must be presented on the frontend
JSP page.
WebRTC with Industry Standard Frameworks
[ 210 ]
The following screenshot is the GUI representation of the Call and Message page:
In the preceding screenshot, the upper section is for messaging functions. It is
made up of three elements: a textbox for the SIP URI of the other party in message
conversion, a textbox for current messages being sent from our side, and a text area
to display the most recent messages sent and received between the two parties.
The lower section is for making and receiving calls. When a user is trying to make a
call or receive a call, a drop-down window appears in this section that has the local
and remote party's media elements such as audio and video. Furthermore, the call
can be either of two types for which two buttons are present: Audio Call (for audio)
and Video Call (for video).
The Phonebook module
The Phonebook module can be used to add the SIP URI of other users for quick
reference. Also, the present status of the users is indicated with a green (user online)
or red (user unavailable) symbol adjacent to their SIP URI.
The following code snippet is the Servlet implementation of the
FriendListController class for the Phonebook module:
public class FriendListController extends HttpServlet {
private static final long serialVersionUID = 1L;
public FriendListController() {
Chapter 7
[ 211 ]
protected void doGet(HttpServletRequest request,
HttpServletResponse response) throws ServletException,
IOException {
String action=request.getParameter("action");
HttpSession session=request.getSession();
String username=(String) session.getAttribute("name");
String friendUri=request.getParameter("friendName");
else if (action.equalsIgnoreCase("addFriendURI")){
String sipURI=request.getParameter("friendName");
The following screenshot is the GUI representation of the Phonebook page:
The preceding screenshot depicts a Phonebook page that holds the contacts that the
user has added to keep for future reference. There are three important elements in
a phonebook interface, as follows:
The SIP URI of the users that are added to the phonebook
A textbox to add a new SIP URI to the phonebook
The ofine/online status of each user depicted by a red/green symbol
alongside their SIP URI
WebRTC with Industry Standard Frameworks
[ 212 ]
The primary JAR les used are servlet-api.jar for Servlet and mysql-connector-
java-5.0.8-bin.jar for MySQL connectivity. The scalability of this POC on
WebRTC client depends on factors such as server CPU, server memory, bandwidth,
and so on.
The following screenshot shows the Project Explorer window after the project
completion of the WebRTC client POC:
The project can be run with any web server such as Tomcat, JBoss, and others. Some
substantial limitations of using only JSP/Servlet pattern for a WebRTC project are:
The properties le and the le IO systems are slow and outdated.
However, in order to avoid slow disk access, properties can be
dened using the environment variables.
JavaScript-based validation can be easily overruled. They are easily
subject to SQL Injection.
Multithreading issues are not handled in the existing JSP/Servlet project.
It might lead to memory leaks before the lifetime of the connection objects
ends. Also, the buffer might overow with garbage values.
Chapter 7
[ 213 ]
Struts- / Hibernate-based WebRTC
web project
The Struts framework is best suited for WebRTC client development when an
agile steadfast Communication client is required. It's true that it does not offer as
many plugin options as Spring, but it does meet the destined goal quickly in an
organized manner.
It is usually the next phase in the transformation of an architecture from
JSP- / Servlets-based MVC architecture to Struts 2.0 with Hibernate support.
It is a major step ahead from the previous architecture that begins to look like a
ball of entangled threads with the addition of more classes and functions. With the
easy modularization approach of the Struts framework, much of the confusion and
complexity is reduced.
Hibernate tools help developers by acting as code generation tools. They are used
to generate Hibernate applications very fast with mapping in ORM using XMLs,
dialects, annotations, and so on. Most importantly, they are database-independent.
Therefore, to replace the database from MySQL to Oracle or any other DBMS, at
any time, is an easy job, as Hibernate acts like a database abstraction layer between
the project and the database. It also has its own table modication and query tools
(Hibernate Query Tool (HQL)).
Apache Struts is a free, open-source, MVC framework for creating modern Java
web applications. It is extensible, uses a plugin architecture, and ships with plugins
to support REST, AJAX, and JSON. It helps in Web Request Validation, UI tags,
and Action forms, such as extensive validation without JavaScript, and it cannot
be hacked since code or URLs are not visible on the page.
Programming the Struts- / Hibernate-based
web project structure
Struts uses the Model 2 architecture. The Struts Action Servlet controls the
navigation ow. Struts classes, such as Action, are used to access the business
logic classes named Service. When the Action Servlet receives a request from the
container, it uses the request URI or path to determine what action will be used to
handle the request. The Action Servlet can verify the input, retrieve information from
a database, or perform data processing in the business layer. For more information,
refer to https://struts.apache.org.
WebRTC with Industry Standard Frameworks
[ 214 ]
The overall architecture of the WebRTC application with the Struts framework is
shown in the following diagram:
Kamailio Node.js
Triggers, Procedures, and
Data Layer
invoke action
Configuration files
Jsp pages
from bean and
action forms
struts tags
render pages
Presentation layer
Data Access
Business Logic layer
getters, and
log4j audit Plugins
to DB
Struts action classes
dispatch filter
gdata Apache Struts2,
Action ServletActionContext
oauth APIs
The server machine depicted in the preceding diagram is Red Hat Enterprise
Linux 6 (RHEL 6); however, any upgraded machine can be used in place of this.
The components of Business Logic Layer, Presentation Layer, and Data Access
Layer are differentiated. Add-on features such as SMTP for e-mail, log4j for logging,
and sipML5 for SIP signaling are also depicted. The database (MySQL server), SIP
signaling server (Kamailio), and WebSocket conferencing servers (Node.js) are
shown in the lower part of the diagram.
Chapter 7
[ 215 ]
The development of modules
Since modules for Login, Account, and Call Control have already been discussed
in the previous section, JSP- / Servlet-based WebRTC web project, we shall cover the
OtherAccount module in this part. The OtherAccount module, as the name suggests,
links other/third-party accounts such as the Google account, the Facebook account,
and the Yahoo account of users with his WebRTC SIP account. This will be used
for extended functionality of the WebRTC application such as login using the third
party, OAuth, sending e-mail from WebRTC account to ofine users, importing
contacts from other accounts, and so on.
The steps to build a Struts2-based WebRTC application, are listed as follows:
1. Create a dynamic web project in Eclipse. Let's assume we name it
WebRTCframework. In the deployment descriptor (web.xml), dene an entry
for the FilterDispatcher class as follows:
The mapping of the Struts dispatcher to the /* pattern allows it to handle
all the incoming requests. The welcome-file page is the rst page that is
displayed when a web project is hit.
WebRTC with Industry Standard Frameworks
[ 216 ]
2. Create a hibernate.cfg le to access and manage the database connection
as follows:
<property name="hibernate.connection.driver_class">
<property name="hibernate.connection.url">
<property name="hibernate.connection.username">
<property name="connection.password"> </property>
<property name="connection.pool_size">5</property>
<property name="hibernate.dialect">