72
Saarland University Faculty of Natural Sciences and Technology I Department of Computer Science Master Thesis Virtualization of Video Streaming Functions Submmited by: Birhan Tadele Teklehaimanot Advisor Goran Appelquist Supervisor Prof. Dr.-Ing. Thorsten Herfet Reviewers Prof. Dr.-Ing. Thorsten Herfet Prof. Dr. Dietrich Klakow April 25, 2016

Virtualization of Video Streaming Functions

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Virtualization of Video Streaming Functions

Saarland UniversityFaculty of Natural Sciences and Technology I

Department of Computer Science

Master Thesis

Virtualization of Video Streaming Functions

Submmited by: Birhan Tadele Teklehaimanot

Advisor

Goran Appelquist

Supervisor

Prof. Dr.-Ing. Thorsten Herfet

Reviewers

Prof. Dr.-Ing. Thorsten Herfet

Prof. Dr. Dietrich Klakow

April 25, 2016

Page 2: Virtualization of Video Streaming Functions
Page 3: Virtualization of Video Streaming Functions

Eidesstattliche Erklarung

Ich erklare hiermit an Eides Statt, dass ich die vorliegende Arbeit selbststandig verfasst und

keine anderen als die angegebenen Quellen und Hilfsmittel verwendet habe. Ich erklare hiermit

an Eides Statt, dass die vorliegende Arbeit mit der elektronischen Version ubereinstimmt.

Statement in Lieu of an Oath

I hereby confirm that I have written this thesis on my own and that I have not used any other

media or materials than the ones referred to in this thesis. I hereby confirm the congruence of

the contents of the printed data and the electronic version of the thesis.

Saarbrucken, on April 25, 2016

Birhan Tadele Teklehaimanot

Einverstandniserklarung

Ich bin damit einverstanden, dass meine (bestandene) Arbeit in beiden Versionen in die Biblio-

thek der Informatik aufgenommen und damit veroffentlichtwird.

Declaration of Consent

I agree to make both versions of my thesis (with a passing grade) accessible to the public by

having them added to the library of the Computer Science Department.

Saarbrucken, on April 25, 2016

Birhan Tadele Teklehaimanot

Page 4: Virtualization of Video Streaming Functions
Page 5: Virtualization of Video Streaming Functions

Abstract

Edgeware is a leading provider of video streaming solutions to network and service operators.

The Edgeware Video Consolidation Platform(VCP) is a complete video streaming solution con-

sisting of the Convoy Management system and Orbit streaming servers. The Orbit streaming

servers are purpose designed hardware platforms which are composed of a dedicated hardware

streaming engine and a purpose designed flash as a storage system. The Orbit streaming server

is an accelerated HTTP streaming cache server which have up to 80 Gbps bandwidth and can

stream to 128000 clients from a single rack unit. In line with the new trend of moving more and

more functionalities towards a virtualized or software environment, the main goal of this thesis

is to make a performance comparison between Edgeware’s Orbit streaming server and one of the

best generic HTTP accelerators(reverse proxy severs) after implementing logging functionality

of the Orbit on top of it. This is achieved by implementing test cases for the use cases that can

help to evaluate those servers. Finally, after evaluating those proxy servers Varnish is selected

and then compared the modified Varnish and Orbit to investigate the performance difference.

Page 6: Virtualization of Video Streaming Functions
Page 7: Virtualization of Video Streaming Functions

iii

Acknowledgements

First and foremost, I would like to express my heartfelt gratitude to my supervisor Prof. Dr.-

Ing Thorsten Herfet for providing me an opportunity to write my thesis with him. My sincere

thanks to Goran Appelquist, for his patience and invaluable guidance. During our periodic

discussions, his constructive suggestions have helped me to gain such a wonderful experience on

doing my thesis. Besides, I have learned a lot while working with him. Furthermore, I would

like to thank my immediate family, specifically my father and my mother for helping me to

reach here even though the culture in our village is difficult to send girls to school. Last but

not least, my wholehearted gratitude to my brothers, sisters and my husband for their love,

encouragement, endless motivation and support.

Page 8: Virtualization of Video Streaming Functions
Page 9: Virtualization of Video Streaming Functions

Contents

Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix

1 Introduction 1

1.1 Multimedia Streaming . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Streaming Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.2.1 Traditional HTTP Download Technologies . . . . . . . . . . . . . . . . . . 2

1.2.2 True Streaming Technologies . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2.3 HTTP Adaptive Bitrate Streaming Technologies . . . . . . . . . . . . . . 4

2 Background and Related Works 6

2.1 Content Delivery Networks(CDN) . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2.1.1 Components of CDN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Forward proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.2 Transparent proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2.2.3 Reverse proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 The Orbit streaming server 12

3.1 Edgeware solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3.2 Edgeware Orbit server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Proposed solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 General comparisons of HTTP reverse proxy servers 15

4.1 Squid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

4.1.1 Pros and cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

4.2 Apache traffic server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

v

Page 10: Virtualization of Video Streaming Functions

vi Contents

4.2.1 Pros and cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.3 Nginx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4.3.1 Pros and Cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4 Varnish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4.4.1 Pros and cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.5 Aiscaler . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.5.1 Pros and cons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Test methodology 21

5.1 Definition of test cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.1.1 Live test case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.1.2 100% cache hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.1.3 90% cache hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.2 Implementation of test cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.3 Configuration of proxy servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

5.4 Test Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.4.1 Test setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

5.4.2 Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6 Performance comparison of Orbit and proxy servers 29

6.1 Live test case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6.2 100% cache hit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.3 90% cached . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.3.1 Nginx . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.3.2 Varnish . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

6.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.5 Implementation of logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

6.6 Orbit and modified Varnish performance comparison . . . . . . . . . . . . . . . . 43

6.6.1 Varnish results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

6.6.2 Orbit results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Page 11: Virtualization of Video Streaming Functions

Contents vii

7 Conclusion 49

List of Figures 52

List of Tables 55

Bibliography 57

Page 12: Virtualization of Video Streaming Functions
Page 13: Virtualization of Video Streaming Functions

List of Abbreviations

CDN Content Delivery Network

HLS HTTP Live Streaming

HDS HTTP Dynamic Streaming

SSL Secure Socket Layer

ATS Apache Traffic Server

QoS Quality of Service

ASF Apache Software Foundation

FMS Flash Media Server

VoD Video on demand

RTMP Real Time Message Protocol

WMS Windows Media Services

TCP Transport Control Protocol

IIS Internet Information Services

ATS Apacche Traffic Server

CAPEX Capital Expenditure

OPEX Operational Expenditure

VCP Video Consolidation Platform

QoE Quality of Experience

GPL General Public License

VCL Varnish Configuration Language

MSE Massive storage engine

ix

Page 14: Virtualization of Video Streaming Functions
Page 15: Virtualization of Video Streaming Functions

Chapter 1

Introduction

1.1 Multimedia Streaming

Internet was originally designed to support data traffic transmission. Later in the early 1990’s,

the need of multimedia transmission emerges due to the growth of Internet in terms of users,

applications and nodes. Nowadays multimedia content is composing of large portion of the

internet traffic as more and more users are accessing multimedia content over the internet. For

instance, the share of mobile video traffic will reach up to 67% in 2017 [7] and it is estimated that

the multimedia transmission will cover up to 90% of the internet traffic within the next few years

[6]. Multimedia includes text, still images, audio, animation and video in an integrated manner.

Multimedia streaming refers to transmission of multimedia content from streaming sender to

a streaming receiver in a compressed form without downloading the whole content at the re-

ceiver device. The basic difference between multimedia streaming and textual data transfer

is that multimedia streaming requires real-time delivery but can tolerate certain amount of

data loss. Moreover, the main components of a multimedia streaming system are the encoder,

streaming server, streaming client, media transfer protocol and the underlying physical network.

The minimal set actions performed by the streaming system are, first camera captures and

produces either still images or video. The output from camera can be raw media without any

compression and this requires a very large bandwidth as it’s size is very large compared to

compressed version of it. Therefore, the media should be compressed using appropriate com-

pression techniques at the sender end to transmit over the internet. The compressed media

are then stored in the server together with its metadata which is the description of the media

such as location and timing information about the media. Hence, when the receiver requests

a certain media, the sender sends that media and its metadata. The media is received in the

form of packets and reassembled to the original compressed stream at the receiver end. Then,

1

Page 16: Virtualization of Video Streaming Functions

2 Chapter 1. Introduction

the decoder takes this compressed stream and it decodes the media. Finally, the decoded media

is subsequently passed to the renderer for display.

In general, depending on the media, streaming can be classified into on-demand and live stream-

ing. In on-demand streaming media is pre-recorded, compressed and stored on the streaming

server and delivered to clients when requested. On the other hand, in live streaming media is

captured, compressed and transmitted on the fly.

1.2 Streaming Technologies

In this section the background of the most widely used streaming technologies is described in

short.

1.2.1 Traditional HTTP Download Technologies

Traditional HTTP Download is the basic technology for transmission of content over internet.

It uses HTTP protocol to download the content to the receiver device and then it play out

locally. The most commonly used traditional HTTP download methods are the following:

1.2.1.1 HTTP Download

HTTP Download is widely used technology for data transfer. When using http download the

content is downloaded and stored on the receiver’s device and playback does not begin until the

media is downloaded completely which may cause delays for large media.

1.2.1.2 HTTP Progressive Download

HTTP progressive download is a widely used media streaming technology and it uses HTTP/TCP

protocol. In progressive download, contents are downloaded partially and progressively stored

on user’s device [14] which improves HTTP Download method by reducing the delay before

playback begins. First, the metadata which tells the player how to play the media is down-

loaded. Then, playback begins after the metadata and sufficient data has been buffered on the

receiver’s device and the rest of the content is continued to be downloaded and save while the

player plays what is already downloaded data. However, there is no bandwidth adaption since

it does not consider the variation of network condition between client and server. In addition,

it cannot be used to stream a live media as it needs offline preparation and can be inefficient in

controlling the bandwidth from the ISP point of view [9].

Page 17: Virtualization of Video Streaming Functions

Chapter 1. Introduction 3

1.2.1.3 HTTP Pseudo Streaming

HTTP Pseudo Streaming is very similar to HTTP Progressive Download, except that the player

can seek forward or backward even the content is not yet downloaded. The player uses byte

offset or number of seconds from the start of the video to find the desired part of video. The

player can buffer the content without saving it in the receiver’s device and it is not mandatory

to download the video from start to finish which means the player can stop the stream and

jump to different point.

1.2.2 True Streaming Technologies

True streaming technologies are the most popular streaming protocols for Flash and windows

media streaming. These technologies create a connection to a dedicated media server and the

content is sent to the end-user device as a series of small packets. True Streaming technologies

use a stateful protocol, which means from the first time a client connects to the streaming server

until the time it disconnects from the streaming server, the server keeps track of the client’s

state. Commands like PLAY, PAUSE and STOP can be issued by the end-user for playback

control and multi-bitrate delivery is supported. However, it is not common to switch between

bitrates once streaming has started. The two most common methods are Adobe Flash Media

Server and Microsoft Windows Media Server.

1.2.2.1 Adobe Flash Media Server

Adobe Flash Media Server (FMS) uses proprietary data from Adobe Systems (formerly Macro-

media) and is a hugely popular streaming platform. FMS is commonly installed on a media or

origin server running linux operating system; however, FMS can be supported on the Windows

Server operating system. FMS supports stored Video on Demand(VOD) and live media delivery

[14].

1.2.2.2 Microsoft Windows Media Services

Microsoft Windows Media Services (WMS) supports both VOD and live media delivery. WMS

is normally installed on a media or origin server running the Windows Server 2003 or 2008

operating system; however, there are proprietary variants which run on non-Windows servers.

WMS has the ability to enforce authentication and impose connection limits. The preferred

protocol for WMS is Real Time Streaming Protocol(RTSP) [14].

Page 18: Virtualization of Video Streaming Functions

4 Chapter 1. Introduction

1.2.3 HTTP Adaptive Bitrate Streaming Technologies

HTTP adaptive bit-rate streaming is currently the most sophisticated method for streaming

media delivery [13]. The content is encoded at multiple bit rates allowing selection of differ-

ent encoded versions during streaming based on the available network bandwidth and client

resources. For each encoded version, the content is divided into a series of smaller segments

or chunks each with 2-10 seconds in length which are reassembled and played back as a single

continuous stream. This makes very easy for the receiver player to jump forward or backward

in the video. Eventually, a manifest file is created to act as a table of contents for the segments.

The quality of the segments can be adapted during streaming based on network quality condi-

tions which means player is able to switch seamlessly between the different fragments at any time

during the playback, so the player can select the desired quality level and adjust automatically

based on real-time network conditions. Therefore, during streaming only the relevant segments

are requested by the player and sent for reassembly and playback on the receiver device. An

additional benefit of HTTP adaptive bit rate streaming is the ability to utilize CDNs to cache

video content closer to end-viewers.

In addition, Apple HTTP live streaming, adobe HTTP dynamic streaming, microsoft IIS smooth

streaming and MPEG dynamic adaptive streaming over HTTP are the most widely used ex-

amples of HTTP adaptive bit rate streaming technologies.Each of them are described briefly as

follows.

1.2.3.1 Apple HTTP Live Streaming

Apple HTTP live streaming (HLS) is an HTTP-based media streaming technology imple-

mented by Apple in 2009 [15]. In HLS, video and/audio inputs are typically encoded using

H.264/Advanced audio coding(AAC) and then the stream segmenter breaks the stream into a

series of short segments that are saved as transport stream files(TS files)along with an index

file (.m3u8) which indicates the order of the TS files. These TS files are stored on a standard

HTTP web server for distribution along with the URL for the index file. The receiver player

begins by fetching the index file using its URL and then reading it to request the appropriate

segments, and displays content without any pauses or gaps as a continuous stream. HLS is

optimized for delivery to IOS devices and Safari browsers, but there are solutions on all other

platforms as well, with different quality.

1.2.3.2 Adobe HTTP Dynamic Streaming

HTTP Dynamic Streaming(HDS), is Adobe’s method for adaptive bitrate streaming and it sup-

ports live and on-demand delivery of MP4 media over regular HTTP connections. HDS allows

Page 19: Virtualization of Video Streaming Functions

Chapter 1. Introduction 5

for adaptive streaming over HTTP to any device that’s compatible with Adobe Flash orAdobe

Integrated Runtime(AIR). It converts the content into a fragmented MP4 file format and deliv-

ers high-definition video/audio using any Flash compatible codec (H.264/AAC, VP6/MP3).On

demand content is converted using file packager utility as a post processing step; whereas, live

streams are converted real time using Adobe’s live Packer utility. An open source file specifica-

tion is used for fragmentation with .F4M for the manifest (index) file and .F4F for the media

segments which are stored in a single file. Moreover, adobe’s HTTP origin module is installed

on a standard HTTP web server to handle fragment requests and the streams are played using

Adobe Flash Player 10.1 or AIR.

1.2.3.3 Microsoft IIS Smooth Streaming

IIS(Internet Information Services) Smooth Streaming is Microsoft’s HTTP adaptive streaming

technology which uses Microsoft Silverlight as an application framework similar to Adobe Flash.

It supports multiple audio and video codecs. IIS is installed on an origin server running Windows

Server 2008 and Smooth Streaming is a plug-in software module to IIS. In IIS, media content

is stored as single file format with the segments stored as fragments MP4 within the file [14].

1.2.3.4 MPEG-Dynamic Adaptive Streaming over HTTP

MPEG Dynamic Adaptive Streaming over HTTP(DASH) is a standard which is defined by

MPEG to enable the interoperability between severs and clients of different vendors. It is a

generic solution based on HLS, HDS and Microsoft IIS Smooth Streaming. DASH is introduced

in order to standardize these proprietary solutions and have interoperability among themselves.

As a result, client applications can use all the streaming formats of different proprietary solutions

[10]. DASH, like HLS, HDS and Microsoft IIS Smooth Streaming, uses the concept of segments

and the equivalent of a playlist or manifest file, known as a Media Presentation Description

(MPD) file. Moreover, DASH can treats the video stream as a single file (does not create

segment files), so the MPD file points to offsets in the origin file rather than to segment files

[13].

Page 20: Virtualization of Video Streaming Functions

Chapter 2

Background and Related Works

In this chapter the concept of content delivery networks (CDN) is explained along with it’s

components. Furthermore, the concept of proxy servers and how caching work with reverse

proxy servers are also introduced in a brief way to insure a better understanding of this work.

2.1 Content Delivery Networks(CDN)

High bandwidth requirement and rate variation of videos in compressed format introduces

some challenging issues to the end-to-end delivery over wide area network. As a consequence,

Content Delivery Networks has been evolved in order to overcome these challenges and improve

accessibility of the Internet. CDN is a large, geographically distributed network of specialized

servers that accelerate the delivery of web content and rich media to Internet-connected devices

[8].

The main concept at the basis of this technology is the delivery at edge points of the network, in

proximity to the request areas, to improve the user’s perceived Quality of Service (QoS) when

accessing Web content. This concept of CDN uses edge caching, which entails storing replicas

of static text, image, audio, and video including various forms of interactive media streaming

content in multiple servers around the ”edges” of the Internet, so that user requests can be

served by a nearby edge server rather than by a far-off origin server. The purpose of caching is to

reduce network traffic to a minimum; this is achieved by delivering content from caches as close

to the requesting user as possible but also by ensuring the delivery device has effectively cached

the content from previous requests [14]. CDNs typically host static content including images,

video, media clips, advertisements, and other embedded objects for dynamic Web content.

Typical customers of a CDN are media and Internet advertisement companies, data centers,

Internet Service Providers (ISPs), online music retailers, mobile operators, consumer electronics

manufacturers, and other carrier companies.

6

Page 21: Virtualization of Video Streaming Functions

Chapter 2. Background and Related Works 7

Figure 2.1: CDN Architecture [8].

2.1.1 Components of CDN

Most CDN architectures are constructed from the following key components:

• Content delivery component: It contains origin server and a set of edge servers (cache

servers) to replicate the content and deployed as near as possible to the users. The origin

servers are the master sources for the content and can be deployed within the operator’s

network or more commonly within a content owner’s infrastructure. The primary purpose

of content delivery component is to deliver data to end users.

• Content distribution component: moves content from the origin server to cache servers

and ensures consistency. These can be deployed in a hierarchical model to allow tiered

caching and protection to any origin servers.

• Request-routing component: Direct user requests to cache servers and interact with the

distribution component to keep the content fresh.

• Accounting component: maintains logs of client accesses and records usage of the servers

assists in traffic reporting and usage-based billing.

2.2 Proxy servers

A proxy server is an intermediary server that intercepts requests from clients seeking resources

from different severs across the internet. Those resources can be images, files, web page, video,

audio,etc. Proxy server facilitates communication between clients and servers and can filter

requests based on its various rules. It can allow or reject communications by validating the

Page 22: Virtualization of Video Streaming Functions

8 Chapter 2. Background and Related Works

requests against the available rules. There are different kinds of proxy servers, we will see here

three basic types of proxy servers although the focus of this work is on reverse proxy server.

2.2.1 Forward proxy servers

A forward proxy server intermediates traffic between client and the destination chosen by the

client. It enables a client to connect to a remote network to which it normally does not have

access. Moreover, it can also be used to cache data, reducing load on the networks between the

forward proxy and the remote web server. A forward proxy cache needs explicit configuration

of the browser to direct all requests to the proxy cache rather than the target web server.

2.2.2 Transparent proxy servers

Transparent proxy cache achieves the same goal as forward proxy cache, but operates trans-

parently to the browser. The browser does not need to be explicitly configured to access the

cache. Transparent caches are especially useful to ISPs, because they require no browser setup

modification. Moreover, they are the simplest way to use a cache internally on a network,

because they don’t require explicit coordination with other caches. However, many companies

like YouTube are currently trying to prevent the use of transparent proxies since they want to

have a full control of the communication between the service providers and clients.

2.2.3 Reverse proxy servers

Reverse proxy, also known as Web Server Accelerator is an intermediary server which stores

responses from the origin server( a server that contains the content) in its cache and serve

subsequent requests for the same content from this cache. It proxies on behalf of servers and

appears to the end users as the origin server. The origin servers will never be accessed from

outside since every request for the origin server is passed through the reverse proxies. When a

client requests some content, the DNS will route the request to the reverse proxy server instead

of the origin server. The reverse proxy checks the content in its cache, if not it connects to the

origin server and fetches the requested content to its cache and serves the users. Requested

content can be fetched from one or more origin servers but for the user it looks like a content of

one server. Reverse Proxy servers check validity of the stored data using the additional HTTP

headers received from the origin server. In addition, origin server controls weather a given

content should be cached by the proxy server or not using HTTP headers.

Reverse proxy server checks weather the requested data is in cache and still valid when it receives

a request on behalf of a server. So, if the content is not in the cache it forwards the request to

the origin server. Moreover, if the data is in cache but not valid, it deletes the content from the

cache and forwards the request to the origin server. On the other hand, if the data is in cache

Page 23: Virtualization of Video Streaming Functions

Chapter 2. Background and Related Works 9

and it is still valid the reverse proxy forwards the requested data to the client from its cache.

It also checks weather the response is cacheable or not before storing the content to its cache,

when it receives a response from the origin servers.

Reverse proxy reduces load on the origin server rather than reducing upstream network band-

width on the client side as forward and transparent proxy servers. Reverse proxy handles all

traffics before it can reach the origin server by sitting between the client and the origin server.

Reverse proxy servers are used to reduce bandwidth usage and improve performance by stor-

ing static contents like images, videos, audios, etc on its cache and then serving users without

going to the origin server. This can also help to offload a very busy server and to reduce the

response time and enhance customer’s browsing experience. Moreover, proxy servers protect

origin servers and act as additional defence against security attacks because they intercept

requests to the origin servers.

2.2.3.1 How caching works on reverse proxy

Clients always use HTTP when talking to a caching proxy(reverse proxy) even if the application

is an FTP transfer.

• Is it cacheable

A response is called cacheable if it can be used to answer a future request. A cache decides

whether a particular response is cacheable or not by checking some parts of the request and

response. In particular it check the following: the response status code, request method,

response cache-control directives, a response validator and request authentication. More-

over, in some of the caches, valuable responses(frequently requested) are cacheable than

those requested once. The most important HTTP header tags which are used by reverse

proxy to check validity of the cached content and to check whether the response from the

origin is cacheable are:

– Last-Modified:-Tells the proxy when the file was last modified.

– Expires:- Tells the proxy when to drop the file from the cache.

– Cache-Control: Tells the proxy if the file should be cached.

– Pragma: Also tells the proxy if the file should be cached.

• Definition of Cache Hit and Miss

When a cache receives a request, it checks to see if the response has been cached. If it is

found in the cache we call it cache hit, otherwise we call it cache miss. When the object

is found, the cache has to decide if the stored is fresh or stale. A cached response is called

fresh if the expiration time has not been reached; otherwise it is stale. Moreover, fresh

Page 24: Virtualization of Video Streaming Functions

10 Chapter 2. Background and Related Works

response will be send to the client immediately; on the other hand stale responses require

validation from the origin server.

Hit ratio is used in order to measure the effectiveness of cache. It refers to the percentage

of requests that are satisfied as cache hits and usually it includes the validated and non

validated hits.

• Cache replacement policies

Cache replacement refers to the process of removing the old responses when the cache is

full and there is need of space for the new ones. Usually cache assigns some kind of value

to the object cached and the least valuable are removed first. Although, the definition

of “valuable“ differs from cache to cache. Typically, an object’s value is related to the

probability that it will be requested again, thus maximizing the hit ratio. The most known

cache replacement algorithms according to caching researchers are listed below.

1 Least recently used (LRU) This algorithm is the most popular replacement algorithm

that provides high performance in almost all situations. It removes the objects that

have not used for long time. This algorithm can be implemented by a simple list and

every time an object is accessed it will be on the top of the list. The least recently

accessed object will automatically be moved to the bottom of the list.

2 First in First out (FIFO) FIFO replacement algorithm is even simpler to implement

than LRU. In this algorithm objects are removed in the order they are added to the

cache.

3 Least Frequently used (LFU) LFU is similar to LRU. However, instead of selecting

objects only in terms of time since access and it also considers the number of access

as a significant parameter. LFU replaces objects with small access count and keeps

objects with high access account.

4 Size A size based algorithm uses the object size as the primary removal criteria. This

means that the largest object is removed first from the cache. However, the algorithm

needs an algorithm that measures how old an object stayed in the cache in order to

remove old objects first. Otherwise, the cache will only have smaller objects.

5 GreedyDual-Size(GDS) GDS assigns a value for every object based on the cost of the

cache miss and size of the object. Since GDS doesn’t specify what ”cost ” means,

it offers a lot of flexibility to optimize what you want. For example cost can be

defined as the latency which is the time it takes to receive the response. It can be

also defend as the number of packets transmitted over the network or the number of

hops between the origin server and the cache.

6 Greedy-Dual Size Frequency Greedy–Dual Size Frequency policy is proposed to max-

imize hit and byte hit rates for WWW proxies. This Proposed caching strategy

Page 25: Virtualization of Video Streaming Functions

Chapter 2. Background and Related Works 11

incorporates the main characteristics of a file such as file size, file access frequency

and recentness of the last access. This algorithm is an improvement of Greedy-Dual-

Size algorithm current champion among the replacement strategies proposed for Web

proxy caches. In general, GDF-like replacement policies emphasizing frequency have

better byte hit ratio but result in worse file hit ratio.

Page 26: Virtualization of Video Streaming Functions

Chapter 3

The Orbit streaming server

This chapter starts with an introduction to Edgeware solutions and then the Orbit server

features and functionalities are explained which are mainly based on Edgeware’s marketing

material. Finally, we describe the proposed solution on this thesis work.

3.1 Edgeware solutions

Edgeware is a leading provider of video streaming solutions to network and service operators.

It has a special platform for providing this solutions called Edgeware Video Consolidation

Platform(VCP). VCP is a highly accelerated and consolidated platform which significantly

reduces the high infrastructure costs required for the delivery of TV and video services [11].

VCP is a highly scalable platform to deliver a high quality video services to any screen, across

any network topology. It supports all major adaptive streaming frameworks such as Microsoft,

Apple and Adobe. It also provides CDN (described in 2.1) video delivery management functions.

VCP reduces capital expenditure of a company by at least 50 % [11]. As shown in 3.1 below, VCP

contains a highly accelerated video origin, called the VCP Origin, and a widely deployed operator

Content Delivery Network (CDN) solution, called the VCP Edge. The VCP Origin solution

Figure 3.1: Components of Edgeware Video Consolidation Platform [11]

12

Page 27: Virtualization of Video Streaming Functions

Chapter 3. The Orbit streaming server 13

reduces complexity, performance requirements and cost of the origin servers by offloading all

recording, ingest, re-packaging and play out capacity [11]. Therefore, load balancers or complex

file systems are not required. The VCP Edge is based on Edgeware’s widely deployed Distributed

Video Delivery Network (D-VDN) solution, incorporated into the new Video Consolidation

Platform [11]. VCP Edge is the main part of Edgeware video consolidation platform and it

addresses the network infrastructure costs of networks service providers and operators which

are growing due to the rapid increase of TV and video demand over the Internet. Moreover,

VCP Edge is a highly optimized CDN caching and distribution solution for a wide range of

service applications. It is designed to deliver next generation video services with the highest

Quality of Experience (QoE) and scalability to any screen [11]. In addition, VCP Edge is

easily integrated with any content management, middleware, conditional access and resource

management systems. The VCP Edge has two components called Orbit hardware platform

and convoy management software which are fully integrated. The convoy management software

allows an operator to set-up and manage a complete video delivery network across any network

topology. It ensures efficient and effective configuration, content management, license control,

session management, monitoring, and an open integration framework in close integration with

the optimized Orbit Delivery Servers.

3.2 Edgeware Orbit server

The Orbit servers are fully integrated with Convoy Management Software, providing highly

scalable asset propagation, session management and fault tolerance. The Orbit servers offer

advanced capabilities for operators and content providers to offer a full range of Cloud TV and

video services, irrespective of network topology and core bandwidth. These servers use a com-

bination of a dedicated hardware streaming engine and a purpose designed flash-based storage

system, coupled with a Linux based control plane to give up to 80 Gbps or 128 000 streams

from a single unit.

The main functionalities of the Orbit server are Ingest, repackaging, encryption, caching and

streaming. Repackaging and encryption are done just in time. The Orbit platform have func-

tionalities like session handling, logging and backend selector.

• Backend selector: the origin servers that contains the content in Edgeware are organized as

a group of servers which contains a set of nodes which are a model of physical computers.

Each node contains a set of IP addresses that models network interfaces. The server groups

model data center. The main functionality of the backend selector is load balancing and

fail-over. The load can be spread randomly over servers in a group, or be based on the

content requested to optimize cache utilization.

Page 28: Virtualization of Video Streaming Functions

14 Chapter 3. The Orbit streaming server

• Session handling: is a module which is used to limit access to the content, for setting

maximum number of TCP connections per session(client) and to group related requests

i.e during a fragmented streaming and when user sends HTTP requests with a few seconds

apart.

• Session Logger: This logging is enabled when the session handling module is enabled and

it logs:

– TIMESTAMP: Time, relative to epoch, when gathering of data for the sample

started.

– DURATION: Duration from first request to the last transfer.

– SENDTIME: The time spend streaming from the video server.

– IP: IP address of the client (remote host) that initiated the session.

– SESSIONID: The identifier of a session.

– CONTENT: URI of the initial request.

– BYTES: Number of bytes transferred from the video server.

– REFERRER: The Referer HTTP request header, if provided by the client.

– USERAGENT: The User-Agent HTTP request header, if provided by the client.

3.3 Proposed solution

The main aim of this thesis is to replace this Orbit sever hardware with an implemented software

version and then see the impact of performance in Edgeware solution. To do this we will select

three proxy servers using their general behaviour and then implement test cases for the use

cases that can help to evaluate those three proxy servers. Finally, after evaluating the three

proxy servers we will select one proxy server that will use as a cache server and then implement

some of the Orbit server functionality explained in 3.2 on top of this selected web server(reverse

proxy server).

Page 29: Virtualization of Video Streaming Functions

Chapter 4

General comparisons of HTTP

reverse proxy servers

HTTP reverse proxy servers are proxy servers that intercept HTTP requests coming from the

clients as it is described in 2.2.3. In this section the most commonly used HTTP reverse proxy

servers i.e squid, Apache traffic server, Nginx, Varnish and aicache are compared based on

performance, flexibility and license in a general way. However, it is not easy to find up to

date research papers on the reverse proxy servers and every proxy server adds new features and

functionalities overtime. Therefore, comparing those reverse proxy servers is not an easy task.

Moreover, their performance depends on the implementation, architecture and the flexibility

room to optimize them. In this chapter the pros and cons of each reverse proxy server is

described one by one. In addition, we tried to find some benchmarks on the performance

measurement of those servers from existing companies which are currently using those proxy

servers if any. Finally, we have selected three reverse proxy servers for further evaluation and

testing.

4.1 Squid

Squid is originated from the Harvest project in 1990s and it is the oldest and well-known of the

popular HTTP reverse proxy servers [2]. It is open source software licensed under the GNU

GPL and it supports HTTP, HTTPS, FTP. Squid offers a rich access control, authorization

and logging environment. It runs on many platforms including Linux, FreeBSD, and Microsoft

Windows and it typically runs as a single-process, single-threaded, asynchronous event proces-

sor. Squid stores the content in RAM until the RAM is full and then on disk. Therefore, the

RAM size and disk speed are important factors for its performance. Thousands of web-sites

around the internet use Squid to considerably increase their content delivery [2].

15

Page 30: Virtualization of Video Streaming Functions

16 Chapter 4. General comparisons of HTTP reverse proxy servers

The cache replacement algorithms which are used in squid are LRU described 2.2.3.1, Greedy-

Dual Size Frequency (GDSF) which keeps smaller objects in cache, Least Frequently Used

with Dynamic Aging (LFUDA) that keeps poplar objects in cache regardless of thier size thus

optimizes byte hit rate at the expense of hit rate since one large, popular object will prevent

many smaller, slightly less popular objects from being cached.

4.1.1 Pros and cons

Squid has the following advantages:

• Caching of static objects. These are served much faster, assuming that your cache size is

big enough to keep the most frequently requested objects in the cache.

• Buffering of dynamic content

• Nonlinear URL space/server setup. Squid can be used to do some tricks with the URL

space and/or domain-based virtual server support.

• Features: Squid is richer than any available reverse proxy servers in features

The disadvantages are:

• Buffering limit for log records: Squid could not keep more than 64KB in its buffer log.

• Speed: Squid is not very fast when compared with the other reverse proxy servers available

today. Only if you are using a lot of dynamic features then there is a reason to use Squid,

and then only if the application and the server are designed with caching in mind.

• Memory usage: Squid uses quite a bit of memory. It can grow three times bigger than

the limit provided in the configuration file.

• Stability: Compared to the other reverse proxy servers, Squid is not the most stable.

• Scalability: Squid is limited in scalability on modern multi-core systems since it runs as

a single process, single threaded, and asynchronous event processor.

4.2 Apache traffic server

Apache traffic server was originally developed by Inktomi, and later donated to the Apache

server foundation (ASF) by Yahoo. Apache traffic server is a fast, scalable and feature rich

proxy server [1]. It has feature rich plugin APIs to develop extensions. Since it is a multi-

threaded event driven server, it combines asynchronous event processing and multi-threading

technologies to deal with concurrency. Apache traffic server can draw benefits from each tech-

nologies but it also makes the code and the technology complex and sometimes difficult to

Page 31: Virtualization of Video Streaming Functions

Chapter 4. General comparisons of HTTP reverse proxy servers 17

understand. Apache traffic server is free and open source software that has robust plugin APIs

to extend and modify its behaviours and functionalities. It scales very well in modern multi-core

systems because it is a multi-threaded event driven proxy server.

There are a small number of “worker threads” in Apache traffic server; each such worker thread

is running its own asynchronous event processor. In a typical setup, this means Traffic Server

will run with around 20-40 threads only. This is configurable, but increasing the number of

threads above the default (which is 3 threads per CPU core) will yield worse performance due

to the overhead caused by more threads [18]. In ATS, the cache eviction algorithms that the

RAM cache supports are, LRU, LFU and Clocked Least Frequently Used by Size ( CLFUS)

which balances recentness, frequency, and size to maximize hit rate. The default algorithm is

CLFUS, but the user can select and set it in the configuration of ATS. Besides, Apache traffic

server uses a FIFO algorithm to update it’s disk cache.

In Yahoo CDN apache traffic server deliver 350,000 requests per second and 30 Gbps(95 per-

centile ) on which there are around 100 servers distributed over all the world. Additionally in

their lab they got 105,000 request per second out of one cache for small content and 3,6 Gbps

out of one server for large content. In addition to this Comcast use ATS in their CDN.

4.2.1 Pros and cons

The advantages of using Apache:

• It is very scalable, which means it needs little configuration and it can work in many

modes

• It is easily adapts to your network

• It uses efficient subsystem storage

The disadvantages are:

• It has many configuration files.

• It is not stable compared to others.

• it needs restart for some cases

4.3 Nginx

Nginx is an HTTP web server that also can function as a HTTP reverse proxy server. It is free

and an open source software that has a lot of plug-ins and released under a BSD-like license.

Page 32: Virtualization of Video Streaming Functions

18 Chapter 4. General comparisons of HTTP reverse proxy servers

Nginx uses event driven multiple processes to solve the concurrency problem which needs small

CPU [3]. In addition to HTTP, it can proxy several other TCP protocols, and also have a

flexible plugin interface for extensions and additions of its behaviour and functionalities. It is

also well documented and widely available compared to the other reverse proxy servers. Nginx

uses a persistent disk based cache and the OS page cache keeps object in RAM. Moreover,

Nginx uses LRU cache replacement policy to evict contents from it’s cache if the specified size

for the cache exceeds.

4.3.1 Pros and Cons

• It has high performance and is stable with simple configuration.

• Consumes lower CPU power and memory

Disadvantages:

• Nginx requires a recompile of the entire application for new plugin APIs.

• Has some latency in accepting new connections

• Storage time is unlimited

4.4 Varnish

Varnish is a free and open source software which is licensed under two-clause BSD license. It

was initiated in 2005 and the first version was released in 2006 [18]. It focuses mainly on per-

formance and flexibility. Varnish was designed originally as a reverse proxy with the principle

of solving real problems, optimize for modern hardware (64-bit, multi-core, etc) and modern

workloads, work with the kernel not against it, innovation not regurgitation [19]. It takes the

advantage of modern kernel features to simplify the code. Moreover, Varnish does not keep

track of whether your cache is on disk or in memory. Instead, Varnish will request a large

chunk of memory and leave it to the operating system to figure out where that memory really

is. The operating system can generally do a better job than a user-space program. In general

to get a simpler design and reduce the amount of work Varnish needs to do, but it sacrificed

portability. For example in 32-bit system the virtual memory address space is limited to 3GB

which limits the size of cache and number of concurrent users[4].

Varnish is developed and tested on GNU/Linux and FreeBSD and its development is governed

by Varnish Governance Board (VGB). Varnish moves a lot of complexities to the kernel by

using the advanced features of the operating system such as accept filters, epoll and kqueue.

All caching are done using virtual memory provided by the operating system and each active

Page 33: Virtualization of Video Streaming Functions

Chapter 4. General comparisons of HTTP reverse proxy servers 19

connection uses up a thread. Besides, Varnish uses LRU cache replacement algorithm both in

RAM and disk to remove contents from the caches.

Varnish uses its own domain specific configuration language called varnish configuration lan-

guage which translate to C-code, compiled with a normal C compiler and then dynamically

linked directly into Varnish at run-time. Varnish configuration language is lightning fast and

gives freedom to system administrators by allowing developers to define their own policies rather

than constrained by vanish developers. Varnish also contains modules called vmods which makes

it easy to extend, add new functionalities to Varnish or integrate Varnish with other software,

such as database or other network services. Example Integration with GeoIP databases or de-

vice detection for mobile users.

Varnish has two processes called parent and child process. The parent process starts the child

process when the varnishadm daemon starts and recover it when it dies by any reason.

Varnish contains different subroutines such as vcl recv, vcl fetch, vcl pipe, vcl pass, vcl hit,

vcl miss, and vcl error. But most of the tasks of VCL can be performed by:

• vcl recv: receives the requests, parse them, makes decision of serving from the cache or a

backend etc. It is able to alter the headers as well.

• vcl fetch: This method is called when an object is retrieved from a back end. The

basic operations here are to change the header, change the backend if previous one was

unhealthy etc.

Varnish plus is the commercial version of Varnish which contains all features of Varnish plus

some. Varnish plus has a measured performance up to 20 Gbit per second per single server in

video and audio streaming and it can stream to as many as 6500 users from one single server

[20].

4.4.1 Pros and cons

The main advantages Varnish are:

• It is very flexible compared to other reverse proxy servers due to its own language VCL.

• Varnish gives you access to very detailed logs that are useful when debugging problems

without any cost.

• Developers can implement their own policies using VCL.

• It has modules called Vmods which are helpful to extend its functionalities.

Disadvantages of Varnish:

Page 34: Virtualization of Video Streaming Functions

20 Chapter 4. General comparisons of HTTP reverse proxy servers

• Opens new thread for every connection use the advantage of the operating system to solve

this

• It is not portable for all systems because it is designed for modern hardware(64-bit, multi-

core, etc).

4.5 Aiscaler

Aiscaler is not a pure caching solution instead it is all-in-one application delivery controller

(ADC) solution which is normally installed as a reverse proxy on a dedicated machine. Some

features of aiscaler are caching, SSL offloading, DDoS protection, multiplexed session manage-

ment, mobile device detection and IP-based geocontent delivery. Aiscaler is easy to configure

and creates better user experience by increasing speed and availability of a web site by offload-

ing request processing from the web, reducing code complexity and reducing cost for servers,

reducing space, power and cooling. Aicache is a Linux application, custom written in C. it is a

right threaded application which means it use limited number of threads (processes)

4.5.1 Pros and cons

The advantages of Aicache which is a caching feature of Aiscaler are :

• High performance

• Has good configuration flexibility

• Support for real time alerting and responding

• Responses are cached in RAM no disk

• Low resource usage

Disadvantages of Aiscaler:

• Aiscaler performs well especially with a dynamic web sites. But in our case we need a

reverse proxy for static contents.

Pages load faster with over 250,000 RPS served directly from aiScaler. However, we are not

interested in aiscaler because it is all in one solution not pure caching solution.

4.6 Conclusion

Based on the advantages and disadvantages of the above HTTP reverse proxy servers, we

selected Apache traffic server, Nginx and Varnish for further evaluation and testing to see their

performance and behaviours.

Page 35: Virtualization of Video Streaming Functions

Chapter 5

Test methodology

We need to define some test cases to evaluate the performance of the generic HTTP acceleration

servers(cache servers) which are defined in chapter 4 as well as the Orbit server demonstrated

in chapter 3. We can measure the performance of cache servers using cache hit, cache miss and

live test cases. But in the case of cache miss, since the cache server should fetch the content

from the origin server (the server that contains the content) the performance will depend both

on the cache server and the origin server from which the content is fetched. Therefore, due

to such limitations only live , 100% cache hit and 90% cache hit test cases are implemented

and evaluated as part of this thesis. Moreover, to compare the cache servers we need some

parameters (characteristics) as a performance measure. In this thesis the characteristics that

we have used to evaluate those cache servers and the Orbit server are response time, CPU usage

and network traffic (bandwidth). We will describe them more in 6.

The video assets that we have used in our test case are stored in an origin server so that the

proxy servers and the Orbit server will fetch those video assets. All assets are chunked into

different length of small fragments. The fragments are grouped based on their content bitrate.

In other words, an asset contains different length of many fragments with a different sizes.

Therefore, we should state number of assets, number of fragments and their length and the

quality(content bitrate) of the fragments as an input in all test cases.

This chapter starts with the definition of test cases that we have used for our evaluation.

Following the definition, the implementation of those test cases is explained and then, the

configuration of the generic proxy servers which are selected in chapter 4 will be demonstrated.

Finally setup of the test environment is summarized in short.

21

Page 36: Virtualization of Video Streaming Functions

22 Chapter 5. Test methodology

5.1 Definition of test cases

As it is mentioned in the above we have used three test cases namely, live, 100% cache hit and

90% cache hit test cases to evaluate the performance of the three reverse proxy servers selected

in chapter 4 and Orbit server. But, for Apache traffic server, the 90% cache hit test case is not

implemented due to the bad result collected from the live and 100% cache hit test cases. In the

case of 100% cache hit and 90% cache hit test cases, the video transmission was not a real video

on demand(VoD) since all clients were synchronized to spread over the assets and request the

fragments of the assets sequentially i.e all clients were requesting the same portion of the assets

at the same time. However, in a real VoD different clients can request different portion of the

assets at any time. The size of each fragment that we have used in our test cases was 132 KB.

5.1.1 Live test case

In this test case all clients were requesting a single asset with 5 second fragments at 300 kbps.

The asset was not in cache before the test so that all servers fetch and serve the first request for

a fragment from the origin server and then serve the next requests for the same fragment from

their cache. Proxy servers put data in memory to increase performance by decreasing the access

of hard disk. Every client was requesting the fragments sequentially and continuously as they

are represented in figure 5.1. Fragment length is the time that a client waits for that fragment

before requesting the next fragment which means if the client didn’t receive the fragment within

5 seconds its timeout will reach and the request will be a late request. In this test case, we have

tested the maximum number of clients(streams) in each server. In addition, the response time,

CPU usage and egress bandwidth are tested having the same number of clients for all servers.

Figure 5.1: Live fragments

Page 37: Virtualization of Video Streaming Functions

Chapter 5. Test methodology 23

5.1.2 100% cache hit

In this test case all assets were saved in the cache of the proxy servers and the Orbit server

before the test and all clients were requesting those assets to simulate video on demand stream-

ing. In the proxy servers, the clients were spread over the assets and request the fragments in

the asset sequentially. Proxy servers put data in RAM cache as much as possible to decrease

the load of disk access and to keep contents in RAM for fast access. In the case of ATS the user

can specify the size of this RAM in its configuration. However in Varnish and nginx contents

will be stored in RAM as much as there is free space in the RAM. The assets are chunked into

5 second fragments with a 300 kbps content bitrate. Figure 5.2 represents the fragments in this

test case. The length of the fragments is a time that a clients waits to receive that fragment as

it is stated in 5.1.1. The focus of this test case was to see the number of assets that each proxy

server can serve without hitting any limitation which depends on their resource usage and then

having the maximum assets that can fit in the RAM cache of each proxy server, we have tested

and compared the response time, CPU usage and bandwidth of all servers.

Figure 5.2: 100% cached fragments.

Page 38: Virtualization of Video Streaming Functions

24 Chapter 5. Test methodology

5.1.3 90% cache hit

To see the ingest performance of the proxy servers, 90% test case is implemented. In this test

case, all clients spread over the assets and request the fragments sequentially the same as the

100% cache hit test case. The reverse proxy servers were serving 90% of the requests from

their cache and fetch and serve the rest from the origin. Figure 5.3 represent the 90% cache hit

fragments on which the fragments that are saved in the cache before the test are represented

by a green color and the other one represents the fragments that are not in the cache server.

Figure 5.3: 90% cached fragments.

Page 39: Virtualization of Video Streaming Functions

Chapter 5. Test methodology 25

5.2 Implementation of test cases

The test cases are implemented using Python and bash script. The implementations are based

on an in-house tool called rq which is used to generate TCP load. In those implementations rq

is configured with different parameters like the destination hosts and ports, number of clients,

number of video assets, number of fragments in each asset, quality and fragment length. In

the implementations, a reporter module is used to generate output of the test results in a pdf

format. Besides, VMstat and network statistics are collected from the proxy servers machine to

produce CPU usage and network traffic streaming respectively. For each test case, the number

of assets and fragments are varying as can be clearly seen in the table 6.2. However, for the

100% and 90% cache hit test cases, the video on demand property were not simulated exactly

as it is stated in section 5.1.

5.3 Configuration of proxy servers

In this work, each proxy server was configured as a cache server. The configuration is different

for each proxy server and the default configuration could not give us a reliable result for the

implemented test cases discussed above. Therefore, a lot of optimizations were made for each

proxy server to get a result which is comparable with Orbit server.

In the case of ATS we have configured it as reverse proxy server that can be used as a cache

server. It is only configured for the live and 100% cache hit test cases. The main configuration

files are called records, remap, storage and cache. Records configuration of ATS is used to

configure the server to act as a reverse proxy server and set how much RAM should be used to

store the most accessed assets.It also sets the IP address and port on which it should be accessed

and the IP address to access the origin server and some more tunings. In remap configuration

file we have defined both map and reverse-map rules. A map rule translates the URL in the

client requests into the URL of the origin server where the content is located. It constructs

a complete request URL from the the client URL and its headers and then looks for match

with its list of targeted URLs in the remap rules. Besides, a reverse-map translates the URL of

redirect responses from the origin server into the address of ATS, so that clients are redirected

to ATS instead of accessing an origin server directly. Therefore, clients cannot access the origin

server without knowledge of Apache traffic server. Furthermore, storage configuration is used

to set how much hard disk will be used by Apache traffic server. Likewise, some caching rules

are set in the cache configuration file.

Varnish uses its own configuration language called varnish configuration language(VCL) to con-

figure it. This language was used to tell Varnish which origin server it has to use and all required

caching rules. Unlike Apache traffic server, VCL is a very flexible configuration language. More-

Page 40: Virtualization of Video Streaming Functions

26 Chapter 5. Test methodology

over, another configuration file was used to tweak Varnish, to set the IP address and port on

which Varnish should listen.

In the case of Nginx, it has its own way of configuration and caching rules are defined to configure

Nginx as a reverse proxy server. In addition, we have defined the address of the origin server,

the storage to be used, the address and port on which Nginx listens and many optimizations in

the configuration file.

5.4 Test Environment

The test environment that we have used in our test case contains the following components:

client machine, Orbit server, origin server and switch. Only the Orbit machine is replaced by

the proxy server’s machine when we are testing proxy servers.

5.4.1 Test setup

The proxy servers were installed and configured one by one on the same machine so that the

test environment will be the same for all test cases. All the components of the test environment

are demonstrated in short as follows.

• Client machine with Ubuntu 12.04

Application called rq is installed in this machine. Rq is an in-house purpose built perfor-

mance streaming application which is used to generate TCP load. It can do progressive

streaming and adaptive bitrate streaming. Rq can be configured with number of clients,

number of assets, number of fragments, fragment length, ramp-up, timeout, duration, etc

and it creates tcp sockets and send GET requests based on the implementation of the test

cases. In this machine, two network interfaces each with 10 Gbps were used to communi-

cate with the proxy servers machine. In addition, rq can emulate thousands of concurrent

HTTP clients.

• Proxy server with Ubuntu 12.04

This is a machine on which we installed the reverse proxy servers to be tested. Those

are used as a cache server, when a request comes from a client the proxy server checks

if the requested data is in cache. If data is in cache it serve the client from the cache,

otherwise it fetches the data from the origin server and serve the clients simultaneously.

In this machine, we have 11 GB RAM and we specified 50 GB disk space to do the test

and two 10 Gbps interfaces were used, one is shared by the client machine and the origin

server and the other one with the client machine only. All proxy servers uses RAM cache

to serve objects as quick as possible and reduces load on disks in addition to the specified

Page 41: Virtualization of Video Streaming Functions

Chapter 5. Test methodology 27

cache storage. Therefore, memory and CPU are the basic constraint in the proxy servers.

This machine is replaced with Orbit server during the Orbit tests runs.

• Origin server with Ubuntu 12.04

In order to support HTTP requests, lighttpd web server is installed and configured in this

machine. Video segments of different bit rates are stored on this server and this is the

server from which reverse proxy servers get the content when there is a cache miss.

• Switch

The above components are connected as shown in the figures 5.4 and 5.5 through switch

both in the Orbit and proxy servers test environment.

Figure 5.4: Orbit test. Figure 5.5: Proxy servers test.

Page 42: Virtualization of Video Streaming Functions

28 Chapter 5. Test methodology

5.4.2 Parameters

The parameters that we have used in all test case are, number of clients, number of assets, num-

ber of fragments in each asset, ramp up in milliseconds, timeout in seconds, quality, fragment

length, duration in minutes, content type which is video. The value of each of the parameters

varies according to the specific test cases as it can be seen in table 5.1 below. We have used

10000 number of fragments in the live test case to make sure the proxy servers download the

first request from the origin server and serve the next requests from its cache to simulate the

live streaming. However, in the 100% cache hit and 90% cache hit test case there are 180 frag-

ments(the duration of the test(900 seconds) divided by the fragment length(5 seconds)) in each

asset. The number of assets are different for all proxy servers in the 100% cache hit and 90%

cache hit test cases due to the drawback of disk access and memory limitation that we have in

the proxy servers machine. In other words, the number of assets that a proxy server can serve

depends on its memory and CPU usage.

Parameters Live test case 100% cache hit 90% cache hit

Clients 25000 25000 25000

Fragments 10000 180 180

ramp up 2ms 4ms 4ms

Quality 300 kbps 300 kbps 300 kbps

fragment length in seconds 5s 5s 5s

durationin minutes 15m 15m 15m

Table 5.1: Parameters of test cases.

As part of the test setup, different test cases are implemented in the client machine and the

proxy server machine was configured and optimized for each proxy server as it is described in

sections 5.2 and 5.3.

Page 43: Virtualization of Video Streaming Functions

Chapter 6

Performance comparison of Orbit

and proxy servers

The results of the above test cases for the generic proxy servers and Orbit server are presented

in this chapter to compare response time, CPU usage and bandwidth of all servers. The com-

parison is accompanied with the analysis of each result. The Orbit, client and the proxy servers

machines have two interfaces with 10 Gbps each as they are stated in the figures 5.4 and 5.5.

As it can be seen in the figures below, in the response time there is a heat-map graph with

a blue color in it’s right side which represents the number of request in percentage. 0-20% of

the requests are represented with a light blue color and the weight of the blue color increases

proportionally with the number of requests in percentage. The y-axis represents response time

in milliseconds and x-axis is the duration of the test in percentage.

In the CPU usage figures, the red graph represents the time on which the CPU is idle, the blue

and green graph are the time of CPU that are spent waiting for the system and user respec-

tively. The graph which is represented by a cyan color is the CPU time waiting for the I/O

operation(disk access).

In the figures of network streaming ports, for the proxy server tests red and blue colors are used

to represent the egress bandwidth of both interfaces(since we have two interfaces) and green

and cyan colors are used for the ingest bandwidth. Besides, in the Orbit test blue and red colors

are used for both the egress and ingest bandwidths.

29

Page 44: Virtualization of Video Streaming Functions

30 Chapter 6. Performance comparison of Orbit and proxy servers

6.1 Live test case

In this test case, 25000 clients were requesting a single asset which contains 10000 fragments

each with 5 second length at 300 kbps, and the ramp up between each clients was 2 milliseconds

to simulate the live test case. Orbit, Varnish and Nginx creates only one connection with the

origin server for each cache miss even many clients request the same fragment at the same time.

However, ATS might send more than one request per fragment to the origin server if many

client requests come at the same time for that fragment, but it tries to decrease the number of

connections to the origin server for the same fragment.

Proxy servers put data in RAM cache to increase performance by decreasing the access of disk

cache. Since the size of the fragments was 132 KB and we have a single asset in this test case

the overall size of the asset was 10000 times 132 KB which is 1.32 GB. Therefore, the size of the

RAM cache was enough to save all the fragments and proxy servers was accessing the fragments

from RAM cache in this test case.

In live test case, we encountered a limitation related with the number of concurrent requests

that every proxy server can support. As a result, the maximum number of concurrent requests

that Orbit, ATS and Varnish can support are 32000. When there are more than 32000 clients,

Varnish hits memory limitation and it restarts in the mid of the test. Moreover, Apache traffic

server will be very slow when the number of concurrent clients were more than 32000 and then

there will be many late requests. On the other hand, Nginx can support up to 45000 concur-

rent requests without any late requests which implies Nginx can handle many more concurrent

requests than Varnish and Apache traffic server with the same resources. The reason for the

different number of concurrent request in the proxy servers, is due to they way they used to

handle incoming requests. Nginx uses asynchronous event-driven connection handling algorithm

on which it does not create new thread for each request. However, Varnish is a multi-threaded

program that uses one thread per each connection and ATS uses a hybrid event-driven engine

with a multi-threaded processing model to handle incoming requests.

In addition to the number concurrent requests, we used response time, CPU usage and network

traffic as a main criteria to compare the proxy servers and the Orbit server. A test that runs

with 25000 clients is used for all proxy servers and the Orbit server in this section to evaluate

the results for the above criteria. The result of live test case for all proxy servers and Orbit

server are demonstrated as follows.

• Response time

Response time is the amount of time taken between a client request and receipt of the

response. Having enough memory, Varnish is the best reverse proxy server in terms of

Page 45: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 31

response time. It has comparable response time with the Orbit server as it can be seen in

the figures 6.3 and 6.4. However, in Nginx 6.2 and ATS 6.1 the response time of 0-20% of

the total requests reaches up to 1.1 and 2.4 seconds respectively. As a result, Varnish is

very fast compared to Nginx and Apache traffic server in live test case.

Figure 6.1: ATS response time. Figure 6.2: Nginx response time.

Figure 6.3: Varnish response time. Figure 6.4: orbit response time

• CPU usage

As it can be seen from the figures below, in Apache traffic server 6.5 and Varnish 6.7 the

CPU is 75% idle but in Nginx CPU is idle around 80%. Therefore, Nginx 6.6 uses less

CPU compared to Varnish and Apache traffic server in live test case. However, 95% of

CPU is idle in the Orbit server 6.8 since it doesn’t use CPU and memory to perform its

activities. It uses FPGA instead of cpu and RAM and flash memory as a storage.

Page 46: Virtualization of Video Streaming Functions

32 Chapter 6. Performance comparison of Orbit and proxy servers

Figure 6.5: ATS CPU usage.Figure 6.6: Nginx CPU usage.

Figure 6.7: Varnish CPU usage. Figure 6.8: Orbit CPU usage

• Network traffic

As it can be seen below, the ingest bandwidth for each interface is around zero since all

the servers send a single request per fragment to the fetch from the origin server and the

egress bandwidth is 3.5 Gbps in all interfaces. The egress bandwidth starts from zero and

then gradually grows until all clients joined and it stays constant. The egress bandwidth

is around 7 Gbps for every server since each server has two interfaces as it is stated in the

figures 5.4 and 5.5 above.

Page 47: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 33

Figure 6.9: ATS network traffic. Figure 6.10: Nginx network traffic.

Figure 6.11: Varnish network traffic. Figure 6.12: Orbit network traffic

Results of all servers for the live test case are summarized in the table below.

Servers Response time CPU usage Egress bandwidth

Orbit < 1 ms 10% 2*3.5 Gbps

Varnish < 1 ms 20% 2*3.5 Gbps

Nginx <= 10 ms 20% 2*3.5 Gbps

ATS <= 20 ms 30% 2*3.5 Gbps

Table 6.1: live test case results of all servers.

6.2 100% cache hit

In this test case all assets were saved in the cache before the test and the number of assets that

each proxy server can support differs due to their difference of memory and CPU usage. In this

test case, 25000 clients make requests for 5 second fragments with 300 kbps content bitrate.

There were 180 fragments in each asset and the size of every fragment was 132 KB. Moreover,

the rump up between the clients was 4 milliseconds and the test runs for 15 minutes in the

Page 48: Virtualization of Video Streaming Functions

34 Chapter 6. Performance comparison of Orbit and proxy servers

proxy servers and 10 minutes in the Orbit server.

As it is stated above the RAM size in the proxy servers machine is 11 GB and the proxy servers

put contents in the RAM cache to serve objects as quickly as possible and reduces load on disk

cache. However, since the RAM size in the proxy servers machine is not large enough to store all

the contents, the proxy servers put contents in the RAM cache as much as possible and access

the rest from the disk cache. As a result, the proxy servers perform very low when the RAM

cache is full and they start to access more assets from disk cache, even though enough disk

cache storage was specified in all proxy servers. In other words, when they start to access more

requests from disk cache CPU spends a lot of its time to wait the disk I/O operation(I/O wait).

Therefore, one of the criteria that we use to compare the proxy servers with each other was how

many assets can each proxy server support without late requests and hitting any limitation;

which depends on the memory usage of the proxy servers. The Orbit server uses flash memory

as a storage system hence, there is no limitation on the number of assets up to the size of the

storage system.

ATS can only support 32 assets without late requests in the given environment which are less

than 0.76 GB in size(32 times 180 fragments in each asset times 132 KB which is the size one

fragment). When the number of assets are increased, ATS will access the disk cache many times

and the overall time of the CPU is took by the disk I/O wait operation. The main reason that

ATS supports only few assets in our test cases was, ATS puts data in RAM cache only extremely

popular objects which means it only put a requested object in RAM cache if ATS thinks it is

accessed many times. In other words, when there are more assets the number of requests for a

fragment will decrease since the clients spread over the assets and request fragments sequentially

and then it may not be put in the RAM cache since ATS thinks it is not popular. However,

Varnish and Nginx puts an object in RAM cache if it is accessed at least once. Vanish has

two storage options namely malloc(memory) and file(disk) storage. But, we use the disk option

in our test to have the same criteria with the other proxy servers. In the disk storage option,

Varnish uses the OS page cache to put data in memory and keep contents in RAM for fast

access. The maximum number of assets that can be served by Varnish without hitting memory

limitation were 330 which are 7.8 GB. This is due to Varnish needs some memory to perform its

activities and it has an overhead of about 1 kB per object regardless of the size i.e if the number

of objects stored in Varnish increases the overhead might be significant and Varnish’s memory

usage will increase accordingly. In our case, Varnish was using around 3.2 GB of the RAM for

its activity and for the overhead needed for each fragment. Besides, if the number of assets are

increased, Varnish uses more RAM to save the assets and hit memory limitation for its activity

and the overhead needed so that it restarts automatically in the mid of the test. Moreover,

Varnish file storage is not a persistent storage which means it doesn’t retain the objects in the

cache if Varnish stops or restarts. In the other hand, Nginx can support 500 assets which are

Page 49: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 35

11.88 GB in size which implies Nginx uses a very small memory for its activity and it was also

serving some fragments from the disk cache. Moreover, when the number of assets are increased

in Nginx, the limitation that we hit was CPU time(the CPU time waiting for the I/O operation

was increased accordingly). Therefore, the limitation for number of assets in Nginx was fully

due to the bottleneck of the disk access operation.

We have also compared the response time, CPU usage and bandwidth having limitation of the

number of assets for each proxy server stated in the above paragraph. To compare the proxy

servers and the Orbit in terms those characteristics, we have used 350 assets which are 8.3 GB

for Nginx to have the same environment with the others. In other words, all proxy servers were

serving all the assets from the RAM cache and there wasn’t any impact of disk I/O operation.

Having all assets in a RAM cache the response time, CPU usage and egress bandwidth of the

proxy servers and the Orbit are demonstrated below.

• Response time

Varnish is the best reverse proxy server in terms of response time. However, the response

time of 0-20% of the total requests in Nginx 6.14 and ATS 6.13 reaches 2 and 2.5 seconds

respectively. Therefore, Varnish is very fast 6.15 compared to Nginx 6.14 and Apache

traffic server 6.13 as for the live test case having enough RAM cache in the proxy servers.

As it can be seen in the figure 6.16 the Orbit server has very short response time and

there is no any impact on the performance even the number of assets are increased as it

is described in above paragraphs.

Page 50: Virtualization of Video Streaming Functions

36 Chapter 6. Performance comparison of Orbit and proxy servers

Figure 6.13: ATS response time Figure 6.14: Nginx response time

Figure 6.15: Varnish response time Figure 6.16: Orbit response time

Page 51: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 37

• Network traffic

As it is shown the figures below, there is no any ingest as all the contents were saved

in the cache before the test and the egress bandwidth is zero when it starts and then

gradually grows to 3.5 Gbps in each interface. Therefore, the total bandwidth is 7 Gbps

in all servers.

Figure 6.17: ATS network traffic Figure 6.18: Nginx network traffic

Figure 6.19: Varnish network traffic Figure 6.20: Orbit network traffic

Page 52: Virtualization of Video Streaming Functions

38 Chapter 6. Performance comparison of Orbit and proxy servers

• CPU usage

The figures below represents the CPU usage of all servers, in Apache traffic server 6.21

and Varnish 6.23 70% of the CPU is idle but in Nginx 6.22 CPU is idle 75%. Therefore,

Nginx uses less CPU compared to both Apache traffic server and Varnish as live test case.

The Orbit server is not used CPU as it is stated in the live test case above and as it can

be seen in the figure 6.24 below.

Figure 6.21: ATS CPU usage. Figure 6.22: Nginx CPU usage.

Figure 6.23: Varnish CPU usage. Figure 6.24: Orbit CPU usage

The overall results for the 100% cache hit test case are summarized in the table below.

Servers Response time CPU usage Bandwidth

Orbit < 1 ms 15% 2*3.5 Gbps

Varnish <= 5 ms 30% 2*3.5 Gbps

Nginx <= 15 ms 25% 2*3.5 Gbps

ATS <= 27 ms 30% 2*3.5 Gbps

Table 6.2: 100% cached results of all servers.

Page 53: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 39

6.3 90% cached

In this test case we only tested Nginx and Varnish due to the bad result collected from the

above test cases in ATS. Furthermore, Varnish uses virtual memory as a cache storage and

Nginx was serving many assets from disk, hence the main focus of this test case is to see the

ingest bandwidth of the proxy servers. Furthermore, this test case is not implemented for Orbit

and we did not test the Orbit server for this test case.

6.3.1 Nginx

In Nginx 32000 clients were requesting a 5 second fragments with 300kbps content bitrate. The

test runs for 15 minutes and the ramp up between the clients was 4 milliseconds. Nginx was

serving 90% of the assets from its cache and 10% from the origin server. The main focus was

the number of assets Nginx can support and the number of clients that can be served without

hitting any limitation and any late requests. Nginx can serve 550 assets each with 180 number

of 5 second fragments and it can have up to 60000 concurrent requests but there were some

late requests because the CPU was fully utilized and many actions were waiting CPU to be

performed. Therefore, we have used the test that run with 32000 clients in this case.

6.3.2 Varnish

As it is stated above Varnish can use virtual memory as a cache storage in addition to disk(file)

storage. Using this storage option, we got the following results for the 90% cache hit in Varnish.

This test was run using 32000 clients requesting 3000 assets each with 180 fragments of 5 second

length. The ramp up between the clients was 2 milliseconds and data rate 300 Kbps. Varnish

was serving 90% of the fragments from its cache and 10% from the origin server. The cache

was 5GB virtual memory and it was deleting the Least Recently Used (LRU) fragments when

the cache is full.

• Response time

As it is shown in the figures 6.25 and 6.26 , the response time of Nginx reaches up to 2.7

seconds for some requests but in Varnish only few request have response time of 9 second

in the first few seconds. Therefore, Varnish have good response time compared to Nginx

using virtual memory as a storage in the 90% cache hit test case.

Page 54: Virtualization of Video Streaming Functions

40 Chapter 6. Performance comparison of Orbit and proxy servers

Figure 6.25: Nginx response time. Figure 6.26: Varnish response time.

• CPU usage

Figures 6.27 and 6.28 represents the CPU usage of nginx and Varnish respectively. CPU

time spent waiting the disk(I/O wait) reaches 50% in nginx, since Nginx was accessing the

disk for many requests and CPU time waiting for the system is around 25%. Therefore,

the CPU is idle only 25% in the worst cases in Nginx. However, in varnish 55% of the

CPU is idle since it was using virtual memory and it was not accessing any disk.

Figure 6.27: Nginx CPU usage. Figure 6.28: Varnish CPU usage

Page 55: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 41

• Network load

As it is shown in the figures below, the ingest bandwidth is around 0.96 Gbps and the

egress bandwidth is 4.5 Gbps for each interface for both Nginx 6.29 and Varnish 6.30.

Therefore, the total ingest bandwidth is 0.96 Gbps since we have only one interface con-

nected with origin and total egress bandwidth is 9 Gbps (we have two interfaces connected

with client)in both servers. Both ingest and egress bandwidth starts with zero and grad-

ually increases and then it stays constant after all clients joined.

Figure 6.29: Nginx network traffic. Figure 6.30: Varnish network traffic

Page 56: Virtualization of Video Streaming Functions

42 Chapter 6. Performance comparison of Orbit and proxy servers

6.4 Summary

We have selected Varnish based on the results of the above comparisons and the flexibility of

extending its functionality through Vmods.

6.5 Implementation of logger

In this section, we will describe the implementation of one of the features(modules) of the Orbit

called logger(session logger) on top of Varnish. There are also other functionalities on the Orbit

server like session management and backeend selector demonstrated in 3.2 but those function-

alities are not expected to have any major impact on the performance of Varnish and they are

already implemented on same or different way in Varnish also.

The logger module calculates the duration from the start of the first request to the end of the

last request and find the sum of bytes sent and send time in each minute of the test for all

sessions(clients). Varnish can give us the bytes sent and send time(time to finish the request)

informations for a single request from its logging but we need those values per every minute

for each session. A session can generate hundreds of requests in each minute and we have to

catch and add the bytes sent and send time for all requests in that interval. Duration is a

period between the start of the first request and the end(sum of start time and send time) of

the last request in that interval. In other words, duration is not exactly one minute, but rather

as close as possible to one minute, but only including those segments that have been completely

delivered during this sample interval. On average the sample duration will be one minute.

Logger is implemented as a Varnish module(Vmod) using C++ to see if there is any impact

on the performance of Varnish. Since Varnish puts all the logs in a shared memory, we have

written some configurations to log session ID, URL, start time of the request, time taken to

finish that request and the total bytes sent to the client in that request into a file instead of

shared memory. The current implementation takes this log file as an input and read every line

to extract the required columns for the implementation.

To implement the logger functionality, a session ID is added to the URL of each request in

the test case implementation of the client machine and rewrite rules are written in the VCL

configuration of Varnish to remove this session ID and send the correct URL to the origin server

since the URL to the origin server does not include the session ID. This session ID is assigned

for each client uniquely in the test case implementations and it is used to identify the clients in

our logger implementation. Logger logs for every session:

• Session ID: This an ID which is used to identify the session (the client) uniquely.

Page 57: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 43

• IP: IP address of the client (remote host) that initiated the session.

• Bytes sent: is the total number of bytes that are sent to that session during the interval.

• Send time: is the total time spent actually streaming to the client, i.e. the sum of the

time spent fulfilling each HTTP request in the interval.

• Duration: is time from the beginning of the first request to the end of the last request in

that interval.

From those parameters we can approximate the download bitrate that the client got by dividing

the bytes sent to the send time and the content encoding bitrate by the division of the bytes sent

to the duration during the interval as they are shown in figure 6.31 below. This implementation

Figure 6.31

was a first, simplified, implementation to get initial results. To implement like it is implemented

in the Orbit server, the session handling feature should be implemented first and then the logger

takes its input from the session handling feature rather than from file.

6.6 Orbit and modified Varnish performance comparison

In this section we will describe the results of Varnish before and after the implementation of log-

ging in Varnish and then the results of the Orbit server with the same parameter. In the Orbit

Page 58: Virtualization of Video Streaming Functions

44 Chapter 6. Performance comparison of Orbit and proxy servers

server the logger is related with the session handling functionality. This means a session object

that contains session ID, bytes sent, send time and duration informations is created using the

session management module and then the logger module uses those values to generate the log

file described in 3.2. However, due to time constraint we implement logger without implement-

ing session handling functionality of the Orbit. Therefore, the result of the implementation is

not fully optimized and we can only show that there is an impact in the performance of Varnish

when the logger functionality is enabled. This is due to Varnish uses CPU and memory unlike

the Orbit server which uses FPGA instead of CPU and memory. The parameters that we have

used in this case are stated in the table below.

Parameters Live test case 100% cache hit

Clients 9800 9800

Fragments 10000 100

ramp up 0.5 ms 4 ms

Quality 2 MB 2 MB

fragment length in seconds 5s 2s

durationin minutes 10m 10m

number of assets 1 100

Table 6.3: Parameters of test cases.

6.6.1 Varnish results

The implemented logger plugin uses a lot of resource since it is reading and processing file to

get the expected result. This implementation is not fully optimized since it takes a log file as

an input. As a result, the results in figures below show only additional processing will have an

impact on the streaming performance of Varnish. The limitation here was the CPU, the logger

was taking a lot of CPU to read and process the file and then there was an effect in response

time of Varnish since most of the CPU was used by the logger module. All assets were served

from RAM cache and if we increase the number of assets more than 100 Varnish hits a memory

limitation stated in chapter 6.

• Response time

As it can be seen in the figures below there is an impact in the response time of varnish

after the logger implementation both in live and 100% cache hit test cases.

Page 59: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 45

Figure 6.32: live without logger Figure 6.33: live with logger

Figure 6.34: 100% cached without logger Figure 6.35: 100% cached with logger

Page 60: Virtualization of Video Streaming Functions

46 Chapter 6. Performance comparison of Orbit and proxy servers

• CPU usage

The CPU usage of Varnish before and after of the logger implementation is different.

Varnish with the logger functionality uses more CPU both in the live and 100% cache hit

test cases as it can be seen in the figures below.

Figure 6.36: live without logger Figure 6.37: live with logger

Figure 6.38: 100% cached without logger Figure 6.39: 100% cached with logger

Page 61: Virtualization of Video Streaming Functions

Chapter 6. Performance comparison of Orbit and proxy servers 47

• Network traffic

There is no any difference in the bandwidth of Varnish with and without the logger

functionality.

Figure 6.40: live without logger Figure 6.41: live with logger

Figure 6.42: 100% cached without logger Figure 6.43: 100% cached with logger

6.6.2 Orbit results

Orbit uses flash memory storage therefore, there is no limitation in the number of assets since

it doesn’t use any RAM and CPU. The test results of both live and 100% cached test cases are

described in the figures below.

• Response time

The Orbit response time is not affected by the number of assets like Varnish.

Page 62: Virtualization of Video Streaming Functions

48 Chapter 6. Performance comparison of Orbit and proxy servers

Figure 6.44: orbit live Figure 6.45: orbit 100% cached

• CPU usage

Since Orbit does not used CPU and RAM to perform its activity, more that 90% of CPU

is idle in both live and 100% cache hit test cases.

Figure 6.46: orbit live Figure 6.47: orbit 100% cached

• Network traffic

Figure 6.48: orbit live Figure 6.49: orbit 100% cached

Page 63: Virtualization of Video Streaming Functions

Chapter 7

Conclusion

Nowadays, multimedia transmission is growing rapidly due to high demand of users. As a result,

different multimedia streaming technologies are developed to support the high quality demand.

Moreover, a high bandwidth requirement and rate variation of videos in compressed format

introduces some challenging issues to the end-to-end delivery over wide area network. As a

consequence, Content Delivery Networks(CDN) has been evolved in order to overcome these

challenges and improve accessibility of the Internet. CDN contains cache server as a compo-

nent which is intermediary server which stores responses from the origin server( a server that

contains the content) in its cache and serve subsequent requests for the same content from its

cache that improves response time. In Edgeware VCP solution, this cache server is a purpose

designed hardware implementation called Orbit server.

The main focus of this thesis was to virtualize the Orbit server by implementing backend selec-

tor, session handling and logger features of the Orbit server on top of one the selected HTTP

acceleration servers(reverse proxy servers) based on performance comparison. However, the de-

fault configuration of the proxy servers couldn’t give us a valuable result that can be compared

with the Orbit server even before implementing the above functionalities of the Orbit server

on top of them. Therefore, we have optimized the proxy servers until they give us a valuable

result and see the limitations they have. Moreover, we have implemented only the logger feature

of the Orbit since the other functionalities are not expected to have any major impact on the

performance of Varnish.

In this work, the performance of Orbit server is compared with a software based reverse proxy

servers which are configured as video streaming cache servers. We have implemented live, 100%

cache hit and 90% cache hit test cases to evaluate and compare the performance of the proxy

servers with each other as well as with Orbit. Moreover, we have configured three reverse proxy

servers i.e ATS, Nginx and Varnish and optimize them until they give us a valuable result that

can be compared with Orbit performance test results. In order to do the performance com-

49

Page 64: Virtualization of Video Streaming Functions

50 Chapter 7. Conclusion

parison response time, resource usage, network traffic and flexibility of extension were used as

fundamental criteria.

Nginx is best server in terms of resource usage i.e it uses small memory and CPU compared

to Varnish and Apache traffic server. However, Varnish is better in the case of response time

having enough memory. Varnish have malloc(virtual memory) and disk(file) storage options

but Varnish’s disk storage is not persistent i.e if Varnish stops or restarts, the information that

tells Varnish weather the data is in cache(cache key) will be lost and the data in cache will be

a random data.

In the cases of 100% cache hit, there was a limitation in the number of assets that every proxy

server can serve. This is due to the bad performance of disk access i.e CPU spends a lot of its

time waiting the I/O operation to finish. In other words, the number of assets that the proxy

servers can serve depends on the CPU and memory usage of the proxy servers. Nginx supports

more assets than Varnish and ATS since it uses low memory and CPU than varnish and ATS.

The RAM size in the proxy servers machine was 11 GB and the proxy servers put contents in

the RAM cache to serve requests as quickly as possible and reduces load on disks. However,

since the RAM size in the proxy servers machine is not large enough to store all contents, the

proxy servers put contents in the RAM cache as much as possible and access the rest from the

disk cache. As a result, the proxy servers perform very low when the RAM cache is full and

they start to access more assets from disk cache, even though enough disk cache storage was

specified in all proxy servers.

A raw disk is recommended in the documentation of ATS as cache storage, but we have used

a normal disk cache to use the same resources with the others. Therefore, ATS supports only

32 number of assets due to the way how it puts contents in RAM cache and it was worst in all

criteria both in live and 100% cache hit test cases.

Varnish needs more memory to perform its activities and it has an overhead of about 1 KB per

object regardless of the size i.e if the number of assets stored in Varnish increases the overhead

might be significant and Varnish’s memory usage will increase accordingly. Moreover, Varnish

will restart when it becomes out of memory in the mid of the test and since Varnish’s disk cache

is not persistent all the contents in the cache will be random data. As a result, Varnish serves

less assets than Nginx but it has the best response time in all test cases than Nginx and ATS.

Besides, Varnish is flexible than Nginx and ATS since it has its own configuration language and

it is easier to extend it through Varnish modules called Vmods. Therefore, we have selected

Varnish based on the results of the test cases and its flexibility for extension.

Page 65: Virtualization of Video Streaming Functions

Chapter 7. Conclusion 51

Finally, the logger functionality of the Orbit server is implemented using C++ as a plugin(vmod)

of Varnish to see if there is any effect in its performance. However, since we have implemented

the logger functionality without implementing the session management feature like in the Orbit

server our implementation is not optimized and complete. Therefore, either implementing the

logger functionality in the same way as the it is implemented in the Orbit or using a shared

memory like Varnish do to log the information for every request may reduce the impact of the

logger in Varnish performance.

The main limitation on the software based reverse proxy server was the number of assets that

they can serve without hitting any limitation as it is described above. This is due to the load

of disk and memory limitation that we have in the proxy servers machine. However, since the

Orbit server uses flash memory as a storage there is no any limitation in the number of assets

it can serve and its performance is not affected by the number of assets. Therefore, since we

can not have unlimited memory the Orbit server is better than the proxy servers for a large

contents like video. For small contents like HTML and text the proxy servers perform very well

and can be used easily.

Varnish plus is a commercial version of Varnish that solves the storage problem with disk by

introducing a new storage option called massive storage engine(MSE). Varnish plus can store

almost unlimited contents(100+ terabytes)[30] in this MSE which is also persistence storage

unlike the disk storage of Varnish. Hence, in the future it will be important to compare perfor-

mance of Varnish plus and Orbit since Varnish has comparable response time with the Orbit

having enough memory for its activity and to save the assets in the RAM cache.

Page 66: Virtualization of Video Streaming Functions

List of Figures

2.1 CDN Architecture [8]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.1 Components of Edgeware Video Consolidation Platform [11] . . . . . . . . . . . . 12

5.1 Live fragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2 100% cached fragments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.3 90% cached fragments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

5.4 Orbit test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

5.5 Proxy servers test. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

6.1 ATS response time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.2 Nginx response time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.3 Varnish response time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.4 orbit response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

6.5 ATS CPU usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6.6 Nginx CPU usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6.7 Varnish CPU usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6.8 Orbit CPU usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6.9 ATS network traffic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.10 Nginx network traffic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.11 Varnish network traffic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.12 Orbit network traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.13 ATS response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6.14 Nginx response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6.15 Varnish response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

52

Page 67: Virtualization of Video Streaming Functions

List of Figures 53

6.16 Orbit response time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6.17 ATS network traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6.18 Nginx network traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6.19 Varnish network traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6.20 Orbit network traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

6.21 ATS CPU usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.22 Nginx CPU usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.23 Varnish CPU usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.24 Orbit CPU usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.25 Nginx response time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6.26 Varnish response time. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6.27 Nginx CPU usage. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6.28 Varnish CPU usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

6.29 Nginx network traffic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.30 Varnish network traffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

6.31 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

6.32 live without logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

6.33 live with logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

6.34 100% cached without logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

6.35 100% cached with logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

6.36 live without logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.37 live with logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.38 100% cached without logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.39 100% cached with logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

6.40 live without logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.41 live with logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.42 100% cached without logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.43 100% cached with logger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

6.44 orbit live . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.45 orbit 100% cached . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Page 68: Virtualization of Video Streaming Functions

54 List of Figures

6.46 orbit live . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.47 orbit 100% cached . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.48 orbit live . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

6.49 orbit 100% cached . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Page 69: Virtualization of Video Streaming Functions

List of Tables

5.1 Parameters of test cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

6.1 live test case results of all servers. . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

6.2 100% cached results of all servers. . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6.3 Parameters of test cases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

55

Page 70: Virtualization of Video Streaming Functions
Page 71: Virtualization of Video Streaming Functions

Bibliography

[1] Apache traffic server, http://trafficserver.apache.org/ last accessed June 15, 2015.

[2] Squid, http://www.squid-cache.org/ last accessed June 15, 2015.

[3] NGiNX, http://nginx.org/ last accessed June 20, 2015.

[4] Varnish, https://www.varnish-cache.org/ last accessed June 20, 2015.

[5] Aicache, https://aiscaler.com/ last accessed June 25, 2015.

[6] Ali Begen, Tankut Akgul, and Mark Baugher. Watching video over the web: Part 1: Stream-

ing protocols.IEEE Internet Computing 15(2):54–63, March 2011.

[7] Cisco. Cisco visual networking index: Global mobile data traffic forecast update, 2012-2017,

2013.

[8] A.Pathan and R.Buyya A Taxonomy and Survey of Content Delivery Networks

[9] R. Dubin, O. Hadar, R. Ohayon, and N. Amram. Progressive download video rate traffic

shaping using tcp window and deep packet inspection May 2012.

[10] I. Sodagar. The MPEG-DASH Standard for Multimedia Streaming Over the Internet. In

IEEE Multimedia, pages 62 – 67, 2011.

[11] Edgeware, http://www.edgeware.tv/products/ last accessed on April 20, 2016

[12] L.Larson-Kelley, Adobe Media Server 5 White Pape August, 2012.

[13] NGINX.Inc, Serving Media with NGINX Plus April 11, 2016.

[14] Digital Media Delivery Platform Introduction to Content Delivery Networks SEPTEM-

BER, 2011.

[15] HLS protocol https://tools.ietf.org/html/draft-pantos-http-live-streaming-16 last accessed

May 20, 2015.

[16] HTTP Live Streaming Overview. https://developer.apple.com/library/ios/documentation/NetworkingInternet/Conceptual/StreamingMediaGuide/Introduction/Introduction.html

last accessed on May 20, 2015.

57

Page 72: Virtualization of Video Streaming Functions

58 Bibliography

[17] EdgeCast.http://blog.edgecast.com/post/55198896476/hds-hls-hss-adaptive-http-

streaming last accessed May 20, 2015.

[18] L.Hedstrom, Apache Traffic Server Development Team Apache Traffic Server HTTP Proxy

Server on the Edge.

[19] White paper Varnish cache whitepaper November 2013.

[20] White paper HTTP Streaming with Varnish Cache.

[21] Logren Dely, Tobias. Caching HTTP: A comparative study of caching reverse proxies Var-

nish and Nginx 2014.

[22] Bakhtiyari, Shahab Performance Evaluation of the Apache Traffic Server and Varnish Re-

verse Proxies 2012.

[23] Nedelcu, Clement. Nginx HTTP Server: Adopt Nginx for Your Web Applications to Make

the Most of Your Infrastructure and Serve Pages Faster Than Ever. Packt Publishing Ltd

2010.

[24] Gannes, Liz (10 June 2009). ”The Next Big Thing in Video: Adaptive Bitrate Streaming”.

Archived from the original on 19 June 2010. Retrieved 1 June 2010.

[25] C. Muller, S. Lederer and C. Timmerer, “An Evaluation of Dynamic Adaptive Streaming

over HTTP in Vehicular Environments”, In Proceedings of the ACM Multimedia Systems

Conference 2012 and the 4th ACM Workshop on Mobile Video, Chapel Hill, North Carolina,

February 24, 2012.

[26] C. Mueller, S. Lederer, C. Timmerer DASH at ITEC, VLC Plugin, DASHEncoder and

Dataset.

[27] Duane Wessels. web caching. O’Reilly, June 2001.

[28] Charu Aggarwal, Joel Wolf and Philip Yu Caching on the World Wide Web. IBM T.J.

Watson Research Center, Yorktown Heights, New York.

[29] How Caching Works. by Guy Provost,last accessed April 15.

[30] https://www.varnish-software.com/plus/massive-storage-engine. last accessed April 13,

2016.