52
3/20/2001 Network Computing Labor atory EE. KAIST 1 Web Servers & Load Web Servers & Load Balancing Techniques Balancing Techniques 3/20/2001 송송송 송송송

3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

Embed Size (px)

Citation preview

Page 1: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

1

Web Servers & Load Web Servers & Load Balancing TechniquesBalancing Techniques

3/20/2001송준화김영호

Page 2: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

2

Part I : Web ServersPart I : Web Servers

Page 3: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

3

OverviewOverview

What is a web server? Market share How a web server works? How does a web server serve contents? Architectures of Web Servers

– Example : Apache, AOLServer, Jigsaw Issues on Web Servers Load Balancing Techniques (part 2) References

Page 4: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

4

What is a Web Server?What is a Web Server?

An advanced application which runs on a server and does the following– Provides connections to remote computers – Sends web pages to remote computers via the Inte

rnet or an Intranet Examples of Web Servers

– Apache– MS Internet Information Server for Windows NT – AOLServer

Page 5: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

5

Market ShareMarket Share

Page 6: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

6

How does a Web Server How does a Web Server Work?Work? Static Contents

1. Web server receives a request for a Web page such as http://www.kaist.ac.kr/index.html

2. Server maps URL to a local file on the host server.

3. The server then loads this file from disk and serves it out across the network to the user's Web browser.

Page 7: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

7

Page 8: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

8

Dynamic contents– Dynamic means Web pages created in response to

a user’s input (eg : CGI)– Web server should run programs locally and trans

mit their output through the Web server to the user's Web browser that is requesting the dynamic content

– user's Web browser never really has to know that the content is dynamic because CGI is basically a Web server extension protocol.

Page 9: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

9

Page 10: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

10

How does a web server serve How does a web server serve contents?contents?

The primary mechanism for deciding how to display content is the MIME type header.

Multipurpose Internet Mail Extension (MIME) types tell a Web browser what sort of document is being sent.

Page 11: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

11

More than 370 MIME types are distributed with the Apache Web server by default in the mime.types configuration file.

eg) Apache mime.types file. – text/xml xml– video/mpeg mpeg mpg mpe– video/quicktime qt mov

Page 12: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

12

Reception

Request

Analysis

Access

Control

Resource

Handler

Record

Transaction

UtilityOperating System Abstraction Layer

Browser

Operating System

Web Server

Page 13: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

13

Architecture of Web Architecture of Web ServerServer Reception

– Interprets the resource request protocol

– Parses the requests, and builds an internal representation of the request

– Determines capabilities of the browser (e.g., simple text browser or graphics capable browser)

Page 14: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

14

Request Analyzer– Translates the location of the resource from netwo

rk location to local file name • eg) ~/index.html could be transformed to local file /usr/h

ttpd/pub/index.html

Access Control– Enforces the access rules employed by the server– Authenticate the browser and authorizes their acce

ss to the requested resources

February 19, 2001 PC Data Online

Page 15: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

15

Resource Handler

– Determines the type of the resource requested by the browser, executes it and generates the response.

Record Transaction– Records all the requests and their result.

Support Layer– Utility and Operating System Abstraction

Layer– Provide functions used by the above

subsystems

Page 16: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

16

Utility subsystem– Contains functions that are used by all

other subsystems. – It has functions for manipulating

strings or URLs and many commonly used functions

Operating System Abstraction Layer– Encapsulates the OS specific functionality to

facilitate the porting of the server to different platforms

Page 17: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

17

Example (1): ApacheExample (1): Apache

Freely Available – Source code– binaries for many platforms (version 1.3.x includes

also the Windows NT) Web server originally based on NCSA server (i

n 1995) Over 60% of Internet Web servers run Apache

or an Apache derivative (in the December 2000 survey)

Page 18: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

18

Process based

– 2.0 will support multi threads Very configurable, lots of directives... Optional modules provide extra functionality Apache is “A PAtCHy server”

– Patches on NCSA Httpd 1.3 Powerful performance and Continually upgrade

Page 19: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

19

Translatio

n

core

ResponseMime typeAuthorizatio

n

Authentication

Logging

Util OS Layer

Res. HandlerAccess

Ctrl

Req. analysis

Record

Trans.

Recep.

Page 20: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

20

ApacheApache

Core: maintains multiple processesRequest_rec: internal representation

Page 21: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

21

Example(2): AOLServerExample(2): AOLServer Commercial Web Server

– Developed by AOL– Source opened in 1999

First released in 1995 Powerful support for Database Provide extensibility

– By using a maintainable and safe extension language

– Using TCL (Tool Command Language) as the extension language.

Page 22: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

22

Communication

Driver

Daemon

Core

NSLog NSPerm

URL Handle

Timer Util

Database

Interface

TCL

Interpreter

NSthread

Recep.

Req. analysis

Access

Ctrl

Res. Handler

Record

Trans.

Util OSAL

*(NS: NaviSoft)

platform independent

Thread Lib.

Page 23: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

23

AOLServerAOLServer Richer OSAL and Utility subsystems (than Apache)

– Portable thread lib. Implementation– Database interface– Timer

• Event scheduling, time-out of connections, etc– TCL interpreter

Support for multiple network protocols Internal structure: Conn

Page 24: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

24

JigsawJigsaw

Experimental server developed by W3C– Analyzing Internet protocols and

standards Open source, first released in 1996 Written in Java

– Platform independent– OSAL does not exist – Extensibility – Object Oriented design

Page 25: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

25

Daemon

Protocol

Frame

In Filter

Resource

In Filter

Protocol

Frame

Out Filter

Resource

Out Filter

Resource

Util

Access

Ctrl

Record

Trans.

Res. Handler

Recep.

Page 26: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

26

JigsawJigsaw

Daemon: maintains a thread pool for concurrency

Filters: for different experiments???

Page 27: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

27

Issues on Web ServerIssues on Web Server

Connections explosion– Due to rapid growth of WWW

application on the internet, a web server may encounter the situation that a huge number of connection requests in a very short time

Research trend on web server– Load Balancing– Distributed Scalable Web Server

Page 28: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

28

Part II : Part II : Load Balancing Techniques Load Balancing Techniques

Junehwa SongYoung Ho Kim

Page 29: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

29

Load Balancing TechniqueLoad Balancing Technique

Mirror Client based approach DNS-based approach Dispatcher based approach

• Packet Single Rewriting• Packet Double Rewriting• Network Dispatcher

Server based approach• HTTP redirection• Packet redirection

Page 30: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

30

MirrorMirror

Replicate information across a mirrored server architecture

User manually select alternative URL

Not user transparentDon’t allow the Web-server

system to control request distribution

Page 31: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

31

Client Based ApproachClient Based Approach

Web Client– Web client selects a node of the cluster and submi

ts the request to the selected node– Netscape home(http://www.netscape.com) use thi

s technique• When user access this site, Navigator selects a random n

umber i between 1 and the number of servers and directs the request to the node wwwi.netscape.com

– Limited practical applicability and is not scalable

Page 32: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

32

Smart Client– Migrates server functionality to

the client through a Java applet– Increase network traffic and

network delayClient side Proxies

– Web Cluster standpoint, proxy servers are similar to clients

Page 33: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

33

DNS Based ApproachDNS Based Approach

DNS server maps the domain name to multiple IP address

Returning more than one IP address for the hostname or returning a different IP address for each DNS request it receives (Round robin)

User transparentSimple and easy to implement

Page 34: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

34

Page 35: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

35

Page 36: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

36

Drawbacks

– Unable to know the situation of the whole system

– Not really fair because DNS uses a simple round robin

– DNS may encounter TTL problem in IP-address cache

• Between the client and the web server DNS, many intermediate name servers can cache the logical name to IP address mapping to reduce network traffic and every web browser typically caches some address resolution

Page 37: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

37

•Because of address caching, each

address can cause a burst of future requests to the selected server and quickly obsolete the current load information

– Many DNS based solutions to this problem•System-Stateless algorithms•Server-State-based algorithms•Client-State-based algorithms•Adaptive TTL Algorithms

Page 38: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

38

Dispatcher based Dispatcher based approachapproach

Page 39: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

39

To centralize request scheduling and

completely control client-request routing Request routing among server is

transparent-unlike DNS-based– DNS deals address at the URL level, the dispatcher

has a single, virtual IP address(IP-SVA)

Dispatcher uniquely identifies each server in the system through a private address

Dispatcher typically use simple algorithms to select the Web server

Page 40: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

40

Packet Single RewritingPacket Single Rewriting

Page 41: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

41

TCP router acts as an IP address dispatcher

– Router tracks the source IP address for every established TCP connection to route packets regarding the same connection to the same web server node

High System availability– When one of server fails, its address can be remov

ed from the router’s table – Can be combined with a DNS based solution

Page 42: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

42

Packet Double RewritingPacket Double Rewriting

Page 43: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

43

Two solution using this approach– Magicrouter – Cisco System’s Local Director

Because outgoing packets typically outnumber incoming request packets, dispatcher becomes bottleneck

Page 44: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

44

Network DispatcherNetwork Dispatcher

Extends the basic TCP router mechanism work with both LANs and WANs

Dispatcher forward packets to the selected server using its physical address without IP modification

Page 45: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

45

Core and Sore Lab NRL project– http://core.kaist.ac.kr/nrlintro2.htm

Page 46: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

46

Server based approachServer based approach

Use two level dispatching mechanism– Integrating the DNS based approach

with redirection techniques executed by Web server

– Solves most DNS scheduling problemTwo Solution

– HTTP redirection– Packet redirection

Page 47: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

47

HTTP RedirectionHTTP Redirection

Page 48: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

48

Above figure server1 redirect the

request to server2. Not client transparent !

Overhead of infra cluster communication – Every server must periodically

transmit status information to cluster DNS

Increases response time in client side, because of packet redirection

Page 49: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

49

Packet RedirectionPacket Redirection

Use a round robin DNS mechanism to schedule the request among the Web Server

Server reached by a request reroutes the connection to another server through a packet rewriting– Transparent to the client!

Packet rewriting overhead

Page 50: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

50

ReferenceReference

[1] A reference architecture for Web Server Reverse Engineering, 2000. Proceedings. Seventh Working Conference on , 2

000 , Page(s): 150 -159 [2] Dynamic load balancing on Web-server systems

Cardellini, V.; Colajanni, M.; Yu, P.S. IEEE Internet Computing Volume: 3 3 , May-June 1999 , Page(s): 28 -39

[3] Design and practice of a dispatch server architecture Hong, H.C.; Chen, Y.C. Distributed Computing Systems, 1999. Proceedings. 7th IEEE Workshop on Future Trends of , 1999 , Page(s): 246 -251

[4] Scalable Web server architectures Mourad, A.; Huiqun Liu Computers and Communications, 1997. Proceedings., Second IEEE Symposium on , 1997 , Page(s): 12 -16

Page 51: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

51

[5] TCP/IP Illustrated, Volume1 W. Richard Strevens Addison Wesley

[6] TCP/IP Illustrated, Volume3 W. Richard Strevens Addison Wesley

[7] Netcraft. The Netcraft WWW server survey Available at http://www.netcraft.co.uk/Survey

Page 52: 3/20/2001 Network Computing Laboratory EE. KAIST 1 Web Servers & Load Balancing Techniques 3/20/2001 송준화 김영호

3/20/2001 Network Computing Laboratory EE. KAIST

52