DB2 Active-Active Clustering - Washington Area … Active-Active Clustering Dwaine R. Snow and Linda Snow IBM

DB2

Active-Active Clustering

Dwaine R. Snow and Linda Snow

IBM

IBM Software Group

What Do You Want From Your Solution?

1.Highly Available

2.Low Cost

3.Easy to Setup/Administer

4.All Servers Active

Continuously Available

IBM Software Group

Traditional HA Clusters

� 2 Servers

�Active - Passive

� If / when the primary

server fails

�Transactions move to

the standby

�Failover can be very

fast

Application

X

IBM Software Group

Active / Active Clustering

� Fast, transparent failover,

plus…

� Ability to add database and

server capacity as the

“workload” grows

�Workload can be:

� More data

� More users

� Both

Application

IBM Software Group

DB2 Active / Active Clustering – HA

Application(s)

Load Balancer

DB2a DB2b DB2n…

Local Servers - HA and Scalability

IBM Software Group

And Beyond

Application(s)

Load Balancer

DB2a DB2b DB2n…DB2dr1

Local Servers - HA and ScalabilityRemote For DR

DB2drn …

HADR and Scale Out in one solution

Remote For DR

IBM Software Group

GRIDSCALE at a High Level - Writes

Application(s)

Load Balancer


Local - For HA and ScalabilityRemote – For DR

DB2drn …

WriteSend Writes

(i.e. insert/update/delete) to all DB2 servers

IBM Software Group

GRIDSCALE at a High Level - Writes

Application(s)

Load Balancer



DB2drn …

CompleteReturn to the application as soon

as the FIRST DB2 server completes

DB2 server completes processing the statement and returns to the load balancer

Remaining servers complete and return to the load balancer

IBM Software Group

apLive at a High Level - Writes

Application(s)

Load Balancer

DB2M DB2s1 DB2s2…DB2sr11


DB2srn …

Write Send all Writesto the Master

As well as all transactions, even the selects within a

transaction

IBM Software Group

apLive at a High Level - Writes

Application(s)

Load Balancer



DB2drn …

CompleteReturn to the application as soon

as the Master DB2 server completes

DB2 server completes processing the statement and returns to the load balancer

Writes sent to slaves Asynchronously

IBM Software Group

The Solutions at a High Level - Reads

Application(s)

Load Balancer



DB2drn …

Read 1

Read 1

Read 2Read 3

Read 3 Read 2

Reads are sent to the DB2 server with lowest queue.

GRIDSCALE ensures

data read consistency by

sending the query to the

server where data is up

to date.

apLive does NOT

guarantee data read

consistency

data can read from a

slave where a write has

not yet been applied

IBM Software Group

Failure Handling

Application(s)

Load Balancer


No block re-mastering, crash recovery, etc.

DB2drn

Read

Read

Read

X

Return

ReturnIf the server was unavailable/unreachable, or it failed while processing the statement, that statement is moved to a surviving server transparently and without any intervention

IBM Software Group

Online, Rolling Upgrades – xkoto only

Application(s)

Load Balancer

DB2a DB2b DB2n…DB2dr1DB2drn

Apply “missed” writesTime to apply will depend on:

Time system was out of the clusterTransaction rate

Once back in synch, Start sending incoming SQLRepeat for other servers as necessary

Stop sending new work

Apply fixpack, or new version

Add back into the cluster

V8–>V9FP3–>FP6

Mixed FPs / Versions supported

IBM Software Group

A Use Case

Application

DB2a DB2b DB2c

Load Balancer

Application

DB2x DB2y DB2z

Load Balancer

Philadelphia Phoenix

xkoto can

have 2 load

balancers

apLive does

not support

more than one

master

IBM Software Group

DB2 Active / Active Clustering vs. Shared Disk

Application(s)

Load Balancer

DB2a DB2b DB2n

Application(s)

nd1 nd2 nd3

SPOF

Load Balancer

SPOF?

Can add 2nd

inexpensive load balancer

Can add a 2nd

switch, but costly

IBM Software Group

DB2 Active / Active Clustering vs. Shared Disk for HA and DR

Application(s)

Load Balancer

DB2a DB2b DB2n

Application(s)

nd1 nd2 nd3Load Balancer

Local Remote

Application(s)

nd1 nd2 nd3

Local Remote

Data

Guard

X2 X2 ?

IBM Software Group

DB2 Active / Active Clustering vs. Shared Disk

Application(s)

DB2a DB2b DB2n

Application(s)

nd1 nd2 nd3

ExpensiveSwitch and Storage

Load Balancer

Low CostStorage

SAME

IBM Software Group

Active-Active Clustering Provides

�Access to all DB servers�Scalability, instantaneous failover for HA

�Ability to add predictable capacity on demand by adding servers

�DB2’s Active-Active Clustering adds�Very Good Scalability

�Distance Clustering = Continuous Availability

�Transparent failover between local and remote servers

IBM Software Group

Protection Against

�Server Failure AND Disk Failure

�With:�Low cost, commodity servers and storage

�Without:�The need for high cost shared storage and/or switches

IBM Software Group

Recommended Starting Configuration

Application(s)

DB2a DB2b DB2c

Load Balancer

�3 DB2 Servers

�In a worst case scenario, �If a server needs to be rebuilt�Can clone the database from one of the surviving severs�And still have the database up and running

Load Balancer

IBM Software Group

Summary

� DB Server failure is completely transparent

� All patches can be online, and rolling!

�Even version to version upgrades

�Best to have at least 3 DB2 servers to support this

� Can build an HA, Scale out, and DR solution in one

� No shared disk and/or switch as single points of failure!

IBM Software Group

Documents

DB2 Active-Active Clustering - Washington Area … Active-Active Clustering Dwaine R. Snow and Linda Snow IBM