9
526 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 526 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards Serial Peripheral Interface (SPI) Full Duplex, 4-wire, flexible clock rate Receive shift register Transmit shift register Clock generator Receive shift register Transmit shift register MISO MISO MOSI MOSI SCK SCK NSS CS Slave selector Master Slave 522 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 522 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards 5: Session Layer Service: Coordination of the dialogue between application programs Functions: Session establishment, management, termination Examples: RPC Application Presentation Session Transport Network Data link Physical Application Presentation Session Transport Network Data link Physical Network Data link Physical User data User data OSI Network Layers 518 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 518 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards 1: Physical Layer Service: Transmission of a raw bit stream over a communication channel Functions: Conversion of bits into electrical or optical signals Examples: X.21, Ethernet (cable, detectors & amplifiers) Application Presentation Session Transport Network Data link Physical Application Presentation Session Transport Network Data link Physical Network Data link Physical User data User data OSI Network Layers 514 8 Distributed Systems Uwe R. Zimmer - The Australian National University Systems, Networks & Concurrency 2020 527 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 527 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards Serial Peripheral Interface (SPI) © 2020 U R Zi Th A t li N ti lU i it 527 f 758 (h t 8 MISO MOSI SCK CS time Set Sample Set Set Set Set Set Set Set Sample Sample Sample Sample Sample Sample Sample Receive shift register Transmit shift register Clock generator Receive shift register Transmit shift register MISO MISO MOSI MOSI SCK SCK NSS CS Slave selector Master Slave Clock phase and polarity need to be agreed upon 523 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 523 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards 6: Presentation Layer Service: Provision of platform independent coding and encryption Functions: Code conversion, encryption, virtual devices Examples: ISO code conversion, PGP encryption Application Presentation Session Transport Network Data link Physical Application Presentation Session Transport Network Data link Physical Network Data link Physical User data User data OSI Network Layers 519 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 519 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards 2: Data Link Layer Service: Reliable transfer of frames over a link Functions: Synchronization, error correction, flow control Examples: HDLC (high level data link control protocol), LAP-B (link access procedure, balanced), LAP-D (link access procedure, D-channel), LLC (link level control), … Application Presentation Session Transport Network Data link Physical Application Presentation Session Transport Network Data link Physical Network Data link Physical User data User data OSI Network Layers 515 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 515 of 758 (chapter 8: “Distributed Systems” up to page 641) References for this chapter [Bacon1998] Bacon, J Concurrent Systems Addison Wesley Longman Ltd (2nd edition) 1998 [Ben2006] Ben-Ari, M Principles of Concurrent and Dis- tributed Programming second edition, Prentice-Hall 2006 [Schneider1990] Schneider, Fred Implementing fault-tolerant services using the state machine approach: a tutorial ACM Computing Surveys 1990 vol. 22 (4) pp. 299-319 [Tanenbaum2001] Tanenbaum, Andrew Distributed Systems: Prin- ciples and Paradigms Prentice Hall 2001 [Tanenbaum2003] Tanenbaum, Andrew Computer Networks Prentice Hall, 2003 528 Distributed Systems 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards (SPI) Serial Peripheral Interface (SPI) tReceive shift register Transmit shift register Clock generator Receive shift register Transmit shift register MISO MISO MOSI MOSI SCK SCK NSS CS Slave selector Master Slave fromSTM32L4x6 advancedARM®-based32-bit MCUs referencemanual: Figure420 onpage1291 1 shift register? FIFOs? Data connected to an internal bus? CRC? DMA? Speed? 524 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 524 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards 7: Application Layer Service: Network access for application programs Functions: Application/OS specific Examples: APIs for mail, ftp, ssh, scp, discovery protocols … Application Presentation Session Transport Network Data link Physical Application Presentation Session Transport Network Data link Physical Network Data link Physical User data User data OSI Network Layers 520 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 520 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards 3: Network Layer Service: Transfer of packets inside the network Functions: Routing, addressing, switching, congestion control Examples: IP, X.25 Application Presentation Session Transport Network Data link Physical Application Presentation Session Transport Network Data link Physical Network Data link Physical User data User data OSI Network Layers 516 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 516 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards OSI network reference model Standardized as the Open Systems Interconnection (OSI) reference model by the International Standardization Organization (ISO) in 1977 • 7 layer architecture • Connection oriented Hardy implemented anywhere in full … …but its concepts and terminology are widely used, when describing existing and designing new protocols … 529 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 529 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards (SPI) Receive shift register Transmit shift register Clock generator Receive shift register Transmit shift register MISO MISO MOSI MOSI SCK SCK NSS CS Slave selector Master Slave © 2020 U R Zi Th A t li N ti lUi it Receive shift register Transmit shift register Clock generator Receive shift register Transmit shift register MISO MISO MOSI MOSI SCK SCK S1 CS Slave selector Master Slave 1 Receive shift register Transmit shift register Slave 2 Receive shift register Transmit shift register Slave 3 MISO MOSI SCK CS MISO MOSI SCK CS S2 S3 Full duplex with 1 out of x slaves 525 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 525 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards Serial Peripheral Interface (SPI) G Used by gazillions of devices … and it’s not even a formal standard! G Speed only limited by what both sides can survive. G Usually push-pull drivers, i.e. fast and reliable, yet not friendly to wrong wiring/programming. 1.8” COLOR TFT LCD display from Adafruit SanDisk marketing photo 521 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 521 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards 4: Transport Layer Service: Transfer of data between hosts Functions: Connection establishment, management, termination, flow-control, multiplexing, error detection Examples: TCP, UDP, ISO TP0-TP4 Application Presentation Session Transport Network Data link Physical Application Presentation Session Transport Network Data link Physical Network Data link Physical User data User data OSI Network Layers 517 Distributed Systems © 2020 Uwe R. Zimmer, The Australian National University page 517 of 758 (chapter 8: “Distributed Systems” up to page 641) Network protocols & standards Application Presentation Session Transport Network Data link Physical Application Presentation Session Transport Network Data link Physical Network Data link Physical User data User data OSI Network Layers

4: Transport Layer

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 4: Transport Layer

526

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

526

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Seri

al P

erip

hera

l Int

erfa

ce (S

PI)

Full

Du

ple

x, 4

-wir

e, fl

exib

le c

lock

rat

e

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

NSS

CS

Slav

e se

lect

or

Master

Slave

522

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

522

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

5: S

essi

on

Laye

r

• Se

rvic

e: C

oo

rdin

atio

n o

f th

e d

ialo

gue

bet

wee

n a

pp

licat

ion

pro

gram

s

• Fu

nct

ion

s: S

essi

on

est

ablis

hm

ent,

man

agem

ent,

term

inat

ion

• Ex

amp

les:

RPC

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

518

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

518

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

1: P

hysi

cal L

ayer

• Se

rvic

e: T

ran

smis

sio

n o

f a r

aw b

it s

trea

m

ove

r a

com

mu

nic

atio

n c

han

nel

• Fu

nct

ion

s: C

on

vers

ion

of b

its

into

ele

ctri

cal o

r o

pti

cal s

ign

als

• Ex

amp

les:

X.2

1, E

ther

net

(cab

le, d

etec

tors

& a

mp

lifi e

rs)

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

514

8D

istr

ibut

ed S

yste

ms

Uw

e R

. Zim

mer

- T

he A

ustr

alia

n N

atio

nal U

nive

rsity

Syst

ems,

Net

wo

rks

& C

on

curr

ency

202

0

527

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

527

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Seri

al P

erip

hera

l Int

erfa

ce (S

PI)

©20

20U

RZ

iTh

At

liN

tilU

iit

527

f758

(h

t8

8

MISO

MOSI

SCK

CS

time

SetSa

mp

le Set

Set

Set

Set

Set

Set

Set

Sam

ple

Sam

ple

Sam

ple

Sam

ple

Sam

ple

Sam

ple

Sam

ple

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

NSS

CS

Slav

e se

lect

or

Master

Slave

Clo

ck p

has

e an

d

po

lari

ty n

eed

to

be

agre

ed u

po

n

523

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

523

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

6: P

rese

ntat

ion

Laye

r

• Se

rvic

e: P

rovi

sio

n o

f pla

tfo

rm in

dep

end

ent c

od

ing

and

en

cryp

tio

n

• Fu

nct

ion

s: C

od

e co

nve

rsio

n, e

ncr

ypti

on

, vir

tual

dev

ices

• Ex

amp

les:

ISO

co

de

con

vers

ion

, PG

P en

cryp

tio

n

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

519

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

519

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

2: D

ata

Link

Lay

er

• Se

rvic

e: R

elia

ble

tran

sfer

of f

ram

es o

ver

a lin

k

• Fu

nct

ion

s: S

ynch

ron

izat

ion

, err

or

corr

ecti

on

, flo

w c

on

tro

l

• Ex

amp

les:

HD

LC (h

igh

leve

l dat

a lin

k co

ntr

ol p

roto

col)

, LA

P-B

(lin

k ac

cess

pro

ced

ure

, bal

ance

d),

LAP-

D (l

ink

acce

ss p

roce

du

re, D

-ch

ann

el),

LLC

(lin

k le

vel c

on

tro

l), …

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

515

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

515

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Ref

eren

ces

for

this

cha

pter

[ Bac

on1

998 ]

Bac

on

, J

Co

ncu

rren

t Sys

tem

s A

dd

iso

n W

esle

y Lo

ngm

an

Ltd

(2n

d e

dit

ion

) 199

8

[ Ben

2006

] B

en-A

ri, M

Pr

inci

ple

s o

f Co

ncu

rren

t an

d D

is-

trib

ute

d P

rogr

amm

ing

seco

nd

ed

itio

n, P

ren

tice

-Hal

l 200

6

[ Sch

neid

er19

90 ]

Sch

nei

der

, Fre

d

Imp

lem

enti

ng

fau

lt-t

ole

ran

t ser

vice

s u

sin

g th

e st

ate

mac

hin

e ap

pro

ach

: a tu

tori

al

AC

M C

om

pu

tin

g Su

rvey

s 19

90

vol.

22 ( 4

) pp

. 299

-319

[ Tan

enb

aum

2001

] Ta

nen

bau

m, A

nd

rew

D

istr

ibu

ted

Sys

tem

s: P

rin

-ci

ple

s an

d P

arad

igm

s Pr

enti

ce H

all 2

001

[ Tan

enb

aum

2003

] Ta

nen

bau

m, A

nd

rew

C

om

pu

ter N

etw

ork

s Pr

enti

ce H

all,

2003

528

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

528

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

(SP

I)

Seri

al P

erip

hera

l Int

erfa

ce (S

PI)

pte

p(

)R

ecei

ve s

hift

regi

ster

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

NSS

CS

Slav

e se

lect

or

Master

Slave

fro

m S

TM32

L4x6

ad

van

ced

AR

M®-

bas

ed 3

2-b

it M

CU

s re

fere

nce

man

ual

: Fig

ure

420

on

pag

e 12

91

1 sh

ift r

egis

ter?

FIFO

s?

Dat

a co

nn

ecte

d to

an

inte

rnal

bu

s?

CR

C?

DM

A?

Spee

d?

524

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

524

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

7: A

pp

licat

ion

Laye

r

• Se

rvic

e: N

etw

ork

acc

ess

for

app

licat

ion

pro

gram

s

• Fu

nct

ion

s: A

pp

licat

ion

/OS

spec

ific

• Ex

amp

les:

API

s fo

r m

ail,

ftp

, ssh

, scp

, dis

cove

ry p

roto

cols

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

520

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

520

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

3: N

etw

ork

Lay

er

• Se

rvic

e: T

ran

sfer

of p

acke

ts in

sid

e th

e n

etw

ork

• Fu

nct

ion

s: R

ou

tin

g, a

dd

ress

ing,

sw

itch

ing,

co

nge

stio

n c

on

tro

l

• Ex

amp

les:

IP, X

.25

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

516

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

516

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

OSI

net

wo

rk r

efer

ence

mo

del

Stan

dar

diz

ed a

s th

eO

pen

Syst

ems

Inte

rcon

nect

ion

(OSI

) ref

eren

ce m

od

el b

y th

e In

tern

atio

nal

Sta

nd

ard

izat

ion

Org

aniz

atio

n (I

SO) i

n 1

977

• 7

laye

r ar

chit

ectu

re

• C

on

nec

tio

n o

rien

ted

Har

dy

imp

lem

ente

d a

nyw

her

e in

full

…b

ut i

ts c

once

pts

and

term

inol

ogy

are

wid

ely

use

d,

wh

en d

escr

ibin

g ex

isti

ng

and

des

ign

ing

new

pro

toco

ls …

529

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

529

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

(SP

I)R

ecei

ve s

hift

regi

ster

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

NSS

CS

Slav

e se

lect

or

Master

Slave

©20

20U

RZ

iTh

At

liN

tilU

iit

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

S1C

SSl

ave

sele

ctor

Mas

ter

Slav

e 1

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Slav

e 2

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Slav

e 3

MIS

O

MO

SI

SCK CS

MIS

O

MO

SI

SCK CS

S2 S3

Full

du

ple

x w

ith

1

ou

t of x

sla

ves

525

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

525

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Seri

al P

erip

hera

l Int

erfa

ce (S

PI)

Use

d b

y ga

zilli

on

s o

f dev

ices

… a

nd

it

’s n

ot e

ven

a fo

rmal

sta

nd

ard

!

Sp

eed

on

ly li

mit

ed b

y w

hat

b

oth

sid

es c

an s

urv

ive.

Usu

ally

pu

sh-p

ull

dri

vers

, i.e

. fas

t an

d r

elia

ble

, yet

no

t fri

end

ly to

wro

ng

wir

ing/

pro

gram

min

g.

1.8”

CO

LOR

TFT

LC

D d

isp

lay

fro

m A

daf

ruit

San

Dis

k m

arke

tin

g p

ho

to

521

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

521

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

4: T

rans

po

rt L

ayer

• Se

rvic

e: T

ran

sfer

of d

ata

bet

wee

n h

ost

s

• Fu

nct

ion

s: C

on

nec

tio

n e

stab

lish

men

t, m

anag

emen

t, te

rmin

atio

n, fl

ow

-co

ntr

ol,

mu

ltip

lexi

ng,

err

or

det

ecti

on

• Ex

amp

les:

TC

P, U

DP,

ISO

TP0

-TP4

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

517

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

517

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Ap

plic

atio

n

Pres

enta

tio

n

Sess

ion

Tran

spo

rt

Net

wo

rk

Dat

a lin

k

Phys

ical

Ap

plic

atio

n

Pres

enta

tio

n

Sess

ion

Tran

spo

rt

Net

wo

rk

Dat

a lin

k

Phys

ical

Net

wo

rk

Dat

a lin

k

Phys

ical

Use

r d

ata

Use

r d

ata

OSI

Net

wo

rk L

ayer

s

Page 2: 4: Transport Layer

542

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

542

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Fib

re C

hann

elM

app

ing

of F

ibre

Ch

ann

el to

OSI

laye

rs:

Ap

plic

atio

n

Pres

enta

tio

n

Sess

ion

Tran

spo

rt

Net

wo

rk

Dat

a lin

k

Phys

ical

Ap

plic

atio

n

Pres

enta

tio

n

Sess

ion

Tran

spo

rt

Net

wo

rk

Dat

a lin

k

Phys

ical

IP

Phys

ical

Use

r d

ata

Use

r d

ata

OSI

TCP/

IPO

SI

IP

Phys

ical

Ap

plic

atio

n

FC/I

P

FC-0

Ap

plic

atio

n

Fib

reC

hann

el

FC-4

FC-4

FC

-3FC

-3

FC-2

FC-1

Tran

spo

rtTr

ansp

ort

Net

wo

rkN

etw

ork

Ap

plic

atio

n

FC-3

Co

mm

on

se

rvic

e

FC-4

Pro

toco

l map

pin

g

FC-2

Ne

two

rk

FC-0

Ph

ysic

al

FC-1

Dat

a li

nk

538

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

538

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Ethe

rnet

/ IE

EE 8

02.1

1

Wir

eles

s lo

cal a

rea

net

wo

rk (W

LAN

) dev

elo

ped

in th

e 90

’s

• Fi

rst s

tan

dar

d a

s IE

EE 8

02.1

1 in

199

7 (1

-2 M

bp

s o

ver

2.4

GH

z).

• Ty

pic

al u

sage

at 5

4 M

bp

s o

ver

2.4

GH

z ca

rrie

r at

20

MH

z b

and

wid

th.

• C

urr

ent s

tan

dar

ds

up

to 7

80 M

bp

s (8

02.1

1ac)

ove

r 5

GH

z ca

rrie

r at

160

MH

z b

and

wid

th.

• Fu

ture

sta

nd

ard

s ar

e d

esig

ned

for

up

to 1

00 G

bp

s o

ver

60 G

Hz

carr

ier.

• D

irec

t rel

atio

n to

IEEE

802

.3 a

nd

sim

ilar

OSI

laye

r as

soci

atio

n.

Car

rier

Sen

se M

ulti

ple

Acc

ess

wit

h C

ollis

ion

Avo

idan

ce (C

SMA

/CA

)

Dir

ect-

Sequ

ence

Spr

ead

Spec

trum

(D

SSS)

534

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

534

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Ap

plic

atio

n

Pres

enta

tio

n

Sess

ion

Tran

spo

rt

Net

wo

rk

Dat

a lin

k

Phys

ical

IP

Net

wo

rk

Phys

ical

OSI

App

leTa

lk o

ver

IP

Eth

erTa

lk L

ink

Acc

ess

Pro

toco

lLo

calT

alk

Lin

k A

cces

s Pr

oto

col

Toke

nTa

lk L

ink

Acc

ess

Pro

toco

lFD

DIT

alk

Lin

k A

cces

s Pr

oto

col

IEEE

802

.3Lo

calT

alk

Toke

n R

ing

IEEE

802

.5FD

DI

Ap

ple

Talk

Fili

ng

Pro

toco

l (A

FP)

Ro

uti

ng

Tab

le

Mai

nte

nan

ce P

rot.

AT

Up

dat

e B

ased

Ro

uti

ng

Pro

toco

lA

T Tr

ansa

ctio

n

Pro

toco

lN

ame

Bin

din

g Pr

oto

col

AT

Ech

o

Pro

toco

l

AT

Dat

a St

ream

Pro

toco

lA

T Se

ssio

n P

roto

col

Zo

ne

Info

Pro

toco

lPr

inte

r A

cces

s Pr

oto

col

Dat

agra

m D

eliv

ery

Pro

toco

l (D

DP)

Ap

ple

Talk

Add

ress

Res

olu

tio

n Pr

oto

col (

AA

RP)

530

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

530

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

(SP

I)R

ecei

ve s

hift

regi

ster

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

NSS

CS

Slav

e se

lect

or

Master

Slave

©20

20U

RZ

iTh

At

liN

tilU

iit

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

S1C

SSl

ave

sele

ctor

Mas

ter

Slav

e 1

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Slav

e 2

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Slav

e 3

MIS

O

MO

SI

SCK CS

MIS

O

MO

SI

SCK CS

S2 S3

Co

ncu

rren

t sim

ple

x w

ith

y o

ut o

f x s

lave

s

543

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

543

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Infi n

iBan

d

• D

evel

op

ed in

the

late

90’

s

• D

efi n

ed b

y th

e In

fi n

iBan

d T

rad

e A

sso

ciat

ion

(IB

TA) s

ince

199

9.

• C

urr

ent s

tan

dar

ds

allo

w fo

r 25

Gb

ps

per

lin

k.

• Sw

itch

ed fa

bri

c to

po

logi

es.

• C

on

curr

ent d

ata

links

po

ssib

le (c

om

mo

nly

up

to 1

2 3

00 G

bp

s).

• D

efi n

es o

nly

the

dat

a-lin

k la

yer a

nd

par

ts o

f th

e n

etw

ork

laye

r.

• Ex

isti

ng

dev

ices

use

co

pp

er c

able

s (i

nst

ead

of o

pti

cal fi

bre

s).

Mo

stly

use

d in

su

per

-co

mp

ute

rs a

nd

clu

ster

s b

ut a

pp

licab

le to

sto

rage

arr

ays

as w

ell.

Ch

eap

er th

an E

ther

net

or

Fib

reC

han

nel

at h

igh

dat

a-ra

tes.

Sm

all p

acke

ts (o

nly

up

to 4

kB

) an

d n

o s

essi

on

co

ntr

ol.

539

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

539

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Blu

eto

oth

Wir

eles

s lo

cal a

rea

net

wo

rk (W

LAN

) dev

elo

ped

in th

e 90

’s w

ith

dif

fere

nt f

eatu

res

than

802

.11:

• Lo

wer

po

wer

co

nsu

mp

tio

n.

• Sh

ort

er r

ange

s.

• Lo

wer

dat

a ra

tes

(typ

ical

ly <

1 M

bp

s).

• A

d-h

oc

net

wo

rkin

g (n

o in

fras

tru

ctu

re r

equ

ired

).

Co

mb

inat

ion

s o

f 802

.11

and

Blu

eto

oth

OSI

laye

rsar

e p

oss

ible

to a

chie

ve th

e re

qu

ired

feat

ure

s se

t.

535

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

535

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Ethe

rnet

/ IE

EE 8

02.3

Loca

l are

a n

etw

ork

(LA

N) d

evel

op

ed b

y X

ero

x in

the

70’s

• 10

Mb

ps

spec

ifica

tio

n 1

.0 b

y D

EC, I

nte

l, &

Xer

ox

in 1

980.

• Fi

rst s

tan

dar

d a

s IE

EE 8

02.3

in 1

983

(10

Mb

ps

ove

r th

ick

co-a

x ca

ble

s).

• cu

rren

tly

1 G

bp

s (8

02.3

ab) c

op

per

cab

le p

ort

s u

sed

in m

ost

des

kto

ps

and

lap

top

s.

• cu

rren

tly

stan

dar

ds

up

to 1

00 G

bp

s (I

EEE

802.

3ba

2010

).

• m

ore

than

85

% o

f cu

rren

t LA

N li

nes

wo

rld

wid

e (a

cco

rdin

g to

the

Inte

rnat

ion

al D

ata

Co

rpo

rati

on

(ID

C))

.

Car

rier

Sen

se M

ulti

ple

Acc

ess

wit

h C

ollis

ion

Det

ecti

on (C

SMA

/CD

)

531

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

531

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

(SP

I)R

ecei

ve s

hift

regi

ster

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

NSS

CS

Slav

e se

lect

or

Master

Slave

©20

20U

RZ

iTh

At

liN

tilU

iit

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Clo

ck g

ener

ator

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

rM

ISO

MIS

O

MO

SIM

OSI

SCK

SCK

NSS

CS

Slav

e se

lect

or

Mas

ter

Slav

e 1

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Slav

e 2

Rec

eive

shi

ft re

gist

er

Tran

smit

shift

reg

iste

r

Slav

e 3

MIS

O

MO

SI

SCK CS

MIS

O

MO

SI

SCK CS

Co

ncu

rren

t d

aisy

ch

ain

ing

wit

h a

ll sl

aves

544

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

544

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Dis

trib

utio

n!

Mo

tiva

tio

nPo

ssib

ly …

… fi

ts a

n e

xist

ing

phys

ical

dis

trib

utio

n (e

-mai

l sys

tem

, dev

ices

in a

larg

e cr

aft,

…).

… h

igh

perf

orm

ance

du

e to

po

ten

tial

ly h

igh

deg

ree

of p

aral

lel p

roce

ssin

g.

… h

igh

relia

bilit

y/in

tegr

ity

du

e to

red

un

dan

cy o

f har

dw

are

and

so

ftw

are.

… s

cala

ble.

… in

tegr

atio

n o

f h

eter

oge

neo

us

dev

ices

.

Dif

fere

nt s

pec

ifi ca

tio

ns

will

lead

to s

ub

stan

tial

ly d

iffe

ren

t dis

trib

ute

d d

esig

ns.

540

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

540

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Toke

n R

ing

/ IE

EE 8

02.5

/

Fib

re D

istr

ibut

ed D

ata

Inte

rfac

e (F

DD

I)

• “T

oke

n R

ing

“ d

evel

op

ed b

y IB

M in

the

70’s

• IE

EE 8

02.5

sta

nd

ard

is m

od

elle

d a

fter

the

IBM

To

ken

Rin

g ar

chit

ectu

re(s

pec

ifi ca

tio

ns

are

slig

htl

y d

iffe

ren

t, b

ut b

asic

ally

co

mp

atib

le)

• IB

M T

oke

n R

ing

req

ues

ts a

re s

tar

top

olo

gy a

s w

ell a

s tw

iste

d p

air

cab

les,

wh

ile IE

EE 8

02.5

is u

nsp

ecifi

ed in

top

olo

gy a

nd

med

ium

• Fi

bre

Dis

trib

ute

d D

ata

Inte

rfac

e co

mb

ines

a to

ken

rin

g ar

chit

ectu

re

wit

h a

du

al-r

ing,

fi b

re-o

pti

cal,

ph

ysic

al n

etw

ork

.

Un

like

CSM

A/C

D, T

oken

rin

g is

det

erm

inis

tic

(wit

h r

esp

ect t

o it

s ti

min

g b

ehav

iou

r)

FD

DI i

s de

term

inis

tic

and

failu

re r

esis

tant

No

ne

of t

he

abo

ve is

cu

rren

tly

use

d in

per

form

ance

ori

ente

d a

pp

licat

ion

s.

536

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

536

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Ethe

rnet

/ IE

EE 8

02.3

O

SI r

elat

ion

: PH

Y, M

AC

, MA

C-c

lien

t

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

OS

Ire

fere

nce

model

Applic

ation

Pre

senta

tion

Sessio

n

Tra

nsport

Ne

two

rk

Data

lin

k

Physic

al

IEE

E 8

02.3

refe

rence

model

MA

C-c

lient

Media

Access (

MA

C)

Physic

al (P

HY

)

Upper-

layer

pro

tocols

IEE

E 8

02-s

pecific

IEE

E 8

02.3

-specific

Media

-specific

532

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

532

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Ap

plic

atio

n

Pres

enta

tio

n

Sess

ion

Tran

spo

rt

Net

wo

rk

Dat

a lin

k

Phys

ical

Ap

plic

atio

n

Pres

enta

tio

n

Sess

ion

Tran

spo

rt

Net

wo

rk

Dat

a lin

k

Phys

ical

IP

Net

wo

rk

Phys

ical

Use

r d

ata

Use

r d

ata

OSI

Tran

spo

rt

Ap

plic

atio

n

TCP/

IPO

SI

545

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

545

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Wha

t ca

n b

e d

istr

ibut

ed?

• St

ate

C

om

mo

n o

per

atio

ns

on

dis

trib

ute

d d

ata

• Fu

ncti

on

Dis

trib

ute

d o

per

atio

ns

on

cen

tral

dat

a

• St

ate

& F

unct

ion

C

lien

t/se

rver

clu

ster

s

• no

ne o

f tho

se

Pu

re r

eplic

atio

n, r

edu

nd

ancy

541

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

541

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Fib

re C

hann

el

• D

evel

op

ed in

the

late

80’

s.

• A

NSI

sta

nd

ard

sin

ce 1

994.

• C

urr

ent s

tan

dar

ds

allo

w fo

r 16

Gb

ps

per

lin

k.

• A

llow

s fo

r th

ree

dif

fere

nt t

op

olo

gies

:

Poi

nt-t

o-po

int:

2 ad

dre

sses

Arb

itra

ted

loop

(sim

ilar

to to

ken

rin

g): 1

27 a

dd

ress

es

det

erm

inis

tic,

rea

l-ti

me

cap

able

Sw

itch

ed fa

bric

: 224

ad

dre

sses

, man

y to

po

logi

es a

nd

co

ncu

rren

t dat

a lin

ks p

oss

ible

• D

efi n

es O

SI e

qu

ival

ent l

ayer

s u

p to

the

sess

ion

leve

l.

Mo

stly

use

d in

sto

rage

arr

ays,

b

ut a

pp

licab

le to

su

per

-co

mp

ute

rs a

nd

hig

h in

tegr

ity

syst

ems

as w

ell.

537

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

537

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Ethe

rnet

/ IE

EE 8

02.3

O

SI r

elat

ion

: PH

Y, M

AC

, MA

C-c

lien

t

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

App

licat

ion

Pres

enta

tion

Sess

ion

Tran

spor

t

Net

wor

k

Dat

a lin

k

Phys

ical

Net

wor

k

Dat

a lin

k

Phys

ical

Use

r da

taU

ser

data

OSI

Net

wor

k La

yers

802.3

MA

C

Physic

al m

ediu

m-

independent la

ye

r

MA

C C

lient

MII

Physic

al m

ediu

m-

dependent la

yers

MD

I

802.3

MA

C

Physic

al m

ediu

m-

independent la

ye

r

MA

C C

lient

MII

Physic

al m

ediu

m-

dependent la

yers

MD

I

PH

Y

Lin

k m

edia

,sig

nal encodin

g, and

transm

issio

n r

ate

Tra

nsm

issio

n r

ate

MII =

Mediu

m-independent in

terf

ace

MD

I =

Mediu

m-d

ependent in

terf

ace -

the lin

k c

onnecto

r

Lin

k

533

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

533

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Net

wor

k pr

otoc

ols

& s

tand

ards

Ap

plic

atio

n

Pres

enta

tio

n

Sess

ion

Tran

spo

rt

Net

wo

rk

Dat

a lin

k

Phys

ical

Ap

ple

Talk

Fili

ng

Pro

toco

l (A

FP)

Ro

uti

ng

Tab

le

Mai

nte

nan

ce P

rot.

IP

Net

wo

rk

Phys

ical

OSI

Tran

spo

rt

Ap

plic

atio

n

TCP/

IPA

pple

Talk

AT

Up

dat

e B

ased

R

ou

tin

g Pr

oto

col

AT

Tran

sact

ion

Pr

oto

col

Nam

e B

ind

ing

Pro

t.A

T Ec

ho

Pr

oto

col

AT

Dat

a St

ream

Pr

oto

col

AT

Sess

ion

Pr

oto

col

Zo

ne

Info

Pr

oto

col

Prin

ter

Acc

ess

Pro

toco

l

Dat

agra

m D

eliv

ery

Pro

toco

l (D

DP)

Ap

ple

Talk

Add

ress

Res

olu

tio

n Pr

oto

col (

AA

RP)

Eth

erTa

lk L

ink

Acc

ess

Pro

toco

lLo

calT

alk

Lin

k A

cces

s Pr

oto

col

Toke

nTa

lk L

ink

Acc

ess

Pro

toco

lFD

DIT

alk

Lin

k A

cces

s Pr

oto

col

IEEE

802

.3Lo

calT

alk

Toke

n R

ing

IEEE

802

.5FD

DI

Page 3: 4: Transport Layer

558

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

558

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Vir

tual

(lo

gica

l) ti

me

()

()

ab

Ca

Cb

<"

&

Imp

licat

ion

s:

()

()

()

()

()

Ca

Cb

ba

ab

ab

<&

""

0J

z=

()

()

Ca

Cb

ab

ab

ba

&"

"/

JJ

z=

=^

^h

h

()

()

()

Ca

Cb

Cc

ca

<&

"J

=^

h

()

()

()

Ca

Cb

Cc

ca

<<

&"

J^

h

554

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

554

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Vir

tual

(lo

gica

l) ti

me

[Lam

po

rt 1

978]

()

()

ab

Ca

Cb

<"

&

wit

h a

b"

bei

ng

a ca

usa

l rel

atio

n b

etw

een

a a

nd

b,

and

(

)C

a,

()

Cb

are

the

(vir

tual

) tim

es a

sso

ciat

ed w

ith

a a

nd

b

ab

" if

f:•

a h

app

ens

earl

ier

than

b in

the

sam

e se

qu

enti

al c

on

tro

l-fl

ow

or

• a

den

ote

s th

e se

ndin

g ev

ent o

f mes

sage

m,

wh

ile b

den

ote

s th

e re

ceiv

ing

even

t of t

he

sam

e m

essa

ge m

or

• th

ere

is a

tran

sitiv

e ca

usal

rel

atio

n b

etw

een

a a

nd

b:

ae

eb

n1

""

""

f

No

tio

n o

f co

ncu

rren

cy:

ab

ab

ba

&"

"/

JJ

z^

^h

h

550

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

550

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Sync

hro

nize

a ‘r

eal-

tim

e’ c

lock

(bi-

dir

ecti

ona

l)

Res

etti

ng

the

clo

ck d

rift

by

regu

lar

refe

ren

ce ti

me

re-s

ynch

ron

izat

ion

:

Max

imal

clo

ck d

rift

d d

efi n

ed a

s:

()

()

Ct

Ct

-1

11

21

21

##

dd

++

-t

t-

^^

hh

‘rea

l-ti

me’

clo

ck is

ad

just

ed

forw

ard

s &

bac

kwar

ds

Cal

enda

r ti

me

t 're

al-t

ime'

C 'm

easu

red

tim

e'

syn

c.sy

nc.

syn

c.

ref.

tim

e

ref.

tim

e

ref.

tim

e

real

clo

ck

idea

lcl

ock

546

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

546

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Co

mm

on

des

ign

crit

eria

Ach

ieve

De-

coup

ling

/ hig

h d

egre

e o

f lo

cal a

uto

no

my

Coo

pera

tion

rat

her

than

cen

tral

co

ntr

ol

Co

nsi

der

Rel

iabi

lity

Co

nsi

der

Sca

labi

lity

Co

nsi

der

Per

form

ance

559

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

559

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Vir

tual

(lo

gica

l) ti

me

()

()

ab

Ca

Cb

<"

&

Imp

licat

ion

s:

()

()

()

()

()

Ca

Cb

ba

ab

ab

<&

""

0J

z=

()

()

Ca

Cb

ab

ab

ba

&"

"/

JJ

z=

=^

^h

h

()

()

()

()

()

Ca

Cb

Cc

ca

ac

ac

<&

""

0J

z=

=^

h

()

()

()

()

()

Ca

Cb

Cc

ca

ac

ac

<<

&"

"0

Jz

=^

h

555

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

555

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Vir

tual

(lo

gica

l) ti

me

()

()

ab

Ca

Cb

<"

&

Imp

licat

ion

s:

()

()

?C

aC

b<

&

()

()

?C

aC

b&

=

()

()

()

?C

aC

bC

c<

&=

()

()

()

?C

aC

bC

c<

<&

551

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

551

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Sync

hro

nize

a ‘r

eal-

tim

e’ c

lock

(fo

rwar

d o

nly)

Res

etti

ng

the

clo

ck d

rift

by

regu

lar

refe

ren

ce ti

me

re-s

ynch

ron

izat

ion

:

Max

imal

clo

ck d

rift

d d

efi n

ed a

s:

()

()

Ct

Ct

-1

11

21

21

##

d+

-t

t-

^h

‘rea

l-ti

me’

clo

ck is

ad

just

ed

forw

ard

s o

nly

Mon

oton

ic t

ime

t 're

al-t

ime'

C 'm

easu

red

tim

e'

syn

c.sy

nc.

syn

c.

ref.

tim

e

ref.

tim

e

ref.

tim

e

idea

lcl

ock

547

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

547

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Som

e co

mm

on

phe

nom

ena

in d

istr

ibut

ed s

yste

ms

1. U

npre

dict

able

del

ays

(co

mm

un

icat

ion

) A

re w

e d

on

e ye

t?

2. M

issi

ng o

r im

prec

ise

tim

e-ba

se C

ausa

l rel

atio

n o

r te

mp

ora

l rel

atio

n?

3. P

arti

al fa

ilure

s L

ikel

iho

od

of i

nd

ivid

ual

failu

res

incr

ease

s

Lik

elih

oo

d o

f co

mp

lete

failu

re d

ecre

ases

(in

cas

e o

f a g

oo

d d

esig

n)

560

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

560

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Vir

tual

(lo

gica

l) ti

me

Tim

e as

der

ived

fro

m c

ausa

l rel

atio

ns:

25

tim

e0

510

1520

2530

3540

45

21

20

P 1

26

27

29

22

29

P 2 P 3

31

35

36

23

24

2526

27

30

31

33

34

3536

37

3031

32

33

34

353637

40

3839

2728

30

37

38

Mes

sage

20

22 26

22233333

24

227

30

4

27

2242

29

22225

8

2222222222222225 292292292929222929292929292929292999

9

26

272272722727272777777777777

30

3

26

27272727272777777777

0 23333

31

36

36

31

3333333333

35

3

35 338

37

33

34333333

33

3535

444343433344

35

Eve

nts

in c

on

curr

ent c

on

tro

l fl o

ws

are

no

t ord

ered

.

No

glo

bal

ord

er o

f tim

e.

556

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

556

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Vir

tual

(lo

gica

l) ti

me

()

()

ab

Ca

Cb

<"

&

Imp

licat

ion

s:

()

()

()

Ca

Cb

ba

<&

"J

()

()

Ca

Cb

ab

&z

=

()

()

()

?C

aC

bC

c<

&=

()

()

()

?C

aC

bC

c<

<&

552

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

552

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Dis

trib

uted

cri

tica

l reg

ions

wit

h sy

nchr

oni

zed

clo

cks

• 6

tim

es:

6 r

ecei

ved

Req

ues

ts: A

dd to

loca

l Req

ues

tQu

eue

(ord

ered

by

tim

e)6

rec

eive

d R

elea

se m

essa

ges:

Del

ete

corr

esp

on

din

g Re

qu

ests

in lo

cal R

equ

estQ

ueu

e

1. C

reat

e O

wn

Req

ues

t an

d a

ttac

h cu

rren

t tim

e-st

amp

.A

dd O

wn

Req

ues

t to

loca

l Req

ues

tQu

eue

(ord

ered

by

tim

e).

Send

Ow

nRe

qu

est t

o a

ll p

roce

sses

.

2. D

elay

by

L2 ( L

bei

ng

the

tim

e it

take

s fo

r a

mes

sage

to r

each

all

net

wo

rk n

od

es)

3. W

hile

To

p (R

equ

estQ

ueu

e) ≠

Ow

nRe

qu

est:

dela

y u

nti

l new

mes

sage

4. E

nter

an

d le

ave

crit

ical

reg

ion

5. S

end

Rele

ase-

mes

sage

to a

ll p

roce

sses

.

548

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

548

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Tim

e in

dis

trib

uted

sys

tem

s

Two

alt

ern

ativ

e st

rate

gies

:

Bas

ed o

n a

shar

ed t

ime

Syn

chro

nize

clo

cks!

Bas

ed o

n se

que

nce

of e

vent

s C

reat

e a

virt

ual t

ime!

561

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

561

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Imp

lem

enti

ng a

vir

tual

(lo

gica

l) ti

me

1.

:P

C0

ii

6=

2.

:P

i6 6

loca

l eve

nts

: CC

1i

i=

+;

6 s

end

eve

nts

: CC

1i

i=

+; S

end

(mes

sage

, Ci);

6 r

ecei

ve e

ven

ts: R

ecei

ve (m

essa

ge, C

m);

(,

)m

axC

CC

1i

im

=+

;

557

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

557

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Vir

tual

(lo

gica

l) ti

me

()

()

ab

Ca

Cb

<"

&

Imp

licat

ion

s:

()

()

()

()

()

Ca

Cb

ba

ab

ab

<&

""

0J

z=

()

()

Ca

Cb

ab

ab

ba

&"

"/

JJ

z=

=^

^h

h

()

()

()

?C

aC

bC

c<

&=

()

()

()

?C

aC

bC

c<

<&

553

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

553

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

Dis

trib

uted

cri

tica

l reg

ions

wit

h sy

nchr

oni

zed

clo

cks

Ana

lysi

s•

No

dea

dlo

ck, n

o in

div

idu

al s

tarv

atio

n, n

o li

velo

ck.

• M

inim

al r

equ

est d

elay

: L2

.

• M

inim

al r

elea

se d

elay

: L.

• C

om

mu

nic

atio

ns

req

uir

emen

ts p

er r

equ

est:

N2

1-

^h m

essa

ges

(can

be

sign

ifi ca

ntl

y im

pro

ved

by

emp

loyi

ng

bro

adca

st m

ech

anis

ms)

.

• C

lock

dri

fts

affe

ct fa

irn

ess,

bu

t no

t in

tegr

ity

of t

he

crit

ical

reg

ion

.

Ass

um

pti

on

s:

• L

is k

no

wn

an

d c

on

stan

t v

iola

tio

n le

ads

to lo

ss o

f mu

tual

exc

lusi

on

.

• N

o m

essa

ges

are

lost

v

iola

tio

n le

ads

to lo

ss o

f mu

tual

exc

lusi

on

.

549

Dis

trib

ute

d S

yste

ms

© 2

020

Uw

e R

. Zim

mer

, The

Aus

tral

ian

Nat

iona

l Uni

vers

ity

page

549

of 7

58 (c

hapt

er 8

: “D

istr

ibut

ed S

yste

ms”

up

to p

age

641)

Dis

trib

uted

Sys

tem

s

‘Rea

l-ti

me’

clo

cks

are:

• di

scre

te –

i.e.

tim

e is

no

t den

se a

nd

ther

e is

a m

inim

al g

ran

ula

rity

• dr

ift a

ffec

ted:

Max

imal

clo

ck d

rift

d d

efi n

ed a

s:

()

()

Ct

Ct

-1

11

21

21

##

dd

++

-t

t-

^^

hh

oft

en s

pec

ifi ed

as

PPM

(Par

ts-P

er-M

illio

n)

(typ

ical

20

. P

PM in

co

mp

ute

r ap

plic

atio

ns)

©20

20U

we

RZ

imm

erTh

eA

ustr

alia

nN

atio

nalU

nive

rsi

t 're

al-t

ime'

1

1

idea

l clo

ck

d

C 'm

easu

red

tim

e'

1-(1

+d

)-1real

clo

ck

Page 4: 4: Transport Layer

574

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 574 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states Running the snapshot algorithm:

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31 35 36

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

22

26

22 2333 24

227

30

4

27

2242

2922292299

22225

8

22222222222225

292229292922229292929292929292922929999

9

26 272727272222772277777777777

3

26 272722722727777777

0

0

2 333333

31

36

36

31

3 333333333

35

3

35

333837

22

21

220

33

34333333333

3535P1

P2

P1

4443434333444 35

P0

3

330

3 31

0

P3

0

303033

12

• Observer-process P0 (any process) creates a snapshot token ts and saves its local state s0.

• P0 sends ts to all other processes.

570

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 570 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states

A consistent global state (snapshot) is defi ne by a unique division into:

• “The Past” P (events before the snapshot):( ) ( )e P e e e P2 1 2 1" &/! !

• “The Future” F (events after the snapshot):( ) ( )e F e e e F1 1 2 2" &/! !

566

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 566 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Electing a central coordinator (the Bully algorithm)Any process P which notices that the central coordinator is gone, performs:

1. P sends an Election-message to all processes with higher process numbers.

2. P waits for response messages. If no one responds after a pre-defined amount of time: P declares itself the new coordinator and sends out a Coordinator-message to all.

If any process responds, then the election activity for P is over and P waits for a Coordinator-message

All processes Pi perform at all times:

• If Pi receives a Election-message from a process with a lower process number, it responds to the originating process and starts an election process itself (if not running already).

562

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 562 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed critical regions with logical clocks• 6 times: 6 received Requests:

Add to local RequestQueue (ordered by time) Reply with Acknowledge or OwnRequest

• 6 times: 6 received Release messages: Delete corresponding Requests in local RequestQueue

1. Create OwnRequest and attach current time-stamp. Add OwnRequest to local RequestQueue (ordered by time). Send OwnRequest to all processes.

2. Wait for Top (RequestQueue) = OwnRequest & no outstanding replies3. Enter and leave critical region4. Send Release-message to all processes.

575

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 575 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states Running the snapshot algorithm:

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31 35 36

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

22

26

22 233 24

27

30

4

27

24

29

222225

8

222222222222225

2929292922922222922929229292929229292929999

9

26 2722727272727272277777777726 2722722727272722777777777

2 3333

6

363 33333333 3

35

333837

33

4434343344 35

P0

3

220

3030

31330

3

33

313131

P0

21

220

P1

P2

3636555 33

3433333

5555333335533335333335333335335333303033

12

• Pi6 which receive ts (as an individual token-message, or as part of another message):

• Save local state si and send si to P0.

• Attach ts to all further messages, which are to be sent to other processes.

• Save ts and ignore all further incoming ts‘s.

571

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 571 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states How to read the current state of a distributed system?

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31 35

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

20 22

26

22 233333 24

227

330

4

27

224

29

222225

8

22222222222222225

2292929292222292229292929292929292299999

9

26 272722727272727227777777777

0

26 27277227227277777

0

2 3333333333333333333333333333333333333333333 34433434343434334343343343444

31

31 35

444444 37

333333333 35

3338

35

34

6

3333333333333333333333333333333333

3335353533535353535353533535555 36

P0

32

30

330

30

12

333337

444440

388888

Instead: some entity probes and collects local states. What state of the global system has been accumulated?

Sorting the events into past and future events.

567

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 567 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states How to read the current state of a distributed system?

25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

P3

31 35 36

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

Message

20 22

26

22 2333333 24

227

330

4

27

2242

29

22225

8

2222222222222225

29292929292922292929292929292292292929

9

26 27227272727272727777777777 30

3

26 2727227272277777

0

2 3333

31

36

36

31

3 33333333

35

3

35

333837

33

343333333

3535

4434433344 35

This “god’s eye view” does in fact not exist.

563

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 563 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed critical regions with logical clocks

Analysis• No deadlock, no individual starvation, no livelock.

• Minimal request delay: N 1- requests (1 broadcast) + N 1- replies.

• Minimal release delay: N 1- release messages (or 1 broadcast).

• Communications requirements per request: N3 1-^ h messages (or N 1- messages + 2 broadcasts).

• Clocks are kept recent by the exchanged messages themselves.

Assumptions: • No messages are lost violation leads to stall.

576

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 576 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states Running the snapshot algorithm:

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31 35 36

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

22

26

22 23333 24

227

330

4

27

2242

29

222225

8

22222222222222225

92992929292922929222292292292922929999

9

26 272272727222727777777777726 272722727277777777

2 33333

6

363 3333333333 3

35

33837

33

4443434333444 35

P0

3

220

3030

31330

3

33

313131

P0

21

220

P1

P2

3636555 33

3433333333

55553333355333333353333333535353333303033

12

• Pi6 which previously received ts and receive a message m without ts:

• Forward m to P0 (this message belongs to the snapshot).

572

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 572 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states How to read the current state of a distributed system?

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31 35

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

20 22

26

22 233333 24

2227

4

33027

2242

29

22225

8

22222222222222225

29229922229292922929292929292922999

9

26 2722722727272727227777777

0

26 2727222727277777777

0

2 3333333333333333333333333333333333333333333333333 3433434343343334343433433434444

31

31 35

3744444

35

338

35

34

636

333333333333333333333333333333333333333

333333333535333355555

P0

32

30

330

30

12

33337

444440

388888

Instead: some entity probes and collects local states. What state of the global system has been accumulated?

Event in the past receives a message from the future!Division not possible Snapshot inconsistent!

568

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 568 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states How to read the current state of a distributed system?

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31 35 36

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

22

26

22 233333 24

227

330

4

27

2242

29

22225

8

22222222222222225

29292929292929222229229292229292999

9

26 272272727222727777777777726 272722727277777777

6 33837

P0

P3 25

21

2

P1

P2 20

444440322 3333 363 33333333 3

33

4434433344 353535335553

3030

31330

3

3

3

333333333333333333

3331331331303033

12

36

35 36

39

33337337337337373337337333777 336

6

35

3837

33

34343434334333433434343434334434344444 33337

444440

38888

36

5

3

55

3

353535

Instead: some entity probes and collects local states. What state of the global system has been accumulated?

564

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 564 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed critical regions with a token ring structure

1. Organize all processes in a logical or physical ring topology

2. Send one token message to one process

3. 6 times, 6processes: On receiving the token message:1. If required the process enters and leaves a critical section (while holding the token).2. The token is passed along to the next process in the ring.

Assumptions: • Token is not lost violation leads to stall.

(a lost token can be recovered by a number of means – e.g. the ‘election’ scheme following)

577

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 577 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states Running the snapshot algorithm:

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31 35 36

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

20 22

26

22 23333 24

2227

330

4

27

224

P3

P1

P2

29

222225

8

22222222222222225

92992929292922929222292292292922929999

9

26 272272727222727777777777726 272722727277777777

2 33333

0 5

363 33333333 3 3837

33

44434343433344 5

P0

3

P0

3030

3130

3

33

3131

35 36

39

7 3

363

35

38375

31

P

5555555333335533333333335333535353333

3433333333

666666 3666666666363633366666666636363636663636363636366666666333333333

33333333333333

303033

12

• Pi6 which receive ts (as an individual token-message, or as part of another message):

• Save local state si and send si to P0.

• Attach ts to all further messages, which are to be sent to other processes.

• Save ts and ignore all further incoming ts‘s.

573

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 573 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Snapshot algorithm

• Observer-process P0 (any process) creates a snapshot token ts and saves its local state s0.

• P0 sends ts to all other processes.

• Pi6 which receive ts (as an individual token-message, or as part of another message):

• Save local state si and send si to P0.

• Attach ts to all further messages, which are to be sent to other processes.

• Save ts and ignore all further incoming ts‘s.

• Pi6 which previously received ts and receive a message m without ts:

• Forward m to P0 (this message belongs to the snapshot).

569

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 569 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states How to read the current state of a distributed system?

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31 35 36

23 24 25 26 27 30 31 33 34 35 36 37

30 31 32 33 34 35 36 37 4038 39

27 28 30 37 38

20 22

26

22 23333 24

227

330

4

27

224

29

22225

8

22222222222222225

292929292929292222292292922292999

9

26 2722727272227277777777777

3

26 272722727277777777

0

2 3333

31

36

36

31

3 3

35

3

35

3837

33

343333333

3535

4344 35

P0

3

3

30

3

3

33337

444440

388888

12

3131

33333333443434333344 353544 3

Instead: some entity probes and collects local states. What state of the global system has been accumulated?

Connecting all the states to a global state.

565

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 565 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed critical regions with a central coordinator

A global, static, central coordinator Invalidates the idea of a distributed system

Enables a very simple mutual exclusion scheme

Therefore:

• A global, central coordinator is employed in some systems … yet …

• … if it fails, a system to come up with a new coordinator is provided.

Page 5: 4: Transport Layer

590

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 590 of 758 (chapter 8: “Distributed Systems” up to page 641)

or accept Contention (Print_Job : in Job_Type; Server_Id : in Task_Id) do if Print_Job in AppliedForJobs then if Server_Id = Current_Task then Internal_Print_Server.Start_Print (Print_Job); elsif Server_Id > Current_Task then Internal_Print_Server.Cancel_Print (Print_Job); Next_Server_On_Ring.Contention (Print_Job; Server_Id); else null; -- removing the contention message from ring end if; else Turned_Down_Jobs := Turned_Down_Jobs + Print_Job; Next_Server_On_Ring.Contention (Print_Job; Server_Id); end if; end Contention; or terminate; end select; end loop; end Print_Server;

586

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 586 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

A distributed server (load balancing)

Server

Server

Server Server

Server

Server

Server

Ser

Client

Server

erver

Server erver

ver

Server

Serve

Ser

errrr SSSSSS

Se r Se

SerSeSSSS v

Server

rver

Client

582

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 582 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Consistent distributed statesWhy would we need that?

• Find deadlocks.

• Find termination / completion conditions.

• … any other global safety of liveness property.

• Collect a consistent system state for system backup/restore.

• Collect a consistent system state for further pro-cessing (e.g. distributed databases).

• …

578

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 578 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states Running the snapshot algorithm:

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2 23 24 25 26 35 36 37

4038 39

27 28 37 38

0

244

27

2242

29

222225

8

292929229292922292922929292929292999

222222222222225

9

2626

P 25

2P2

P3

20 22

26

22 233333

3338

P0

35 36

39

7 3

35

338

32822721P1

P00 12

• Save ts and ignore all further incoming ts‘s.

591

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 591 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Transactions

Concurrency and distribution in systems with multiple, interdependent interactions?

Concurrent and distributedclient/server interactions

beyond single remote procedure calls?

587

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 587 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

A distributed server (load balancing)

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Job_Completed (Results)

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Job_Completed (Results)

583

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 583 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

A distributed server (load balancing)

ServerClient ServerClient

579

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 579 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states Running the snapshot algorithm:

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2

31

23 24 25 26 27 30 31 33 37

30 31 32 33 34 40

27 28 30

20 22

6

22

P1

2333 24

227

25

5

26 330

4

27

2242

29

222225

8

22222222222222225

2292929292222292229292929292929292299999

9

26 772727272727277777777726 27272727272727272722277777777

P1

P0

PP

P

333337

444440P3 2

1

P2

P

P

PP

0

3133313331

322 33333 333333333

33

4443434333443

3030

31330

3

33333333333333

303033

12

• Finalize snapshot

592

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 592 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

TransactionsDefi nition (ACID properties):

• Atomicity: All or none of the sub-operations are performed. Atomicity helps achieve crash resilience. If a crash occurs, then it is possible to roll back the system to the state before the transaction was invoked.

• Consistency: Transforms the system from one consistent state to another consistent state.

• Isolation: Results (including partial results) are not revealed unless and until the transaction commits. If the operation accesses a shared data object, invocation does not interfere with other operations on the same object.

• Durability: After a commit, results are guaranteed to persist, even after a subsequent system failure.

588

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 588 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

A distributed server (load balancing)with Ada.Task_Identification; use Ada.Task_Identification;

task type Print_Server is

entry Send_To_Server (Print_Job : in Job_Type; Job_Done : out Boolean); entry Contention (Print_Job : in Job_Type; Server_Id : in Task_Id);

end Print_Server;

584

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 584 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

A distributed server (load balancing)

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client Ring of servers

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client Ring of servers

580

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 580 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed states Running the snapshot algorithm:

P3 25

time0 5 10 15 20 25 30 35 40 45

21

20

P1

26 27 29

22

29

P2 23 24 25 26 27 30

30 31 32

27 28 30

26

2333 24

227

4

33027

2242

29

222225

8

292929292922929222929292929292929299

222222222222222225

9

26 72722722777777777726 272227272727272777777777

P0

1

220 2222

1

P3

P2

P

2P1

P0

3223

3030

330

3

3

0

303033

12

Sorting the events into past and future events.

Past and future events uniquely separated Consistent state

593

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 593 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

TransactionsDefi nition (ACID properties):

• Atomicity: All or none of the sub-operations are performed. Atomicity helps achieve crash resilience. If a crash occurs, then it is possible to roll back the system to the state before the transaction was invoked.

• Consistency: Transforms the system from one consistent state to another consistent state.

• Isolation: Results (including partial results) are not revealed unless and until the transaction commits. If the operation accesses a shared data object, invocation does not interfere with other operations on the same object.

• Durability: After a commit, results are guaranteed to persist, even after a subsequent system failure.

isis possibp leiisii possibibpossibiblell

How to ensure consistency

in a distributed system?

Actual isolation and

effi cient concurrency?

Shadow copies?

Actual isolation or the appearance of isolation?

sub operations are performedsub operattions ai rere perforperforffffffffperfoferforffff medmedmedmedddededd

Atomic operations spanning multiple processes?

What hardware do we

need to assume?

589

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 589 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

A distributed server (load balancing)task body Print_Server is begin loop select

accept Send_To_Server (Print_Job : in Job_Type; Job_Done : out Boolean) do

if not Print_Job in Turned_Down_Jobs then

if Not_Too_Busy then Applied_For_Jobs := Applied_For_Jobs + Print_Job; Next_Server_On_Ring.Contention (Print_Job, Current_Task); requeue Internal_Print_Server.Print_Job_Queue;

else Turned_Down_Jobs := Turned_Down_Jobs + Print_Job; end if;

end if; end Send_To_Server;

(...)

585

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 585 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

A distributed server (load balancing)

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Send_To_Group (Job) Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Send_To_Group (JTT ob)

581

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 581 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Snapshot algorithm

Termination condition?

Either

• Make assumptions about the communication delays in the system.

or

• Count the sent and received messages for each process (include this in the lo-cal state) and keep track of outstanding messages in the observer process.

Page 6: 4: Transport Layer

606

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 606 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)

Read (C)

P1

Write (C)

Read (A)

Write (B)

P2

P3

Write (B)

Read (C)

Order

R

W

eaRe

Re

W

P1 P2 P3

• Three confl icting pairs of operations with the same order of execution (pair-wise between processes).

Serialization graph is cyclic.

Not serializable

602

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 602 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)P1

Write (C)

Read (A)

Write (B)

P2

P3

Write (B)

Read (C)

Order

Re

Re

W

• Three confl icting pairs of operations with the same order of execution (pair-wise between processes).

• The order between processes also leads to a global order of processes.

Serializable

598

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 598 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)P1

Write (C)

Read (A)

Write (B)

P2

P3

Write (B)

Write (B)

ead (A)(A)(A(A(A)(A)(AAAA)A)A)AA)AA)))) WR

P1 P2

Serializable

594

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 594 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Transactions

A closer look inside transactions:

• Transactions consist of a sequence of operations.

• If two operations out of two transactions can be performed in any order with the same fi nal effect, they are commutative and not critical for our purposes.

• Idempotent and side-effect free operations are by defi nition commutative.

• All non-commutative operations are considered critical operations.

• Two critical operations as part of two different transactions while affecting the same object are called a confl icting pair of operations.

607

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 607 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Transaction schedulers

Three major designs:

• Locking methods:Impose strict mutual exclusion on all critical sections.

• Time-stamp ordering:Note relative starting times and keep order dependencies consistent.

• “Optimistic” methods:Go ahead until a confl ict is observed – then roll back.

603

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 603 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)

Read (C)

P1

Write (C)

Read (A)

Write (B)

P2

P3

Write (B)

Read (C)

Order

R

W

eaRe

Re

W

P1 P2 P3

• Three confl icting pairs of operations with the same order of execution (pair-wise between processes).

• The order between processes does no longer lead to a global order of processes.

Not serializable

599

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 599 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)P1

Write (C)

Read (A) Write (B)P2

P3

Write (B)

Order

Re

W

P1 P2

• Two confl icting pairs of operations with different orders of executions.

Not serializable.

595

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 595 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Transactions

A closer look at multiple transactions:

• Any sequential execution of multiple transactions will fulfi l the ACID-properties, by defi nition of a single transaction.

• A concurrent execution (or ‘interleavings’) of multiple transactions might fulfi l the ACID-properties.

If a specifi c concurrent execution can be shown to be equivalent to a specifi c sequential execution of the involved transactions then this specifi c interleaving is called ‘serializable’.

If a concurrent execution (‘interleaving’) ensures that no transaction ever encounters an inconsistent state then it is said to ensure the appearance of isolation.

608

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 608 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Transaction schedulers – Locking methodsLocking methods include the possibility of deadlocks careful from here on out …

• Complete resource allocation before the start and release at the end of every transaction:

This will impose a strict sequential execution of all critical transactions.

• (Strict) two-phase locking:Each transaction follows the following two phase pattern during its operation:

• Growing phase: locks can be acquired, but not released.

• Shrinking phase: locks can be released anytime, but not acquired (two phase locking) or locks are released on commit only (strict two phase locking).

Possible deadlocks

Serializable interleavings

Strict isolation (in case of strict two-phase locking)

• Semantic locking: Allow for separate read-only and write-locks

Higher level of concurrency (see also: use of functions in protected objects)

604

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 604 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Achieving serializability For the serializability of two transactions it is necessary and suffi cient

for the order of their invocations of all confl icting pairs of operations to be the same

for all the objects which are invoked by both transactions.

• Defi ne: Serialization graph: A directed graph; Vertices i represent transactions Ti; Edges T Ti j" represent an established global order dependency between all confl icting pairs of operations of those two transactions.

For the serializability of multiple transactions it is necessary and suffi cient

that the serialization graph is acyclic.

600

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 600 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)P1

Write (C)

Read (A)

Write (B)

P2

P3

Write (B)

Read (C)

Order

Re

Re

W

• Three confl icting pairs of operations with the same order of execution (pair-wise between processes).

• The order between processes also leads to a global order of processes.

596

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 596 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Achieving serializability

For the serializability of two transactions it is necessary and suffi cient for the order of their invocations

of all confl icting pairs of operations to be the same for all the objects which are invoked by both transactions.

(Determining order in distributed systems requires logical clocks.)

609

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 609 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Transaction schedulers – Time stamp orderingAdd a unique time-stamp (any global order criterion) on every transaction upon start. Each involved object can inspect the time-stamps of all requesting transactions.

• Case 1: A transaction with a time-stamp later than all currently active transactions applies: the request is accepted and the transaction can go ahead.

• Alternative case 1 (strict time-stamp ordering): the request is delayed until the currently active earlier transaction has committed.

• Case 2: A transaction with a time-stamp earlier than all currently active transactions applies: the request is not accepted and the applying transaction is to be aborted.

Collision detection rather than collision avoidance No isolation Cascading aborts possible.

Simple implementation, high degree of concurrency– also in a distributed environment, as long as a global event order (time) can be supplied.

605

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 605 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)P1

Write (C)

Read (A)

Write (B)

P2

P3

Write (B)

Read (C)

Order

Write (B)ead (C)ead (CWrite (A) RReReRe

ead (A)ad (A)dd (A)d (A(A)d (A)d (A)(A)(A)(A(A)(A)(A)(AA)AA)A)AAAA) WR

P1 P2P3

• Three confl icting pairs of operations with the same order of execution (pair-wise between processes).

Serialization graph is acyclic.

Serializable

601

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 601 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)P1

Write (C)

Read (A)

Write (B)

P2

P3

Write (B)

Read (C)

Order

Write (B)ead (C)ead (C(Write (A) RReRRRee

ead (A)ad (A)dd (A)d (Ad (A)d (A)d (A)(A)((A)((A(A(A)(A)A)A)A)A)A)A)) WR

P1 P2P3

• Three confl icting pairs of operations with the same order of execution (pair-wise between processes).

• The order between processes also leads to a global order of processes.

Serializable

597

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 597 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Serializability

time0 5 10 15 20 25 30 35 40 45

Write (A)P1

Write (C)

Read (A)

Write (B)

P2

P3

Write (B)

Order

Re W

• Two confl icting pairs of operations with the same order of execution.

Page 7: 4: Transport Layer

622

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 622 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolPhase 2: Implement results

0 Uwe R. Zimmer, The Australian National University page 622 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Everybody destroys shadows

618

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 618 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolStart up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 618 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Setup & Start operations

Shadow copy

614

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 614 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolStart up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 614 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

erver

Server Server

erver

erver

ver

Client

See e SerSe

ServerSeSe

Se

Se

rver SerSSSSSSSS vSe

rverSerS

ererveSeS

ver

verrv

rver

erve

rver

Distributed Transaction

610

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 610 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Transaction schedulers – Optimistic control

Three sequential phases:

1. Read & execute:Create a shadow copy of all involved objects and perform all required operations on the shadow copy and locally (i.e. in isolation).

2. Validate:After local commit, check all occurred interleavings for serializability.

3. Update or abort:

3a. If serializability could be ensured in step 2 then all results of involved transactions are written to all involved objects – in dependency order of the transactions.

3b. Otherwise: destroy shadow copies and start over with the failed transactions.

623

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 623 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolPhase 2: Implement results

0 Uwe R. Zimmer, The Australian National University page 623 of y 758 (chapter 8: “Distributed Systems” up to pag8 a

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Everybody reports "Committed"

619

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 619 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolPhase 1: Determine result state

0 Uwe R. Zimmer, The Australian National University page 619 of y 758 (chapter 8: “Distributed Systems” up to pag8 a

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Coordinator requests and assembles votes:"Commit" or "Abort"

615

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 615 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolStart up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 615 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

ver

rver

rver

ver

rver

Determine coordinator

611

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 611 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Transaction schedulers – Optimistic control

Three sequential phases:

1. Read & execute:Create a shadow copy of all involved objects and perform all required operations on the shadow copy and locally (i.e. in isolation).

2. Validate:After local commit, check all occurred interleavings for serializability.

3. Update or abort:

3a. If serializability could be ensured in step 2 then all results of involved transactions are written to all involved objects – in dependency order of the transactions.

3b. Otherwise: destroy shadow copies and start over with the failed transactions.

How to create a consistent copy?

resulresulresulresulrerresuresureresulreesulesuleeesulesuleess ts ots of ts oftsttsts ofts ofts ofs os ooofoofofff involviinvininvolvinvoinvolvnnvolvnvolvvvolvvoolvvvvvved ed traeedededed trad trad tradd trad tttratrraaaansacnsactinnsnsactinsactinsactinsacnsansaaccctctictitiiioonsoonsonsons onsslt f i l d t tiHow to update all objects consistently?

(i e(i eiiiii in isin isi ii ii i olatioolatiol til tilll n))))

Full isolation and maximal concurrency!

Aborts happen after everything has been committed locally.

624

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 624 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolor Phase 2: Global roll back

0 Uwe R. Zimmer, The Australian National University page 624 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Coordinator instructs everybody to "Abort"

620

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 620 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolPhase 2: Implement results

0 Uwe R. Zimmer, The Australian National University page 620 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Coordinator instructs everybody to "Commit"

616

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 616 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolStart up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 616 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Determine coordinator

612

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 612 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed transaction schedulersThree major designs:

• Locking methods: no abortsImpose strict mutual exclusion on all critical sections.

• Time-stamp ordering: potential aborts along the wayNote relative starting times and keep order dependencies consistent.

• “Optimistic” methods: aborts or commits at the very endGo ahead until a confl ict is observed – then roll back.

How to implement “commit” and “abort” operationsin a distributed environment?

625

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 625 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolor Phase 2: Global roll back

0 Uwe R. Zimmer, The Australian National University page 625 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Everybody destroys shadows

621

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 621 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolPhase 2: Implement results

0 Uwe R. Zimmer, The Australian National University page 621 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Ser

er Server

Server

Server

Server

Client

Server

Ser

Server

Server

Server

Server

Client

Coord.

ver

rver

rverEverybody commits

617

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 617 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolStart up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 617 of y 758 (chapter 8: “Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Server

Server

Client

Coord.

ver

rver

rver

ver

rver

Setup & Start operations

613

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 613 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolStart up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 613 of y 758 (chapter 8: Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

ver

rver

rver

ver

rver

Ring of servers

Data

Page 8: 4: Transport Layer

638

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 638 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Coordinator processes

0 Uwe R. Zimmer, The Australian National University page 638 of y 758 (chapter 8: “Distributed Systems” up to page8

Coordinator processes

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Coord.

Coordinator also received two messages

and processes job

634

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 634 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Distribute job

0 Uwe R. Zimmer, The Australian National University page 634 of y 758 (chapter 8: “Distributed Systems” up to page8

Distribute job

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Coord.

Coordinator sends job both ways

630

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 630 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Start-up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 630 of y 758 (chapter 8: “Distributed Systems” up to page8

Start up (initialization) phase

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client Ring of identical servers

626

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 626 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Two phase commit protocolPhase 2: Report result of distributed transaction

0 Uwe R. Zimmer, The Australian National University page 626 of y 758 (chapter 8: Distributed Systems” up to pag8

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Server

Server

Server

Server

Server

Server

Server

Clien

Coord.

ver

rver

rver

Server

Server

nttt

C

ver

rverCoordinator reports to client: "Committed" or "Aborted"

639

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 639 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Result delivery

0 Uwe R. Zimmer, The Australian National University page 639 of y 758 (chapter 8: “Distributed Systems” up to page8

Result delivery

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Coord.

Server

Server

ntttt

C

Coordinator delivers his local result

635

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 635 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Distribute job

0 Uwe R. Zimmer, The Australian National University page 635 of y 758 (chapter 8: “Distributed Systems” up to page8

Distribute job

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Coord.

Everybody received job (but nobody knows that)

631

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 631 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Start-up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 631 of y 758 (chapter 8: “Distributed Systems” up to page8

Start up (initialization) phase

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client Determine coordinator

627

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 627 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Distributed transaction schedulersEvaluating the three major design methods in a distributed environment:

• Locking methods: No aborts.Large overheads; Deadlock detection/prevention required.

• Time-stamp ordering: Potential aborts along the way.Recommends itself for distributed applications, since decisions are taken locally and communication overhead is relatively small.

• “Optimistic” methods: Aborts or commits at the very end.Maximizes concurrency, but also data replication.

Side-aspect “data replication”: large body of literature on this topic (see: distributed data-bases / operating systems / shared memory / cache management, …)

640

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 640 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)

Event: Server crash, new servers joining, or current servers leaving.

Server re-confi guration is triggered by a message to all (this is assumed to be supported by the distributed operating system).

Each server on reception of a re-confi guration message:

1. Wait for local job to complete or time-out.

2. Store local consistent state Si.

3. Re-organize server ring, send local state around the ring.

4. If a state Sj with j i> is received then S Si j%

5. Elect coordinator

6. Enter ‘Coordinator-’ or ‘Replicate-mode’

636

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 636 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Processing starts

0 Uwe R. Zimmer, The Australian National University page 636 of y 758 (chapter 8: “Distributed Systems” up to page8

Processing starts

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Coord.

First server detects two job-messages☞ processes job

632

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 632 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Start-up (initialization) phase

0 Uwe R. Zimmer, The Australian National University page 632 of y 758 (chapter 8: “Distributed Systems” up to page8

Start up (initialization) phase

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Coord.

Coordinator determined

628

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 628 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Premise:

A crashing server computer should not compromise the functionality of the system(full fault tolerance)

Assumptions & Means:

• k computers inside the server cluster might crash without losing functionality.

Replication: at least k 1+ servers.

• The server cluster can reorganize any time (and specifi cally after the loss of a computer).

Hot stand-by components, dynamic server group management.

• The server is described fully by the current state and the sequence of messages received.

State machines: we have to implement consistent state adjustments (re-organization) and consistent message passing (order needs to be preserved).

[Schneider1990]

641

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 641 of 758 (chapter 8: “Distributed Systems” up to page 641)

Summary

Distributed Systems

• Networks• OSI, topologies

• Practical network standards

• Time• Synchronized clocks, virtual (logical) times

• Distributed critical regions (synchronized, logical, token ring)

• Distributed systems• Elections

• Distributed states, consistent snapshots

• Distributed servers (replicates, distributed processing, distributed commits)

• Transactions (ACID properties, serializable interleavings, transaction schedulers)

637

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 637 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Everybody (besides coordinator) processes

0 Uwe R. Zimmer, The Australian National University page 637 of y 758 (chapter 8: “Distributed Systems” up to page8

Everybody (besides coordinator) processes

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Coord.

All server detecttwo job-messages

☞ everybody processes job

633

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 633 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)Coordinator receives job message

0 Uwe R. Zimmer, The Australian National University page 633 of y 758 (chapter 8: “Distributed Systems” up to page8

Coordinator receives job message

Server

Server

Server

Server

Server Server

Server

Server

Server

Server

Client

Coord.

Server

Server

nttttt

C

Send Job

629

Distributed Systems

© 2020 Uwe R. Zimmer, The Australian National University page 629 of 758 (chapter 8: “Distributed Systems” up to page 641)

Distributed Systems

Redundancy (replicated servers)

Stages of each server:

Job message received by all active servers

Job processed locallyJob message received locally

Received Deliverable

Processed

Page 9: 4: Transport Layer