26
Unit 3 : Database System Architectures Unit 3 : Database System Architectures

Rdbms (Unit 3)

Embed Size (px)

Citation preview

Page 1: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 1/26

Unit 3 : Database System ArchitecturesUnit 3 : Database System Architectures

Page 2: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 2/26

Centralized SystemsCentralized Systems

s Run on a single computer system and do not interact with other

computer systems.s General-purpose computer system: one to a few CPUs and a number

of device controllers that are connected through a common bus thatprovides access to shared memory.

s Single-user system (e.g., personal computer or workstation): desk-top

unit, single user, usually has only one CPU and one or two hard disks;the OS may support only one user.

s Multi-user system: more disks, more memory, multiple CPUs, and amulti-user OS. Serve a large number of users who are connected tothe system via terminals.

Page 3: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 3/26

A Centralized Computer SystemA Centralized Computer System

Page 4: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 4/26

Page 5: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 5/26

Client-Server SystemsClient-Server Systems

s Server systems satisfy requests generated at m client systems, whose general

structure is shown below:

Page 6: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 6/26

Client-Server Systems (Cont.)Client-Server Systems (Cont.)

s Database functionality can be divided into:

q Back-end: manages access structures, query evaluation andoptimization, concurrency control and recovery.

q Front-end: consists of tools such as forms , report-writers , andgraphical user interface facilities.

s The interface between the front-end and the back-end is through SQL or

through an application program interface.

Page 7: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 7/26

Server System ArchitectureServer System Architecture

s Server systems can be broadly categorized into two kinds:

q Transaction servers, which are widely used in relationaldatabase systems.

q Data servers, used in object-oriented database systems.

Page 8: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 8/26

Transaction ServersTransaction Servers

s Also called query server systems or SQL server systems

q Clients send requests to the serverq Transactions are executed at the server

q Results are shipped back to the client.

s Open Database Connectivity (ODBC) is a C language applicationprogram interface standard from Microsoft for connecting to a server,

sending SQL requests, and receiving results.

s JDBC standard is similar to ODBC, for Java

Page 9: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 9/26

Transaction Server Process StructureTransaction Server Process Structure

Monitors other processes,and takes recovery actionsif any of the other

processes fail.

Lock manager functionalitylike lock grant, lockrelease, and deadlockdetection.

Output modified bufferBlocks back to disk on acontinuous basis.

Output log records fromthe log record buffer tostable storage.

Performs periodicCheckpoints.

Receive user queries(transactions), execute

them and send resultsback.

Send user queries(transactions) to server.

Page 10: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 10/26

Data ServersData Servers

s Used in high-speed LANs, in cases where

q High speed connection between the clients and the server.q Client machines are comparable in processing power to the server

machine.

s Data are shipped to clients where processing is performed, and thenshipped data back to the server.

s This architecture requires full back-end functionality at the clients.

s Used in many object-oriented database systems

s Issues:

q Page-Shipping versus Item-Shipping

q Lockingq Data Caching

q Lock Caching

Page 11: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 11/26

Data Servers (Cont.)Data Servers (Cont.)

s Page-shipping versus item-shipping

q Unit of communication for data may be coarse granularity (page)or fine granularity (tuple or item)

q Item shipping⇒ overhead of message passing is high.

q Page shipping⇒ prefetching

Fetching items even before they are requested is called

prefetching. All the items in the page are shipped when a process desires

to access a single item in the page.

Page 12: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 12/26

Data Servers (Cont.)Data Servers (Cont.)

s Locking

q Locks granted by the server for the data items that it ships to theclient machines.

q Disadvantage of page shipping :

Lock on a page implicitly locks all items contained in the pageeven client is not accessing some items in the page.

q

Other client machines that require locks on those items may beblocked unnecessarily.

q Techniques for lock deescalated have been proposed where theserver can request its clients to transfer back locks on prefetcheditems.

Page 13: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 13/26

Data Servers (Cont.)Data Servers (Cont.)

s Data Caching

q Data can be cached at client even in between transactions

q Even if a transaction finds cached data, it must make sure that thosedata are up to date (cache coherency)

s Lock Caching

q Locks can be retained by client system even in between transactions

q Server calls back locks from clients when it receives conflicting lockrequest. Client returns lock once no local transaction is using it.

Page 14: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 14/26

Parallel SystemsParallel Systems

s Parallel systems improve processing and I/O speeds by using multiple

CPUs and disks in parallel.s Useful for applications that have to query extremely large databases or

that have to process an extremely large number of transactions persecond.

s A coarse-grain parallel machine consists of a small number of

powerful processorss A massively parallel or fine grain parallel machine utilizes

thousands of smaller processors.

s Two main measures of performance of database system:

q throughput --- the number of tasks that can be completed in a

given time intervalq response time --- the amount of time it takes to complete a single

task from the time it is submitted

Page 15: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 15/26

SpeedupSpeedup

s Running a given task in less time by increasing the degree of

parallelism is called speedup.s Suppose,

q execution time of task on the larger machine is TL

q execution time of same task on the smaller machine is TS

s A fixed-sized problem executing on a small system is given to asystem which is N -times larger.

q Measured by:

speedup = small system elapsed time =  TS

large system elapsed time  TL

q Speedup is linear if equation equals N.

q Speedup is sublinear if equation less than N.

Page 16: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 16/26

SpeedupSpeedup

Speedup with increasing resources

Page 17: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 17/26

ScaleupScaleup

s Handling larger tasks by increasing degree of parallelism is called

scaleup.s Relates to the ability to process larger tasks in the same amount of

time by providing more resources.

s Increase the size of both the problem and the system

q N -times larger system used to perform N -times larger task

q Measured by:

scaleup = small system small problem elapsed time =  TS

big system big problem elapsed time  TL

q Scaleup is linear if equation equals 1 (TS=TL).

q

Scaleup is sublinear if TL

>TS

.

Page 18: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 18/26

ScaleupScaleup

Scaleup with increasing problem size and resources

Page 19: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 19/26

Factors Limiting Speedup and ScaleupFactors Limiting Speedup and Scaleup

Speedup and scaleup are often sublinear due to:

s Startup costs: Cost of starting up multiple processes may dominatecomputation time, if the degree of parallelism is high.

s Interference: Processes accessing shared resources (e.g.,systembus, disks, or locks) compete with each other, thus spending timewaiting on other processes, rather than performing useful work.

s Skew: By breaking down a single task into a number of parallel steps,we reduce the size of the average steps.

q It is often difficult to divide a task into exactly equal-sized parts,which decreased the speedup.

Page 20: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 20/26

Interconnection ArchitecturesInterconnection Architectures

Page 21: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 21/26

Parallel Database ArchitecturesParallel Database Architectures

Page 22: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 22/26

Distributed SystemsDistributed Systemss Data spread over multiple machines (also referred to as sites or

nodes).

s Computers in a distributed system communicate with one anotherthrough various communication media, such as high-speed networks ortelephone lines.

Page 23: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 23/26

Distributed DatabasesDistributed Databases

s Homogeneous distributed databases

q

Same software/schema on all sites, data may be partitionedamong sites

s Heterogeneous distributed databases

q Different software/schema on different sites

s Differentiate between local and global transactions

q A local transaction accesses data in the single site at which thetransaction was initiated.

q A global transaction either accesses data in a site different fromthe one at which the transaction was initiated or accesses data inseveral different sites.

Page 24: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 24/26

Trade-offs in Distributed SystemsTrade-offs in Distributed Systems

s Sharing data – users at one site able to access the data residing at

some other sites.s Autonomy – each site is able to retain a degree of control over data

stored locally.

s Higher system availability through redundancy — data can bereplicated at remote sites, and system can function even if a site fails.

s Disadvantage: added complexity required to ensure propercoordination among sites.

q Software development cost.

q Greater potential for bugs.

q Increased processing overhead.

Page 25: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 25/26

Network TypesNetwork Types

s Local-area networks (LANs) – composed of processors that are

distributed over small geographical areas, such as a single building ora few adjacent buildings.

s Wide-area networks (WANs) – composed of processors distributedover a large geographical area.

s WANs with continuous connection (e.g. the Internet) are needed forimplementing distributed database systems

Page 26: Rdbms (Unit 3)

8/14/2019 Rdbms (Unit 3)

http://slidepdf.com/reader/full/rdbms-unit-3 26/26

End of UnitEnd of Unit