View
217
Download
0
Category
Preview:
Citation preview
7/24/2019 Marakas-Ch10
1/22
Marakas: Decision Support Systems, 2nd Edition 2003, Prentice-Hall Capter !0 - !
Chapter 10:
The Data Warehouse
Decision Support Systems in the
21st
Century, 2nd
Editionby George M. Marakas
7/24/2019 Marakas-Ch10
2/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 2
10-1: Stores, Warehouses and Marts
( data )arehouse is a co''ection o* integrateddatabases designed to support a D.
(n operationa' data store +D- stores data*or a speci*ic app'ication. t *eeds the data
)arehouse a strea! o* desired ra) data.( data !art is a 'o)er%cost, sca'ed%do)n/ersion o* a data )arehouse, usua''ydesigned to support a s!a'' group o* users
+rather than the entire *ir!-.The !etadata is in*or!ation that is kept aboutthe )arehouse.
7/24/2019 Marakas-Ch10
3/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % #
The Data Warehouse Environment
The organiations 'egacy syste!s and datastores pro/ide data to the data )arehouse or!art.
During the trans*er o* data *ro! the /arioussources, c'eansing or trans*or!ation !ayoccur, so the data in the DW is !ore uni*or!.
i!u'taneous'y, !etadata is recorded.
ina''y, the DW or !art !ay be used to createone or !ore 3persona'4 )arehouses.
7/24/2019 Marakas-Ch10
4/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 5
Organizational Data Flow and Data
Storage om!onents
7/24/2019 Marakas-Ch10
5/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 6
hara"teristi"s o# a Data Warehouse
Subject oriented7 organied based on useIntegrated7 inconsistencies re!o/ed
Nonvolatile7 stored in read%on'y *or!at
Time variant7 data are nor!a''y ti!e seriesSummarized7 in decision%usab'e *or!at
Large volume7 data sets are 8uite 'arge
Non normalized7 o*ten redundant
etadata7 data about data are stored
Data sources7 co!es *ro! nonintegratedsources
7/24/2019 Marakas-Ch10
6/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 9
$ Data Warehouse is Su%&e"t Oriented
7/24/2019 Marakas-Ch10
7/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 %
Data in a Data Warehouse are 'ntegrated
7/24/2019 Marakas-Ch10
8/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % ;
10-(: The Data Warehouse $r"hite"ture
The architecture consists o* /ariousinterconnected e'e!ents:!perational and e"ternal database layer7
the source data *or the DW In#ormation access layer7 the too's the end
user access to e
7/24/2019 Marakas-Ch10
9/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % =
The Data Warehouse $r"hite"ture )"ont*+
(dditiona' 'ayers are:$rocess management layer7 the schedu'er
or >ob contro''er%pplication messaging layer7 the
3!idd'e)are4 that transports in*or!ationaround the *ir!
$hysical data &arehouse layer7 )here theactua' data used in the D are 'ocated
Data staging layer7 a'' o* the processesnecessary to se'ect, edit, su!!arie and'oad )arehouse data *ro! the operationa'and e
7/24/2019 Marakas-Ch10
10/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 10
om!onents o# the Data Warehouse
$r"hite"ture
7/24/2019 Marakas-Ch10
11/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 11
Data Warehousing T!olog
The virtual data &arehouse7 the end usersha/e direct access to the data stores, usingtoo's enab'ed at the data access 'ayer
The central data &arehouse7 a sing'ephysica' database contains a'' o* the data *ora speci*ic *unctiona' area
The distributed data &arehouse7 theco!ponents are distributed across se/era'physica' databases
7/24/2019 Marakas-Ch10
12/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 12
10-: Data .ave Data -- The Metadata
The na!e suggests so!e high%'e/e'techno'ogica' concept, but it rea''y is *air'ysi!p'e. Metadata is 3data about data4.
With the e!ergence o* the data )arehouseas a decision support structure, the !etadataare considered as !uch a resource as thebusiness data they describe.
Metadata are abstractions %% they are high'e/e' data that pro/ide concise descriptions o*'o)er%'e/e' data.
7/24/2019 Marakas-Ch10
13/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 1#
The Metadata in $"tion
The !etadata are essentia' ingredients in thetrans*or!ation o* ra) data into kno)'edge.They are the 3keys4 that a''o) us to hand'e thera) data.
or e
7/24/2019 Marakas-Ch10
14/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 15
The /eed #or onsisten" in the
Metadata
The data )arehouse is set up *or the bene*ito* business ana'ysts and e
7/24/2019 Marakas-Ch10
15/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 16
10-: 'nterviewing the DataMetadata
E2tra"tion
Aegard'ess o* the nature o* a 8uery, certainaspects o* the !etadata are i!portant toa'' decision%!akers. o!e o* these are:
What tab'es, attributes and keys doesthe DW containB
Where did each set o* data co!e*ro!B
What trans*or!ations )ere app'ied)ith c'eansingB
7/24/2019 Marakas-Ch10
16/22
7/24/2019 Marakas-Ch10
17/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 1
Co!ponents o* the Metadata
Trans#ormation maps7 records that sho))hat trans*or!ations )ere app'ied
'"traction history7 records that sho) )hat
data )as ana'yed%lgorithms #or summarization7 !ethodsa/ai'ab'e *or aggregating and su!!ariing
Data o&nership7 records that sho) origin%ccess patterns7 records that sho) )hatdata are accessed and ho) o*ten
7/24/2019 Marakas-Ch10
18/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 1;
Typica' Mapping Metadata
Trans*or!ation !apping records inc'ude: denti*ication o* origina' source(ttribute con/ersions
$hysica' characteristic con/ersionsEncodingre*erence tab'e con/ersionsa!ing changes
?ey changesa'ues o* de*au't attributesFogic to choose *ro! !u'tip'e sources('gorith!ic changes
7/24/2019 Marakas-Ch10
19/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 1=
10%6: !p'e!enting the Data Warehouse
(ozar assembled a list o# )seven deadly sins* o#data &arehouse implementation+
1, )I# you build it- they &ill come*7 the DWneeds to be designed to !eet peop'es
needs2, !mission o# an architectural #rame&or.7
you need to consider the nu!ber o* users,/o'u!e o* data, update cyc'e, etc.
/, 0nderestimating the importance o#documenting assumptions7 theassu!ptions and potentia' con*'icts !ustbe inc'uded in the *ra!e)ork
7/24/2019 Marakas-Ch10
20/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 20
3e/en Dead'y ins4, continued
1, ailure to use the right tool7 a DW pro>ectneeds di**erent too's than those used tode/e'op an app'ication
2, Li#e cycle abuse7 in a DW, the 'i*e cyc'e
rea''y ne/er ends/, Ignorance about data con#licts7 reso'/ing
these takes a 'ot !ore e**ort than !ostpeop'e rea'ie
, ailure to learn #rom mista.es7 since oneDW pro>ect tends to beget another,'earning *ro! the ear'y !istakes )i'' yie'dhigher 8ua'ity 'ater
7/24/2019 Marakas-Ch10
21/22
Marakas: Decision upport yste!s, 2nd Edition "200#, $rentice%&a''
Chapter 10 % 21
10-3: Data Warehouse Te"hnologies
o one current'y o**ers an end%to%end DWso'ution. rganiations buy bits and pieces*ro! a nu!ber o* /endors and hope*u''y
!ake the! )ork together.(, M, o*t)are (G, n*or!ation ui'dersand $'atinu! o**er so'utions that are at 'east*air'y co!prehensi/e.
The !arket is /ery co!petiti/e. Tab'e 10%9 inthe te
7/24/2019 Marakas-Ch10
22/22
Marakas: Decision upport yste!s, 2nd Edition "200# $ ti & ''
Chapter 10 % 22
10%: The uture o* Data Warehousing
(s the DW beco!es a standard part o* anorganiation, there )i'' be e**orts to *ind ne))ays to use the data. This )i'' 'ike'y bring)ith it se/era' ne) cha''enges:
Aegu'atory constraints !ay 'i!it the abi'ityto co!bine sources o* disparate data.
These disparate sources are 'ike'y tocontain unstructured data, )hich is hard to
store.The nternet !akes it possib'e to access
data *ro! /irtua''y 3any)here4. * course,this >ust increases the disparity.
Recommended