Upload
others
View
18
Download
0
Embed Size (px)
Citation preview
1
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
©Copyright 2010, Breckenridge Academy. All rights reserved.
Breckenridge Academy
Inmon vs. Kimball:Let’s get to the bottom of this!
TDWI LA Chapter
Los Angeles, CA
March 9, 2010
Speaker:Bob [email protected]
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 2
©Copyright 2010, Breckenridge Academy. All rights reserved.
StrategicPlanning
Management Monitoring & Control
Business Operations
DecisionSupport
Analysis &Reporting
Transaction Processing
Anthony’s Pyramid
Business Perspective Systems Perspective
2
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 3
©Copyright 2010, Breckenridge Academy. All rights reserved.
Why Data Warehousing?Transaction Processing
•Real-time perspective
•Detailed data
•Update intensive
•Online updates
•Batch reporting
•Performance sensitive
•Structured processes
•Stable data structures
•Functional organization
•Clerical community
Analytical/Reporting
•Historic perspective
•Summarized/derived info.
•Read-only
•Batch updates
•Online reporting
•Flexibility priority
•Ad-hoc reporting
•Evolving data structures
•Cross-functional
•Mgmt/analyst community
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 4
©Copyright 2010, Breckenridge Academy. All rights reserved.
OLTP versus Reporting Design
Transaction Processing
•Highly normalized
•Minimal indexing
•Transaction logging on
•Record locking on
•Individual records
•Calculate derived data
Analytical Processing
•Denormalized
•Liberal indexing
•Transaction logging off
•Record locking off
•Sets of data
•Store derived data
3
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 5
©Copyright 2010, Breckenridge Academy. All rights reserved.
Traditional System Development
PARTSSUPPLIERS
POs
INVOICESVENDORSPAYMENTS
PARTSASSYs
REQUISITIONS
PurchasingApplication
AccountsPayable
InventoryManagement
•Data belongs to an application•Point-to-point interfaces
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 6
©Copyright 2010, Breckenridge Academy. All rights reserved.
Typical Legacy ‘Architecture’• Redundant data ($)• Inconsistent, unreliable data ($$$)
4
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 7
©Copyright 2010, Breckenridge Academy. All rights reserved.
Data Warehouse Concept
Extract, Translate LoadData Transformation Layer
Operational Layer(source of record)
Data Warehouse
Reporting and AnalysisRead Only SnapshotsHistorical, SummarizedValidated, Integrated
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 8
©Copyright 2010, Breckenridge Academy. All rights reserved.
StrategicPlanning
Management Monitoring & Control
Business Operations
DecisionSupport
Analysis &Reporting
Transaction Processing
Anthony’s Pyramid
Business Perspective Systems Perspective
IE onlyaddresses
these
DWaddresses
these
•DW complement Tx Proc•DW pathway to integration•DW immediate value
5
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 9
©Copyright 2010, Breckenridge Academy. All rights reserved.
Inmon Architecture circa 1992
Oper DB•op detail•current•App-oriented•Unintegrated
Atomic DW•op grain•latency•subject oriented•time variant•enterprise integrated
Dept DW•parochial•summary•derived data
Individual•PC-based•temporary•ad-hoc•heuristic
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 10
©Copyright 2010, Breckenridge Academy. All rights reserved.
Relational (3NF) DesignMARKET
market_id
CUSTOMER
customer_id
market_idpostal_cdcntry_id
ORDER
order_id
customer_idemployee_idday_id
ORDER_ITEM
order_idorder_line_id
product_id
PRODUCT_CLASS
prodcut_class_id
PRODUCT_CATEGORY
product_category_id
prodcut_class_id
PRODUCT
product_id
product_category_id
EMPLOYEE
employee_id
territory_iddepartment_id
DEPARTMENT
department_id
division_id
TERRITORY
territory_id
region_id
REGION
region_id
DIVISION
division_id
ACCOUNT
account_id
INVOICE
invoice_id
account_id
INVOICE_ITEM
invoice_line_idinvoice_id
order_idorder_line_idday_id
YEAR
year_id
QUARTER
quarter_id
year_id
MONTH
month_id
quarter_id
WEEK
week_id
month_id
DAY
day_id
month_idweek_idpay_period_id
PAY_PERIOD
pay_period_id
month_id
COUNTRY
cntry_id
STATE
cntry_idstate_id
COUNTY
county_id
state_idcntry_id
CITY
city_id
county_id
POSTAL_ZONE
postal_cdcntry_id
city_id
6
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 11
©Copyright 2010, Breckenridge Academy. All rights reserved.
Atomic History Table Design
CSTMR_HSTcstmr_dwidhst_sqnc_id strt_ext_ts end_ext_ts** hst_crf cstmr_nm
101 1 Mon night Wed night N Bob102 1 Mon night 12/31/2999 Y Joe103 1 Tues night 12/31/2999 Y Mary101 2 Wed night 12/31/2999 Y Robert
No gaps or overlaps in time spans for a given DWID value** Use high date (12/31/2999) instead of NULL for current row
101 - Bob
101-Robert
Mon Wed 29991231
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 12
©Copyright 2010, Breckenridge Academy. All rights reserved.
Inmon Architecture circa 1996
ODS EDWData
Marts
Oper DB•op detail•current
•App-oriented•Unintegrated
Oper Data Store•op grain•latency
•subject oriented•volatile•enterprise integrated
Enterprise DW•op grain•latency
•subject oriented•time variant•enterprise integrated
Dept DW•parochial•summary
•derived data
7
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 13
©Copyright 2010, Breckenridge Academy. All rights reserved.
Kimball Architecture circa 1996
� 3NF is great for tx processing but is inappropriate for DW
� Dimensional model (star schemas)
� Central fact table –aggregate measures
� Grouped by dimensions - denormalized
� EDW- collection of star schemas (by subject area) with shared (conforming) dimensions.
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 14
©Copyright 2010, Breckenridge Academy. All rights reserved.
Dimensional (Star Schema) Design
D_PRODUCT
product_id
prodcut_class_idproduct_category_id
D_EMPLOYEE
employee_id
territory_idregion_iddepartment_iddivision_id
D_CUSTOMER
customer_id
market_idpostal_cdcity_idcounty_idstate_idcntry_id
D_DAY
day_id
week_idpay_period_idmonth_idquarter_idyear_id
F_SALES
day_idemployee_idproduct_idcustomer_id
sales_unitssales_amountytd_sales
8
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 15
©Copyright 2010, Breckenridge Academy. All rights reserved.
Proliferation of Data Marts
Sales
MRP
Acctg
SalesDM
MfgDM
FinanceDM
HR
ETL tools
BI Tools
BI Tools
BI Tools
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 16
©Copyright 2010, Breckenridge Academy. All rights reserved.
DW Challenges in 2010
� Multiple DWs-Redunant/Inconsistent
� Data Integration/Data Quality
� Performance/Scalability (ETL and BI)
� Maintenance/Extensibility
� Evolving/Expanding Reporting Needs
� Changing/Expanding Source Systems
9
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 17
©Copyright 2010, Breckenridge Academy. All rights reserved.
Program Management Office
BusReq1
Bus Req2
BusReq3
BusReq4
PMO StrgcProj1 SP2 SP3
•Reconcile Business Requirements•Scope DW Projects•Justify DW Projects•Prioritize DW Projects
SP4
TactProj11 TP12 TP13 TP14
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 18
©Copyright 2010, Breckenridge Academy. All rights reserved.
Data Warehouse Concept
Extract, Translate LoadData Transformation Layer
Operational Layer(source of record)
Data Warehouse
Reporting and AnalysisRead Only SnapshotsHistorical, SummarizedValidated, Integrated
10
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 19
©Copyright 2010, Breckenridge Academy. All rights reserved.
Data Warehouse Concept
SOR1
SOR2
ExtractionTransformation
Load (ETL)
•Optimized for Tx processing•Heterogeneous technology•Redundant data•Inconsistent semantics
•Optimized for Analysis Reporting•Integrated data•Detail, Summarized, Historic•Cross-Functional•Enterprise Perspective•Flexibility, Ad-Hoc Access
Oracle
DB2
SQL Server
Order Mgmt
Billing
DataWarehouse
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 20
©Copyright 2010, Breckenridge Academy. All rights reserved.
Data Warehouse Architecture
SOR1
SOR2
ExtractionFilter
Conform DatatypeConform Domain
Error TrappingConsolidate
Generate PKsRenormalizeAggregate
Track HistoryLoad
DataWarehouse
Oracle
DB2
SQL Server
Order Mgmt
Billing
11
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 21
©Copyright 2010, Breckenridge Academy. All rights reserved.
RAPID Architecture Components
SOR1
SOR2
ACQ1
ACQ2
CNF1
CNF2
BASE
HIST
DIM
ETL1
ETL1
ETL2
ETL2
ETL3
ETL5
•Persistent•Historic•Non-transformed•Comprehensive*
*SOR tables*All rows/cols.
ExtractChange CaptureAudit Columns
•Transient•Subset•Source Layout*•Target Domains
•Persistent•Integrated•Detailed•Normalized•Non-Historic
Filter Tables/Rows/ColsConform Data Types
Conform ValuesDefault/Error
ETL4
•Dimensional•Historic•Summarized•Derived Data
RenormalizeMatch/MergeGen DW PKsEnforce FK/RI
DenormalizeAggregationDerivation
Time Resolved
As-Is
As-Was
…•Heterogeneous•Redundancy•Inconsistency•Difficult Reports
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 22
©Copyright 2010, Breckenridge Academy. All rights reserved.
Dimensional Layer
Bus.Objects
Cognos
12
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 23
©Copyright 2010, Breckenridge Academy. All rights reserved.
DW Layers - Purpose/Usage
SOR1
SOR2
ACQ1 CNF1
CNF2
BASE
HIST
DIM
ETL1
ETL1
ETL2
ETL2
ETL3
ETL5
• Simplify ACQ-BASE• Reusable CNF tables• Expedite Development• Expedite Prod Schedule
• Drill Down• Reusable BASE-DIM• Ad-Hoc Reporting• Operational Reports
ETL4
• Simplify Reports• Performance• Std. Derivations• Leverage BI Tools
ACQ2
• Drill Back• Error Reporting
As-Is
As-Was
… … …
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 24
©Copyright 2010, Breckenridge Academy. All rights reserved.
Error Tracking/Reporting
ACQ CNFETL2
RULES 1st Logic
ERR Log
Error Reports
BI Tool
ETL3 BASE
Inline
Offline
Inline
Data Stewards
13
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 25
©Copyright 2010, Breckenridge Academy. All rights reserved.
DW Layers - Purpose/Usage
SOR1
SOR2
ACQ1 CNF1
CNF2
BASE
HIST
DIM
ETL1
ETL1
ETL2
ETL2
ETL3
ETL5
• Simplify ACQ-BASE• Reusable CNF tables• Expedite Development• Expedite Prod Schedule
• Drill Down• Reusable BASE-DIM• Ad-Hoc Reports• Operational Reports
ETL4
• Simplify Reports• Performance• Std. Derivations• Leverage BI Tools
ACQ2
• Drill Back• Error Reporting• Retrospective ETL Rules
As-Is
As-Was
… … …
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 26
©Copyright 2010, Breckenridge Academy. All rights reserved.
DW Layers - Purpose/Usage
SOR1
SOR2
ACQ1 CNF1
CNF2
BASE
HIST
DIM
ETL1
ETL1
ETL2
ETL2
ETL3
ETL5
• Simplify ACQ-BASE• Reusable CNF tables• Expedite Development• Expedite Prod Schedule
• Drill Down• Reusable BASE-DIM• Ad-Hoc Reports• Operational Reports
ETL4
• Simplify Reports• Performance• Std. Derivations• Leverage BI Tools
ACQ2
• Drill Back• Error Reporting• Retrospective ETL Rules• Retrospective DW Scope• Stable ETL1• Interim Reporting
As-Is
As-Was
… … …
14
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 27
©Copyright 2010, Breckenridge Academy. All rights reserved.
ACQ Layer for Interim Reporting
SOR1
SOR2
SOR3
FF4
ACQ1
ACQ2
ACQ3
ACQ4
ETL1
ETL1
ETL1
ETL1
V1
V2
V3
EIILayer
ViewLayer
Bus.Objects
SAS
2ndry Indexes
Cognos
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 28
©Copyright 2010, Breckenridge Academy. All rights reserved.
Interim Reporting in ACQ Layer
� Rapid deployment of ‘tactical’ reports
� Offload OLTP reporting
� Unified tactical/strategic reporting
� Common extraction/delta process/history
� Early insight/experience into the business requirement
15
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 29
©Copyright 2010, Breckenridge Academy. All rights reserved.
EDW Architecture Roadmap
S1
S2
S3S4
A1
A2
A3A4
V1
V2
V3
EII
S1
S2
S3S4
A1
A2
A3A4
St1
St2 BaseHist
Dim
S1
S2
S3S4
A1
A2
A3A4
St1
St2 Base
Hist
DimSt4
Project 1
Project 2
Project 3
EDW Release 1
EDW Release 2
EDW Release 3
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 30
©Copyright 2010, Breckenridge Academy. All rights reserved.
Re-Acquire Legacy DW
SOR
SOR
ACQ
ACQ
CNF
CNF
BAS
HIS
DIM… … …
LegacyDW
LegacyETL
LegacyETL
LegacyReports
16
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 31
©Copyright 2010, Breckenridge Academy. All rights reserved.
Re-Conform Legacy DW
SOR
SOR
ACQ
ACQ
CNF
CNF
BAS
HIS
DIM… … …
LegacyDWs
LegacyETL Legacy
ReportsLegacyETL
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 32
©Copyright 2010, Breckenridge Academy. All rights reserved.
Re-Integrate Legacy DW
SOR
SOR
ACQ
ACQ
CNF
CNF
BAS
HST
DIM… … …
LegacyDWs
LegacyReports
LegacyETL
Re-ArchReports
17
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 33
©Copyright 2010, Breckenridge Academy. All rights reserved.
RAPID Architecture Summary
� Multi-layer, modular design
� Supports incremental development
� Durable for expansion
� Flexible for changes
� Leverage reusability
� Strategic and tactical solutions
� Ad-hoc and structured reporting
� Technology neutral
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 34
©Copyright 2010, Breckenridge Academy. All rights reserved.
Questions
18
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 35
©Copyright 2010, Breckenridge Academy. All rights reserved.
Bill Inmon on Ralph Kimball
The Data Warehouse Toolkit, 1996
Forward:“…Kimball’s stark cognizance and revolutionary approaches…have been
tested in crucible of reality. DW Toolkit is one of the definitive books of our industry and mandatory reading for IT professionals…to successfully and profitably conduct business.
W.H. Inmon
Prism Solutions
Aug. 27, 1995
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 36
©Copyright 2010, Breckenridge Academy. All rights reserved.
Release 4
Release 3
Release 2
Release 1
Retrospective Scope Expansion
Time
Incremental data elements
19
DW Boot Camp DW Overview
©Copyright 2010, Breckenridge Academy. All rights reserved.
β reckenridgecademyA
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 37
©Copyright 2010, Breckenridge Academy. All rights reserved.
ACQ Layer Storage Management
XYZ200 tables•90 SOR
•110 Non-SOR
ACQ_XYZ90 SOR
•15 In Scope•75 Out Scope
CNF_XYZ15 In ScopeETL1 ETL2
Monthly Archive(Out Scope, CRF=N)
(In Scope >120d)Off LineStorage
Restore (On-Demand)(Specify tables
and dates)
Breckenridge Academy Data Warehouse Boot Camp DW Overview, Page 38
©Copyright 2010, Breckenridge Academy. All rights reserved.
Incremental/Opportunistic
Business/ITPartnership
Architecture/Methodology
Resources/Organization
Governance
Technology/Tools DW
C.S.F.s
Data Warehouse Critical Success Factors