79
Starring Sakila A Data Warehousing and Business Intelligence Tutorial

A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

  • Upload
    lytruc

  • View
    252

  • Download
    13

Embed Size (px)

Citation preview

Page 1: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

A Data Warehousing and Business Intelligence

Tutorial

Page 2: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

Welcome!

Matt CastersChief Data Integration, Pentahohttp://www.ibridge.be/ @mattcasters

Roland BoumanSoftware Engineer, Pentahohttp://rpbouman.blogspot.com/@rolandbouman

Page 3: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Pentaho

● Commercial Open Source Business Intelligence– Full BI suite since 2005

● Projects: Kettle (DI & ETL), Jfree (Reporting), Mondrian (OLAP), Weka (Data Mining)

● Community: CDF (Dashboarding), Saiku (OLAP)

● Recent: Focus on “Big Data”, esp. Hadoop● http://www.pentaho.com● http://sourceforge.net/projects/pentaho/

Page 4: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Agenda

● Business Intelligence● Data Warehousing● Anatomy of a Data Warehouse● Physical Implementation● Sakila – a Star is Born● Filling the Data Warehouse● Presenting the Data - BI Applications

Page 5: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

Part I:

Business Intelligence

Page 6: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Business Intelligence

● Skills, technologies, applications and practices to acquire a better understanding of the commercial context of your business

● Turning data into information useful for business users

– Management Information– Decision Support

Page 7: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Business Intelligence Scope

Operational

Strategic

Tactical

Customers Partners Employees

Analysts

Executives

Managers

days, weeks:

“Who's available for tomorrow's shift”

weeks, months:

“In what region should we open a new store?”

months, years:

“Should we become an ap-pliance vendor instead of delivering software solu-tions”

Reporting

OLAP/Analysis

Data mining

Page 8: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Functional Parts of a Business Intelligence Solution

● Front end Applications:

– Reports– Charts and Graphs– OLAP Pivot tables– Data Mining– Dashboards

● Back end Infrastructure

– Data Integration– Data Warehouse– Data Mart– Metadata– (ROLAP) Cube

Page 9: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

High Level BI Architecture

ERP

Staging Area

Enterprise Data Warehouse

Meta Data

Extract Transform Load Present

Back-end Front-end

Datamarts

Sources

CRM

External Data

OLAP/Analysis

Reporting

Charts /Graphs

Dashboards

Data Mining

Operational Datastore

Page 10: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

Part II:

Data Warehousing

Page 11: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

What is a Data Warehouse?

● A database designed to support Business Intelligence Applications

● Different requirements as compared to Operational Applications

● Analytic Database Systems (ADBMS)– MySQL: Infobright, InfiniDB

– LucidDB, MonetDB

Page 12: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

What is a Data Warehouse?

● Ultimately, it's just a Relational Database– Tables, Columns, ...

● Designed for Business Intelligence applications– Ease of use

– Performance

● Data from various source systems– Integration, Standardization, Data cleaning

● Add and maintain history– Corporate memory

Page 13: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

What is a Data Warehouse?

● A database designed to support BI applications● BI applications (OLAP) differ from Operational

applications (OLTP)– OLTP: Online Transaction Processing

– OLAP: Online Analytical Processing

● Differences:– Applications, Data Processing, Data Model

Page 14: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

OLTP vs OLAP:Application Characterization

● OLTP– Operational

– 'Always' on

– All kinds of users

– Many users

– Directly supports business process

– Keep a Record of Current status

● OLAP– Tactical, Strategic

– Periodically Available

– Managers, Directors

– Few(er) users

– Redesign Business Process

– Decision support, long-term planning

– Maintains a history

Page 15: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

OLTP vs OLAP: Data Processing

● OLTP– Transactions

– Subject Oriented

– Add, Modify, Remove single rows

– Human data entry

– Queries for small sets of rows with all their details

– Standard queries

● OLAP– Groups

– Aspect Oriented

– Bulk load, rarely modify, never remove

– Automated ETL jobs

– Scan large sets to return aggregates over arbitrary groups

– Ad-hoc queries

Page 16: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

OLTP vs OLAP:Data Model

● OLTP– Entity-Relationship

model

– Entities, Attributes, Relationships

– Foreign key constraints

– Indexes to increase performance

– Normalized to 3NF or BCNF

● OLAP– Dimensional

model

– Facts, Dimensions, Hierarchies

– Ref. integrity ensured in loading process

– Scans on Fact table obliterates indexes

– Denormalized Dimensions (<= 1NF)

Page 17: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

High Level BI Architecture

ERP

Staging Area

Enterprise Data Warehouse

Meta Data

Extract Transform Load Present

Back-end Front-end

Datamarts

Sources

CRM

External Data

OLAP/Analysis

Reporting

Charts /Graphs

Dashboards

Data Mining

Operational Datastore

Page 18: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

Part III:

Dimensional Model

Page 19: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

What is the Dimensional Model?

● An aspect-oriented logical data model optimized for querying and data presentation

● Divides data in two kinds:

– Facts– Dimensions

Page 20: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

The Dimensional Model

● Facts

– Measures/Metrics of a Business Process– Examples: Cost, Units Sold, Profit

● Dimensions

– Context of Business Process– Who? What? Where? When? Why?– Navigate Facts: Selection, Rollup, Drilldown– Provide and maintain history

Page 21: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Dimensional Data Presentation

Date Dimension 2008 Q4

Location Dimension

All Months

October November December

All locations $ 3850 $ 1000 $ 1350 $ 1500

America All America

$ 2050 $ 500 $ 750 $ 800

North $ 1275 $ 300 $ 500 $ 475

South $ 775 $ 200 $ 250 $ 325

Europe All Europe

$ 1800 $ 500 $ 600 $ 700

East $ 800 $ 250 $ 250 $ 300

West $ 1000 $ 250 $ 350 $ 400

Page 22: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

The Dimensional Model: Facts

● Fact table structure:

– Several measures– Keys to dimension tables

● Measures:

– Usually numeric, Additive, Semi-additive– Sometimes pre-calculated

● Rapidly growing!

– Millions, Billions of rows (Terabytes)

Page 23: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

The Dimensional Model: Dimensions

● Dimension table structure:

– Surrogate key and descriptive text attributes ● Relatively few rows

– Exception: Customer 'Monster' dimension● Relatively static

– Exception: Slowly changing dimensions● Used to navigate through fact data

– Hierarchies

Page 24: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

The Dimensional Model: Navigating data with Dimensions

● Selection (Filter)● Navigation: Attributes organized in Hierarchies

– Date dimension examples:● Year, Quarter, Month, Day● Year, Week, Day

● Groupings for Aggregation

– 'Roll up', 'Drill Down'– 'Slice and Dice'

Page 25: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

The Dimensional Model: Maintaining History

● Fact table usually links to a date dimension● Dimensions maintain their own history

– Slowly changing dimensions● Type I Overwrite (no history)● Type II

– History kept in rows (versioning)● Type III

– History kept in columns

Page 26: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

Part V:

Physical Implementation

Page 27: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Dimensional Model Implementation: Star Schema

● Related metrics stored in a Fact table● Fact table references relevant dimensions● Each Dimension stored in a Dimension Table● Dimension tables shared by multiple fact tables

Page 28: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Rentals

Star Schema example: Sakila Rentals

Store

Date

Time

Film

CustomerStaff

Page 29: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Star Schema example: Sakila Rentals

fact_rentalfact_inventory fact_payment

dim_date

dim_customerdim_store dim_staffdim_store

dim_film

Page 30: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Stars versus Snowflakes

● Star schema is 'just' an implementation– Optimized for simplicity

– Optimized for performance (?)

– Heavily denormalized dimensions

● There is an alternative: Snowflake– Normalized dimensions

Page 31: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Snow Flake example: Sakila Rentals

StoreDate

Minute

Film

Customer

Staff

Month

Hour

Quarter City

Country

City

Country

Language

Rating

Year

Week Rentals

Page 32: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

Part V:

A Star is Born

Page 33: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Dimensional Model example

● MySQL Sample Database– http://dev.mysql.com/doc/sakila/en/sakila.html

● DVD rental business– Overly simplified database schema

● Typical OLTP database

Page 34: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

3NF Source schema: Sakila Rentals

Rental Customer

Film

Store Address

Category Actor

StaffInventory

City

CountryLanguage

Page 35: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Target Star Schema

Fact: Rentals

Store

Date

Time

Film

When?

Where?

What?

CustomerStaff

Who?

Page 36: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Dimensional Design

● Select Business Process– Sales, Purchase, Storage, ...

● Define Facts and Key Metrics– Facts: Key Event in Business Process

– Metrics (Fact Attributes): Count or Amount

● Choose Dimensions and Hierarchies– What? When? Where?

– Who? Why?

Page 37: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Example Business Process:Rentals

● Select Business Process– Rentals

● Identify Facts– Count (number of rentals)

– Rental Duration

● Choose Dimensions– What: Films

– When: Rental, Return

– Who: Customer, Staff

– Where: Store

Page 38: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Rentals 3NF

Rental

CustomerStaffInventory

Page 39: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Rentals 3NF

Rental

CustomerStaffInventory

StoreFilm

Category

Film Category

Page 40: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Denormalize

Rental

CustomerStaffInventory

StoreFilm

Category

Film Category

Page 41: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Denormalize

Rental

CustomerStaff

StoreFilm

StoreCategory

Page 42: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born

Rental

CustomerStaff

StoreFilm

Store

Address

Category

Page 43: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Denormalize

Rental

CustomerStaff

StoreFilm

Store

Address

Category

Page 44: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Denormalize

Rental

CustomerStaff

StoreFilm

Store

AddressAddress

Language

Category

Page 45: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Denormalize

Rental

CustomerStaff

StoreFilm

Store

AddressAddress

LanguageCityCity

Category

Page 46: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Rental Snowflake

Rental

CustomerStaff

StoreFilm

Store

AddressAddress

LanguageCityCity

CountryCountry

Category

Page 47: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born: Rental Star Schema

Rental

StoreLanguage

Film

Country

City

What: Film Who: CustomerWhere: Store Who: Staff

Address

Store

Staff

Country

City

Address

Customer

Category

Page 48: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Dimensional Design

● Something is missing....– Who ? (Customer, Staff)

– What ? (Film)

– Where ? (Store)

– .... ?

Page 49: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

A star is born:Rental Date and Time

Rental

What: Film Who: CustomerWhere: Store Who: Staff

When: Date When: Time

Page 50: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Role Playing: Date/Timefor both Rentals and Returns

Rental

What: Film Who: CustomerWhere: Store Who: Staff

When:Rental Date

When:Rental Time

When:Return Date

When:Return Time

Page 51: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho
Page 52: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Rental Star Schema

Page 53: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

Part IV:

Filling the Data Warehouse

Page 54: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

High Level Data Warehouse Architecture

ERP

Staging Area

Data Warehouse

Meta Data

BI Applications

ReportingAnalysisVisualizationsDashboardsData Mining

Extract Transform Load Present

Back-end Front-end

Datamarts

Sources

CRM

External Data

Page 55: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Planning the ETL Process

● Physical Design● Source to Target Mapping

– Define how data in the data warehouse is derived from data in the source system(s)

– Specification for designing the ETL process

● Column-level mapping– Source system, schema, table, column, data type

– Target dimension/fact, column, defaults

– Transformation rules, cleansing, lookup, calculation

Page 56: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Designing the ETL Process

● Staging?● Changed Data Capture / Extraction● Denormalization● Derived data / Enrichment● Cleansing / Conforming● History policy (dimensions)● Granularity● Dimension Lookup (facts)

Page 57: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Designing ETL with Kettle

● Flow ETL Engine● Transformations

– Data flow and processing

● Jobs– Workflow of ETL tasks

● Tools– Spoon

– Kitchen

– Pan

Page 58: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Loading a Fact Table

● Load Dimension Tables● Load Fact table

Page 59: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Loading a Dimension Table

● Get Customers source data● Lookup Address (Denormalize)● Update Dimension

Page 60: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Loading a Fact Table

Page 61: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Starring Sakila

Part V:

Presenting the Data:BI Applications

Page 62: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Business Intelligence Scope

Operational

Strategic

Tactical

Customers Partners Employees

Analysts

Executives

Managers

days, weeks:

Who's available for tomorrow's shift

weeks, months:

In what region should we open a new store?

months, years:

Should we become an ap-pliance vendor instead of delivering software solu-tions

Reporting

OLAP/Analysis

Data mining

Page 63: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Reporting

Page 64: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Reporting● Mostly Operational● Lists and Grouping● Typically standardized● Typically no or limited interactivity

– Subreporting

Page 65: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Scope of Reporting

Operational

Customers Partners Employees

days, weeks:

Who's available for tomorrow's shift

weeks, months:

In what region should we open a new store?

months, years:

Should we become an ap-pliance vendor instead of delivering software solu-tions

Reporting

OLAP/Analysis

Data mining

Strategic

Tactical

Analysts

Executives

Managers

Page 66: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Reporting

Page 67: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Analysis

Page 68: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Analysis● Tactical, Strategic● OLAP

– Online Analytical Processing● Pivot tables● Typically Interactive

– Slice and Dice– Drilldown

● Typically Ad-hoc

Page 69: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Scope of OLAP & Analysis

Operational

Customers Partners Employees

days, weeks:

Who's available for tomorrow's shift

weeks, months:

In what region should we open a new store?

months, years:

Should we become an ap-pliance vendor instead of delivering software solu-tions

Reporting

OLAP/Analysis

Data mining

Tactical

AnalystsManagers

Strategic

Executives

Page 70: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Analysis Interactive Pivot table

Page 71: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Data Mining

Page 72: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Data Mining● Strategic, Tactical● Discover hidden patterns in data● Machine learning● Statistic analysis● Typically not interactive, long running● Expert matter● Not readily consumable by end-users

– Characteristics of back-end processing

Page 73: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Scope of Data Mining

Operational

Customers Partners Employees

days, weeks:

Who's available for tomorrow's shift

weeks, months:

In what region should we open a new store?

months, years:

Should we become an ap-pliance vendor instead of delivering software solu-tions

Reporting

OLAP/Analysis

Data mining

Strategic

Tactical

Analysts

Executives

Managers

Page 74: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Data Mining

Page 75: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Charts and Graphs

Page 76: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Charts and Graphs● Operational, Tactical, Strategic● Summarize large dataset● Not a separate class but a presentation

– Data Visualization● Standardized or ad-hoc● Can be interactive

– Drive a subreport– Drive drilldown

Page 77: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Dashboarding

Page 78: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Dashboarding● Operational, Tactical, Strategic● Not a separate class but a presentation● Bundle:

– key metrics for a particular role or perspective

– different views on the same metrics● Can contain reports, pivot tables, charts, graphs● Typically interactive

Page 79: A Data Warehousing and Business Intelligence Tutorial · A Data Warehousing and Business Intelligence Tutorial. Starring Sakila Welcome! Matt Casters Chief Data Integration, Pentaho

Dashboard