35
Chapter 16. Insurance 서울시립대학교 인공지능 연구실 G201549028 조찬연 https://github.com/lovebube/ The Data Warehouse Toolkit 1/35

The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Chapter 16. Insurance

서울시립대학교인공지능연구실

G201549028조찬연

https://github.com/lovebube/

The Data Warehouse Toolkit

1/35

Page 2: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Contents

ü Insurance Case Studyü Policy Transactionsü Premium Periodic Snapshotü More Insurance Case Studyü Claim Transactions/Snapshotü Factless Accident Eventsü Common Dimensional Modeling Mistakes

2/35

Page 3: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Insurance Case Study

ü Insurance has a complicated relation to its policyholder.ü Industry grow up very fast.

ü Internal systems and processes already capture the bulk of the data required. But data is not integrated.ü So, integrated data need.

3/35

Page 4: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Value Chain

ü Issue policies, collect premium payments, and process claims.ü Organization is interested in better understanding the metrics spawned

by each of these events.ü Value chain begins with a variety of policy transactions.

ü And, also need to better understand the premium revenue associated

with each policy on a monthly basis. This is key input into the overall profit picture.

4/35

Page 5: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Draft Bus Matrix

ü There are two core columns. Policy, Premium.

Initial draft bus matrix

5/35

Page 6: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Policy Transactions

ü Coverages can be considered the insurance company’s products.ü Homeowner coverages include fire, flood, theft and personal liability.

ü Agents sell policies to policyholders.ü There are two dates associated with each policy transaction.

ü When the transaction was entered into the operational system.

ü The policy transaction effective date(Legally takes effect).

6/35

Page 7: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Policy Transactions

ü There are three basic techniques for handling SCD.ü Simply overwrite the dimension attribute’s value.ü Make new surrogate key, and use it. Do not delete past value.ü Labeled historical for differentiation, to retain the old calssifications.

Slowly Changing Dimensions

7/35

Page 8: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Mini-dimensions for Large or Rapidly Changing Demnsions

ü The policyholder dimension qualifies as a large dimension with more than 1 million rows.ü It is often important to accurately track content values for a subset of

attributes.ü To split the closely monitored, more rapidly changing attributes in to

one or more type 4 mini-dimensions directly linked to the fact table with a separate surrogate key.

8/35

Page 9: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Policy Transaction Fact Table

Policy transaction schema

9/35

Page 10: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Heterogeneous Supertype and Subtype Products

ü Insurance companies typically are involved in multiple, very different lines of business.ü The detailed parameters of homeowners’ coverages differ significantly

from automobile coverages. And these both differ substantially from personal property coverage, general liability coverage, and other types

of insurance.ü So, generalize the initial schema.

10/35

Page 11: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Policy transaction schema with subtype automobile dimension tables.

Policy transaction schema with subtype automobile dimension tables

11/35

Page 12: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Premium Periodic Snapshot

ü When designing the premium periodic snapshot table, you should strive to reuse as many dimensions from the policy transaction table as possible.

ü Business management wants to know how much premium revenue was written each month, as well as how much revenue was earned.

ü Two premium revenue metrics, written versus earned premium.

12/35

Page 13: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Periodic premium snapshot schema

Periodic premium snapshot schema

13/35

Page 14: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Multiple Dimensions

Bridge table for multiple drivers on a policy

14/35

Page 15: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Policyholder reserve

Policyholder reserve/pɑ:ləsihoʊldə(r) rɪzɜ:rv/ noun.

With respect to an insurance company, an amount representing the estimated payments to policyholders (as determined by actuaries) based on the types and

terms of the various insurance policies issued by the company.

15/35

Page 16: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

More Insurance Case Study Background

ü Before the insurance company pays any claim, there is usually an investigative phase.ü Examine the covered item and interview the claimant, policyholder, or

other individuals involved.ü After the investigative phase, the insurance company issues a number

of payments.

16/35

Page 17: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Updated Insurance Bus Matrix

Updated insurance bus matrix

17/35

Page 18: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Claim Transactions

ü The operational claim processing system generates a slew of transactions, including the following transaction task types:ü Open claim, reopen claim, close claimü Set reserve, reset reserve, close reserveü Adjuster inspection, adjuster interview

ü Open lawsuit, close lawsuitü Make payment, receive payment

18/35

Page 19: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Detailed implementation bus matrix

19/35

Page 20: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Claim transaction schema

Claim transaction schema

20/35

Page 21: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Claim Accumulating Snapshot

ü Even with a robust transaction schema, there is a whole class of urgent business questions that can’t be answered using only transaction detail.ü Deliver time lags be the raw difference between two dates.

21/35

Page 22: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Claim Accumulating Snapshot Schema

22/35

Page 23: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Policy/Claim Consolidated Periodic Snapshot

Policy/claim consolidated fact table

23/35

Page 24: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

Common Dimensional ModelingMistakes to Avoid

24/35

Page 25: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

10. Place Text Attributes in a Fact Table

ü The descriptive textual attributes comprising the context of the measurements go in dimension tablesü You need to get text attributes off the main runway of the data

warehouse and into dimension tables

25/35

Page 26: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

9. Limit Verbose Descriptors to Save Space

ü Our job as designers of easy-to-use dimensional models is to supply as much verbose descriptive context in each dimension as possibleü Make sure every code is augmented with readable descriptive text

26/35

Page 27: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

8. Split Hierarchies into Multiple Dimensions

ü A hierarchy is a cascaded series of many-to-one relationships. Business users understand hierarchies.ü Our job is to present the hierarchies in the most natural and efficient

manner in the eyes of the users, not in the eyes of a data modeler

27/35

Page 28: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

7. Ignore the Need to Track Dimension Changes

ü Business users often want to understand the impact of changes on at least a subset of the dimension tables’ attributes.ü Likewise, if a group of attributes changes rapidly, you can split a

dimension to capture the more volatile attributes in a mini-dimension

28/35

Page 29: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

6. Solve All Performance Problems with More Hardware

ü Aggregates, or derived summary tables, are a cost-effective way to improve query performanceü Choosing query-efficient DBMS software, increasing real memory size

29/35

Page 30: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

5. Use Operational Keys to Join Dimensions and Facts

ü Novice designers are sometimes too literal minded when designing the dimensionü Operational or intelligent key should be replaced with a simple integer

surrogate key

30/35

Page 31: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

4. Neglect to Declare and Comply withthe Fact Grain

ü All dimensional designs should begin by articulating the business process that generates the numeric performance measurementsü Exact granularity of that data must be specifiedü Staying true to the grain is a crucial step in the design of a dimensional

model

31/35

Page 32: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

3. Use a Report to Design the Dimensional Model

ü A dimensional model has nothing to do with an intended reportü The team should have focused on the measurement processes. The

user’s requirements could have been handled with a well-designed schema for the atomic data

32/35

Page 33: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

2. Expect Users to Query Normalized Atomic Data

ü Data that has been aggregated in any way has been deprived of some of its dimensionsü Do not build a dimensional model with aggregated data and expect users

and their BI tools to seamlessly drill down to third normal form data for the atomic details

33/35

Page 34: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

1. Failed to Conform Facts and Dimensions

ü The single most important design technique in the dimensional modeling arsenal is conforming dimensionsü Conformed dimensions allow teams to be more agile because they’re

not re-creating the wheel repeatedly

34/35

Page 35: The Data Warehouse Toolkit Chapter 16. Insurancedatamining.uos.ac.kr/wp-content/uploads/2015/09/Chapter... · 2015-12-17 · Mini-dimensions for Large or Rapidly Changing Demnsions

35/35