Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Chapter 16. Insurance
서울시립대학교인공지능연구실
G201549028조찬연
https://github.com/lovebube/
The Data Warehouse Toolkit
1/35
Contents
ü Insurance Case Studyü Policy Transactionsü Premium Periodic Snapshotü More Insurance Case Studyü Claim Transactions/Snapshotü Factless Accident Eventsü Common Dimensional Modeling Mistakes
2/35
Insurance Case Study
ü Insurance has a complicated relation to its policyholder.ü Industry grow up very fast.
ü Internal systems and processes already capture the bulk of the data required. But data is not integrated.ü So, integrated data need.
3/35
Value Chain
ü Issue policies, collect premium payments, and process claims.ü Organization is interested in better understanding the metrics spawned
by each of these events.ü Value chain begins with a variety of policy transactions.
ü And, also need to better understand the premium revenue associated
with each policy on a monthly basis. This is key input into the overall profit picture.
4/35
Draft Bus Matrix
ü There are two core columns. Policy, Premium.
Initial draft bus matrix
5/35
Policy Transactions
ü Coverages can be considered the insurance company’s products.ü Homeowner coverages include fire, flood, theft and personal liability.
ü Agents sell policies to policyholders.ü There are two dates associated with each policy transaction.
ü When the transaction was entered into the operational system.
ü The policy transaction effective date(Legally takes effect).
6/35
Policy Transactions
ü There are three basic techniques for handling SCD.ü Simply overwrite the dimension attribute’s value.ü Make new surrogate key, and use it. Do not delete past value.ü Labeled historical for differentiation, to retain the old calssifications.
Slowly Changing Dimensions
7/35
Mini-dimensions for Large or Rapidly Changing Demnsions
ü The policyholder dimension qualifies as a large dimension with more than 1 million rows.ü It is often important to accurately track content values for a subset of
attributes.ü To split the closely monitored, more rapidly changing attributes in to
one or more type 4 mini-dimensions directly linked to the fact table with a separate surrogate key.
8/35
Policy Transaction Fact Table
Policy transaction schema
9/35
Heterogeneous Supertype and Subtype Products
ü Insurance companies typically are involved in multiple, very different lines of business.ü The detailed parameters of homeowners’ coverages differ significantly
from automobile coverages. And these both differ substantially from personal property coverage, general liability coverage, and other types
of insurance.ü So, generalize the initial schema.
10/35
Policy transaction schema with subtype automobile dimension tables.
Policy transaction schema with subtype automobile dimension tables
11/35
Premium Periodic Snapshot
ü When designing the premium periodic snapshot table, you should strive to reuse as many dimensions from the policy transaction table as possible.
ü Business management wants to know how much premium revenue was written each month, as well as how much revenue was earned.
ü Two premium revenue metrics, written versus earned premium.
12/35
Periodic premium snapshot schema
Periodic premium snapshot schema
13/35
Multiple Dimensions
Bridge table for multiple drivers on a policy
14/35
Policyholder reserve
Policyholder reserve/pɑ:ləsihoʊldə(r) rɪzɜ:rv/ noun.
With respect to an insurance company, an amount representing the estimated payments to policyholders (as determined by actuaries) based on the types and
terms of the various insurance policies issued by the company.
15/35
More Insurance Case Study Background
ü Before the insurance company pays any claim, there is usually an investigative phase.ü Examine the covered item and interview the claimant, policyholder, or
other individuals involved.ü After the investigative phase, the insurance company issues a number
of payments.
16/35
Updated Insurance Bus Matrix
Updated insurance bus matrix
17/35
Claim Transactions
ü The operational claim processing system generates a slew of transactions, including the following transaction task types:ü Open claim, reopen claim, close claimü Set reserve, reset reserve, close reserveü Adjuster inspection, adjuster interview
ü Open lawsuit, close lawsuitü Make payment, receive payment
18/35
Detailed implementation bus matrix
19/35
Claim transaction schema
Claim transaction schema
20/35
Claim Accumulating Snapshot
ü Even with a robust transaction schema, there is a whole class of urgent business questions that can’t be answered using only transaction detail.ü Deliver time lags be the raw difference between two dates.
21/35
Claim Accumulating Snapshot Schema
22/35
Policy/Claim Consolidated Periodic Snapshot
Policy/claim consolidated fact table
23/35
Common Dimensional ModelingMistakes to Avoid
24/35
10. Place Text Attributes in a Fact Table
ü The descriptive textual attributes comprising the context of the measurements go in dimension tablesü You need to get text attributes off the main runway of the data
warehouse and into dimension tables
25/35
9. Limit Verbose Descriptors to Save Space
ü Our job as designers of easy-to-use dimensional models is to supply as much verbose descriptive context in each dimension as possibleü Make sure every code is augmented with readable descriptive text
26/35
8. Split Hierarchies into Multiple Dimensions
ü A hierarchy is a cascaded series of many-to-one relationships. Business users understand hierarchies.ü Our job is to present the hierarchies in the most natural and efficient
manner in the eyes of the users, not in the eyes of a data modeler
27/35
7. Ignore the Need to Track Dimension Changes
ü Business users often want to understand the impact of changes on at least a subset of the dimension tables’ attributes.ü Likewise, if a group of attributes changes rapidly, you can split a
dimension to capture the more volatile attributes in a mini-dimension
28/35
6. Solve All Performance Problems with More Hardware
ü Aggregates, or derived summary tables, are a cost-effective way to improve query performanceü Choosing query-efficient DBMS software, increasing real memory size
29/35
5. Use Operational Keys to Join Dimensions and Facts
ü Novice designers are sometimes too literal minded when designing the dimensionü Operational or intelligent key should be replaced with a simple integer
surrogate key
30/35
4. Neglect to Declare and Comply withthe Fact Grain
ü All dimensional designs should begin by articulating the business process that generates the numeric performance measurementsü Exact granularity of that data must be specifiedü Staying true to the grain is a crucial step in the design of a dimensional
model
31/35
3. Use a Report to Design the Dimensional Model
ü A dimensional model has nothing to do with an intended reportü The team should have focused on the measurement processes. The
user’s requirements could have been handled with a well-designed schema for the atomic data
32/35
2. Expect Users to Query Normalized Atomic Data
ü Data that has been aggregated in any way has been deprived of some of its dimensionsü Do not build a dimensional model with aggregated data and expect users
and their BI tools to seamlessly drill down to third normal form data for the atomic details
33/35
1. Failed to Conform Facts and Dimensions
ü The single most important design technique in the dimensional modeling arsenal is conforming dimensionsü Conformed dimensions allow teams to be more agile because they’re
not re-creating the wheel repeatedly
34/35
35/35