Upload
richard-vermillion
View
613
Download
1
Embed Size (px)
Citation preview
The Warranty Data Lake How Big Data and Data Science Can Drive Program improvements
March 2016
Richard Vermillion, CEO of After, Inc. and Fulcrum Analytics, Inc.
Who is After, Inc.?
Just What Is a “Data Lake”?
Why a Warranty Data Lake?
Warranty Data Science • Customer Data • Claims Data • Spatial Data • Telematics • Parts/Supplier/Lot Data • Shipment, Engineering, Dealer Data
What does it all mean?
Agenda
Founded in 2005 as a division of Fulcrum Analytics, we leverage our dual-expertise in data analytics and consumer marketing to help companies establish world-class warranty businesses. We help manage program risk and optimize sales and marketing programs from start to finish — absorbing almost all of the management and technology burden. As your partner, we are singularly committed to creating the greatest possible value for you and your customers.
Who is After, Inc.?
What is a Data Lake?
TRADITIONAL REPOSITORIES
Large, well-organized data warehouses
That deliver some data to retail data marts
Much larger, more loosely organized lakes
That you can sample at will
DATA LAKES
What is a Data Lake?
A ”Data Lake” is a large storage repository and processing engine
Data Lakes are built on a “Big Data” platform like the Hadoop stack • Volume • Velocity • Variety
Focus on variety of data they hold than just volume or velocity: • Flat files • Relational • Hierarchical • Unstructured, free-form text • Graph/network • Real-time
Also distinguished by how they are built and loaded • Focus on agility • Iterative development to meet changing business needs • Support innovation • Destroy silos
Not an agile process
Data Warehouse or Mart
INNOVATION KILLER
Generate Business & Technical
Requirements
Design Conformed Data Model
Develop ETL for Each Data
Source
Build Reports & Connect BI
Tools
Too Limited • Drop fields that don’t fit
• Aggregate away detail
• Only answer questions you anticipated
Too Slow • Waste time fully understanding each data source
• Business requirements change by the time it’s built
Too Expensive • Costly commercial hardware/software
• Especially if you aren’t sure each data source is valuable
Supports an agile process
Data Lake
INNOVATION ENABLER
Load Data As-Is
Conform and Model What You Need
Iterate ELT for Each Data
Source
Access All Data Types with One Toolset
Not Limited • Keep all the fields in their original format
• Keep the detail data
• Pose and answer questions you never anticipated
Not Slow • Easy to get started, just copy raw data in
• Incremental changes as business requirements change
Not Expensive • Cheap storage, open source tools
• Defer hard work until you have a use for the data
Build to optimize warranty, extend to impact other areas
Why a Warranty Data Lake?
Warranty Chain is unusual, if not unique, in how many areas of the business it touches: • Customer – sales, usage, service, satisfaction • Engineering – quality, reliability, design • Service – repairs, claims • Distribution – dealers, retailers • Finance – insurance, reserves, credit
Spans front-office and back-office
Many relevant, but siloed, data sets
Large opportunities for analytical improvements
Large (usage) data sets on the horizon
By collecting multiple types of data, we can reduce reliance on lagging indicators (e.g., claims paid) and also identify, or develop new leading indicators for program performance.
Why a Warranty Data Lake?
Key analytical questions a warranty data lake can help answer: • Claim reserve and loss modeling • Early warning for losses • Claim & service triage optimization, “no fault found”
minimization • Fraud and/or service anomaly detection • Customer satisfaction warning signs • Sales & marketing effectiveness • Field usage and relation to service events • Recall cost estimation
Unstructured Data & Text
Claim notes include more details than codes… but less structure
Codes impose structure and simplify analysis • But only for problems you anticipated and created codes for
Text mining and natural language processing can unlock the secrets in claim and repair notes
Visualization techniques such as word trees and clouds enable free text exploration
Unstructured Data & Text
Tweets, product reviews and other user-generated content also can be collected and analyzed
Sentiment analysis, topic modeling, keyword & semantic analysis
Bayesian classifiers (like your spam filter) can be trained to spot quality complaints and failure reports
Possible early warning of service problems
Spatial Data
Spatial data (precipitation, temperature, crop production, land use, etc.) can be a proxy for usage
For example, snow accumulation Data spatially correlated with claims
Can be predictive of frequency
Usage Data & Telematics
The real ‘big data’ of warranty (high volume, high velocity)
Usage data, telematics, “Internet of Things”, sensor data
Real-time streaming or collected at service event
Tracking Device (GPS + Altimeter + Accelerometer) with post-processing can identify:
• Mileage and usage time
• Hard driving and hard braking incidents
• Inefficient shifting
• Idle time
• Geography and location of usage
• Altitude changes
More accurately map usage cycles to calendar time (relevant for warranty)
Segment light vs. heavy usage and model differing expected losses
•
Graph Data Graph data models the world as nodes connected by edges
Developed for web search (pages connected by hyperlinks) and social network analysis (users connected by friendship)
Can capture the complex relationships between parts, assemblies and sub-assemblies
Store part explosions and proximity natively as directed graphs
Capture failure sets of parts serviced together (causal or not)
Customer and Marketing Data
Product registration data
Demographic & third-party data
Marketing event data
Promotion participation
Direct mail participation and response
Email sends, opens, click-through
Customer service calls and dispositions
Clickstream data
Mobile app engagement
Consumer research and survey responses
Price and messaging test results
Other Useful Data Claim Workflow Details
Unit Shipment Data
Inventory Data
Dealer Information (locations, ownership, firmographics, etc.)
Retail Unit Sales, Pricing and Discounting
Damaged Product Images (especially for ADH claims)
ESP Terms & Conditions (often unstructured).
What does it all mean? A Data Lake provides a repository and processing engine to gather all relevant data for analysis
Handling variety is the key to breaking down data silos
A Warranty Data Lake can bring together front-office and back-office data to optimize program performance and improve loss projections
No one is going to build a warranty data lake with every data source discussed, but at After, we see companies beginning to build – and incrementally add to – starter data lakes
Create the corporate data asset to drive 21st century program improvement.
Richard Vermillion, CEO After, Inc. + Fulcrum Analytics, Inc.
https://www.linkedin.com/in/rvermillion
www.afterinc.com
800.374.4728 212.651.7000
70 West 40th Street, 10th Floor
New York, NY 10018
afterinc.com
Learn more