View
545
Download
0
Category
Preview:
Citation preview
BigData in Banking
Challenges and Solutions
Arshavsky AndzheyDirector Big Data dept SberBank
Avarshavskysbtsberbankruandzheymaccom
2015
3
Innovations like killers ndash destruction stages of standard banking system
① Internet amp social networks
Control and choice
② Screens and Smartphones
Anyplace any time
③ Mobile wallet
Out of cash and plastic cards
④ Accounts without Banks
No bank accounts
⑤ BigDataCros-system personalization and targeting
Бретт Кинг Банк 30
4
BIGDATA as the development of approaches to the use of data
Information like competition
differentiator
Information like innovation
enablement
Information as strategic asset
Information for business analysys
Data for business
ldquoDay by day operationsrdquo
ldquoDatawarehousingrdquo
The
valu
e of
info
rmat
ion
for b
usin
ess
BIGDATA
ldquoInformation in business contextrdquo
ldquoBusiness innovations based on informationrdquo
ldquoAdaptive business strategyrdquo
Flexible information infra
structu
re
Continues Busin
ess Innovations
Information in th
e management context
Change management
Information usage methods maturity
+ INTERNET AND OPEN DATA
BIGDATA in Banking
5
BIGDATA In BankingInformation challenges in large Banks (XL)
Data is the most valuable asset in all XL banks
A few know how to apply data for solving even this day challenges
A few know how to leverage internet external or open data sources to understand clients better and attract new customers
6
The Key challenge with data analysis
Through the development of the Big Data Infrastructure which solves the challenges with data pre-processing and attribution thru building intelligent data processing Framework the company will be able to optimize labor costs by reducing works on data preparation of data for the development of business applications up to 70
BIGDATA in Banking
It is estimated (by Gartner) 70 of the time spent on analytical projects are dedicated to bringing cleaning and data integration mainly due to the following problems
The difficulty of locating data due to the carelessness among disparate business applications and business systemsTo be more than appropriate for analysis data require reengineering and reformatting1113088The acquisition of data for analysis in a specified format creates a huge burden on the teams that own the systems data source Often the same data is requested or purchase by a variety of departments and business units which creates additional work and chaosThe need for process setup regular data exchange
7
Data and Analytics tools as shared resource
ClientProductTransactionsLocationhellipInstruments
RISKS Dept
RETAIL Dept
OPERATIONS Dept
SEQURITY Dept
CORPORATE CLIENTS Dept
HR
BIGDATA in Banking
BIGDATA to a lesser extent about the data size and is more about the opportunity to work with many different data types formats and applications with powerful analytic capabilities
8
Sources of business growth and execution excelenceBIGDATA in Banking
Client
ПРИВЛЕЧЕНИЕ
УДЕРЖАНИЕ
ПРОДАЖИ
ПЕРВИЧНЫЕ
ВТОРИЧНЫЕ
КРЕДИТЫРИСКИ
ЗАДОЛЖЕННОСТИ
АНТИФРОД
ВНУТРЕННИЙ
ВНЕШНИЙ
HR ОПТИМИЗАЦИЯ
ПРОЦЕССОВ①
②
③ ④
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for all platforms functions and customers To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources
Tasks Information trust
Traditional and not so traditional data sources
BIGDATA in Banking
bull Delivery informationbull Information integration
(Cleaning Transformation Mapping Improvement)
bull Information searchbull Access to informationbull Study hypothesesbull Learning models and
information analysisbull Backup Cleanup Restorebull Administration
bull Lifecycle managementbull Data qualitybull Reference databull Record linkage and the resolution of
contradictionsbull Classificationbull Reporting
bull Internet databull Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
3
Innovations like killers ndash destruction stages of standard banking system
① Internet amp social networks
Control and choice
② Screens and Smartphones
Anyplace any time
③ Mobile wallet
Out of cash and plastic cards
④ Accounts without Banks
No bank accounts
⑤ BigDataCros-system personalization and targeting
Бретт Кинг Банк 30
4
BIGDATA as the development of approaches to the use of data
Information like competition
differentiator
Information like innovation
enablement
Information as strategic asset
Information for business analysys
Data for business
ldquoDay by day operationsrdquo
ldquoDatawarehousingrdquo
The
valu
e of
info
rmat
ion
for b
usin
ess
BIGDATA
ldquoInformation in business contextrdquo
ldquoBusiness innovations based on informationrdquo
ldquoAdaptive business strategyrdquo
Flexible information infra
structu
re
Continues Busin
ess Innovations
Information in th
e management context
Change management
Information usage methods maturity
+ INTERNET AND OPEN DATA
BIGDATA in Banking
5
BIGDATA In BankingInformation challenges in large Banks (XL)
Data is the most valuable asset in all XL banks
A few know how to apply data for solving even this day challenges
A few know how to leverage internet external or open data sources to understand clients better and attract new customers
6
The Key challenge with data analysis
Through the development of the Big Data Infrastructure which solves the challenges with data pre-processing and attribution thru building intelligent data processing Framework the company will be able to optimize labor costs by reducing works on data preparation of data for the development of business applications up to 70
BIGDATA in Banking
It is estimated (by Gartner) 70 of the time spent on analytical projects are dedicated to bringing cleaning and data integration mainly due to the following problems
The difficulty of locating data due to the carelessness among disparate business applications and business systemsTo be more than appropriate for analysis data require reengineering and reformatting1113088The acquisition of data for analysis in a specified format creates a huge burden on the teams that own the systems data source Often the same data is requested or purchase by a variety of departments and business units which creates additional work and chaosThe need for process setup regular data exchange
7
Data and Analytics tools as shared resource
ClientProductTransactionsLocationhellipInstruments
RISKS Dept
RETAIL Dept
OPERATIONS Dept
SEQURITY Dept
CORPORATE CLIENTS Dept
HR
BIGDATA in Banking
BIGDATA to a lesser extent about the data size and is more about the opportunity to work with many different data types formats and applications with powerful analytic capabilities
8
Sources of business growth and execution excelenceBIGDATA in Banking
Client
ПРИВЛЕЧЕНИЕ
УДЕРЖАНИЕ
ПРОДАЖИ
ПЕРВИЧНЫЕ
ВТОРИЧНЫЕ
КРЕДИТЫРИСКИ
ЗАДОЛЖЕННОСТИ
АНТИФРОД
ВНУТРЕННИЙ
ВНЕШНИЙ
HR ОПТИМИЗАЦИЯ
ПРОЦЕССОВ①
②
③ ④
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for all platforms functions and customers To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources
Tasks Information trust
Traditional and not so traditional data sources
BIGDATA in Banking
bull Delivery informationbull Information integration
(Cleaning Transformation Mapping Improvement)
bull Information searchbull Access to informationbull Study hypothesesbull Learning models and
information analysisbull Backup Cleanup Restorebull Administration
bull Lifecycle managementbull Data qualitybull Reference databull Record linkage and the resolution of
contradictionsbull Classificationbull Reporting
bull Internet databull Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
4
BIGDATA as the development of approaches to the use of data
Information like competition
differentiator
Information like innovation
enablement
Information as strategic asset
Information for business analysys
Data for business
ldquoDay by day operationsrdquo
ldquoDatawarehousingrdquo
The
valu
e of
info
rmat
ion
for b
usin
ess
BIGDATA
ldquoInformation in business contextrdquo
ldquoBusiness innovations based on informationrdquo
ldquoAdaptive business strategyrdquo
Flexible information infra
structu
re
Continues Busin
ess Innovations
Information in th
e management context
Change management
Information usage methods maturity
+ INTERNET AND OPEN DATA
BIGDATA in Banking
5
BIGDATA In BankingInformation challenges in large Banks (XL)
Data is the most valuable asset in all XL banks
A few know how to apply data for solving even this day challenges
A few know how to leverage internet external or open data sources to understand clients better and attract new customers
6
The Key challenge with data analysis
Through the development of the Big Data Infrastructure which solves the challenges with data pre-processing and attribution thru building intelligent data processing Framework the company will be able to optimize labor costs by reducing works on data preparation of data for the development of business applications up to 70
BIGDATA in Banking
It is estimated (by Gartner) 70 of the time spent on analytical projects are dedicated to bringing cleaning and data integration mainly due to the following problems
The difficulty of locating data due to the carelessness among disparate business applications and business systemsTo be more than appropriate for analysis data require reengineering and reformatting1113088The acquisition of data for analysis in a specified format creates a huge burden on the teams that own the systems data source Often the same data is requested or purchase by a variety of departments and business units which creates additional work and chaosThe need for process setup regular data exchange
7
Data and Analytics tools as shared resource
ClientProductTransactionsLocationhellipInstruments
RISKS Dept
RETAIL Dept
OPERATIONS Dept
SEQURITY Dept
CORPORATE CLIENTS Dept
HR
BIGDATA in Banking
BIGDATA to a lesser extent about the data size and is more about the opportunity to work with many different data types formats and applications with powerful analytic capabilities
8
Sources of business growth and execution excelenceBIGDATA in Banking
Client
ПРИВЛЕЧЕНИЕ
УДЕРЖАНИЕ
ПРОДАЖИ
ПЕРВИЧНЫЕ
ВТОРИЧНЫЕ
КРЕДИТЫРИСКИ
ЗАДОЛЖЕННОСТИ
АНТИФРОД
ВНУТРЕННИЙ
ВНЕШНИЙ
HR ОПТИМИЗАЦИЯ
ПРОЦЕССОВ①
②
③ ④
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for all platforms functions and customers To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources
Tasks Information trust
Traditional and not so traditional data sources
BIGDATA in Banking
bull Delivery informationbull Information integration
(Cleaning Transformation Mapping Improvement)
bull Information searchbull Access to informationbull Study hypothesesbull Learning models and
information analysisbull Backup Cleanup Restorebull Administration
bull Lifecycle managementbull Data qualitybull Reference databull Record linkage and the resolution of
contradictionsbull Classificationbull Reporting
bull Internet databull Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
5
BIGDATA In BankingInformation challenges in large Banks (XL)
Data is the most valuable asset in all XL banks
A few know how to apply data for solving even this day challenges
A few know how to leverage internet external or open data sources to understand clients better and attract new customers
6
The Key challenge with data analysis
Through the development of the Big Data Infrastructure which solves the challenges with data pre-processing and attribution thru building intelligent data processing Framework the company will be able to optimize labor costs by reducing works on data preparation of data for the development of business applications up to 70
BIGDATA in Banking
It is estimated (by Gartner) 70 of the time spent on analytical projects are dedicated to bringing cleaning and data integration mainly due to the following problems
The difficulty of locating data due to the carelessness among disparate business applications and business systemsTo be more than appropriate for analysis data require reengineering and reformatting1113088The acquisition of data for analysis in a specified format creates a huge burden on the teams that own the systems data source Often the same data is requested or purchase by a variety of departments and business units which creates additional work and chaosThe need for process setup regular data exchange
7
Data and Analytics tools as shared resource
ClientProductTransactionsLocationhellipInstruments
RISKS Dept
RETAIL Dept
OPERATIONS Dept
SEQURITY Dept
CORPORATE CLIENTS Dept
HR
BIGDATA in Banking
BIGDATA to a lesser extent about the data size and is more about the opportunity to work with many different data types formats and applications with powerful analytic capabilities
8
Sources of business growth and execution excelenceBIGDATA in Banking
Client
ПРИВЛЕЧЕНИЕ
УДЕРЖАНИЕ
ПРОДАЖИ
ПЕРВИЧНЫЕ
ВТОРИЧНЫЕ
КРЕДИТЫРИСКИ
ЗАДОЛЖЕННОСТИ
АНТИФРОД
ВНУТРЕННИЙ
ВНЕШНИЙ
HR ОПТИМИЗАЦИЯ
ПРОЦЕССОВ①
②
③ ④
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for all platforms functions and customers To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources
Tasks Information trust
Traditional and not so traditional data sources
BIGDATA in Banking
bull Delivery informationbull Information integration
(Cleaning Transformation Mapping Improvement)
bull Information searchbull Access to informationbull Study hypothesesbull Learning models and
information analysisbull Backup Cleanup Restorebull Administration
bull Lifecycle managementbull Data qualitybull Reference databull Record linkage and the resolution of
contradictionsbull Classificationbull Reporting
bull Internet databull Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
6
The Key challenge with data analysis
Through the development of the Big Data Infrastructure which solves the challenges with data pre-processing and attribution thru building intelligent data processing Framework the company will be able to optimize labor costs by reducing works on data preparation of data for the development of business applications up to 70
BIGDATA in Banking
It is estimated (by Gartner) 70 of the time spent on analytical projects are dedicated to bringing cleaning and data integration mainly due to the following problems
The difficulty of locating data due to the carelessness among disparate business applications and business systemsTo be more than appropriate for analysis data require reengineering and reformatting1113088The acquisition of data for analysis in a specified format creates a huge burden on the teams that own the systems data source Often the same data is requested or purchase by a variety of departments and business units which creates additional work and chaosThe need for process setup regular data exchange
7
Data and Analytics tools as shared resource
ClientProductTransactionsLocationhellipInstruments
RISKS Dept
RETAIL Dept
OPERATIONS Dept
SEQURITY Dept
CORPORATE CLIENTS Dept
HR
BIGDATA in Banking
BIGDATA to a lesser extent about the data size and is more about the opportunity to work with many different data types formats and applications with powerful analytic capabilities
8
Sources of business growth and execution excelenceBIGDATA in Banking
Client
ПРИВЛЕЧЕНИЕ
УДЕРЖАНИЕ
ПРОДАЖИ
ПЕРВИЧНЫЕ
ВТОРИЧНЫЕ
КРЕДИТЫРИСКИ
ЗАДОЛЖЕННОСТИ
АНТИФРОД
ВНУТРЕННИЙ
ВНЕШНИЙ
HR ОПТИМИЗАЦИЯ
ПРОЦЕССОВ①
②
③ ④
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for all platforms functions and customers To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources
Tasks Information trust
Traditional and not so traditional data sources
BIGDATA in Banking
bull Delivery informationbull Information integration
(Cleaning Transformation Mapping Improvement)
bull Information searchbull Access to informationbull Study hypothesesbull Learning models and
information analysisbull Backup Cleanup Restorebull Administration
bull Lifecycle managementbull Data qualitybull Reference databull Record linkage and the resolution of
contradictionsbull Classificationbull Reporting
bull Internet databull Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
7
Data and Analytics tools as shared resource
ClientProductTransactionsLocationhellipInstruments
RISKS Dept
RETAIL Dept
OPERATIONS Dept
SEQURITY Dept
CORPORATE CLIENTS Dept
HR
BIGDATA in Banking
BIGDATA to a lesser extent about the data size and is more about the opportunity to work with many different data types formats and applications with powerful analytic capabilities
8
Sources of business growth and execution excelenceBIGDATA in Banking
Client
ПРИВЛЕЧЕНИЕ
УДЕРЖАНИЕ
ПРОДАЖИ
ПЕРВИЧНЫЕ
ВТОРИЧНЫЕ
КРЕДИТЫРИСКИ
ЗАДОЛЖЕННОСТИ
АНТИФРОД
ВНУТРЕННИЙ
ВНЕШНИЙ
HR ОПТИМИЗАЦИЯ
ПРОЦЕССОВ①
②
③ ④
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for all platforms functions and customers To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources
Tasks Information trust
Traditional and not so traditional data sources
BIGDATA in Banking
bull Delivery informationbull Information integration
(Cleaning Transformation Mapping Improvement)
bull Information searchbull Access to informationbull Study hypothesesbull Learning models and
information analysisbull Backup Cleanup Restorebull Administration
bull Lifecycle managementbull Data qualitybull Reference databull Record linkage and the resolution of
contradictionsbull Classificationbull Reporting
bull Internet databull Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
8
Sources of business growth and execution excelenceBIGDATA in Banking
Client
ПРИВЛЕЧЕНИЕ
УДЕРЖАНИЕ
ПРОДАЖИ
ПЕРВИЧНЫЕ
ВТОРИЧНЫЕ
КРЕДИТЫРИСКИ
ЗАДОЛЖЕННОСТИ
АНТИФРОД
ВНУТРЕННИЙ
ВНЕШНИЙ
HR ОПТИМИЗАЦИЯ
ПРОЦЕССОВ①
②
③ ④
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for all platforms functions and customers To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources
Tasks Information trust
Traditional and not so traditional data sources
BIGDATA in Banking
bull Delivery informationbull Information integration
(Cleaning Transformation Mapping Improvement)
bull Information searchbull Access to informationbull Study hypothesesbull Learning models and
information analysisbull Backup Cleanup Restorebull Administration
bull Lifecycle managementbull Data qualitybull Reference databull Record linkage and the resolution of
contradictionsbull Classificationbull Reporting
bull Internet databull Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
9
Data Factory conception
Big Data Factory should enable data processing in a uniform manner for all platforms functions and customers To build easily changeable and easy to use data processing operating model with the required level of trust for both traditional and not so traditional data sources
Tasks Information trust
Traditional and not so traditional data sources
BIGDATA in Banking
bull Delivery informationbull Information integration
(Cleaning Transformation Mapping Improvement)
bull Information searchbull Access to informationbull Study hypothesesbull Learning models and
information analysisbull Backup Cleanup Restorebull Administration
bull Lifecycle managementbull Data qualitybull Reference databull Record linkage and the resolution of
contradictionsbull Classificationbull Reporting
bull Internet databull Data virtualization
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
10
ЦК Супермассивов данныхBIGDATA PLATFORM HIGH-LEVEL CONCEPTION
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
11
BIGDATA in BankingData Factory Scenarious
The experts of the subject areas of the Banks business need to access the organizations data for research sampling annotation and modelling
Data Scientists works on new models
Marketing is looking for data for the new compains
Security services looking for data for drill a suspicious transaction
Retail unit wants to make the best proposal to the clienthelliphellip
Daily activity
The need for ad hoc access to diverse data
Support analysis and decision making
To use the terminology subject matter experts when accessing data
Providing the same easy access to data in spreadsheets with the ability to scale to huge volumes and distribution on a huge variety of types of information while protecting sensitive information and optimizing it storage systems
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
BIGDATA in BankingData 2 profit process
Task formalization
DATAPREPARATION
DATAEXPLORATION
ADDITIONAL INDICATORS
ALATITICS ampMODELING
MODEL VALIDATION
MODEL PRODUCTIZATION
EFFECIENCY MONITORING
12
①
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
13
HDFS row dataData
ex
chan
ge
Data preparation processing and analytical layer
Analytical Views
Ad-hoc analytics Development factory
Streaming
Big Data applications Integration
marts API
BIGDATA in BankingPossible architecture
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
14
BI amp BIGDATA
Traditional BI Big Data
Based on DWH
Precession is crucial
Flat data scheme
Long time 2 market
hi-end hardware
Based on Hadoop and Spark
Any precesion
Complex and variable data schemes
Ad-hoc analytics
Short time 2 market
New data sources
Low cost
Both approaches are valid
BIGDATA in Banking
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
15
BIGDATA in BankingIs not expensive - OPEN SOURCE does work
Low costNo vendor lockCommunity support
APPLICATION LAYER
Spark
Hadoop
SQL
NoS
QL
DB
16
BIGDATA in Banking
Thanks and good luck
16
BIGDATA in Banking
Thanks and good luck
Recommended