Upload
vuhanh
View
233
Download
2
Embed Size (px)
Citation preview
許良謀 亞太區儲存業務部技術總監 Dell
Inform 洞悉:
乘載巨量資料的 航空母艦
2
Considerations for Big Data applications
• Limited IT budgets
• Competitive pressures
• Long cycle time to insights
• Limited IT resources
• New data sources: social data
• High complexity of integrating data and managing performance
• Overburdened data management systems
• Legacy proprietary systems requiring
forklift upgrades
Customer Wants Challenges and Pain Points
Cost efficient data analytics platforms
Easy to deploy and simple to use analytic solutions
Data agnostic solutions
Scalable architectures
$
3
Why all the Big Data hype?
31.7% CAGR (7x overall IT market) IDC Worldwide Big Data Technology and Services 2012-2016 Forecast [Jan 2013]
80% of the effort is
cleaning up data O’Reilly Strata, “Big Data Now 2012”
Big Data Opportunity Heat Map by Industry (Gartner, July 2012)
VERY HOT HOT MODERATE LOW VERY LOW
Primary Target Markets in Financial Services, Communications, Government, and Manufacturing
..Generates new sets of questions
4
Why is our product more popular with teenagers?
How will my social media campaign impact my product launch?
How do I capture, analyze, and manage all of this data?
Will monsoons impact my sales in Indonesia and parts availability from my suppliers next quarter?
How do I make the connections?
How do I turn this data into operational intelligence?
Advanced Analytics Social and Web Analytics Live Data Feeds
Big Data Use cases
Operational Data Processing (data pain points)
EDW Augmentation
ETL Offload Batch
Processing Data
Reservoir Log
Processing
Customer 360 View
Content Optimization
Recommendation Engine
Network Analytics
Fraud Detection
Predictive Analytics (bigger questions)
5
6
Big Data Conundrum
Architecture Considerations
• Relational vs. Non-Relational data stores
• Virtualized on-premise servers or external clouds
• Uncompressed storage format vs. compressed storage
• Distributed vs. consolidated data management and storage
• Column-oriented vs. row-oriented databases and storage
Data Management and Storage
• DW appliance vs. Hadoop clusters
• Commodity vs. Special-Purpose Hardware
• In-memory or disk-based processing
• Extending traditional Data warehouse vs. new solutions
• Cloud based processing solutions vs. in-house solutions
Data Processing
• Hire new people with the required Big Data skill sets vs. retrain new staff
• Create a sandbox to experiment to explore with Big Data vs. connect users to the production system
• Create a hybrid platform vs. migrate computation from a legacy platform or to that from a new vendor
Other Implementation Considerations
Trade off Continuum
Scalability Performance Cost Reliability Integration Security
7
The Big Data Stack: What’s at Stake? In
cre
asi
ng
th
e v
alu
e o
f th
e d
ata
Hardware Infrastructure Servers, Storage and Networking
Information Management/ Ent. Search Architectures, policies, and process, for data lifecycle management. (controlled, optimized, access, searched)
BI Tools Dashboards, Reporting
ETL and Data Integration Bulk, Real-time
Advanced Analytics Predictive Modeling, Data Visualization tools
Database Management System OLTP, Relational, SQL, NoSQL, NewSQL, Columnar, Pig
Data Warehouse OLAP, Hive
Ecosystem Services
Consulting
Implementation
Support
Maintenance
Training
Big Data Hosted Services
Dell Storage for Big Data
8 Confidential
9
Dell Storage POV for Big Data Applications
Modern Storage Architectures
Optimized for large data sets
• Manages exponential data growth through dynamic storage tiering and scale out architecture
• Makes data easily accessible for analysis and review
• Efficiencies in virtualized storage to manage structured data repositories
Efficient Storage infrastructures
Data-driven business results
• Stays ahead of storage growth while protecting investments
• Deploys common technologies across a family of storage products
Innovation driven storage foundation that is… • Easy to use • Virtualized • Intelligently tiers • Scalable • No forklifts • Innovative licensing
Delivering superior
value
10
Applying Dell storage for Big Data: Whiteboard View
10
BI Tools Dashboards, Reporting: SAP Hana
ETL and Data Integration Batch, Real-time
Advanced Analytics and BI Tools
Data Warehouse OLAP, ERP –Oracle, SAP,
Information Management/ Ent. Search Architectures, policies, and process, for data lifecycle management. (
Aggregate
Data Warehouse
Aggregate
OLTP Data:
Unstructured Data: Social Media, Web logs, Machine Data, etc.
Secondary Storage for information management policies, disaster recovery, data protection
NAS for shared advanced analytic reports
Dell NAS Scalable file system
ETL Integration
Storage Area Network (SAN)
Optimized Solutions for MS-SQL Decision Support Systems
11
DSS Workload I/O Profile
I/O Characteristics
• High bandwidth – High MBPS • Scan centric – Data Load &
Query
Storage Requirement
• Sequential Reads and Sequential Writes
• Large Block size I/O Operations
• Easy to deploy DP and DR options
Dell Storage Value
• 10Gb iSCSI (EqualLogic 10GbE products)
• MS-SQL Host Integration Tools • Multiple network paths to data • High Performance - EqualLogic
10Gb iSCSI array can perform at near line-rate for sequential read I/O operations, through to disk, and scale performance at this level by adding more arrays
Dell PowerEdge R820
EqualLogic PS6110X
SQL Log
Server LAN
SQL DB1
SQL
DB
SQL Data
Dell Force10 S4810
Dell EqualLogic MS-SQL DSS Reference Architecture
Optimized Solutions for MS-SQL OLTP Applications
12
OLTP Workload I/O Profile
I/O Characteristics
• High Random I/O throughput. Low latency
• Seek centric
Storage Requirement
• Large number of IOs – High IOPS
• ~ 70% random reads, ~30% random writes
• Small Block Size IO operations
Dell Storage Value
• Dell EqualLogic - high IOPS, low Latency
• MS-SQL Host Integration– for ease of use
• Multiple network paths for high resiliency
• Snapshots and clones for data protection
• Synchronous and asynchronous replication for high availability and disaster recovery
VM1
SQL DB
Virtual Servers
VM2
SQL DB
Dell PowerEdge
Servers
Dell EqualLogic Storage
SQL Log
Server LAN
SQL Data
Dell Force10 S60
Dell EqualLogic MS-SQL OLTP Reference Architecture
Optimized Solutions for Oracle Decision Support Systems
13
DSS Workload I/O Profile
I/O Characteristics
• High bandwidth – High MBPS • Scan centric – Data Load &
Query
Storage Requirement
• Sequential Reads and Sequential Writes
• Large Block size I/O Operations
• Easy to deploy DP and DR options
Dell Storage Value
• 10Gb iSCSI (EqualLogic 10GbE products)
• Multiple network paths to data • High Performance - EqualLogic
10Gb iSCSI array can perform at near line-rate for sequential read I/O operations, through to disk, and scale performance at this level by adding more arrays
Dell PowerEdge R710
EqualLogic PS6110X
ORCL Log
Server LAN
ORCL Data
Dell PC 8024F
Dell EqualLogic Oracle DSS Reference Architecture
Oracle RAC
Optimized Solutions for Oracle OLTP Applications
14
OLTP Workload I/O Profile
I/O Characteristics
• High Random I/O throughput. Low latency
• Seek centric
Storage Requirement
• Large number of IOs – High IOPS
• ~ 70% random reads, ~30% random writes
• Small Block Size IO operations
Dell Storage Value
• Dell EqualLogic - high IOPS, low Latency
• Multiple network paths for high resiliency
• Snapshots and clones for data protection
• Synchronous and asynchronous replication for high availability and disaster recovery
Dell EqualLogic Storage
ORCLLog
Server LAN
ORCLData
Dell PC 8024F
Dell EqualLogic Oracle OLTP Reference Architecture
Dell PowerEdge R710
Oracle RAC
Dell Storage Differentiation for Big Data
15
16
Leveraging Dell Compellent Flash to improve search performance
Did you know that… Dell offers 56% lower costs versus
traditional spinning disks?
Dell Compellent Flash storage solution improves Database queries by 10X
Low Cost Disk
15K/10K Drives
Flash Drives
Server Cache
Server Cache
Servers
Networking
Storage
• Move hot data to flash with industry’s first MLC/SLC intelligent tiering
• Improves OLTP and data intensive app performance
75% Lower cost than
most all-flash solutions
90% Latency reduction
vs. traditional spinning disks
Performance Optimized
17
Improve data warehouse performance Dell Compellent Data Progression
Did you know that….
Dell tiering significantly improves performance
Reduced Tier 1 storage by 70 percent by
moving infrequently accessed data to lower
storage tiers
Low Cost Disk
15K/10K Drives
Flash Drives
Server Cache
Server Cache
Servers
Networking
Storage
Instantly respond to performance
and demand spikes
Keep mission-critical apps
always on
Keep information always available
for faster decisions
Manage DB growth within a
budget
Performance
Scale
Intelligent tiering
High Performance
18
Reducing Big Data Storage Footprint
Did you know that… Dell offers the densest solution of
any major OEM vendor?
Dell storage solutions are designed to minimize power and cooling while still delivering the capacity
your applications and users demand
336TB in 5U
Low Cost Disk
15K/10K Drives
Flash Drives
Server Cache
Server Cache
Servers
Networking
Storage
• Move cold data to dense “cheap and deep”
• Optimizes Data Progression
• Reduces OPEX
2.8X More dense vs
traditional 2U, 3.5”disk drive enclosures
2PB Storage capacity in
48RU with 2 Compellent Storage
Center systems
Cost Optimized
19
Customer Study: Freeman Exhibitions: US
“Rather then spending ~ 25 hours every month to provision storage space, we now only need a few hours with the Dell Compellent SAN. This translated to about $80K in IT overhead reduction.”
Martin Vogt Infrastructure Engineering Manager, Freeman
Challenge • Managing an 80TB Oracle
Database • Required a highly scalable, easy
to manage Linux data storage solution for Oracle Database
Solution: Dell Compellent Storage Center
SAN • 62% storage volume reduction
through consolidation
• 40% reduction in storage costs
Dell SecureWorks: Hadoop Use Case
20
Mission
24 hours a day, 365 days a year, helping to protect the security of its customers’ assets in real time
Challenge Collecting, processing and analyzing massive amounts of data from customer environments
Results with Hadoop
• Cost of data storage reduced to ~21 cents per gigabyte
• 80% savings over previous solution
• 6 months faster deployment
• < 1 yr. payback on entire investment
21
Dell storage portfolio – Ready for Big Data deployment
• Dell’s intelligent tiering provides best $/IOP and $/GB • EqualLogic hybrid and Compellent SSD tiering optimize
data placement across a mix of drive types Lower TCO
• Dell Storage PowerVault for simple Big Data DAS , EqualLogic and Compellent for data warehouse shared storage
Scalable Portfolio
Flexible Architectures
• Dell offers advanced, modern architectures that allow customers to seamlessly scale to exponential data growth
• Dell offers a unique enterprise portfolio of products and services that can help reduce the risk of implementing complex Big Data projects
Compute, Network and Storage