Upload
amazon-web-services
View
577
Download
1
Embed Size (px)
DESCRIPTION
Companies are increasingly dealing with large data sets and looking for ways to increase the scale and lower the cost of Big Data analysis with AWS. In this interactive session, you’ll learn how to: * Integrate massive data volumes, from any on-premises or cloud data sources into AWS with Informatica’s high performance cloud integration connectors and Vibe Secure Agent technology. * Transform and load data into RDS, Redshift, and S3 without the need for coding. * Automate streaming data collection into Kinesis with built-in high availability and failover features.
Citation preview
High Performance Big Data Loading for AWS: Deep Dive and Best Practices from Informatica
Ajay Gandhi, VP Cloud Product Marketing
Nicolas Brisoux, Sr. Cloud Platform Specialist
Roderick Clemente, Product Specialist
July 10th, 2014
Why Are Customers Adopting Cloud and AWS?
1.
Cost savings
through economics
of scale
Don’t have to
guess on capacity
3.
Agility, Speed to
market & Flexibility
4.
Global in minutes
5.
2.
Trade capital
expense for
variable expense
Security and
Compliance
6.
3
So, How Do You Try Redshift – Quickly & Easily?
Amazon Redshift
4
Amazon Redshift
ERP, CRM Apps
Files
Legacy, RDBMS
Firewall
Logs, JSONs, Social
SaaS Apps
Use New Cloud & Traditional Data Sources
5
How To Manage Integration In This New World?
Amazon Redshift
ERP, CRM Apps
Files
Legacy, RDBMS
Firewall
Experiment.
Prototype.
Repeat.
Logs, JSONs, Social
SaaS Apps
AWS RDS Staging, Redshift DW, Infa Cloud
ERP, CRM Apps
Files
Legacy, RDBMS
Amazon
RDS
Logs, JSONs, Social
SaaS Apps
Experiment.
Prototype.
Repeat.
Amazon
Redshift
Map Once. Deploy Anywhere.
ON PREMISE HADOOP 3rd PARTY
APPLICATIONS
CLOUD
AWS EMR (Hadoop) and DynamoDB (NoSQL)
ERP, CRM Apps
Files
Legacy, RDBMS
Amazon
RDS
Amazon
Redshift
Amazon
EMR
Logs, JSONs, Social
SaaS Apps
Dynamo
DB
Growth Path to Hybrid Data Warehouse
ERP, CRM Apps
Files
Legacy, RDBMS
Amazon
RDS
Amazon
Redshift
Amazon
EMR
Logs, JSONs, Social
SaaS Apps
Dynamo
DB
Traditional
Staging
DB
Traditional
Data
Warehouse
Informatica Cloud - Get it right. Go live. Grow flexibly.
Cloud
Data Integration
Cloud
Real-time
Integration
Cloud Test
Data
Management
Cloud
Data
Quality
Cloud Master
Data
Management
Secure
Development DataLeverage Existing
Bulk Data
Cleanse and
De-dupe Data
Consolidate and
Visualize Data
Instant Access to
Actionable Data
“The Informatica Cloud Platform is the only complete solution for cloud integration and data management
that allows SaaS application administrators, architects, and developers to easily power optimal processes
connected with enterprise-ready data across cloud, on-premises, big data, social, and mobile environments.”
Hundreds of Connectors
JDBC
Technical Innovations for AWS Data Loading
• Out-of-the-box integration for S3, DynamoDB, Kinesis, Redshift and
RDS available NOW!
• Agile data loading for cloud data warehousing with Redshift
• Create target using cloud designer and multiple source objects
• High performance parallel data loading architecture
• E.g. load data in parallel across all 32 nodes in a Redshift cluster
• Push down optimization for increased throughput
• Push data transformations down to optimal source/target database engine
©2013 Informatica. Proprietary and Confidential 12
Loading data into REDSHIFT,
DYNAMODB and RDS
2
Informatica Cloud Architecture Overview- Redshift
4Secure
Agent
Your Company or VPC
Amazon
Redshift
1
Amazon
RDSAmazon S3 Amazon
DynamoDB
3
Informatica Cloud Amazon Redshift Architecture
Firewall
Informatica Cloud Secure Agent
Metadata Mappings
Build mapping and execute job
1
1Retrieve Account Data2
2
3 Put Account Data into Flat File
4 Transfer compressed Flat File to S3
5 Initiate copy from S3
6 Load data into Amazon Redshift
6
3
54
Amazon S3 Amazon Redshift
REDSHIFT and RDS DEMO!
REDSHIFT and DYNAMODB DEMO!
Loading data into KINESIS
1 0 1010
1 0 1010
1 0 1010
1 0 1010
1 0 1010
1 0 1010
KINESIS
IoT: Operational Intelligence
Documents andfiles
pdf DOC XLS EDI
Documents andfiles
pdf DOC XLS EDI
Machine device,cloud
Machine device,cloud
Social media, webLogs
Social media, webLogs
Machine device,cloud
Social media, webLogs
Documents andfiles
pdf DOC XLS EDI
Documents andfiles
pdf DOC XLS EDI
Machine device,cloud
Social media, webLogs
Documents andfiles
pdf DOC XLS EDI
Machine device,cloud
Social media, webLogs
aws
amazonkinesis
Documents andfiles
pdf DOC XLS EDI
Documents andfiles
pdf DOC XLS EDI
Machine device,cloud
Machine device,cloud
Social media, webLogs
Social media, webLogs
Machine device,cloud
Social media, webLogs
Documents andfiles
pdf DOC XLS EDI
Documents andfiles
pdf DOC XLS EDI
Machine device,cloud
Social media, webLogs
Documents andfiles
pdf DOC XLS EDI
Machine device,cloud
Social media, webLogs
aws
amazonkinesis
Streaming Collection: Vibe Data Stream
VD
S
VD
S
VD
S
• Central Monitoring Console for
Deployment
• Fault Tolerant
• High Availability
• Vertical &
Horizontal
Scaling
• Ease of
Configuration
Industrial Systems
IoT devices
Social media, webLogs
aws
amazonkinesis
HVAC
KINESIS DEMO!
Try it today:community.informatica.com/solutions/
vibe_data_stream_for_kinesis
Next Steps
• Visit us at Booth# 107 to
see more demos
• Try our 60-Day free trial
for Redshift
• www.informaticacloud.com
/cloud-trial-for-redshift
26