Upload
rommel-garcia
View
1.727
Download
1
Embed Size (px)
Citation preview
Page 1 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache RangerRommel Garcia
Page 2 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Who Am I
• Solutions Engineer @hortonworks• Security SME Lead @hortonworks• Author “Virtualizing Hadoop: How to Install, Deploy, and Optimize
Hadoop in A Virtualized Architecture”
Page 3 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
5 Pillars of Security
• Authentication• Authorization• Audit• Encryption• Centralized Administration
Page 4 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Hadoop Security Tools
• AD/LDAP (authentication)• Apache Knox (authentication)• Kerberos (authentication)• Apache Ranger (authorization, audit, kms)• HDFS TDE (data encryption)• Wire Encryption (data protection)
Page 5 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Data Sources
Page 6 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ranger
• Provides centralized policy definition for authorizing access to resources
• Supported components as of v0.5• HDFS• HBase• Hive• YARN• Knox• Storm• Solr• Kafka
Page 7 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Agent AgentAgent AgentAgent Agent
Apache Ranger authZ Architecture
HBase Hive YARN Knox Storm Solr Kafka
Agent
HDFS
Agent
Audit Server
Policy Server
Administration Portal
REST APIs
DB
SOLR
HDFS
KMS
LDAP/AD
user/group syncLog4j
Page 8 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Sample Simplified Workflow - HDFS
Policy Manager
Agent
Admin sets policies for HDFS files/folder
Data scientist runs a map reduce job
User Application
Users access HDFS data through application Name Node
IT users access HDFS through CLI
Namenode usesAgent for Authorization
Audit Database Audit logs pushed to DB
Namenode provides resource access to user/client
1
2
2
2
3
4
5
Page 9 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
authZ Best Practice – POSIX + Ranger
• HDFS -> POSIX -> owned by hdfs -> Ranger ACLs
• Hive -> POSIX -> owned by hive -> Ranger ACLs
• HBase -> POSIX -> owned by hbase -> Ranger ACLs
• Solr -> native -> owned by solr -> Ranger ACLs
• Kafka -> owned by kafka -> Ranger ACLs
Page 10 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
authZ Best Practice - Ranger
10
000(posix permissions on all HDFS files)
Page 11 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ranger UserSync Best Practice
11
• Ensure LDAPS is used to integrate with Ranger• Create OU ONLY for Hadoop users for performance• Only run usersync when necessary
– How much users are being added and how often– How much users are changing roles– Too much syncing can degrade LDAP performance
• Do not sync anonymously
Page 12 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Ranger Audit Locations
12
• HDFS– Long term storage that can be used to understand user event
trends and predict anomaly• RDBMS
– When SQL is preferred by auditors– MySQL, Oracle, Postgres, SQL Server
• Solr– Nice quick reporting metrics to understand user event trends
• Log4j Appenders
Page 13 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ranger – ACLs & Audit Demo
Environment• CentOS 6.6• 2 vms• FreeIPA 2.0• HDP 2.3
• Apache Ranger v0.5• Kerberized 2 node cluster
Page 14 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Q&A
Page 15 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
1
°
°
°
°
° °
° °
° °
° °
° N°
Ranger KMS + HDFS TDE
DATA ACCESS
DATA MANAGEMENT
1 ° ° ° ° °
° ° ° ° ° °
° ° ° ° ° °
SECURITY
YARN
HDFS Client
° ° ° ° ° °
° ° ° ° ° °
° °
° °
° °
° °
°HDFS (Hadoop Distributed File System)
Encryption Zone (attributes - EZKey ID, version)
HDFS-6134
Encrypted File(attributes - EDEK, IV)
Name Node
KeyProviderAPI
KeyProvider API
Key Management System (KMS)Hadoop-10433
KeyProvider API – Hadoop-10141
EDEK
DEK
Crypto Stream (r/w with DEK)
DEKs EZKs
Acronym Description
EZ Encryption Zone (an HDFS directory)
EZK Encryption Zone Key; master key associated with all files in an EZ
DEK Data Encryption Key, unique key associated with each file. EZ Key used to generate DEK
EDEK Encrypted DEK, Name Node only has access to encrypted DEK.
IV Initialization Vector
EDEK
EDEK
Page 16 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Apache Ranger – KMS + TDE Demo
Exercise• Create an encryption zone• Create key for encryption zone• Create file• Load to hdfs, encrypted zone• List encrypted file• Print encrypted file
Page 17 © Hortonworks Inc. 2011 – 2015. All Rights Reserved
Thank you!Rommel Garcia@rommelgarcia/in/rommelgarcia