19
Apache Spark: Enterprise Security for Production Deployments 蒋 逸峰(しょう いつほう/Yifeng JiangSolutions Engineer, Hortonworks @uprush December 21, 2016

Spark Security

Embed Size (px)

Citation preview

Page 1: Spark Security

ApacheSpark:EnterpriseSecurityforProductionDeployments

蒋逸峰(しょういつほう/YifengJiang)SolutionsEngineer,Hortonworks@uprushDecember21,2016

Page 2: Spark Security

2 ©HortonworksInc.2011– 2016.AllRightsReserved

Whatarethesecurityrequirements?

à Sparkusershouldbeauthenticated

à IntegratewithcorporateLDAP/AD

à Allowonlyauthorizedusersaccess

à Auditallaccess

à Protectdatabothinmotion&atrest

à Easilymanageallsecurity

à Makesecurityeasytomanage

à …

Page 3: Spark Security

3 ©HortonworksInc.2011– 2016.AllRightsReserved

InteractingwithSpark

Ex

SparkonYARN

Zeppelin

Spark-Shell

Ex

SparkThriftServer

Driver

RESTServerDriver

Driver

Driver

Page 4: Spark Security

4 ©HortonworksInc.2011– 2016.AllRightsReserved

Context:SparkDeploymentModes

• Spark on YARN– Spark driver (SparkContext) in YARN AM(yarn-cluster)– Spark driver (SparkContext) in local (yarn-client):

• Spark Shell & Spark Thrift Server runs in yarn-client only

Client

Executor

App Master

Spark Driver

Client

Executor

App Master

Spark Driver

YARN-Client YARN-Cluster

Page 5: Spark Security

5 ©HortonworksInc.2011– 2016.AllRightsReserved

SparkonYARN

Spark Submit

John Doe

SparkAM

1

HadoopCluster

HDFS

Executor

YARNRM

4

2 3

NodeManager

Page 6: Spark Security

6 ©HortonworksInc.2011– 2016.AllRightsReserved

DEMO

ADATALAKEWITHOUTSECURITY

Page 7: Spark Security

7 ©HortonworksInc.2011– 2016.AllRightsReserved

Spark– Security– FourPillars

à Authenticationà Authorizationà Audità Encryption

SparkleveragesKerberosonYARN

Page 8: Spark Security

8 ©HortonworksInc.2011– 2016.AllRightsReserved

AuthenticateuserswithKerberos/AD

KDC

Use Spark ST, submit Spark Job

Spark gets Namenode (NN) service ticket

YARN launches Spark Executors using John Doe’s identity

Get service ticket for Spark,

John Doe

SparkAMNN

ExecutorreadsfromHDFSusingJohnDoe’sdelegationtoken

kinit

1

2

3

4

5

6

7

HadoopCluster

Page 9: Spark Security

9 ©HortonworksInc.2011– 2016.AllRightsReserved

Spark– Kerberos- Example

kinit -kt /etc/security/keytabs/[email protected]

spark-submit --class org.apache.spark.examples.SparkPi --master yarn-cluster --num-executors 3 --driver-memory 512m --executor-memory 512m --executor-cores 1 /usr/hdp/current/spark-client/lib/spark-examples*.jar 10

Page 10: Spark Security

10 ©HortonworksInc.2011– 2016.AllRightsReserved

HDFS

AllowonlyauthorizedusersaccesstoSparkjobs

YARN Cluster

A B C

KDC

Use Spark ST, submit Spark Job

Get Namenode (NN) service ticket

Executors read from HDFS

Client gets service ticket for Spark

RangerCanJohnlaunchthisjob?CanJohnreadthisfile

John Doe

Page 11: Spark Security

11 ©HortonworksInc.2011– 2016.AllRightsReserved

SparkSQL:Finegrainedsecurity

Page 12: Spark Security

12 ©HortonworksInc.2011– 2016.AllRightsReserved

SparkSQL Security-- CurrentStatusà SparkSQL – Onlycoarsegrainaccesscontroltoday

JDBCclientSpark

ThriftServer(driver)

YARNContainer

HDFS/apps/hive/warehouse/…

HiveMetastore

YARNContainer(DAG)

Runashiveuser

Page 13: Spark Security

13 ©HortonworksInc.2011– 2016.AllRightsReserved

SparkSQL Security

à SparkThriftServer&SparkExecutorsrunasHiveusertoreadalldata– NoauthorizationsupportinSTS– NoRangerintegrationsupport– AnyonecanauthenticatetoSTScanrealALLdata

à Noidentitypropagationon2nd hop(STStoExecutors):nodoAs equivalenceinHS2

Page 14: Spark Security

14 ©HortonworksInc.2011– 2016.AllRightsReserved

YARN & HDFS

HowHiveSecurityWorks

HiveServer 2A B C

KDC

Use Hive ST, submit query

4. Hive gets Namenode (NN) service ticket

5.Hive creates MR/ Tez using NN ST as proxy user

Ranger

1.Original request w/user id/password

Client gets query result

O/JDBC clients

LDAP

2.HS2 Authenticates user/pass

Ranger Sync users/groups from LDAP

3. Ranger AuthZ

Page 15: Spark Security

15 ©HortonworksInc.2011– 2016.AllRightsReserved

DEMO

HIVE&SPARKSQL AUTHORIZATION

Page 16: Spark Security

16 ©HortonworksInc.2011– 2016.AllRightsReserved

KeyFeatures:SparkColumnSecuritywithLLAP

à Fine-GrainedColumnLevelAccessControlforSparkSQL.

à Fullydynamicpoliciesperuser.Doesn’trequireviews.

à UseStandardRangerpoliciesandtoolstocontrolaccessandmaskingpolicies.

Flow:1. SparkSQL getsdatalocations

knownas“splits” fromHiveServerandplansquery.

2. HiveServer2authorizesaccessusingRanger.Per-userpolicieslikerowfilteringareapplied.

3. Sparkgetsamodifiedqueryplanbasedondynamicsecuritypolicy.

4. SparkreadsdatafromLLAP.Filtering/maskingguaranteedbyLLAPserver.

HiveServer2

Authorization

HiveMetastoreDataLocationsViewDefinitions

LLAPDataRead

FilterPushdown

RangerServer

DynamicPolicies

SparkClient

12

4

3

Page 17: Spark Security

17 ©HortonworksInc.2011– 2016.AllRightsReserved

Example:Per-UserRowFilteringbyRegioninSparkSQL

SparkUser2(EastRegion)

SparkUser1(WestRegion)

OriginalQuery:SELECT*fromCUSTOMERSWHEREtotal_spend>10000

QueryRewritesbasedonDynamicRangerPolicies

LLAPDataAccessUserID Region TotalSpend1 East 5,1312 East 27,8283 West 55,4934 West 7,1935 East 18,193

DynamicRewrite:SELECT*fromCUSTOMERSWHEREtotal_spend>10000

ANDregion=“east”

DynamicRewrite:SELECT*fromCUSTOMERSWHEREtotal_spend>10000

ANDregion=“west”

FinegrainedSecuritytoSparkSQL

http://bit.ly/2bLghGzhttp://bit.ly/2bTX7Pm

Page 18: Spark Security

18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved

DynamicMaskingandRowLevelFiltering

Country National ID CCNo Name DOB MRN PolicyIDUS 232323233 4539067047629850 JohnDoe 9/12/1969 8233054331 nj23j424

US 333287465 5391304868205600 JaneDoe 9/13/1969 3736885376 cadsd984

Japan T30007873 4532488639863821 BenJackson 73/1975 876392473A KK-287365

RangerPolicyEnforcement

Country NationalID

CC No MRN Name

US xxxxx3233 4539 xxxxxxxxxxxx null JohnDoe

US xxxxx7465 5391 xxxxxxxxxxxx null JaneDoe

Country NationalID

Name MRN

Japan 232323233 JohnDoe 8233054331

UsersfromUScustomersupportgroupsseerowfiltereddataforUSpersonswithCCandSSNasmaskedvaluesandMRNisnullified

JapanHealthPolicyAdminsviewrelevantcolumnsofdataunmaskedbutarerestrictedbyrowfilteringpoliciestoseedataforJapanpersonsonly

Page 19: Spark Security

19 ©HortonworksInc.2011– 2016.AllRightsReserved

THANKYOU

@uprush