Click here to load reader
View
559
Download
0
Embed Size (px)
BML
Practice of Baidu Large Scale Machine Learning Cloud
20+
100PB5000
2013Hadoop1.3w
2015Spark
HDFS
Matrix
IDC
Normandy
MapReduce/DAG Parameter
Server Queue Worker
Continous Data Stream
RDD
DCE ELF TaskManager Dstream Spark
Data Warehouse OLAP BML
Paddle
Essential Learning Framework
ELF
HadoopSparkMPIHadoop
SparkMPI
map-reduce
ELFParameter Server
Parameter Server
Compute Components
Parameter Server
Compute Components
Compute Components
Parameter Server Parameter Server
FSlocalhdfsstream
FSlocalhdfsstream
Sync/Async
Multi-Iters
coordinator
Essential Learning Framework
OnlineSGD200
100+LR
LDA500
baidu-rpc
hashtableParameter Server
Baidu Machine Learning
Baidu Machine Learning
2009 CTR,20
,,/,
, CTR ,
BML
GBDT + FFM
BML :
T /
L-BFGS SGD
CTR
LTR
BML
BML
CTR
BML
Online & Offline
AB Test
AucNDCG
UI Client
User
Predict Server Cluster
Online Training
RDS/KV
Event Q
Offline Data Store
Model Training
Bussiness Intelligence
Computing
Anti
Online Nearline Offline
CRM
Marketing
System
Mapping
Retargeting
-> -> -> ->
DMP
OpenCrawler
TF-IDF
NLP
LDA/PLSA
Kmeans/Dbscan
DMP
Word2Vec