Upload
kenta-oono
View
7.202
Download
2
Embed Size (px)
Citation preview
Introduction to Chainer
Preferred Networks
2015/9/5 LL Ring Recursive@ 1stRing
(@delta2323_)
2012.3 PFI 2014.10 PFN
Chainer
http://delta2323.github.io
NIPS2014ICML2015
1!
2
git clone https://github.com/pfnet/chainer.git
Chainerhttp://chainer.org
PFNPFI
201569
1.3.0201592
1.3.1 (9/16) 1.4.0 (9/30)
MIT (Expat)
HPhttp://chainer.org
https://github.com/pfnet/chainer
Twitter@ChainerOfficial
Google GroupChainer Uesr Group
Contribution Guidehttp://docs.chainer.org/en/stable/contribution.
html
PowerfulCUDAGPU
Flexible
IntuitivePython
x1
xN
h1
hH
kM
k1
yM
y1
Forward
Backward
5
50%
AI
+
+
QSAR
()
e
Deep Q Network*
* Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533.** CaffeDeep Q-Network http://d.hatena.ne.jp/muupan/20141021/1413850461*** PFI2014 http://www.ustream.tv/recorded/53153399
7
Kingma, Diederik P., et al. "Semi-supervised learning with deep generative models." Advances in Neural Information Processing Systems. 2014.
http://soumith.ch/eyescream/
Eye Scream Project http://soumith.ch/eyescream/
A Neural Algorithm of Artistic Style [Gatys+'15]
9
https://research.preferred.jp/2015/06/distributed-deep-reinforcement-learning/
http://rll.berkeley.edu/deeplearningrobotics/
10
PubChem
55
MPI
MPI
TSUBAME 824GPU(K40) MPI
Neural Network
x1
xN
h1
hH
kM
k1
yM
y1
f1f2
f3
W2/b2W1/b1
tM
t1
Forward
Backward
W1:1 b1:1 W2:2 b2:2
11
Forward h = f1(x) = Sigmoid(W1x+b1) k = f2(h) = Sigmoid(W2h+b2) y = f3(k) = SoftMax(k)
f3i(k) = exp(ki)/_{j} exp(kj)
DeepLearning
Caffe Chainer
n
Blob Variable
Layer Function Net (FunctionSet)
Solver Optimizer
12
(DAG)
Forward Propagation
Forward(Loss)
Loss
Forward
Chain Rule
Forward Propagationy = f(x; )
: Layer
L
,
x y
Backward Propagation
Backward
Backward
Backward
( : )
SGD / Momentum / AdaGrad / ADADELTA / RMSprop / Adam etc
http://imgur.com/a/Hqolp
OSLinuxUbuntu 14.04
MacOSWindows
Python(Cpython)
2.7+/3.4+
Numpy1.9+Six1.9+
CUDACUDA6.5+
pip install chainer
Github Stars20155
Theano
PyLearn2
https://twitter.com/fchollet/status/635891305084796929
Github Stars20158
https://twitter.com/fchollet/status/635891305084796929
PyLearn2
Theano
2
Chainer
Python
CuPyGPUNumPy
NumPy
CPU GPU
BLAS CUDAToolkit cuDNN
NumPy CuPy
Chainer
Python
NumPyPythonPythonNumPy
GoogLeNet, NTM, Recursive Net, LSTM
Chainer Caffe167 2058
GoogleNet
(2012)AlexNet*, 7
(2014) GoogLeNet**, 22
22
* ImageNet Classification with Deep Convolutional Neural Networks http://www.image-net.org/challenges/LSVRC/2012/supervision.pdf** Szegedy, Christian, et al. "Going deeper with convolutions." arXiv preprint arXiv:1409.4842 (2014).
ChainerDefine-by-Run
Define-and-Run
prototxt, yaml, Lua etc.
Caffe/Torch/Theano
1
f g
x f g
Define-by-Run
for
x yf
x = chainer.Variable(...)y = f(x)z = g(x)
zg
=
Chainer
Forward
x = chainer.Variable(np.array(1))
y = chainer.Variable(np.array(1))
z = x**2 + 2*x*y + y
z.backward()
Split
x
y
_ ** 2
2 * _ _ * _
_ + _ z
_ + _
chainer.Variable
chainer.Function
Forward
Split
x
y
_ ** 2
2 * _ _ * _
_ + _ z
_ + _
x = chainer.Variable(np.array(1))
y = chainer.Variable(np.array(1))
z = x**2 + 2*x*y + y
z.backward()
MNIST# (1) Model definition
model = FunctionSet(
l1=F.Linear(784, 100),
l2=F.Linear(100, 100),
l3=F.Linear(100, 10)).to_gpu()
opt = optimizers.SGD()
opt.setup(model)
# (2) Forward computation
def forward(x, t):
h1 = F.relu(model.l1(x))
h2 = F.relu(model.l2(h1))
y = model.l3(h2)
return F.softmax_cross_entropy(y, t)
# (3) Training loop
for epoch in xrange(n_epoch):
for i in xrange(0, N, batchsize):
x = Variable(to_gpu(...))
t = Variable(to_gpu(...))
opt.zero_grads()
loss = forward(x, t)
loss.backward()opt.update()
784 100 100 10
0:2%1:5%2:90%
9:1%
FunctionSet# Model definition
model = FunctionSet(
l1=F.Linear(784, 100),
l2=F.Linear(100, 100),
l3=F.Linear(100, 10)).to_gpu()
opt = optimizers.SGD()
opt.setup(model)
# Forward computation
def forward(x, t):
h1 = F.relu(model.l1(x))
h2 = F.relu(model.l2(h1))
y = model.l3(h2)
return F.softmax_cross_entropy(y, t)
# Training loop
for epoch in xrange(n_epoch):
for i in xrange(0, N, batchsize):
x = Variable(to_gpu(...))
t = Variable(to_gpu(...))
opt.zero_grads()
loss = forward(x, t)
loss.backward()opt.update()
FunctionFunctionSet
784 100 100 10
0:2%1:5%2:90%
9:1%
Optimizer# Model definition
model = FunctionSet(
l1=F.Linear(784, 100),
l2=F.Linear(100, 100),
l3=F.Linear(100, 10)).to_gpu()
opt = optimizers.SGD()
opt.setup(model)
# Forward computation
def forward(x, t):
h1 = F.relu(model.l1(x))
h2 = F.relu(model.l2(h1))
y = model.l3(h2)
return F.softmax_cross_entropy(y, t)
# Training loop
for epoch in xrange(n_epoch):
for i in xrange(0, N, batchsize):
x = Variable(to_gpu(...))
t = Variable(to_gpu(...))
opt.zero_grads()
loss = forward(x, t)
loss.backward()opt.update()
Optimizer
784 100 100 10
0:2%1:5%2:90%
9:1%
# Model definition
model = FunctionSet(
l1=F.Linear(784, 100),
l2=F.Linear(100, 100),
l3=F.Linear(100, 10)).to_gpu()
opt = optimizers.SGD()
opt.setup(model)
# Forward computation
def forward(x, t):
h1 = F.relu(model.l1(x))
h2 = F.relu(model.l2(h1))
y = model.l3(h2)
return F.softmax_cross_entropy(y, t)
# Training loop
for epoch in xrange(n_epoch):
for i in xrange(0, N, batchsize):
x = Variable(to_gpu(...))
t = Variable(to_gpu(...))
opt.zero_grads()
loss = forward(x, t)loss.backward()opt.update()
784 100 100 10
0:2%1:5%2:90%
9:1%
# Model definition
model = FunctionSet(
l1=F.Linear(784, 100),
l2=F.Linear(100, 100),
l3=F.Linear(100, 10)).to_gpu()
opt = optimizers.SGD()
opt.setup(model)
# Forward computation
def forward(x, t):
h1 = F.relu(model.l1(x))
h2 = F.relu(model.l2(h1))
y = model.l3(h2)
return F.softmax_cross_entropy(y, t)
# Training loop
for epoch in xrange(n_epoch):
for i in xrange(0, N, batchsize):
x = Variable(to_gpu(...))
t = Variable(to_gpu(...))
opt.zero_grads()
loss = forward(x, t)
loss.backward()opt.update()
784 100 100 10
0:2%1:5%2:90%
9:1%
Python (if / for / while etc)
ForRNN
def forward(x, t, train=True):h = F.relu(model.l1(x))y = model.l2(h)if train:loss = F.softmax_cross_entropy(y, t)return loss
else:prob = F.softmax(y)acc = F.accuracy(y, t)return acc
y sceloss
y smprob acc
acc
y
y
truncated BPTT
x f y g z
y g z
y.unchain_backward()
x = Variable()
y = f(x)
z = g(y)
y.unchain_backward()
BPTTBack Propagation Through TimeRNNtruncated BPTTBPTT
Caffe Reference Model
Caffe Model ZooBVLC Reference ModelChainerfunction
func = CaffeFunction('path/to/bvlc_reference_caffenet.caffemodel')
x = Variable()
y, = func(inputs={'data': x}, outputs=['fc8'])
CaffeC++Model ZooCaffeWiki
CuPyGPUNumPy
cupy.ndarray
numpy.ndarray
etc
ElementwiseReduction
CPUGPU
def softmax(x)
xp = get_array_module(x)
y = x x.max(axis=1, keepdims=True)
y = xp.exp(y)
return y / y.sum(axis=1, keepdims=True) xp = numpy/cupy
OK
numpy/cupy
ChainerDefine-by-RunPython
Python
CuPyCPUGPU
HPhttp://chainer.org
https://github.com/pfnet/chainer
Twitter@ChainerOfficial
Google GroupChainer Uesr Group
Contribution Guidehttp://docs.chainer.org/en/stable/contribution.html
git clone https://github.com/pfnet/chainer.git
Your Contribution is Welcomed!!
MochaJulia
Chiyuan Zhang (MIT)
v0.0.9(2015721)
MIT Expat License
train LeNet with MNIST
https://github.com/pluskid/Mocha.jl#hell
o-world
Caffe
Caffe
Caffe
Pure Julia / C++ / GPU