2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用

1. Chainer Preferred Networks [email protected] 2015/9/18 GTC Japan @

2. (@delta2323_) 2012.3 PFI 2014.10 PFN Chainer http://delta2323.github.io NIPS2014ICML2015 1! 2

3. git clone https://github.com/pfnet/chainer.git

4. Chainer http://chainer.org PFNPFI 201569 1.3.0201592 1.3.1 (9/16) 1.4.0 (9/30) MIT (Expat) HPhttp://chainer.org https://github.com/pfnet/chainer Twitter@ChainerOfficial Google GroupChainer Uesr Group Contribution Guide http://docs.chainer.org/en/stable/contribution. html PowerfulCUDAGPUcuDNN Flexible IntuitivePython4

5. ChainerChainer + CuPy Chainer Python CuPyGPUNumPy* NumPy CPU GPU BLAS CUDA Toolkit cuDNN NumPy CuPy Chainer Python * NumPyPython PythonNumPy 5

6. OSLinux Ubuntu 14.04 Python(Cpython) 2.7+/3.4+ Numpy1.9+Six1.9+ CUDACUDA6.5+ pip install chainer 6 pip* : Python Python

7. Github Stars20155 PyLearn2 https://twitter.com/fchollet/status/635 891305084796929 7

8. Github Stars20158 PyLearn2 8 https://twitter.com/fchollet/status/635 891305084796929

9. Chainer Deep Q Network* * Mnih, Volodymyr, et al. "Human-level control through deep reinforcement learning." Nature 518.7540 (2015): 529-533. ** CaffeDeep Q-Network http://d.hatena.ne.jp/muupan/20141021/1413850461 *** PFI2014 http://www.ustream.tv/recorded/53153399 9

10. Chainer 10 https://research.preferred.jp/2015/06/distributed-deep- reinforcement-learning/ 273 600 400 200 100 50 5 273 600 400 200 100 50 5 273 600 400 200 100 50 5

11. Chainer Variational AE Kingma, Diederik P., et al. "Semi-supervised learning with deep generative models." Advances in Neural Information Processing Systems. 2014. CaffeVGG A Neural Algorithm of Artistic Style [Gatys+'15] (chainer-goch) 11

12. Deep Learning Caffe Chainer n Blob Variable Layer Function Net (FunctionSet) Solver Optimizer 12 (DAG)

13. GoogLeNet, NTM, Recursive Net, LSTM Chainer Caffe 167 2058 GoogleNet (2012)AlexNet*, 7 (2014) GoogLeNet**, 22 13 * ImageNet Classification with Deep Convolutional Neural Networks http://www.image-net.org/challenges/LSVRC/2012/supervision.pdf ** Szegedy, Christian, et al. "Going deeper with convolutions." arXiv preprint arXiv:1409.4842 (2014). ChainerDefine-by-Run

14. Define-and-Run vs. Define-by-Run 14 f g x f g x yf x = chainer.Variable(...) y = f(x) z = g(x) zg = Define-and-Run Define-by-Run

15. Forward x = chainer.Variable(np.array(1)) y = chainer.Variable(np.array(1)) z = x**2 + 2*x*y + y z.backward() Split x y _ ** 2 2 * _ _ * _ _ + _ z _ + _ chainer.Variable chainer.Function 15

16. Variable.backward() Split x y _ ** 2 2 * _ _ * _ _ + _ z _ + _ x = chainer.Variable(np.array(1)) y = chainer.Variable(np.array(1)) z = x**2 + 2*x*y + y z.backward() 16

17. Python (if / for / while etc) ForRNN def forward(x, t, train=True): h = F.relu(model.l1(x)) y = model.l2(h) if train: loss = F.softmax_cross_entropy(y, t) return loss else: prob = F.softmax(y) acc = F.accuracy(prob, t) return acc y sce lo ss y sm pr ob acc ac c 17

18. MNIST # (1) Model definition model = FunctionSet( l1=F.Linear(784, 100), l2=F.Linear(100, 100), l3=F.Linear(100, 10)).to_gpu() # (2) Optimizer Setup opt = optimizers.SGD() opt.setup(model) # (3) Forward computation def forward(x, t): h1 = F.relu(model.l1(x)) h2 = F.relu(model.l2(h1)) y = model.l3(h2) return F.softmax_cross_entropy(y, t) # (4) Training loop for epoch in xrange(n_epoch): for i in xrange(0, N, batchsize): x = Variable(to_gpu(...)) t = Variable(to_gpu(...)) opt.zero_grads() loss = forward(x, t) loss.backward() opt.update() 784 100 100 10 0:2% 1:5% 2:90% 9:1%18

19. (1) model = FunctionSet( l1=F.Linear(784, 100), l2=F.Linear(100, 100), l3=F.Linear(100, 10)).to_gpu() opt = optimizers.SGD() opt.setup(model) Function FunctionSet 19

20. (2) Optimizer model = FunctionSet( l1=F.Linear(784, 100), l2=F.Linear(100, 100), l3=F.Linear(100, 10)).to_gpu() opt = optimizers.SGD() opt.setup(model) Optimizer 20

21. (3) Training Loop for epoch in xrange(n_epoch): for i in xrange(0, N, batchsize): x = Variable(to_gpu(...)) t = Variable(to_gpu(...)) opt.zero_grads() loss = forward(x, t) loss.backward() opt.update() 1 21

22. (3) def forward(x, t): h1 = F.relu(model.l1(x)) h2 = F.relu(model.l2(h1)) y = model.l3(h2) return F.softmax_cross_entropy(y, t) x t 22

23. (4) Training Loop for epoch in xrange(n_epoch): for i in xrange(0, N, batchsize): x = Variable(to_gpu(...)) t = Variable(to_gpu(...)) opt.zero_grads() loss = forward(x, t) loss.backward() opt.update() 23

24. y y truncated BPTT** x f y g z y g z y.unchain_backward() x = Variable() y = f(x) z = g(y) y.unchain_backward() * BPTTBack Propagation Through Time RNN ** truncated BPTT BPTT 24

25. Caffe 25 * Caffe C++

26. * hERG potassium channels and cardiac arrhythmia, Michael C. Sanguinetti & Martin Tristani-Firouzi, Nature 440, 463-469(23 March 2006) (doi:10.1038/nature04710) Fig. 5

27. (QSAR) Quantitative Structure-Activity Relationship MerckQSAR (2012) * 27 QSAR HTS * Dahl, George E., Navdeep Jaitly, and Ruslan Salakhutdinov. "Multi- task Neural Networks for QSAR Predictions." arXiv preprint arXiv:1406.1231 (2014). ** http://blog.kaggle.com/2012/10/31/merck-competition-results- deep-nn-and-gpus-come-out-to-play/ **

28. [Dahl+14] 28 2-3 500-2500/ Dropout Minibatch SGD 10010011010100010 1 0 Active!! 1 0 Active!! 0 1 Inactive!! PubChem 19 20 DNN B

29. Community Learning TSUBAME H27 29 PubChem 2 100 SoftTarget SoftTarget Data-Parallel SoftTarget MPI TSUBAME 3GPU(K40), 4GB/ >100

30. Community Learning Community Learning8 30 () () 1 10.5719 1 0.0318 2 5.2267 x 2.022 0.1377 3 3.9455 x 2.679 0.1284 4 2.5978 x 4.070 0.1367 8 1.5417 x 6.857 0.1281 NNChainer mshadow

31. Community Learning 5Community Learning 31 ID [Dahl+ '14] Community Learning 1851 (1a2) 0.926 0.938 0.9387 1851 (2c19) 0.897 0.903 0.9413 1851 (2c9) 0.889 0.907 0.9274 1851 (2d6) 0.863 0.861 0.8913 1851 (3a4) 0.895 0.897 0.9214 NNChainer mshadow

32. ChainerPython Define-by-Run Chainer HPhttp://chainer.org https://github.com/pfnet/chainer Twitter@ChainerOfficial Google GroupChainer Uesr Group Contribution Guide http://docs.chainer.org/en/stable/contribution.html Your Contribution is Welcome!! 32 And You are Welcome!! We are hiring :)

Technology

2015年9月18日 (GTC Japan 2015) 深層学習フレームワークChainerの導入と化合物活性予測への応用