Deep Learningの基礎と応用

Deep Learning

14

2015/09/17,

, Preferred Networks

l (Seiya Tokui) @beam2d (Twitter, GitHub)

l PFI (2012-2014) -> PFN (2014-)

l

Jubatus

Deep Learning2012

l 4 Chainer

2

(Deep Learning)

3

2011 GMM 10%

2012 (ILSVRC) 10%

F. Seide, G. Li and D. Yu. Conversational Speech Transcription Using Context-Dependent Deep Neural Network, in INTERSPEECH, pp. 437-440 (2011) J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla and F.-F. Li. Large Scale Visual Recognition Challenge 2012. ILSVRC2012 Workshop.

l

l

4

Sutskever, I., Vinyals, O. and Le, Q. V. Sequence to Sequence Learning with Neural Networks. NIPS 2014.

l

l

l Chainer

5

l

l

l

l

x h1 h2 yW1 W2 W3

h1 = f1(W1x + b1),

h2 = f2(W2h1 + b2),

y = f3(W3h2 + b3).7

l

x h1

h2 y

W1 W2

W3

b1 b2

b3

+ +

+

f1 f2

f3

Wibifi

8

l

l NN

l

l

9

x yNN

(x1, t1), (x2, t2), . . .

t

loss

NN

l

l

l (SGD)

OK

10

l

l

l NN

(backpropagation)

11

fw x g y h z

z

w=

z

y

y

x

x

w= Dh(y)Dg(x)Df (w)

wz

Chainer

layer1 = F.Linear(n_in, n_hidden) layer2 = F.Linear(n_hidden, n_out)

h = F.relu(layer1(x)) y = layer2(h) loss = softmax_cross_entropy(y, t)

12

x

W1 W2b1 b2

+ +h y

t

lossLinear ReLU Linear

f(x) = max(0, x)

ReLU

l

l

l Deep Learning

l

13

t

1

Linear ReLU Linear

DD+D+ DD D y

b2W2W1 b1

hx

a =loss

a(a)

loss = 1

(2D convolution)

l

l

l

l

l

14

http://deeplearning.net/tutorial/lenet.html

(RNN)

l

l

l DAG

15

x

W1 W2b1 b2

+ +h y

t

lossLinear tanh Linear

Linear

Wr

16

f

1

x h g y

2r

fx1 gh1 y1

f g

f g

x2

x3 h3

h2 y2

y3

1 2r

17

Chainer v1.3

l

l ILSVRC 1,000 5 5%

Team Year Place Error (top-5) Uses external data

SuperVision 2012 - 16.4% no

SuperVision 2012 1st 15.3% ImageNet 22k

Clarifai 2013 - 11.7% no

Clarifai 2013 1st 11.2% ImageNet 22k

MSRA 2014 3rd 7.35% no

VGG 2014 2nd 7.32% no

GoogLeNet 2014 1st 6.67% no

C. Szegedy, et al. GoogLeNet team: C. Szegedy, Going Deeper with Convolutions. ILSVRC 2014 workshop (at ECCV 2014).

3

l ConvNet

ConvNet

GoogLeNet

l ImageNet

128 ConvNet

l Data augmentation

20

, Deep Speech

l

l RNN RNN

l

l data augmentation

()

21

A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates, A. Y. Ng. Deep Speech: Scaling up end-to-end speech recognition. arXiv:1412.5567

Encoder-Decoder

l Encoder RNN Decoder RNN

l

RNN

State of the Art

l LSTM (Long Short-Term Memory) 4 RNN

22

I. Sutskever, O. Vinyals, Q. V. Le. Sequence to Sequence Learning with Neural Networks. NIPS 2014.

Attention

l Decode

l RNN

forward attention

l attention decoder RNN

l Attention RNN

23

D. Bahdanau, K. Cho, Y. Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR 2015.

Encoder-Decoder

l

l Encoder ConvNet (GoogLeNet) Decoder

24

O. Vinyals, A. Toshev, S. Bengio, D. Erhan. Show and Tell :A Neural Image Caption Generation. arXiv:1411.4555v2

Attention Encoder-Decoder

25

K. Xu, J. L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. S. Zemel, Y. Bengio. Show, Attent and Tell: Neural Image Caption Generation with Visual Attention. arXiv:1502.03044v2.

(DQN)ConvNetQ

l ConvNet

l

l

26

V. Mnih, et al. Human-level control through deep reinforcement learning. Nature, vol.518, Feb. 26, 2015.

or

DQN

https://research.preferred.jp/2015/06/distributed-deep-reinforcement-learning/

27

AutoEncoder

l NN

l NN

l 2

Decoder Encoder

l

28

NNx z

NNx z

q(x)q(z|x)

p(z)p(x|z)

Encoder()

Decoder()

p q

Attention AE

29

l

l

l AE

l

K. Gregor, I. Danihelka, A. Graves. D. J. Rezende, D. Wierstra. DRAW: A Recurrent Neural Network For Image Generation. ICML 2015.

l 2 NN

30

NN xzgenerator

NN

discriminator

I. J. Goodfellow, J. P.-Abadie, M. Mirza, B. Xu, D. W.-Farley, S. Ozair, A. Courville, Y. Bengio. Generative Adversarial Nets. NIPS 2014.

l

l

l

31

E. Denton, S. Chintala, A. Szlam, R. Fergus. Deep Generative Image Models using a Laplacian pyramid of Adversarial Networks. arXiv:1506.05751v1

32

E. Denton, S. Chintala, A. Szlam, R. Fergus. Deep Generative Image Models using a Laplacian pyramid of Adversarial Networks. arXiv:1506.05751v1

l NN AE

l

l

33

z

NNxu

v

NNxu

v

x

NN

AutoEncoder

AE

34

z

D. P. Kingma, D. J. Rezende, S. Mohamed, M. Welling. Semi-Supervised Learning with Deep Generative Models. NIPS 2014.

l 2

l ImageNet ConvNet

l

35

L. A. Gatys, A. S. Ecker. M. Bethge. A Neural Algorithm of Artistic Style. arXiv:1508.06576.

l

l

l

AI 36

S. Sukhbaatar, A. Szlam, J. Weston. End-To-End Memory Networks. arXiv:1503.09985.

l Deep Learning

l

l

l DAG

l Encoder-Decoder attention

l AE

l

l AI

38

l Chainer: Deep Learning

http://chainer.org

https://github.com/pfnet/chainer GitHub

http://docs.chainer.org

39

Technology

Deep Learningの基礎と応用