39
Deep Learning の 基礎と応 第14回情報科学技術フォーラム 2015/09/17, 愛媛学 得居 誠也, Preferred Networks

Deep Learningの基礎と応用

Embed Size (px)

Citation preview

  • Deep Learning

    14

    2015/09/17,

    , Preferred Networks

  • l (Seiya Tokui) @beam2d (Twitter, GitHub)

    l PFI (2012-2014) -> PFN (2014-)

    l

    Jubatus

    Deep Learning2012

    l 4 Chainer

    2

  • (Deep Learning)

    3

    2011 GMM 10%

    2012 (ILSVRC) 10%

    F. Seide, G. Li and D. Yu. Conversational Speech Transcription Using Context-Dependent Deep Neural Network, in INTERSPEECH, pp. 437-440 (2011) J. Deng, A. Berg, S. Satheesh, H. Su, A. Khosla and F.-F. Li. Large Scale Visual Recognition Challenge 2012. ILSVRC2012 Workshop.

  • l

    l

    4

    Sutskever, I., Vinyals, O. and Le, Q. V. Sequence to Sequence Learning with Neural Networks. NIPS 2014.

  • l

    l

    l Chainer

    5

  • 6

  • l

    l

    l

    l

    x h1 h2 yW1 W2 W3

    h1 = f1(W1x + b1),

    h2 = f2(W2h1 + b2),

    y = f3(W3h2 + b3).7

  • l

    x h1

    h2 y

    W1 W2

    W3

    b1 b2

    b3

    + +

    +

    f1 f2

    f3

    Wibifi

    8

  • l

    l NN

    l

    l

    9

    x yNN

    (x1, t1), (x2, t2), . . .

    t

    loss

    NN

  • l

    l

    l (SGD)

    OK

    10

  • l

    l

    l NN

    (backpropagation)

    11

    fw x g y h z

    z

    w=

    z

    y

    y

    x

    x

    w= Dh(y)Dg(x)Df (w)

    wz

  • Chainer

    layer1 = F.Linear(n_in, n_hidden) layer2 = F.Linear(n_hidden, n_out)

    h = F.relu(layer1(x)) y = layer2(h) loss = softmax_cross_entropy(y, t)

    12

    x

    W1 W2b1 b2

    + +h y

    t

    lossLinear ReLU Linear

    f(x) = max(0, x)

    ReLU

  • l

    l

    l Deep Learning

    l

    13

    t

    1

    Linear ReLU Linear

    DD+D+ DD D y

    b2W2W1 b1

    hx

    a =loss

    a(a)

    loss = 1

  • (2D convolution)

    l

    l

    l

    l

    l

    14

    http://deeplearning.net/tutorial/lenet.html

  • (RNN)

    l

    l

    l DAG

    15

    x

    W1 W2b1 b2

    + +h y

    t

    lossLinear tanh Linear

    Linear

    Wr

  • 16

    f

    1

    x h g y

    2r

    fx1 gh1 y1

    f g

    f g

    x2

    x3 h3

    h2 y2

    y3

    1 2r

  • 17

    Chainer v1.3

  • 18

  • l

    l ILSVRC 1,000 5 5%

    Team Year Place Error (top-5) Uses external data

    SuperVision 2012 - 16.4% no

    SuperVision 2012 1st 15.3% ImageNet 22k

    Clarifai 2013 - 11.7% no

    Clarifai 2013 1st 11.2% ImageNet 22k

    MSRA 2014 3rd 7.35% no

    VGG 2014 2nd 7.32% no

    GoogLeNet 2014 1st 6.67% no

    C. Szegedy, et al. GoogLeNet team: C. Szegedy, Going Deeper with Convolutions. ILSVRC 2014 workshop (at ECCV 2014).

  • 3

    l ConvNet

    ConvNet

    GoogLeNet

    l ImageNet

    128 ConvNet

    l Data augmentation

    20

  • , Deep Speech

    l

    l RNN RNN

    l

    l data augmentation

    ()

    21

    A. Hannun, C. Case, J. Casper, B. Catanzaro, G. Diamos, E. Elsen, R. Prenger, S. Satheesh, S. Sengupta, A. Coates, A. Y. Ng. Deep Speech: Scaling up end-to-end speech recognition. arXiv:1412.5567

  • Encoder-Decoder

    l Encoder RNN Decoder RNN

    l

    RNN

    State of the Art

    l LSTM (Long Short-Term Memory) 4 RNN

    22

    I. Sutskever, O. Vinyals, Q. V. Le. Sequence to Sequence Learning with Neural Networks. NIPS 2014.

  • Attention

    l Decode

    l RNN

    forward attention

    l attention decoder RNN

    l Attention RNN

    23

    D. Bahdanau, K. Cho, Y. Bengio. Neural Machine Translation by Jointly Learning to Align and Translate. ICLR 2015.

  • Encoder-Decoder

    l

    l Encoder ConvNet (GoogLeNet) Decoder

    24

    O. Vinyals, A. Toshev, S. Bengio, D. Erhan. Show and Tell :A Neural Image Caption Generation. arXiv:1411.4555v2

  • Attention Encoder-Decoder

    25

    K. Xu, J. L. Ba, R. Kiros, K. Cho, A. Courville, R. Salakhutdinov, R. S. Zemel, Y. Bengio. Show, Attent and Tell: Neural Image Caption Generation with Visual Attention. arXiv:1502.03044v2.

  • (DQN)ConvNetQ

    l ConvNet

    l

    l

    26

    V. Mnih, et al. Human-level control through deep reinforcement learning. Nature, vol.518, Feb. 26, 2015.

    or

  • DQN

    https://research.preferred.jp/2015/06/distributed-deep-reinforcement-learning/

    27

  • AutoEncoder

    l NN

    l NN

    l 2

    Decoder Encoder

    l

    28

    NNx z

    NNx z

    q(x)q(z|x)

    p(z)p(x|z)

    Encoder()

    Decoder()

    p q

  • Attention AE

    29

    l

    l

    l AE

    l

    K. Gregor, I. Danihelka, A. Graves. D. J. Rezende, D. Wierstra. DRAW: A Recurrent Neural Network For Image Generation. ICML 2015.

  • l 2 NN

    30

    NN xzgenerator

    NN

    discriminator

    I. J. Goodfellow, J. P.-Abadie, M. Mirza, B. Xu, D. W.-Farley, S. Ozair, A. Courville, Y. Bengio. Generative Adversarial Nets. NIPS 2014.

  • l

    l

    l

    31

    E. Denton, S. Chintala, A. Szlam, R. Fergus. Deep Generative Image Models using a Laplacian pyramid of Adversarial Networks. arXiv:1506.05751v1

  • 32

    E. Denton, S. Chintala, A. Szlam, R. Fergus. Deep Generative Image Models using a Laplacian pyramid of Adversarial Networks. arXiv:1506.05751v1

  • l NN AE

    l

    l

    33

    z

    NNxu

    v

    NNxu

    v

    x

    NN

  • AutoEncoder

    AE

    34

    z

    D. P. Kingma, D. J. Rezende, S. Mohamed, M. Welling. Semi-Supervised Learning with Deep Generative Models. NIPS 2014.

  • l 2

    l ImageNet ConvNet

    l

    35

    L. A. Gatys, A. S. Ecker. M. Bethge. A Neural Algorithm of Artistic Style. arXiv:1508.06576.

  • l

    l

    l

    AI 36

    S. Sukhbaatar, A. Szlam, J. Weston. End-To-End Memory Networks. arXiv:1503.09985.

  • 37

  • l Deep Learning

    l

    l

    l DAG

    l Encoder-Decoder attention

    l AE

    l

    l AI

    38

  • l Chainer: Deep Learning

    http://chainer.org

    https://github.com/pfnet/chainer GitHub

    http://docs.chainer.org

    39