45

go-go-green-wing-mighty-morphing-materials-in-aircraft …

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 2: go-go-green-wing-mighty-morphing-materials-in-aircraft …

2017

2013

Page 3: go-go-green-wing-mighty-morphing-materials-in-aircraft …

https://newatlas.com/bae-smartskin/33458/ https://www.nasa.gov/ames/feature/go-go-green-wing-mighty-morphing-materials-in-aircraft-design

Page 4: go-go-green-wing-mighty-morphing-materials-in-aircraft …

http://www.biologyreference.com/Mo-Nu/Neuron.html

Page 5: go-go-green-wing-mighty-morphing-materials-in-aircraft …

http://www.biologyreference.com/Mo-Nu/Neuron.html

I1 I2 B

O

w2 w3

𝑓 𝑥𝑖 , 𝑤𝑖 = Φ(𝑏 + Σ𝑖(𝑤𝑖 . 𝑥𝑖))

Φ 𝑥 = ቊ1, 𝑖𝑓 𝑥 ≥ 0.50, 𝑖𝑓 𝑥 < 0.5

w1

𝑃 ∧ Q

𝑷 𝑸 𝑷 ∧ Q

𝑇 𝑇 𝑇

𝑇 𝐹 𝐹

𝐹 𝑇 𝐹

𝐹 𝑇 𝐹

Page 6: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 7: go-go-green-wing-mighty-morphing-materials-in-aircraft …

Agent

Environment𝑆𝑡+1

𝑆𝑡

sta

te

𝑅𝑡+1

𝑅𝑡

rew

ard

𝐴𝑡

actio

n

Sutton and Barto

Page 8: go-go-green-wing-mighty-morphing-materials-in-aircraft …

Page 9: go-go-green-wing-mighty-morphing-materials-in-aircraft …

A state St is Markov if and only if:

𝑃 𝑆𝑡+1|𝑆𝑡 = 𝑃 𝑆𝑡+1 𝑆1, … , 𝑆𝑡]

Page 10: go-go-green-wing-mighty-morphing-materials-in-aircraft …

• 𝐺𝑡𝛾 𝛾 ∈ 0,1

𝐺𝑡 = 𝑅𝑡+1 + 𝛾𝑅𝑡+2 + 𝛾2𝑅𝑡+3 + … =

𝑘=0

𝛾𝑘𝑅𝑡+𝑘+1

Page 11: go-go-green-wing-mighty-morphing-materials-in-aircraft …

𝑣𝜋 𝑠 = 𝔼 𝐺𝑡 𝑆𝑡 = 𝑠 = 𝔼 𝑅𝑡+1 + 𝛾𝑣𝜋 𝑠𝑡+1 𝑠𝑡 = 𝑠

𝜋

𝜋

𝑞𝜋 𝑠, 𝑎 = 𝔼 𝑅𝑡+1 + 𝛾𝑞𝜋 𝑠𝑡+1, 𝑎𝑡+1 𝑠𝑡 = 𝑠, 𝑎𝑡 = 𝑎

= ℛ𝑠𝑎 + 𝛾

𝑠′𝜖𝑆

𝒫𝑠𝑠′𝑎 𝑣𝜋 𝑠′

𝜋

𝜋

𝑠 → 𝑣𝜋(𝑠)

𝑠, 𝑎 → 𝑞𝜋(𝑠, 𝑎)

𝑠′ → 𝑣𝜋(𝑠′)

𝑎

𝑠′

𝑟

Page 12: go-go-green-wing-mighty-morphing-materials-in-aircraft …

5.5

510 -3

𝑣 𝑠 = 10 × .5 + 5 × .25 + −3 × .25 = 5.5

4.4

2

R=5

P=.5 R=2

P=.5

5

P=.4 P=.5

𝑣 𝑠 = 5 × .5 + .5[.4 × 2 + .5 × 5 + .1 × 4.4] = 4.4

P=.5

P=.25

P=.25P=.1

Page 13: go-go-green-wing-mighty-morphing-materials-in-aircraft …

max

s,a

r

s’

a’ s’

s

a

𝜋

p r

max

9

510 -3

𝑣∗ 𝑠 = max{−1 + 10,+2 + 5,+3 − 3} = 9

R = -1

R = 2

R = 3

A policy is better if 𝑣𝜋 𝑠 ≥ 𝑣𝜋′ 𝑠 ∀ 𝑠 ∈ 𝑆

𝑣∗ s ≡ max 𝑣𝜋 𝑠 ∀ 𝑠 ∈ 𝑆

Page 14: go-go-green-wing-mighty-morphing-materials-in-aircraft …

1 2 3

4 5 6 7

8 9 10 11

12 13 14

𝑟𝑡 = −1

𝜋 → . = 𝜋 ↑ . =𝜋 ↓ . = 𝜋 ← . = .25

Page 15: go-go-green-wing-mighty-morphing-materials-in-aircraft …

0.00

1

0.00

2

0.00

3

0.00

4

0.00

5

0.00

6

0.00

7

0.00

8

0.00

9

0.00

10

0.00

11

0.00

12

0.00

13

0.00

14

𝑅𝑡 = −1𝑘 = 0

0

15

𝜋: 𝑅𝑎𝑛𝑑𝑜𝑚 𝑃𝑜𝑙𝑖𝑐𝑦

𝜋 → . = 𝜋 ↑ . =𝜋 ↓ . = 𝜋 ← . = .25

𝑟𝑡 = −1

Page 16: go-go-green-wing-mighty-morphing-materials-in-aircraft …

𝑣𝑘=1 1= .25 × −1 + 0𝑣(2)

𝑘=0 →

+.25 × −1 + 0𝑣(1)𝑘=0 ↑

+

.25 × −1 + 0𝑣(5)𝑘=0 ↓

+.25 × −1 + 0𝑣(𝑇)𝑘=0 ←

= −.25 − .25 − .25 − .25 = −𝟏

-1.00 -1.00 -1.00

-1.00 -1.00 -1.00 -1.00

-1.00 -1.00 -1.00 -1.00

-1.00 -1.00 -1.00

0

0

0.00

1

0.00 0.00

0.00 0.00 0.00 0.00

0.00 0.00 0.00 0.00

0.00 0.00 0.00

0

0

𝜋 → . = 𝜋 ↑ . =𝜋 ↓ . = 𝜋 ← . = .25

𝑟𝑡 = −1

𝑘 = 0 𝑘 = 1

𝑣𝑘=1 7 =.25 × −1 + 0𝑣(7)𝑘=0 →

+.25 × −1 + 0𝑣(3)𝑘=0 ↑

+

.25 × −1 + 0𝑣(11)𝑘=0 ↓

+.25 × −1 + 0𝑣(6)𝑘=0 ←

= −.25 − .25 − .25 − .25 = −𝟏

Page 17: go-go-green-wing-mighty-morphing-materials-in-aircraft …

𝑣𝑘=2 1=.25 × −1 + −1.00𝑣(2)

𝑘=1 →

+ . 25 × −1 + −1.00𝑣(1)𝑘=1 ↑

+

.25 × −1 + −1.00𝑣(5)𝑘=1 ↓

+.25 × −1 + 0𝑣(𝑇)𝑘=1 ←

= .25 × −𝟐 − 𝟐 − 𝟐 − 𝟏 = −𝟏. 𝟕𝟓

-1.75 -2.00 -2.00

-2.00 -2.00 -2.00 -2.00

-2.00 -2.00 -2.00 -1.75

-2.00 -2.00 -1.75

0

0

-1.00 -1.00 -1.00

-1.00 -1.00 -1.00 -1.00

-1.00 -1.00 -1.00 -1.00

-1.00 -1.00 -1.00

0

0

𝑣𝑘=2 7= −1 × .25 − 1.00𝑣(7)

𝑘=1 →

+ −1 × .25 − 1.00𝑣(3)𝑘=1 ↑

+−1 × .25 − 1.00𝑣(11)

𝑘=1 ↓

+ −1 × .25 − 1.00.𝑣(6)𝑘=1 ←

=

= .25 × −𝟐 − 𝟐 − 𝟐 − 𝟏 = −𝟐

𝜋 → . = 𝜋 ↑ . =𝜋 ↓ . = 𝜋 ← . = .25

𝑟𝑡 = −1

𝑘 = 1 𝑘 = 2

Page 18: go-go-green-wing-mighty-morphing-materials-in-aircraft …

𝑣𝑘=3 1=.25 × −1 + −2.00𝑣(2)

𝑘=2 →

+ . 25 × −1 + −1.75𝑣(1)𝑘=2 ↑

+

.25 × −1 + −2.00𝑣(5)𝑘=2 ↓

+.25 × −1 + 0𝑣(𝑇)𝑘=2 ←

= .25 × −𝟑 − 𝟐. 𝟕𝟓 − 𝟑 − 𝟏 = −𝟐. 𝟒𝟑

-2.43 -2.93 -3.00

-2.43 -2.93 -3.00 -2.93

-2.93 -3.00 -2.93 -2.43

-3.00 -2.93 -2.43

0

0

-1.75 -2.00 -2.00

-1.75 -2.00 -2.00 -2.00

-2.00 -2.00 -2.00 -1.75

-2.00 -2.00 -1.75

0

0

𝑣𝑘=3 7= −1 × .25 − 2.00𝑣(7)

𝑘=2 →

+ −1 × .25 − 2.00𝑣(3)𝑘=2 ↑

+−1 × .25 − 1.75𝑣(11)

𝑘=2 ↓

+ −1 × .25 − 2.00.𝑣(6)𝑘=2 ←

=

.25 × −𝟑 − 𝟑 − 𝟐. 𝟕𝟓 − 𝟑 = −𝟐.93

𝜋 → . = 𝜋 ↑ . =𝜋 ↓ . = 𝜋 ← . = .25

𝑟𝑡 = −1

𝑘 = 2 𝑘 = 3

Page 19: go-go-green-wing-mighty-morphing-materials-in-aircraft …

𝜋 𝑉

𝜋 → 𝑣𝜋

𝜋 → 𝑔𝑟𝑒𝑒𝑑𝑦(𝑉)

Evaluation

Improvement

𝜋∗ 𝑉∗

Page 20: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 21: go-go-green-wing-mighty-morphing-materials-in-aircraft …

https://github.com/rlcode/reinforcement-learning

Page 22: go-go-green-wing-mighty-morphing-materials-in-aircraft …

• Suitable for medium problem of just a few million states.

Page 23: go-go-green-wing-mighty-morphing-materials-in-aircraft …

… …

Page 24: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 25: go-go-green-wing-mighty-morphing-materials-in-aircraft …

TD(1)

TD(2)

Page 26: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 27: go-go-green-wing-mighty-morphing-materials-in-aircraft …

Page 28: go-go-green-wing-mighty-morphing-materials-in-aircraft …

s,a

r

s’

max

𝑄 𝑆, 𝐴 ← 𝑄 𝑆, 𝐴 + 𝛼(𝑅 + 𝛾max𝑎′

𝑄 𝑆′, 𝑎′ − 𝑄(𝑆, 𝐴))

Page 29: go-go-green-wing-mighty-morphing-materials-in-aircraft …

https://github.com/dbatalov/reinforcement-learning

Rocket Lander DemoGrid World Demo

https://github.com/rlcode/reinforcement-learning

Page 30: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 31: go-go-green-wing-mighty-morphing-materials-in-aircraft …

Check this link for proof of the theorem:

https://en.wikipedia.org/wiki/Universal_approximation_theoremDavid Silver

Page 32: go-go-green-wing-mighty-morphing-materials-in-aircraft …

https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf

Page 33: go-go-green-wing-mighty-morphing-materials-in-aircraft …

https://web.stanford.edu/class/psych209/Readings/MnihEtAlHassibis15NatureControlDeepRL.pdf

• DQN Agent achieves >75%

of the human score in 29

our of 49 games

• DQN Agent beats human

score (>100%) in 22 games

𝑆𝑐𝑜𝑟𝑒% =(𝐴𝑔𝑒𝑛𝑡 𝑆𝑐𝑜𝑟𝑒 − 𝑅𝑎𝑛𝑑𝑜𝑚 𝑝𝑙𝑎𝑦 𝑆𝑐𝑜𝑟𝑒)

(𝐻𝑢𝑚𝑎𝑛 𝑆𝑐𝑜𝑟𝑒 − 𝑅𝑎𝑛𝑑𝑜𝑚 𝑝𝑙𝑎𝑦 𝑆𝑐𝑜𝑟𝑒)𝑋 100

Page 34: go-go-green-wing-mighty-morphing-materials-in-aircraft …

https://github.com/apache/incubator-mxnet/tree/master/example/reinforcement-learning/dqn

Page 35: go-go-green-wing-mighty-morphing-materials-in-aircraft …

def dqn_sym_nature(action_num, data=None, name='dqn'): """Structure of the Deep Q Network in the Nature 2015 paper

Human-level control through deep reinforcement learning(http://www.nature.com/nature/journal/v518/n7540/full/nature14236.html)""”if data is None:

net = mx.symbol.Variable('data’)else:

net = data net = mx.symbol.Variable('data') net = mx.symbol.Convolution(data=net, name='conv1', kernel=(8, 8), stride=(4, 4), num_filter=32) net = mx.symbol.Activation(data=net, name='relu1', act_type="relu") net = mx.symbol.Convolution(data=net, name='conv2', kernel=(4, 4), stride=(2, 2), num_filter=64) net = mx.symbol.Activation(data=net, name='relu2', act_type="relu") net = mx.symbol.Convolution(data=net, name='conv3', kernel=(3, 3), stride=(1, 1), num_filter=64) net = mx.symbol.Activation(data=net, name='relu3', act_type="relu") net = mx.symbol.Flatten(data=net) net = mx.symbol.FullyConnected(data=net, name='fc4', num_hidden=512) net = mx.symbol.Activation(data=net, name='relu4', act_type="relu") net = mx.symbol.FullyConnected(data=net, name='fc5', num_hidden=action_num) net = mx.symbol.Custom(data=net, name=name, op_type='DQNOutput’)return net

Page 36: go-go-green-wing-mighty-morphing-materials-in-aircraft …

DQN = gluon.nn.Sequential()with DQN.name_scope():

#first layerDQN.add(gluon.nn.Conv2D(channels=32, kernel_size=8,strides = 4,padding = 0))DQN.add(gluon.nn.BatchNorm(axis = 1, momentum = 0.1,center=True))DQN.add(gluon.nn.Activation('relu'))#second layerDQN.add(gluon.nn.Conv2D(channels=64, kernel_size=4,strides = 2))DQN.add(gluon.nn.BatchNorm(axis = 1, momentum = 0.1,center=True))DQN.add(gluon.nn.Activation('relu'))#tird layerDQN.add(gluon.nn.Conv2D(channels=64, kernel_size=3,strides = 1))DQN.add(gluon.nn.BatchNorm(axis = 1, momentum = 0.1,center=True))DQN.add(gluon.nn.Activation('relu'))DQN.add(gluon.nn.Flatten())#fourth layerDQN.add(gluon.nn.Dense(512,activation ='relu'))#fifth layerDQN.add(gluon.nn.Dense(num_action,activation ='relu'))

Page 37: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 38: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 39: go-go-green-wing-mighty-morphing-materials-in-aircraft …

Page 40: go-go-green-wing-mighty-morphing-materials-in-aircraft …

• Up to eight NVIDIA Tesla V100 GPUs

• 1 PetaFLOPs of computational performance –

14x better than P2

• 300 GB/s GPU-to-GPU communication

(NVLink) – 9X better than P2

• 16GB GPU memory with 900 GB/sec peak GPU

memory bandwidth

T h e f a s t e s t , m o s t p o w e r f u l G P U i n s t a n c e s i n t h e c l o u d

Page 41: go-go-green-wing-mighty-morphing-materials-in-aircraft …

• Get started quickly with easy-to-launch tutorials

• Hassle-free setup and configuration

• Pay only for what you use – no additional charge for

the AMI

• Accelerate your model training and deployment

• Support for popular deep learning frameworks

Page 42: go-go-green-wing-mighty-morphing-materials-in-aircraft …

End-to-End

Machine Learning

Platform

Zero setup Flexible Model

Training

Pay by the second

$

Build, train, and deploy machine learning models at scale

Page 43: go-go-green-wing-mighty-morphing-materials-in-aircraft …

Lots of companies

doing Machine

Learning

Unable to unlock

business potential

Brainstorming Modeling Teaching

Lack ML

expertise

Leverage Amazon experts with decades of ML

experience with technologies like Amazon Echo,

Amazon Alexa, Prime Air and Amazon GoAmazon ML Lab

provides the missing

ML expertise

Page 44: go-go-green-wing-mighty-morphing-materials-in-aircraft …
Page 45: go-go-green-wing-mighty-morphing-materials-in-aircraft …

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.