Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer
Amazon Web Service Japan K.K.*** Solutions Architect****
w o r k s h o p
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
自己紹介
志村誠
ソリューションアーキテクト
• データ分析・機械学習系サービスを担当
• 好きなサービス• Amazon Athena
• AWS Glue
• そして Amazon SageMaker
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS マンガ第 10 話:いざ挑戦、AWS Summit で AWS DeepRacer リーグ!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
https://aws.amazon.com/jp/campaigns/manga
AWS マンガ
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
アジェンダ
• AWS DeepRacer の概要
• 強化学習
• シミュレータ
• AWS DeepRacer の構成詳細
• DeepRacer リーグ
• AWS DeepRacer コンソールの利用方法
本資料では2019年5月30日時点のサービス内容についてご説明しています。最新の情報は AWS 公式ウェブサイト(http://aws.amazon.com) にてご確認ください。
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer
強化学習をすべての開発者の
手に届けるためのサービス
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer とは
1/18スケールの自律走行カー
学習と評価のためのシミュレータ
世界中でのレースリーグ
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
DeepRacer を走らせるためには
直進
….
• クルマからのカメラ画像のあらゆる見え方に対して、自動運転カーがとるべき運転行動を登録できれば、コースを走らせることが可能
• 実際には無数の見え方が存在するため登録自体が難しい
左
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
エージェント 環境行動 ゴールモデル 状態
強化学習の導入
• カメラ画像から行動を決定するモデルを学習により作成• 環境 (コース) に対して、エージェントが様々な行動 (運転) を試し、
ゴールに到達できるように学習
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
強化学習
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
強化学習の位置づけ
強化学習 教師あり学習
教師なし学習
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
機械学習の全体像
教師あり学習
すべての学習データは、対応するラベルが必要
教師なし学習
学習データにラベルは不要
強化学習
特定の環境下で、一連の行動から学習
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
実世界における強化学習
良い行動に報酬を与える
悪い行動には報酬なし
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
強化学習の用語
エージェント 環境 状態
行動 エピソード報酬
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
報酬関数
強化学習において、特定の行動にインセンティブを与える報酬関数が重要
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
S G = 2
ゴールエージェント
レースのための報酬関数
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
センターラインを走るようにインセンティブを与える
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
S 2 2 2 2 2 2 G = 2
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
8.6 9.5 8.5 7.5 6.3 5.0 3.5 1.9
S 10.4 9.4 8.2 6.9 5.4 3.8 G = 2
8.6 9.5 8.5 7.5 6.3 5.0 3.5 1.9
ステップの割引率0.9
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
学習が行われるプロセス価値関数 (value fn)
方策関数 (reward fn)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
強化学習アルゴリズム: Vanilla policy gradient
* Image Source: Landscape image is CC0 1.0 public domain
J()Newweights
Newweights
0.4 ± 𝛿0.3 ±𝛿
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer のニューラルネットワーク構造
入力 –状態 (画像) 出力 –行動
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon SageMaker Reinforcement Learning
• ゲームやロボットのシミュレーション環境と統合した SageMaker 上の強化学習
• 強化学習ツールキット として Coach とRL-Ray をサポート
• AWS RoboMakerなどのシミュレータを OSS OpenAI Gym インターフェース経由で利用可能.分散学習とシミュレーションの並列化が可能
Redis
方策をもとに行動
観測結果, 報酬
方策を学習
エージェント
Container for Agent
Container for Agent
Container for environment
Container for environment
OpenAI gym, simulator…
環境シミュレータ
AWS RoboMaker
強化学習ツールCoach, RLLib
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
教師あり学習(BEHAVIORAL CLONING)
• カメラ付きの実機カーを熟練のドライバーが運転
• カメラ画像とドライバーの運転を記録し、モデルを学習
学習の結果状態 (画像)を入力すると運転行動を決定する
DeepRacer における強化学習 vs. それ以外のアルゴリズム
強化学習
• 仮想的なエージェントがシミュレーション環境で行動を繰り返し、経験 (入力画像・行動・次状態・報酬) を蓄積
• 経験を利用して学習し、学習したモデルでさらに経験を獲得
学習の結果状態 (画像)を入力すると運転行動を決定する
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS Cloud
AWS DeepRacerNAT gateway
VPC
AWS DeepRacer
モデル
シミュレーション動画
メトリクス
AWS DeepRacer シミュレーションアーキテクチャ
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer コンソールの流れ
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
行動空間の設定
• スピードとステアリングの組合せで定義
• 細かい調整を行うために粒度を設定可能
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
報酬関数の実装
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
コースの構成要素
センターライン
サーキットの壁
コース面 (別名: コース上, on-track)
フィールド(別名: コース外, off-track)
コースの境界線
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
座標系と参照点 (waypoints)
コース外側の参照点
コース中央の参照点
コース内側の参照点
X
Yコース幅
自動運転カーの向き
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
学習アルゴリズムを制御するハイパーパラメータ
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DeepRacer の構成詳細
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer スペック
CAR:18th scale 4WD with monster truck chassisCPU: Intel Atom ProcessorMEMORY: 4 GB RAMSTORAGE: 32 GB (expandable)WI-FI: 802.11acCAMERA: 4 MP camera with MJPEGDRIVE BATTERY: 1000 mAh lithium polymerCOMPUTE BATTERY: 13600 mAh USB-C SENSORS: Integrated accelerometer and gyroscopePORTS: 4x USB-A, 1x USB-C, 1x Micro-USB, 1x HDMISOFTWARE: Ubuntu OS 16.04.3 LTS, Intel OpenVINOtoolkit, ROS Kinetic
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Stored file
ROS nodes
Video
M-JPEG
Webサーバ動画
最適化済みモデル
メディアエンジン
カメラ
モデル
AWS DeepRacer ソフトウェアアーキテクチャ
モデル最適化
推論エンジン
推論結果
ナビゲーションノード
自動運転
手動運転Webサーバ
publisher
制御ノード
サーボ&モータ
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
シミュレーションと実環境の ドメイン転移
シミュレーションから実環境への難しさ
• シミュレーション画像を利用して学習しているが、実機では実世界の画像を利用
• 実環境の完全なシミュレーションも難しい
戦略
• 環境制御を実世界に近づける
• 環境にランダムな要素を追加
• モデルのモジュール化・抽象化
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer League
世界で最初のグローバルな自動運転レースリーグ
www.deepracerleague.com
バーチャル サーキット Summit サーキット
• AWS DeepRacer のサービスにアクセスしましょう
• モデルを学習させます
• バーチャル サーキットで開催されているレースにモデルを提出します
• 実機とコースは AWS Summit で用意されます
• モデルを持ち込むか、ワークショップで学習させましょう
• レースに参加して、リーダーボードに名前を載せ、歴史を作りましょう
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
バーチャルサーキットへの参加方法
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
強化学習についてもっと知りたい
• リーダーボードで上位になるためには強化学習に関する知識が必要不可欠です
• 強化学習とAWS DeepRacer に関する学習コンテンツを提供しています
• コンテンツは無料で、90分間、6つの自己学習のパートで構成されています
https://www.aws.training/learningobject/wbc?id=32143
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer: training and certification
https://www.aws.training/learningobject/wbc?id=35393
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS DeepRacer コンソールの利用方法
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
https://github.com/aws-samples/aws-deepracer-workshops/blob/master/Workshops/2019-AWSSummits-AWSDeepRacerService/Lab1/Readme-Japanese.md
http://bit.ly/deepracer-wsjp
AWS DeepRacer workshop labs
AWS DeepRacer の強化学習モデルを構築しましょう!
S U M M I TN A M E
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Get hands on with AWS DeepRacer & compete in the AWS DeepRacer League
DeClercq WentzelSenior Product ManagerAmazon Web Services
< < Y O U R W O R K S H O P C O D E > >
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Agenda
• AWS DeepRacer origin
• RL for the Sunday driver
• Virtual simulator
• Rubber meets the road
• Under the hood
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
How can we put machine learning in the hands of all developers? literally
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
1/18 scale autonomous race car
AWS DeepRacer: An exciting way for developers to get hands-on experience with machine learning
Global Racing LeagueVirtual simulator, to train and evaluate
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer League, race for prizes and glory
The world’s first global, autonomous racing league
www.deepracerleague.com
Keen on setting up a race in your company? Please reach out
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer problem formulation
STATE
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Reinforcement learning in the broader AI context
ReinforcementLearning
SupervisedLearning
UnsupervisedLearning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Machine learning overview
SUPERVISED UNSUPERVISED REINFORCEMENT
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Reinforcement learning in the real world
Reward positive behavior
Don’t reward negative behavior The result!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Reinforcement learning terms
AGENT ENVIRONMENT STATE
ACTIONEPISODEREWARD
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The reward function
The reward function incentivizes particular behaviors and is at the core of reinforcement
learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The reward function in a race grid
S G = 2
GOALAGENT
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Incentivizing centerline behavior
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
S 2 2 2 2 2 2 G = 2
0.1 0.1 0.1 0.1 0.1 0.1 0.1 0.1
8.6 9.5 8.5 7.5 6.3 5.0 3.5 1.9
S 10.4 9.4 8.2 6.9 5.4 3.8 G = 2
8.6 9.5 8.5 7.5 6.3 5.0 3.5 1.9
Discount per step 0.9
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
How does learning happen?VALUE FUNCTION
POLICY FUNCTION
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
RL algorithms: Vanilla policy gradient
* Image Source: Landscape image is CC0 1.0 public domain
Data is only used once• High variance of rewards• Magnitude of update could be too large
J()Newweights
Newweights
0.4 ± 𝛿 0.3 ± 𝛿
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer Neural Network Architecture
Output - actionInput - state (image)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
METHOD Supervised learning
HOW IT WORKS Expert driver controls a real world car, that has a camera. Save the images from the camera as inputs and corresponding driving actions (speed and steering angle) as outputs. Train a model.
RESULT Provide state(image) into model and receive driving action
RL vs. other approaches for robotic racing
METHOD Reinforcement learning
HOW IT WORKS Virtual agent repeatedly interacts with a simulated environment and logs experience (image, action, new state, reward). Experience is used to train a model, and new model is used to get more experience.
RESULT Provide state(image) into model and receive driving action
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Lab 0 – AWS DeepRacer service resource creation
OBJECTIVE Setup your account resources to get you to the races!
TIME 5 min.
1. Find the lab content here:
https://github.com/aws-samples/aws-deepracer-workshops/
2. Navigate to:
Workshops/2019-AWSSummits-AWSDeepRacerService/Lab0_Create_resources
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS Cloud
AWS DeepRacer
NAT gateway
VPC
AWS DeepRacer
Models
Simulation video
Metrics
AWS DeepRacer simulator architecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer console diagram
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Programming your own reward function
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Track components
TRACK CENTER
TRACK WALL
TRACK SURFACE aka ON-TRACK
FIELD aka OFF-TRACK
TRACK BOUNDARIES
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Coordinate system and track waypoints
OUTER BOUNDARY WAYPOINTS
TRACK CENTER WAYPOINTS
INNER BOUNDARY WAYPOINTS
X
YTRACK WIDTH
CAR DIRECTION
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Action space
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Hyper parameters control the training algorithm
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer League, race for prizes and glory
The world’s first global, autonomous racing league
www.deepracerleague.com
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Submit your model now to race in the Virtual Circuit!
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Lab 1 – AWS DeepRacer service
OBJECTIVE Build your first AWS DeepRacer RL model
TIME 50 min.
1. Find the lab content here:
https://github.com/aws-samples/aws-deepracer-workshops/
2. Navigate to: Workshops/2019-AWSSummits-AWSDeepRacerService/Lab1
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer: Driven by reinforcement learning
Want to learn more?
Learn how to build a reinforcement learning model and find tips and tricks about how to tune those models to climb the League leaderboard in a digital training
course for reinforcement learning and AWS DeepRacer.
This 90-minute course is available at no cost, has 6 self-guided chapters, and will help you prepare to compete in the AWS DeepRacer League.
https://www.aws.training/learningobject/wbc?id=32143
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS DeepRacer car specifications
CAR 18th scale 4WD with monster truck chassis
CPU Intel Atom Processor
MEMORY 4 GB RAM
STORAGE 32 GB (expandable)
WI-FI 802.11ac
CAMERA 4 MP camera with MJPEG
DRIVE BATTERY 1000 mAh lithium polymer
COMPUTE BATTERY 13600 mAh USB-C
SENSORS Integrated accelerometer and gyroscope
PORTS 4x USB-A, 1x USB-C, 1x Micro-USB, 1x HDMI
SOFTWARE Ubuntu OS 16.04.3 LTS, Intel OpenVINO
toolkit, ROS Kinetic
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
ROS msg node
Stored file
ROS nodes
Web Server
Publisher
Model Optimizer
Video M-JPEG
Web ServerVideo
Inference Results
Autonomous Drive
Control Node
Optimized Model
Media engine
Camera
Model
Inference engine
Manual Drive
Navigation Node
Servo & Motor
AWS DeepRacer software architecture
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Simulation-to-real domain transfer
SIM-to-REAL CHALLENGE
Train model using simulated images, but the race car using the images the car experiences in the real world
STRATEGIES
Environment control
Domain randomization
Modularity and abstraction
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.