101
Reinforcement Learning Opensource to Baselines

Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Reinforcement LearningOpensource to Baselines

Page 2: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Member

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

2

● 한국생산기술연구원○ 로보틱스 및 재활로봇 연구

● KAIST 연구원○ IoT 시스템 개발 & 강화학습

연구

● 플랜아이 연구원○ 추천 시스템 & 강화학습 연구

● 한양대학교 석사과정○ 강화학습 시뮬레이션 구축○ 속도 예측 연구

● 호서대학교 로봇 자동화공학과 학부생○ 임베디드, ROS, 제어, 강화학습○ 강화학습을 로봇에 적용하는

연구

Page 3: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Index

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

3

● Reinforcement Learning

● Policy based Reinforcement Learning

● Value based Reinforcement Learning

● Famous open source for Reinforcement Learning

● Why?

● Requirements

● Supported algorithms

● To do List

● How to use

● License

Page 4: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

4

Page 5: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

5

Page 6: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 7: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Probability to select action(a) at state(s)

Policy base Reinforcement Learning

Page 8: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 9: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

State Distribution

Policy base Reinforcement Learning

Page 10: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

State Distribution

Policy

Policy base Reinforcement Learning

Page 11: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

State Distribution

Policy

Reward

Policy base Reinforcement Learning

Page 12: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 13: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 14: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 15: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 16: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 17: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 18: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 19: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

Page 20: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy base Reinforcement Learning

● Policy Gradient

● Advantage Actor Critic

● Natural Policy Gradient

● Trust Region Policy Gradient

● Proximal Policy Gradient

Page 21: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy Gradient

● Policy Gradient Methods for Reinforcement Learning with Function Approximation

○ Policy can be trained by iterable method

○ Policy can be trained by Policy Gradient

Method

● Policy Gradient

○ Optimal Policy can be obtained by Gradient Ascend

Method

● Iterable Method

○ Optimal Policy Can be obtained by iterable method

like deep learning method

Page 22: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Policy Gradient

Page 23: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

● Why advantage is needed?

○ Reduce variance, without changing expectation

○ a

○ laksdfasdf

Advantage Actor Critic

Page 24: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Natural Policy Gradient

● The parameter update can not guarantee

the performance improvement of the actual function

Page 25: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 26: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 27: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 28: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 29: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 30: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 31: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 32: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 33: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 34: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 35: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Trust Region Policy Optimization● Trust Region

Page 36: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 37: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 38: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 39: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 40: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 41: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 42: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 43: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 44: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 45: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 46: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 47: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 48: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 49: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Proximal Policy Optimization● Trust Region

Page 50: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Value base Reinforcement Learning

Page 51: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Value base Reinforcement Learning

Page 52: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

● Playing Atari with Deep Reinforcement Learning(NIPS 2013)

● Human-level control through deep reinforcement learning(Nature 2015)

● Deep Reinforcement Learning with Double Q-Learning

● Dueling Network Architectures for Deep Reinforcement Learning

● Prioritized Experience Replay

● …..

Value base Reinforcement Learning

Page 53: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Dispatch Time : 5 min

Distance : 2 min

Value base Reinforcement Learning

Page 54: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Dispatch Time : 5 min

Distance : 2 min

Value base Reinforcement Learning

Page 55: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

3 min 8 min

Value base Reinforcement Learning

Page 56: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

● A Distributional perspective on Reinforcement Learning(2017)

● Distributional Reinforcement Learning with Quantile Regression(2017)

● Implicit Quantile Networks for Distributional Reinforcement Learning(2018)

Value base Reinforcement Learning

Page 57: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Value base Reinforcement Learning

Page 58: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Value base Reinforcement Learning

Page 59: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Distributional Reinforcement Learningwith Quantile Regrssion

● Histogram cannot meet condition

● Wasserstein Distance

Page 60: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Distributional Reinforcement Learningwith Quantile Regrssion

Page 61: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Implicit Quantile Networkfor Distributional Reinforcement Learning

Page 62: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Implicit Quantile Networkfor Distributional Reinforcement Learning

RandomSampling

Page 63: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Value base Reinforcement Learning

Page 64: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

64

Page 65: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

65

● Deepminds

Page 66: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

66

● Deepminds

● Open AI

Page 67: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

67

● Deepminds

● Open AI

Page 68: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

68

● Deepminds

● Open AI

Page 69: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

69

● Deepminds

● Open AI

Page 70: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

70

● Deepminds

● Open AI

Page 71: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

71

● Deepminds

● Open AI

Page 72: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

72

● Deepminds

○ Dopamine

■ Value based Reinforcement Learning opensource baseline

■ https://github.com/google/dopamine

● Open AI

Page 73: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Famous open source for Reinforcement Learning

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

73

● Deepminds

○ Dopamine

■ Value based Reinforcement Learning open source baseline

■ https://github.com/google/dopamine

● Open AI

○ Baselines

■ Policy based Reinforcement Learning open source baseline

■ https://github.com/openai/baselines

Page 74: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Why?

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

74

Page 75: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Why?

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

75

● Hard to Install

Page 76: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Why?

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

76

● Hard to Install

○ baselines : mp4py, mujoco…

○ dopamine : easy

Page 77: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Why?

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

77

● Hard to Install

○ baselines : mp4py, mujoco…

○ dopamine : easy

Page 78: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Why?

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

78

● Hard to Install

○ baselines : mp4py, mujoco…

○ dopamine : easy

● Hard to use to your environment

Page 79: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Why?

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

79

● Hard to Install

○ baselines : mp4py, mujoco…

○ dopamine : easy

● Hard to use to your environment

○ manipulate the api

○ https://github.com/google/dopamine

○ https://github.com/openai/baselines

Page 80: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Why?

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

80

● Hard to Install

○ baselines : mp4py, mujoco…

○ dopamine : easy

● Hard to use to your environment

○ manipulate the api

○ https://github.com/google/dopamine

○ https://github.com/openai/baselines

● Korean

Page 81: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

81

Page 82: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

82

● Hard to Install

Page 83: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

83

● Easy to Install

Page 84: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

84

● Easy to Install

Page 85: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

85

● Easy to Install

● Hard to use to your environment

Page 86: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

86

● Easy to Install

● Easy to use to your environment

Page 87: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

87

● Easy to Install

● Easy to use to your environment

○ Provide many tutorial code(discrete, continuous action)

Page 88: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

88

● Easy to Install

● Easy to use to your environment

○ Provide many tutorial code(discrete, continuous action)

○ Keep the flow of other reinforcement learning code

Page 89: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

89

● Easy to Install

● Easy to use to your environment

○ Provide many tutorial code(discrete, continuous action)

○ Keep the flow of other reinforcement learning code

Page 90: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Requirements

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

90

● Easy to Install

● Easy to use to your environment

● Korean

Page 91: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Supported algorithms

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

91

Page 92: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Supported algorithms

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

92

● Vanilla Policy Gradient

● Advantage Actor Critic

● Proximal Policy Optimization

● Deep Deterministic Policy Gradient

Page 93: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Supported algorithms

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

93

● Vanilla Policy Gradient(Discrete Action)

● Advantage Actor Critic(Discrete Action)

● Proximal Policy Optimization(Discrete, Continuous Action)

● Deep Deterministic Policy Gradient(Discrete, Continuous Action)

Page 94: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

To do List

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

94

● Vanilla Policy Gradient(Discrete Action)

● Advantage Actor Critic(Discrete Action)

● Proximal Policy Optimization(Discrete, Continuous Action)

● Deep Deterministic Policy Gradient(Discrete, Continuous Action)

Page 95: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

To do List

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

95

● Vanilla Policy Gradient(Discrete Action)

● Advantage Actor Critic(Discrete Action)

● Proximal Policy Optimization(Discrete, Continuous Action)

● Deep Deterministic Policy Gradient(Discrete, Continuous Action)

● Applicate LSTM to Vanilla Policy Gradient, Advantage Actor Critic, Proximal Policy Optimization

● Actor Critic Experience Replay

● Soft actor critic

● Value based Reinforcement Learning

○ DQN, Double DQN, Deuling DQN to Rainbow

○ Distributional RL

Page 96: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

How to use

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

96

Page 97: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

How to use

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

97

Page 98: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

License

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

98

Page 99: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

License

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

99

Free

Page 100: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

100

We are hiring

Page 101: Reinforcement Learning Opensource to Baselines...Korea Advanced Institute of Science Technology (KAIST) Dept. of Civil and Environmental Engineering 3 Reinforcement Learning Policy

Korea Advanced Institute of Science Technology (KAIST)Dept. of Civil and Environmental Engineering

101

Support us in any formhttps://github.com/RLOpensource/tensorflow_RL/blob/master/model.py