AWS Deep Racer

reinforcement learning where car drives using reinforcement learning.

  • SAC (Soft actor critic)
  • PPO (proximal policy optimization)

SAC > It is data efficient but lacks stability. It works only in continuous action space.

PPO > It is data hungry and stable. It works in both discrete and continuous space.

action space : available choices for the agent

src: Aws machine learning foundation course

The reward is incentivizing (encouraging ) the car to perform better.

src : AWS course foundation

The longer the car explores the better result is found.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store