AWS Deep Racer

Adarsha Regmi
Jul 22, 2021


reinforcement learning where car drives using reinforcement learning.

Deep Racer training algorithms

  • SAC (Soft actor critic)
  • PPO (proximal policy optimization)

SAC > It is data efficient but lacks stability. It works only in continuous action space.

PPO > It is data hungry and stable. It works in both discrete and continuous space.

vocabulary to know

action space : available choices for the agent

Reward planning

src: Aws machine learning foundation course

The reward is incentivizing (encouraging ) the car to perform better.

src : AWS course foundation

The longer the car explores the better result is found.