Cleanrl

Latest version: v1.2.0

Safety actively analyzes 723929 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 5

2386.46

| Walker2DBulletEnv-v0 | 567.61 ± 15.01 | 2177.57 ± 65.49 | 1377.68 ± 51.96 |
| HalfCheetahBulletEnv-v0 | 2847.63 ± 212.31 | 2537.34 ± 347.20 | 2347.64 ± 51.56 |
| AntBulletEnv-v0 | 2094.62 ± 952.21 | 3253.93 ± 106.96 | 1775.50 ± 50.19 |
| HopperBulletEnv-v0 | 1262.70 ± 424.95 | 2271.89 ± 24.26 | 2311.20 ± 45.28 |
| HumanoidBulletEnv-v0 | -54.45 ± 13.99 | 937.37 ± 161.05 | 204.47 ± 1.00 |
| BipedalWalker-v3 | 66.01 ± 127.82 | 78.91 ± 232.51 | 272.08 ± 10.29 |
| LunarLanderContinuous-v2 | 162.96 ± 65.60 | 281.88 ± 0.91 | 215.27 ± 10.17 |
| Pendulum-v0 | -238.65 ± 14.13 | -345.29 ± 47.40 | -1255.62 ± 28.37 |
| MountainCarContinuous-v0 | -1.01 ± 0.01 | -1.12 ± 0.12 | 93.89 ± 0.06 |

Other Results

| gym_id | ppo | dqn |
|:---------------|:---------------|:----------------|
| CartPole-v1 | 500.00 ± 0.00 | 182.93 ± 47.82 |
| Acrobot-v1 | -80.10 ± 6.77 | -81.50 ± 4.72 |

1910.07207

My personal thanks to everyone who participated in the monthly dev cycle and, in particular, dosssman who implemented the SAC with discrete action spaces.

Additional improvement include
support gym.wrappers.Monitor to automatically record agent’s performance at certain episodes (default is 1, 2, 9, 28, 65, ... 1000, 2000, 3000) and integrate with wandb. (so cool, see screenshot below) 4
Use the same replay buffer from minimalRL for DQN and SAC 5

https://app.wandb.ai/cleanrl/cleanrl.benchmark

![image](https://user-images.githubusercontent.com/5555347/72108416-8f46ff00-3301-11ea-91d7-04c611f28ee7.png)

1812.05905

1801.01290

67.22

| PongNoFrameskip-v4 | 19.06 ± 0.83 | 18.00 ± 0.00 | 19.78 ± 0.22 | 20.72 ± 0.28 |
| BreakoutNoFrameskip-v4 | 364.97 ± 58.36 | 386.10 ± 21.77 | 353.39 ± 30.61 | 380.67 ± 35.29 |

Mujoco Results

| gym_id | ddpg_continuous_action | td3_continuous_action | ppo_continuous_action |
|:--------------------|:-------------------------|:------------------------|:------------------------|
| Reacher-v2 | -6.25 ± 0.54 | -6.65 ± 0.04 | -7.86 ± 1.47 |
| Pusher-v2 | -44.84 ± 5.54 | -59.69 ± 3.84 | -44.10 ± 6.49 |
| Thrower-v2 | -137.18 ± 47.98 | -80.75 ± 12.92 | -58.76 ± 1.42 |
| Striker-v2 | -193.43 ± 27.22 | -269.63 ± 22.14 | -112.03 ± 9.43 |

31.67

| HalfCheetah-v2 | 10386.46 ± 265.09 | 9265.25 ± 1290.73 | 1717.42 ± 20.25 |
| Hopper-v2 | 1128.75 ± 9.61 | 3095.89 ± 590.92 | 2276.30 ± 418.94 |
| Swimmer-v2 | 114.93 ± 29.09 | 103.89 ± 30.72 | 111.74 ± 7.06 |
| Walker2d-v2 | 1946.23 ± 223.65 | 3059.69 ± 1014.05 | 3142.06 ± 1041.17 |
| Ant-v2 | 243.25 ± 129.70 | 5586.91 ± 476.27 | 2785.98 ± 1265.03 |
| Humanoid-v2 | 877.90 ± 3.46 | 6342.99 ± 247.26 | 786.83 ± 95.66 |

Pybullet Results

| gym_id | ddpg_continuous_action | td3_continuous_action | ppo_continuous_action |
|:-----------------------------------|:-------------------------|:------------------------|:------------------------|
| MinitaurBulletEnv-v0 | -0.17 ± 0.02 | 7.73 ± 5.13 | 23.20 ± 2.23 |
| MinitaurBulletDuckEnv-v0 | -0.31 ± 0.03 | 0.88 ± 0.34 | 11.09 ± 1.50 |
| InvertedPendulumBulletEnv-v0 | 742.22 ± 47.33 | 1000.00 ± 0.00 | 1000.00 ± 0.00 |

Page 1 of 5

Releases

Has known vulnerabilities

Cleanrl

Page 1 of 5

2386.46

1910.07207

1812.05905

1801.01290

67.22

31.67

Page 1 of 5

Links

Releases