Sheeprl

Latest version: v0.5.7

Safety actively analyzes 682404 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 5

0.4.5post0

* Fixes MineDojo and Dreamer's player in 148

0.4.5

* Added new how-to explaining how to add a new custom environment in 128
* Added the possibility to completely disable logging metrics and decide what and how to log metrics in every algorithm in 129
* Fixed the models creation of the Dreamer-V3 agent, where we have removed the bias from every linear layer followed by a LayerNorm and an activation function
* Added the possibility for the users to specify their own custom configs, possibly inheriting from the already defined sheeprl configs in 132
* Added the support to [Lightning 2.1](https://github.com/Lightning-AI/lightning/releases/tag/2.1.0) in #136
* Added the possibility to evaluate every agent given a checkpoint in 139 141
* Various minor fixes in 125 133 134 135 137 140 143 144 145 146

0.4.4

* Fixes the activation in the recurrent model in DV1 in 110
* Updated the Diambra wrapper to support the new Diambra package in 111
* Added `dotdict` to speedup accessing the loaded config in 112
* Better naming when hydra creates the output dirs in 114
* Added the `validate_args` to decide whether `torch.distributions` must check the arguments to the `__init__` function in; disable it to have a huge speedup in 116
* Updated Diambra wrapper to support `AsynVectorEnv` 119
* Minor fixes 120

0.4.3

In this release we have:

* Fixed the action reset given the done flag in the recurrent PPO implementation
* Updated the documentation

0.4.2

In this release we have:

* refactored the recurrent PPO implementation. In particular:
* A single LSTM model is used, taking in input the current observation, the previous played action and the previous recurrent state, i.e., `LSTM([o_t, a_t-1], h_t-1)`. The LSTM has an optional pre-mlp an post-mlp: those can be controlled in the relative `algo/ppo_recurrent.yaml` config
* A feature extractor is used to extract features from the observations, being them vectors or images
* Every PPO algorithm now computes the bootstrapped value, summing it to the current reward, whenever an environment has been truncated

0.4.0

In this release we have:

* made the whole framework single-entryed, i.e. now one can run an experiment just with `python sheeprl.py exp=... env=...`, removing the need to prepend `lightning run model ... sheeprl.py` everytime. The Fabric-related configs can be found and changed under the `sheeprl/configs/fabric/` folder. (97)
* uniformed the `make_env` vs `make_dict_env` method, so there is no more distinction between the two. We now assume that the environment has an observation space that is a `gymnasium.spaces.Dict`, if it is not an exception is raised. (96)
* implemented the `resume_from_checkpoint` for every algorithm. (95)
* added the Crafter environment. (103)
* Fixed some environments, in particula Diambra and DMC
* Diambra: renamed the wrapper implementation file; the done flag now checks if info["env_done"] flag is True. (98)
* DMC: removed a `env.frame_skip=0` for mujoco envs and removed the action repeat from the DMC wrapper. (99)

Page 3 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.