Over the last few years, the volunteer team behind Gym and Gymnasium has worked to fix bugs, improve the documentation, add new features, and change the API where appropriate such that the benefits outweigh the costs. This is the first alpha release of `v1.0.0`, which aims to be the end of this road of changing the project's API along with containing many new features and improved documentation.
To install v1.0.0a1, you must use `pip install gymnasium==1.0.0a1` or `pip install --pre gymnasium` otherwise, `v0.29.1` will be installed. Similarly, the website will default to v0.29.1's documentation, which can be changed with the pop-up in the bottom right.
We are really interested in projects testing with these v1.0.0 alphas to find any bugs, missing documentation, or issues with the API changes before we release v1.0 in full.
Removing the plugin system
Within Gym v0.23+ and Gymnasium v0.26 to v0.29, an undocumented feature that has existed for registering external environments behind the scenes has been removed. For users of [Atari (ALE)](https://github.com/Farama-Foundation/Arcade-Learning-Environment), [Minigrid](https://github.com/farama-Foundation/minigrid) or [HighwayEnv](https://github.com/Farama-Foundation/HighwayEnv), then users could use the following code:
python
import gymnasium as gym
env = gym.make("ALE/Pong-v5")
such that despite Atari never being imported (i.e., `import ale_py`), users can still load an Atari environment. This feature has been removed in v1.0.0, which will require users to update to
python
import gymnasium as gym
import ale_py
gym.register_envs(ale_py) optional
env = gym.make("ALE/Pong-v5")
Alternatively, users can do the following where the `ale_py` within the environment id will import the module
python
import gymnasium as gym
env = gym.make("ale_py:ALE/Pong-v5") `module_name:env_id`
For users with IDEs (i.e., VSCode, PyCharm), then `import ale_py` can cause the IDE (and pre-commit isort / black / flake8) to believe that the import statement does nothing. Therefore, we have introduced `gymnasium.register_envs` as a no-op function (the function literally does nothing) to make the IDE believe that something is happening and the import statement is required.
Note: ALE-py, Minigrid, and HighwayEnv must be updated to work with Gymnasium v1.0.0, which we hope to complete for all projects affected by alpha 2.
Vector environments
To increase the sample speed of an environment, vectorizing is one of the easiest ways to sample multiple instances of the same environment simultaneously. Gym and Gymnasium provide the `VectorEnv` as a base class for this, but one of its issues has been that it inherited `Env`. This can cause particular issues with type checking (the return type of `step` is different for `Env` and `VectorEnv`), testing the environment type (`isinstance(env, Env)` can be true for vector environments despite the two actings differently) and finally wrappers (some Gym and Gymnasium wrappers supported Vector environments but there are no clear or consistent API for determining which did or didn’t). Therefore, we have separated out `Env` and `VectorEnv` to not inherit from each other.
In implementing the new separate `VectorEnv` class, we have tried to minimize the difference between code using `Env` and `VectorEnv` along with making it more generic in places. The class contains the same attributes and methods as `Env` along with `num_envs: int`, `single_action_space: gymnasium.Space` and `single_observation_space: gymnasium.Space`. Additionally, we have removed several functions from `VectorEnv` that are not needed for all vector implementations: `step_async`, `step_wait`, `reset_async`, `reset_wait`, `call_async` and `call_wait`. This change now allows users to write their own custom vector environments, v1.0.0a1 includes an example vector cartpole environment that runs thousands of times faster than using Gymnasium’s Sync vector environment.
To allow users to create vectorized environments easily, we provide `gymnasium.make_vec` as a vectorized equivalent of `gymnasium.make`. As there are multiple different vectorization options (“sync”, “async”, and a custom class referred to as “vector_entry_point”), the argument `vectorization_mode` selects how the environment is vectorized. This defaults to `None` such that if the environment has a vector entry point for a custom vector environment implementation, this will be utilized first (currently, Cartpole is the only environment with a vector entry point built into Gymnasium). Otherwise, the synchronous vectorizer is used (previously, the Gym and Gymnasium `vector.make` used asynchronous vectorizer as default). For more information, see the function [docstring](https://gymnasium.farama.org/main/api/registry/#gymnasium.make_vec).
python
env = gym.make("CartPole-v1")
env = gym.wrappers.ClipReward(env, min_reward=-1, max_reward=3)
envs = gym.make_vec("CartPole-v1", num_envs=3)
envs = gym.wrappers.vector.ClipReward(envs, min_reward=-1, max_reward=3)
Due to this split of `Env` and `VectorEnv`, there are now `Env` only wrappers and `VectorEnv` only wrappers in `gymnasium.wrappers` and `gymnasium.wrappers.vector` respectively. Furthermore, we updated the names of the base vector wrappers from `VectorEnvWrapper` to `VectorWrapper` and added `VectorObservationWrapper`, `VectorRewardWrapper` and `VectorActionWrapper` classes. See the [vector wrapper](https://gymnasium.farama.org/main/api/vector/wrappers/) page for new information.
To increase the efficiency of vector environment, autoreset is a common feature that allows sub-environments to reset without requiring all sub-environments to finish before resetting them all. Previously in Gym and Gymnasium, auto-resetting was done on the same step as the environment episode ends, such that the final observation and info would be stored in the step’s info, i.e., `info["final_observation"]` and `info[“final_info”]` and standard obs and info containing the sub-environment’s reset observation and info. This required similar general sampling for vectorized environments.
python
replay_buffer = []
obs, _ = envs.reset()
for _ in range(total_timesteps):
next_obs, rewards, terminations, truncations, infos = envs.step(envs.action_space.sample())
for j in range(envs.num_envs):
if not (terminations[j] or truncations[j]):
replay_buffer.append((
obs[j], rewards[j], terminations[j], truncations[j], next_obs[j]
))
else:
replay_buffer.append((
obs[j], rewards[j], terminations[j], truncations[j], infos["next_obs"][j]
))
obs = next_obs
However, over time, the development team has recognized the inefficiency of this approach (primarily due to the extensive use of a Python dictionary) and the annoyance of having to extract the final observation to train agents correctly, for [example](https://github.com/vwxyzjn/cleanrl/blob/master/cleanrl/dqn.py#L200). Therefore, in v1.0.0, we are modifying autoreset to align with specialized vector-only projects like [EnvPool](https://github.com/sail-sg/envpool) and [SampleFactory](https://github.com/alex-petrenko/sample-factory) such that the sub-environment’s doesn’t reset until the next step. As a result, this requires the following changes when sampling. For environments with more complex observation spaces (and action actions) then
python
replay_buffer = []
obs, _ = envs.reset()
autoreset = np.zeros(envs.num_envs)
for _ in range(total_timesteps):
next_obs, rewards, terminations, truncations, _ = envs.step(envs.action_space.sample())
for j in range(envs.num_envs):
if not autoreset[j]:
replay_buffer.append((
obs[j], rewards[j], terminations[j], truncations[j], next_obs[j]
))
obs = next_obs
autoreset = np.logical_or(terminations, truncations)
``
Finally, we have improved `AsyncVectorEnv.set_attr` and `SyncVectorEnv.set_attr` functions to use the `Wrapper.set_wrapper_attr` to allow users to set variables anywhere in the environment stack if it already exists. Previously, this was not possible and users could only modify the variable in the “top” wrapper on the environment stack, importantly not the actual environment its self.
Wrappers
Previously, some wrappers could support both environment and vector environments, however, this was not standardized, and was unclear which wrapper did and didn't support vector environments. For v1.0.0, with separating `Env` and `VectorEnv` to no longer inherit from each other (read more in the vector section), the wrappers in `gymnasium.wrappers` will only support standard environments and wrappers in `gymnasium.wrappers.vector` contains the provided specialized vector wrappers (most but not all wrappers are supported, please raise a feature request if you require it).
In v0.29, we deprecated the `Wrapper.__getattr__` function to be replaced by `Wrapper.get_wrapper_attr`, providing access to variables anywhere in the environment stack. In v1.0.0, we have added `Wrapper.set_wrapper_attr` as an equivalent function for setting a variable anywhere in the environment stack if it already exists; only the variable is set in the top wrapper (or environment).
Most significantly, we have removed, renamed, and added several wrappers listed below.
* Removed wrappers
- `monitoring.VideoRecorder` - The replacement wrapper is `RecordVideo`
- `StepAPICompatibility` - We expect all Gymnasium environments to use the terminated / truncated step API, therefore, user shouldn't need the `StepAPICompatibility` wrapper. [Shimmy](https://shimmy.farama.org/) includes a compatibility environments to convert gym-api environment's for gymnasium.
* Renamed wrappers (We wished to make wrappers consistent in naming. Therefore, we have removed "Wrapper" from all wrappers and included "Observation", "Action" and "Reward" within wrapper names where appropriate)
- `AutoResetWrapper` -> `Autoreset`
- `FrameStack` -> `FrameStackObservation`
- `PixelObservationWrapper` -> `AddRenderObservation`
* Moved wrappers (All vector wrappers are in `gymnasium.wrappers.vector`)
- `VectorListInfo` -> `vector.DictInfoToList`
* Added wrappers
- `DelayObservation` - Adds a delay to the next observation and reward
- `DtypeObservation` - Modifies the dtype of an environment’s observation space
- `MaxAndSkipObservation` - Will skip `n` observations and will max over the last 2 observations, inspired by the Atari environment heuristic for other environments
- `StickyAction` - Random repeats actions with a probability for a step returning the final observation and sum of rewards over steps. Inspired by Atari environment heuristics
- `JaxToNumpy` - Converts a Jax-based environment to use Numpy-based input and output data for `reset`, `step`, etc
- `JaxToTorch` - Converts a Jax-based environment to use PyTorch-based input and output data for `reset`, `step`, etc
- `NumpyToTorch` - Converts a Numpy-based environment to use PyTorch-based input and output data for `reset`, `step`, etc
For all wrappers, we have added example code documentation and a changelog to help future researchers understand any changes made. See the following [page](https://gymnasium.farama.org/main/api/wrappers/misc_wrappers/#gymnasium.wrappers.TimeLimit) for an example.
Functional environments
One of the substantial advantages of Gymnasium's `Env` is it generally requires minimal information about the underlying environment specifications however, this can make applying such environments to planning, search algorithms, and theoretical investigations more difficult. We are proposing `FuncEnv` as an alternative definition to `Env` which is closer to a Markov Decision Process definition, exposing more functions to the user, including the observation, reward, and termination functions along with the environment’s raw state as a single object.
python
from typing import Any
import gymnasium as gym
from gymnasium.functional import StateType, ObsType, ActType, RewardType, TerminalType, Params
class ExampleFuncEnv(gym.functional.FuncEnv):
def initial(rng: Any, params: Params | None = None) → StateType
…K
def transition(state: StateType, action: ActType, rng: Any, params: Params | None = None) → StateType
…
def observation(state: StateType, params: Params | None = None) → ObsType
…
def reward(
state: StateType, action: ActType, next_state: StateType, params: Params | None = None
) → RewardType
…
def terminal(state: StateType, params: Params | None = None) → TerminalType
…
`FuncEnv` requires that `initial` and `transition` functions to return a new state given its inputs as a partial implementation of `Env.step` and `Env.reset`. As a result, users can sample (and save) the next state for a range of inputs to use with planning, searching, etc. Given a state, `observation`, `reward`, and `terminal` provide users explicit definitions to understand how each can affect the environment's output.
Additional bug fixes
* Limit the cython version for `gymnasium[mujoco-py]` due to cython==3 issues by pseudo-rnd-thoughts (https://github.com/Farama-Foundation/Gymnasium/pull/616)
* Fix `MuJoCo` environment type issues by Kallinteris-Andreas (https://github.com/Farama-Foundation/Gymnasium/pull/612)
* Fix mujoco rendering with custom width values by logan-dunbar (https://github.com/Farama-Foundation/Gymnasium/pull/634)
* Fix environment checker to correctly report infinite bounds by chrisyeh96 (https://github.com/Farama-Foundation/Gymnasium/pull/708)
* Fix type hint for `register(kwargs)` from `**kwargs` to `kwargs: dict | None = None` by younik (https://github.com/Farama-Foundation/Gymnasium/pull/788)
* Fix `CartPoleVectorEnv` step counter to be set back to zero on `reset` by TimSchneider42 (https://github.com/Farama-Foundation/Gymnasium/pull/886)
* Fix registration for async vector environment for custom environments by RedTachyon (https://github.com/Farama-Foundation/Gymnasium/pull/810)
Additional new features
* New MuJoCo v5 environments (the changes and performance graphs will be included in a separate blog post) by Kallinteris-Andreas (https://github.com/Farama-Foundation/Gymnasium/pull/572)
* Add support in MuJoCo human rendering to changing the size of the viewing window by logan-dunbar (https://github.com/Farama-Foundation/Gymnasium/pull/635)
* Add more control in MuJoCo rendering over offscreen dimensions and scene geometries by guyazran (https://github.com/Farama-Foundation/Gymnasium/pull/731)
* Add support to handle `NamedTuples` in `JaxToNumpy`, `JaxToTorch` and `NumpyToTorch` by RogerJL (https://github.com/Farama-Foundation/Gymnasium/pull/789) and pseudo-rnd-thoughts (https://github.com/Farama-Foundation/Gymnasium/pull/811)
* Add `padding_type` parameter to `FrameSkipObservation` to select the padding observation by jamartinh (https://github.com/Farama-Foundation/Gymnasium/pull/830)
* Add render check to `check_environments_match` by Kallinteris-Andreas (https://github.com/Farama-Foundation/Gymnasium/pull/748)
Deprecation
* Remove unnecessary error classes in error.py by pseudo-rnd-thoughts (https://github.com/Farama-Foundation/Gymnasium/pull/801)
* Stop exporting MuJoCo v2 environment classes from `gymnasium.envs.mujoco` by Kallinteris-Andreas (https://github.com/Farama-Foundation/Gymnasium/pull/827)
* Remove deprecation warning from PlayPlot by pseudo-rnd-thoughts (https://github.com/Farama-Foundation/Gymnasium/pull/800)
Documentation changes
* Updated the custom environment tutorial for v1.0.0 by kir0ul (https://github.com/Farama-Foundation/Gymnasium/pull/709)
* Add swig to installation instructions for Box2D by btjanaka (https://github.com/Farama-Foundation/Gymnasium/pull/683)
* Add tutorial Load custom quadruped robot environments using `Gymnasium/MuJoCo/Ant-v5` framework by Kallinteris-Andreas (https://github.com/Farama-Foundation/Gymnasium/pull/838)
* Add third-party tutorial page to list tutorials write and hosted on other websites by pseudo-rnd-thoughts (https://github.com/Farama-Foundation/Gymnasium/pull/867)
* Add more introductory pages by pseudo-rnd-thoughts (https://github.com/Farama-Foundation/Gymnasium/pull/791)
* Add figures for each MuJoCo environments representing their action space by Kallinteris-Andreas (https://github.com/Farama-Foundation/Gymnasium/pull/762)
* Fix the documentation on blackjack's starting state by pseudo-rnd-thoughts (https://github.com/Farama-Foundation/Gymnasium/pull/893)
* Fix the documentation on Frozenlake and Cliffwalking's position by PierreCounathe (https://github.com/Farama-Foundation/Gymnasium/pull/695)
* Update the classic control environment's `__init__` and `reset` arguments by pseudo-rnd-thoughts (https://github.com/Farama-Foundation/Gymnasium/pull/898)
**Full Changelog**: https://github.com/Farama-Foundation/Gymnasium/compare/v0.29.0...v1.0.0a1