Di-engine

Latest version: v0.5.1

Safety actively analyzes 641049 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 5

0.7.0

New Repo
- [Go-Bigger](https://github.com/opendilab/GoBigger) OpenDILab Multi-Agent Decision Intelligence Environment
- [GoBigger-Challenge-2021](https://github.com/opendilab/GoBigger-Challenge-2021) Basic code and description for GoBigger challenge 2021

**Contributors: PaParaZz1 puyuan1996 Will-Nie YinminZhang Weiyuhong-1998 LikeJulia sailxjx davide97l jayyoung0802 lichuminglcm yifan123 RobinC94 zjowowen**

0.5.0

Env
1. add tabmwp env (667)
2. polish anytrading env issues (731)

Algorithm
1. add PromptPG algorithm (667)
2. add Plan Diffuser algorithm (700) (749)
3. add new pipeline implementation of IMPALA algorithm (713)
4. add dropout layers to DQN-style algorithms (712)

Enhancement
1. add new pipeline agent for sac/ddpg/a2c/ppo and Hugging Face support (637) (730) (737)
2. add more unittest cases for model (728)
3. add collector logging in new pipeline (735)

Fix
1. fix logger middleware problems (715)
2. fix ppo parallel bug (709)
3. fix typo in optimizer_helper.py (726)
4. fix mlp dropout if condition bug
5. fix drex collecting data unittest bugs

Style
1. polish env manager/wrapper comments and API doc (742)
2. polish model comments and API doc (722) (729) (734) (736) (741)
3. polish policy comments and API doc (732)
4. polish rl_utils comments and API doc (724)
5. polish torch_utils comments and API doc (738)
6. update README.md and Colab demo (733)
7. update metaworld docker image

News
1. NeurIPS 2023 Spotlight: [LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios](https://github.com/opendilab/LightZero)
2. OpenDILab + Hugging Face DRL Model Zoo [link](https://huggingface.co/OpenDILabCommunity)

**Full Changelog**: https://github.com/opendilab/DI-engine/compare/v0.4.9...v0.5.0

**Contributors: PaParaZz1 zjowowen AltmanD puyuan1996 kxzxvbk Super1ce nighood Cloud-Pku zhangpaipai ruoyuGao eltociear**

0.4.9

API Change
1. refactor the implementation of Decision Transformer, DI-engine supports both discrete and continuous DT outputs with the multi-modal observation now (example: `ding/example/dt.py`)
2. Update the multi-GPU Distributed Data Parallel (DDP) example ([link](https://github.com/opendilab/DI-engine/blob/main/dizoo/atari/config/serial/spaceinvaders/spaceinvaders_dqn_config_multi_gpu_ddp.py))
3. Change the return value of `InteractionSerialEvaluator`, simplifying redundant results

Env
1. add cliffwalking env (677)
2. add lunarlander ppo config and example

Algorithm
1. add BCQ offline RL algorithm (640)
2. add Dreamerv3 model-based RL algorithm (652)
3. add tensor stream merge network tools (673)
4. add scatter connection model (680)
5. refactor Decision Transformer in new pipeline and support img input and discrete output (693)
6. add three variants of Bilinear classes and a FiLM class (703)

Enhancement
1. polish offpolicy RL multi-gpu DDP training (679)
2. add middleware for Ape-X distributed pipeline (696)
3. add example for evaluating trained DQN (706)

Fix
1. fix to_ndarray fails to assign dtype for scalars (708)
2. fix evaluator return episode_info compatibility bug
3. fix cql example entry wrong config bug
4. fix enable_save_figure env interface
5. fix redundant env info bug in evaluator
6. fix to_item unittest bug

Style
1. polish and simplify requirements (672)
2. add Hugging Face Model Zoo badge (674)
3. add openxlab Model Zoo badge (675)
4. fix py37 macos ci bug and update default pytorch from 1.7.1 to 1.12.1 (678)
5. fix mujoco-py compatibility issue for cython<3 (711)
6. fix type spell error (704)
7. fix pypi release actions ubuntu 18.04 bug
8. update contact information (e.g. wechat)
9. polish algorithm doc tables

New Repo
1. [DOS](https://github.com/opendilab/DOS): [CVPR 2023] ReasonNet: End-to-End Driving with Temporal and Global Reasoning

**Full Changelog**: https://github.com/opendilab/DI-engine/compare/v0.4.8...v0.4.9

**Contributors: PaParaZz1 zjowowen zhangpaipai AltmanD puyuan1996 Cloud-Pku Super1ce kxzxvbk jayyoung0802 Mossforest lxl2gf Privilger**

0.4.8

API Change
1. `stop value` is not the necessary field in config, defaults to `math.inf`, users can indicate `max_env_step` or `max_train_iter` in training entry to run the program with a fixed termination condition.

Env
1. fix gym hybrid reward dtype bug (664)
2. fix atari env id noframeskip bug (655)
3. fix typo in gym any_trading env (654)
4. update td3bc d4rl config (659)
5. polish bipedalwalker config

Algorithm
1. add EDAC offline RL algorithm (639)
2. add LN and GN norm_type support in ResBlock (660)
3. add normal value norm baseline for PPOF (658)
4. polish last layer init/norm in MLP (650)
5. polish TD3 monitor variable

Enhancement
1. add MAPPO/MASAC task example (661)
2. add PPO example for complex env observation (644)
3. add barrier middleware (570)

Fix
1. fix abnormal collector log and add record_random_collect option (662)
2. fix to_item compatibility bug (646)
3. fix trainer dtype transform compatibility bug
4. fix pettingzoo 1.23.0 compatibility bug
5. fix ensemble head unittest bug

Style
1. fix incompatible gym version bug in Dockerfile.env (653)
2. add more algorithm [docs](https://di-engine-docs.readthedocs.io/en/latest/12_policies/index.html)

New Repo
1. [LightZero](https://github.com/opendilab/LightZero): A lightweight and efficient MCTS/AlphaZero/MuZero algorithm toolkit.

**Full Changelog**: https://github.com/opendilab/DI-engine/compare/v0.4.6...v0.4.7

**Contributors: PaParaZz1 zjowowen puyuan1996 SolenoidWGT Super1ce karroyan zhangpaipai eltociear**

0.4.7

API Change
1. remove the requirements of sub fields (learn/collect/eval) in the policy config (users can define their own config formats)
2. use `wandb` as the default logger in task pipeline
3. remove `value_network` config field and implementations in SAC and related algorithms

Env
1. add dmc2gym env support and baseline (451)
2. update pettingzoo to the latest version (597)
3. polish icm/rnd+onppo config bugs and add app_door_to_key env (564)
4. add lunarlander continuous TD3/SAC config
5. polish lunarlander discrete C51 config

Algorithm
1. add Procedure Cloning (PC) imitation learning algorithm (514)
2. add Munchausen Reinforcement Learning (MDQN) algorithm (590)
3. add reward/value norm methods: popart & value rescale & symlog (605)
4. polish reward model config and training pipeline (624)
5. add PPOF reward space demo support (608)
6. add PPOF Atari demo support (589)
7. polish dqn default config and env examples (611)
8. polish comment and clean code about SAC

Enhancement
1. add language model (e.g. GPT) training utils (625)
2. remove policy cfg sub fields requirements (620)
3. add full wandb support (579)

Fix
1. fix confusing shallow copy operation about next_obs (641)
2. fix unsqueeze action_args in PDQN when shape is 1 (599)
3. fix evaluator return_info tensor type bug (592)
4. fix deque buffer wrapper PER bug (586)
5. fix reward model save method compatibility bug
6. fix logger assertion and unittest bug
7. fix bfs test py3.9 compatibility bug
8. fix zergling collector unittest bug

Style
1. add DI-engine torch-rpc p2p communication docker (628)
2. add D4RL docker (591)
3. correct typo in task (617)
4. correct typo in time_helper (602)
5. polish readme and add treetensor example
6. update contributing doc

New Plan
- **Call for contributors about DI-engine** (621)
<div align="center">
<img width="300px" height="auto" src="https://user-images.githubusercontent.com/33195032/226930216-b191c457-85ba-48d5-ae0f-c7ed7b46e9c1.png"></a>
</div>

**Full Changelog**: https://github.com/opendilab/DI-engine/compare/v0.4.6...v0.4.7

**Contributors: PaParaZz1 karroyan zjowowen ruoyuGao kxzxvbk nighood song2181 SolenoidWGT PSHarold jimmydengpeng eltociear**

0.4.6

API Change
1. middleware: `CkptSaver(cfg, policy, train_freq=100)` -> `CkptSaver(policy, cfg.exp_name, train_freq=100)`

Env
1. add metadrive env and related ppo config (574)
2. add acrobot env and related dqn config (577)
3. add carracing in box2d (575)
4. add new gym hybrid viz (563)
5. update cartpole IL config (578）

Algorithm
1. add BDQ algorithm (558)
2. add procedure cloning model (573)

Enhancement
1. add simplified PPOF (PPO × Family) interface (567) (568) (581) (582)

Fix
1. fix to_device and prev_state bug when using ttorch (571)
2. fix py38 and numpy unittest bugs (565)
3. fix typo in contrastive_loss.py (572)
4. fix dizoo envs pkg installation bugs
5. fix multi_trainer middleware unittest bug

Style
1. add evogym docker (580)
2. fix metaworld docker bug
3. fix setuptools high version incompatibility bug
4. extend treetensor lowest version

New Paper
1. [GoBigger](https://openreview.net/forum?id=NnOZT_CR26Z): [ICLR 2023] A Scalable Platform for Cooperative-Competitive Multi-Agent Interactive Simulation

**Contributors: PaParaZz1 puyuan1996 timothijoe Cloud-Pku ruoyuGao Super1ce karroyan kxzxvbk eltociear**

Page 2 of 5

Releases

Has known vulnerabilities

Previous Next

Di-engine

Page 2 of 5

0.7.0

0.5.0

0.4.9

0.4.8

0.4.7

0.4.6

Page 2 of 5

Links

Releases