Di-engine

Latest version: v0.5.1

Safety actively analyzes 641049 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 5

0.3.1

API Change
1. Substitute `gym.wrappers.RecordVideo` for `gym.wrappers.Monitor` to save video replay
2. Substitute `policy/bc.py` for `policy/il.py` and update relevant serial_pipeline and unittest
3. Polish all the configurations in dizoo with our new config guideline

Env (dizoo)
1. polish and standardize dizoo config (252) (255) (249) (246) (262) (261) (266) (273) (263) (280) (259) (286) (277) (290) (289) (299)
4. add GRF academic env and config (281)
5. update env inferface of GRF (258)
6. update D4RL offline RL env and config (285)
7. polish PomdpAtariEnv (254)

Algorithm
1. DREX Inverse RL algorithm (218)

Feature
1. separate mq and parallel modules, add redis (247)
2. rename env variables; fix attach_to parameter (244)
4. env implementation check (275)
5. adjust and set the max column number of tabulate in log (296)
6. speed up GTrXL forward method + GRU unittest (253) (292)
8. add drop_extra option for sample collect

Fix
1. add act_scale in DingEnvWrapper; fix envpool env manager (245)
2. auto_reset=False and env_ref bug in env manager (248)
3. data type and deepcopy bug in RND (288)
4. share_memory bug and multi_mujoco env (279)
5. some bugs in GTrXL (276)
7. update gym_vector_env_manager and add more unittest (241)
9. mdpolicy random collect bug (293)
10. gym.wrapper save video replay bug
11. collect abnormal step format bug and add unittest

Test
1. add buffer benchmark & socket test (284)

Style
1. upgrade mpire (251)
2. add GRF(google research football) docker (256)
3. update policy and gail comment

**Contributors: PaParaZz1 sailxjx puyuan1996 Will-Nie davide97l hiha3456 zjowowen Weiyuhong-1998 LuciusMos kxzxvbk lixl-st YinminZhang song2181 Hcnaeg norman26625 jayyoung0802 RobinC94 HansBug**

0.3.0

API Change
1. add new `BaseEnv` definition:
- remove `info` method
- add `random_action` method
- add `observation_space`, `action_space`, `reward_space` properties
- [Env English doc](https://di-engine-docs.readthedocs.io/en/latest/best_practice/ding_env.html) | [环境中文文档](https://di-engine-docs.readthedocs.io/zh_CN/latest/best_practice/ding_env_zh.html)
2. modify the return value of `eval` method in `InteractionSerialEvaluator` class from `Tuple[bool, float]` to `Tuple[bool, dict]`.
3. move the default logger to rich logger, you can set env variable like `export ENABLE_RICH_LOGGING=False` to disable it.
4. add `train_iter` and `env_step` argument in ding CLI.
- you can use them like `ding -m serial -c pendulum_sac_config.py -s 0 --train-iter 1e3`
5. remove default `n_sample/n_episode` value in policy default config.

Env (dizoo)
1. add bitfilp HER DQN benchmark (192) (193) (197)
2. add slime volley league training demo (229)

Algorithm
1. Gated TransformXL (GTrXL) algorithm (136)
2. TD3 + VAE(HyAR) latent action algorithm (152)
6. stochastic dueling network (234)
7. use log prob instead of using prob in ACER (186)

Feature
1. support envpool env manager (228)
2. add league main and other improvements in new framework (177) (214)
3. add pace controller middleware in new framework (198)
4. add auto recover option in new framework (242)
5. add k8s parser in new framework (243)
8. support async event handler and logger (213)
9. add grad norm calculator (205)
10. add gym vector env manager (147)
11. add train_iter and env_step in serial pipeline (212)
12. add rich logger handler (219) (223) (232)
13. add naive lr_scheduler demo

Refactor
1. new BaseEnv and DingEnvWrapper (171) (231) (240) [Env English doc](https://di-engine-docs.readthedocs.io/en/latest/best_practice/ding_env.html) | [环境中文文档](https://di-engine-docs.readthedocs.io/zh_CN/latest/best_practice/ding_env_zh.html)

Polish
Improve configurations in dizoo and add more algorithm benchmark [doc example](https://di-engine-docs.readthedocs.io/en/latest/hands_on/dqn.html) | [文档示例](https://di-engine-docs.readthedocs.io/zh_CN/latest/hands_on/dqn_zh.html)
1. MAPPO and MASAC smac config (209) (239)
2. QMIX smac config (175)
3. R2D2 atari config (181)
4. A2C atari config (189)
5. GAIL box2d and mujoco config (188)
6. ACER atari config (180)
7. SQIL atari config (230)
8. TREX atari/mujoco config
9. IMPALA atari config
10. MBPO/D4PG mujoco config

Fix
1. random_collect compatible to episode collector (190)
2. remove default n_sample/n_episode value in policy config (185)
3. PDQN model bug on gpu device (220)
4. TREX algorithm CLI bug (182)
5. DQfD JE computation bug and move to AdamW optimizer (191)
6. pytest problem for parallel middleware (211)
7. mujoco numpy compatibility bug
8. markupsafe 2.1.0 bug
9. framework parallel module network emit bug
10. mpire bug and disable algotest in py3.8
11. lunarlander env import and env_id bug
12. icm unittest repeat name bug
13. buffer thruput close bug

Test
1. resnet unittest (199)
2. SAC/SQN unittest (207)
3. CQL/R2D3/GAIL unittest (201)
4. NGU td unittest (210)
5. model wrapper unittest (215)
6. MAQAC model unittest (226)

Style
1. add doc docker (221) (latex support)

**Contributors: PaParaZz1 sailxjx puyuan1996 Will-Nie Weiyuhong-1998 davide97l zjowowen LuciusMos kxzxvbk Hcnaeg jayyoung0802 simonat2011 jiaruonan**

0.2.3

API Change
1. move `actor_head_type` to `action_space` (which is related DDPG/TD3/SAC)
2. add multiple seeds in CLI: `ding -m serial -c cartpole_dqn_config.py -s 0 -s 1 -s 2`
3. add new replay buffer (which separates algorithm and storage), user can refer to [buffer](https://github.com/opendilab/DI-engine/tree/main/ding/worker/buffer)
4. add new main pipeline for async/parallel framework [tutorial](https://di-engine-docs.readthedocs.io/en/latest/distributed/index.html)

Env (dizoo)
1. add multi-agent mujoco env (146)
2. add delay reward mujoco env (145)
3. fix port conflict in gym_soccer (139)

Algorithm
1. MASAC algorithm (112)
2. TREX IRL algorithm (119) (144)
3. H-PPO hybrid action space algorithm (140)
4. residual link in R2D2 (150)
5. gumbel softmax (169)
6. move actor_head_type to action_space field

Feature
1. new main pipeline and async/parallel framework (142) (166) (168)
2. refactor buffer, separate algorithm and storage (129)
3. cli in new pipeline(ditask) (160)
4. add multiprocess tblogger, fix circular reference problem (156)
5. add multiple seed cli
6. polish eps_greedy_multinomial_sample in model_wrapper (154)

Fix
1. R2D3 abs priority problem (158) (161)
2. multi-discrete action space policies random action bug (167)
3. doc generate bug with enum_tools (155)

Style
1. more comments about R2D2 (149)
2. add doc about how to migrate a new env [link](https://di-engine-docs.readthedocs.io/en/latest/best_practice/ding_env.html)
3. add doc about env tutorial in dizoo [link](https://di-engine-docs.readthedocs.io/en/latest/env_tutorial/index.html)
4. add conda auto release (148)
5. udpate zh doc link
6. update kaggle tutorial link

New Repo
1. [awesome-model-based-RL](https://github.com/opendilab/awesome-model-based-RL): A curated list of awesome Model-Based RL resources
2. [DI-smartcross](https://github.com/opendilab/DI-smartcross): Decision AI in Traffic Light Control

**Contributors: PaParaZz1 sailxjx puyuan1996 Will-Nie Weiyuhong-1998 LikeJulia RobinC94 LuciusMos mingzhang96 shgqmrf15 zjowowen**

0.2.2

Env (dizoo)
1. apple key to door treasure env (128)
2. bsuite memory benchmark (138)
3. polish atari impala config

Algorithm
1. Guided Cost IRL algorithm (57)
2. ICM exploration algorithm (41)
3. MP-DQN hybrid action space algorithm (131)
4. add loss statistics and polish r2d3 pong config (126)

Enhancement
1. add renew env mechanism in env manager and update timeout mechanism (127) (134)

Fix
1. async subprocess env manager reset bug (137)
2. keepdims name bug in model wrapper
3. on-policy ppo value norm bug
4. GAE and RND unittest bug
5. hidden state wrapper h tensor compatibility
6. naive buffer auto config create bug

Style
1. add supporters list

New Repo Feature
1. [treevalue speed benchmark](https://github.com/opendilab/treevalue#speed-performance)

**Contributors: PaParaZz1 puyuan1996 RobinC94 LikeJulia Will-Nie Weiyuhong-1998 timothijoe davide97l lichuminglcm YinminZhang**

0.2.1

API Change
1. remove torch in all envs (numpy array is the basic data format in env)
2. remove `on_policy` field in all the config
3. change `eval_freq` from 50 to 1000

Tutorial and Doc
1. [env tutorial](https://di-engine-docs.readthedocs.io/en/latest/env_tutorial/index.html)/[环境指南](https://di-engine-docs.readthedocs.io/en/main-zh/env_tutorial/index_zh.html)


Env (dizoo)
1. gym-hybrid env (86)
2. gym-soccer (HFO) env (94)
3. Go-Bigger env baseline (95)
4. sac and ppo config for bipedalwalker env(121)

Algorithm
1. DQfD Imitation Learning algorithm (48) (98)
2. TD3BC offline RL algorithm (88)
3. MBPO model-based RL algorithm (113)
4. PADDPG hybrid action space algorithm (109)
5. PDQN hybrid action space algorithm (118)
6. fix R2D2 bugs and produce benchmark, add naive NGU (40)
7. self-play training demo in slime_volley env (23)
8. add example of GAIL entry + config for mujoco (114)

Enhancement
1. enable arbitrary policy num in serial sample collector
2. add torch DataParallel for single machine multi-GPU
3. add registry force_overwrite argument
4. add naive buffer periodic thruput seconds argument


Fix
1. target model wrapper hard reset bug
2. fix learn state_dict target model bug
3. ppo bugs and update atari ppo offpolicy config (108)
4. pyyaml version bug (99)
5. small fix on bsuite environment (117)
6. discrete cql unittest bug
7. release workflow bug
8. base policy model state_dict overlap bug
9. remove on_policy option in dizoo config and entry
10. remove torch in env


Test
1. add pure docker setting test (103)
2. add unittest for dataset and evaluator (107)
3. add unittest for on-policy algorithm (92)
4. add unittest for ppo and td (MARL case) (89)


Style

0.2.0

API Change
1. `SampleCollector` rename to `SampleSerialCollector`
2. `EpisodeCollector` rename to `EpisodeSerialCollector`
3. `BaseSerialEvaluator` rename to `InteractionSerialEvaluator`
4. `ZerglingCollector` rename to `ZerglingParallelCollector`
5. `OneVsOneCollector` rename to `MarineParallelCollector`
6. `AdvancedBuffer` registry name from `priority` to `advanced`


Env (dizoo)
1. overcooked env (20)
2. procgen env (26)
3. modified predator env (30)
4. d4rl env (37)
5. imagenet dataset (27)
6. bsuite env (58)
7. move atari_py to ale-py

Algorithm
1. SQIL algorithm (25) (44)
2. CQL algorithm (discrete/continuous) (37) (68)
3. MAPPO algorithm (62)
4. WQMIX algorithm (24)
5. D4PG algorithm (76)
6. update multi-discrete policy(dqn, ppo, rainbow) (51) (72)

Enhancement
1. image classification supervised training pipeline (27)
2. add force_reproducibility option in subprocess env manager
3. add/delete/restart replicas via cli for k8s
4. add league metric (trueskill and elo) (22)
5. add tb in naive buffer and modify tb in advanced buffer (39)
6. add k8s launcher and di-orchestrator launcher, add related unittest (45) (49)
7. add hyper-parameter scheduler module (38)
8. add plot function (59)

Fix
1. acer weight bug and update atari result (21)
2. mappo nan bug and dict obs cannot unsqueeze bug (54)
3. r2d2 hidden state and obs pre-processing bug (36) (52)
4. ppo bug when use dual_clip and adv > 0
5. qmix double_q hidden state bug
6. spawn context problem in interaction unittest (69)
7. formatted config no eval bug (53)
8. the catch statements that will never succeed and system proxy bug (71) (79)
9. lunarlander config polish
10. c51 head dimension mismatch bug
11. mujoco config typo bug
12. ppg atari config multi buffer bug
13. max use and priority update special branch bug in advanced_buffer

Style
1. add docker deploy in github workflow (70) (78) (80)

Page 4 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.