Rl4co

Latest version: v0.5.1

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 3

0.3.0

Faster Library, Python 3.11 and new TorchRL support, Envs, Models, Multiple Dataloaders, and more 🚀

Faster Library, new Python 3.11 and TorchRL
- Update to latest TorchRL 72, solving several issues as 95 97 (also see [this](https://github.com/wouterkool/attention-learn-to-route/issues/58))
- Benchmarking:
- Up to 20% speedup in training epochs thanks to faster TensorDict and new env updates
- Almost instant data generation (avoid list comprehension, e.g. from ~20 seconds to <1 second per epoch!)
- Python 3.11 now available 97

New SMTWTP environment
- Add new scheduling problem: Single Machine Total Weighted Tardiness Problem environment as in [DeepACO](https://arxiv.org/pdf/2309.14032.pdf) henry-yeh

New MatNet model
- Add MatNet version for square matrices (faster implementation ideal for routing problems)
- Should be easy to implement scheduling from here

Multiple Dataloaders
- Now it is possible to have multiple dataloaders, with naming as well!
- For example, to track generalization during training

Miscellaneous
- Fix POMO shapes hyeok9855 , modularizing PPO etc
- Fix precion bug for PPO
- New AI4CO transfer!

0.2.3

Add FlashAttention2 support âš¡

- Add FlashAttention2 support as mentioned [here](https://github.com/kaist-silab/rl4co/issues/85)
- Remove old wrapper for `half()` precision since Lightning already deals with this
- Fix `scaled_dot_product_attention` implementation in PyTorch < 2.0
- Minor fixes

0.2.2

QoL: New Baseline, Testing Search Methods, Downloader, Miscellanea 🚀

Changelog
- Add mean baseline hyeok9855
- Add testing for search methods
- Move downloader to external repo, extra URL as backup for DPP
- Small bug fix for duplicate args
- Add more modular data generation
- Suppress extra warning in `automatic_optimization`
- Minor doc cleaning

0.2.1

QoL, Better documentation, Bug Fixes 🚀

- Add `RandomPolicy` class
- Control `max_steps` for debugging purposes during decoding
- Better documentation, add tutorials, and references 88 bokveizen
- Set bound to < Python 3.11 for the time being 90 hyeok9855
- Log more info by default in PPO
- `precompute_cache` method can now accept `td` as well
- If `Trainer` is supplied with `gradient_clip_val` and `manual_optimization=False`, then remove gradient clipping (e.g. for PPO)
- Fix test data size following training and not test by default

0.2.0

Search Methods, Flexible Embeddings, New Graph Encoders and more 🚀

Search methods
- New flexible and extensible abstract class
- Active Search (Bello et al, 2016)
- Efficient Active Search (Hottung et al, 2022)

Flexible embeddings
- Support for changing any environment embedding (`init`, `context` and `dynamic`)
- Add [new notebook](https://github.com/kaist-silab/rl4co/blob/main/notebooks/tutorials/2-solving-new-problem.ipynb) showcasing how to solve new complex problems (example of multi-depot multi-agent pickup and delivery problem - MDPDP)

Support for `torch-geometric`
- Added new template graph neural networks (MPNN, GCN)
- Example Notebook [here](https://github.com/kaist-silab/rl4co/blob/main/notebooks/tutorials/3-change-encoder.ipynb)

Miscellaneous
- Separate loggers
- Better imports
- Bugfix compatibility with Mac
- Update configs
- ... and more!

0.1.1

Better training, Bug fixes, and more 🚀

- Better automatic training with DDP 87
- Bug Fix `RL4COTrainer`
- Avoid broadcasting error warning in critic baselines
- Fix rollout baseline bug
- New experiment config structure: interpolate with environment name (we won't need anymore to have separate folders for each environment name such as TSP, CVRP etc, simply use one config to rule them all!

Page 2 of 3

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.