Lightning

Latest version: v2.5.1

Safety actively analyzes 723152 Python packages for vulnerabilities to keep your Python projects secure.

Page 24 of 32

0.8.4

Detail changes

Added

- Added reduce ddp results on eval (2434)
- Added a warning when an `IterableDataset` has `__len__` defined (2437)

Changed

- Enabled no returns from eval (2446)

Fixed

- Fixes train outputs (2428)
- Fixes Conda dependencies (2412)
- Fixed Apex scaling with decoupled backward (2433)
- Fixed crashing or wrong displaying progressbar because of missing ipywidgets (2417)
- Fixed TPU saving dir (fc26078e395f8a001f4c6dd7b3fe7ca202f914a3, 04e68f022fc03dd5f1555ee86dea997d42a448ad)
- Fixed logging on rank 0 only (2425)

Contributors

awaelchli, Borda, olineumann, williamFalcon

0.8.3

Fixed

- Fixed AMP wrong call (593837e1da24ff6c942b24ed803fc1496a304609)
- Fixed batch typo (92d1e75b2638a493d9d21ed5fe00a22093888285)

0.8.2

Overview

As we continue to strengthen the codebase with more tests, we’re finally getting rid of annoying bugs that have been around for a bit now. Mostly around the inconsistent checkpoint and early stopping behaviour (amazing work awaelchli jeremyjordan )

Noteworthy changes:

- Fixed TPU flag parsing
- fixed average_precision metric
- all the checkpoint issues should be gone now (including backward support for old checkpoints)
- DDP + loggers should be fixed

Detail changes

Added

- Added TorchText support for moving data to GPU (2379)

Changed

- Changed epoch indexing from 0 instead of 1 (2289)
- Refactor Model `backward` (2276)
- Refactored `training_batch` + tests to verify correctness (2327, 2328)
- Refactored training loop (2336)
- Made optimization steps for hooks (2363)
- Changed default apex level to 'O2' (2362)

Removed

- Moved `TrainsLogger` to Bolts (2384)

Fixed

- Fixed parsing TPU arguments and TPU tests (2094)
- Fixed number batches in case of multiple dataloaders and `limit_{*}_batches` (1920, 2226)
- Fixed an issue with forward hooks not being removed after model summary (2298)
- Fix for `load_from_checkpoint()` not working with absolute path on Windows (2294)
- Fixed an issue how _has_len handles `NotImplementedError` e.g. raised by `torchtext.data.Iterator` (2293), (2307)
- Fixed `average_precision` metric (2319)
- Fixed ROC metric for CUDA tensors (2304)
- Fixed `average_precision` metric (2319)
- Fixed lost compatibility with custom datatypes implementing `.to` (2335)
- Fixed loading model with kwargs (2387)
- Fixed sum(0) for `trainer.num_val_batches` (2268)
- Fixed checking if the parameters are a `DictConfig` Object (2216)
- Fixed SLURM weights saving (2341)
- Fixed swaps LR scheduler order (2356)
- Fixed adding tensorboard `hparams` logging test (2342)
- Fixed use model ref for tear down (2360)
- Fixed logger crash on DDP (2388)
- Fixed several issues with early stopping and checkpoint callbacks (1504, 2391)
- Fixed loading past checkpoints from v0.7.x (2405)
- Fixed loading model without arguments (2403)

Contributors

airium, awaelchli, Borda, elias-ramzi, jeremyjordan, lezwon, mateuszpieniak, mmiakashs, pwl, rohitgr7, ssakhavi, thschaaf, tridao, williamFalcon

_If we forgot someone due to not matching commit email with GitHub account, let us know :]_

0.8.1

Overview

Fixing critical bugs in newly added hooks and `hparams` assignment.
The recommended data following:

1. use `prepare_data` to download and process the dataset.
2. use `setup` to do splits, and build your model internals

Detail changes

- Fixed the `load_from_checkpoint` path detected as URL bug (2244)
- Fixed hooks - added barrier (2245, 2257, 2260)
- Fixed `hparams` - remove frame inspection on `self.hparams` (2253)
- Fixed setup and on fit calls (2252)
- Fixed GPU template (2255)

0.8.0

Overview

Highlights of this release are adding Metric package and new hooks and flags to customize your workflow.

Major features:

- brand new Metrics package with built-in DDP support (by justusschock and SkafteNicki)
- `hparams` can now be anything! (call `self.save_hyperparameters()` to register anything in the `_init_`
- many speed improvements (how we move data, adjusted some flags & PL now adds 300ms overhead per epoch only!)
- much faster `ddp` implementation. Old one was renamed `ddp_spawn`
- better support for Hydra
- added the overfit_batches flag and corrected some bugs with the `limit_[train,val,test]_batches` flag
- added conda support
- tons of bug fixes :wink:

Detail changes

Added

- Added `overfit_batches`, `limit_{val|test}_batches` flags (overfit now uses training set for all three) (2213)
- Added metrics
* Base classes (1326, 1877)
* Sklearn metrics classes (1327)
* Native torch metrics (1488, 2062)
* docs for all Metrics (2184, 2209)
* Regression metrics (2221)
- Added type hints in `Trainer.fit()` and `Trainer.test()` to reflect that also a list of dataloaders can be passed in (1723)
- Allow dataloaders without sampler field present (1907)
- Added option `save_last` to save the model at the end of every epoch in `ModelCheckpoint` (1908)
- Early stopping checks `on_validation_end` (1458)
- Attribute `best_model_path` to `ModelCheckpoint` for storing and later retrieving the path to the best saved model file (1799)
- Speed up single-core TPU training by loading data using `ParallelLoader` (2033)
- Added a model hook `transfer_batch_to_device` that enables moving custom data structures to the target device (1756)
- Added [black](https://black.readthedocs.io/en/stable/) formatter for the code with code-checker on pull (#1610)
- Added back the slow spawn ddp implementation as `ddp_spawn` (2115)
- Added loading checkpoints from URLs (1667)
- Added a callback method `on_keyboard_interrupt` for handling KeyboardInterrupt events during training (2134)
- Added a decorator `auto_move_data` that moves data to the correct device when using the LightningModule for inference (1905)
- Added `ckpt_path` option to `LightningModule.test(...)` to load particular checkpoint (2190)
- Added `setup` and `teardown` hooks for model (2229)

Changed

- Allow user to select individual TPU core to train on (1729)
- Removed non-finite values from loss in `LRFinder` (1862)
- Allow passing model hyperparameters as complete kwarg list (1896)
- Renamed `ModelCheckpoint`'s attributes `best` to `best_model_score` and `kth_best_model` to `kth_best_model_path` (1799)
- Re-Enable Logger's `ImportError`s (1938)
- Changed the default value of the Trainer argument `weights_summary` from `full` to `top` (2029)
- Raise an error when lightning replaces an existing sampler (2020)
- Enabled prepare_data from correct processes - clarify local vs global rank (2166)
- Remove explicit flush from tensorboard logger (2126)
- Changed epoch indexing from 1 instead of 0 (2206)

Deprecated

- Deprecated flags: (2213)
* `overfit_pct` in favour of `overfit_batches`
* `val_percent_check` in favour of `limit_val_batches`
* `test_percent_check` in favour of `limit_test_batches`
- Deprecated `ModelCheckpoint`'s attributes `best` and `kth_best_model` (1799)
- Dropped official support/testing for older PyTorch versions <1.3 (1917)

Removed

- Removed unintended Trainer argument `progress_bar_callback`, the callback should be passed in by `Trainer(callbacks=[...])` instead (1855)
- Removed obsolete `self._device` in Trainer (1849)
- Removed deprecated API (2073)
* Packages: `pytorch_lightning.pt_overrides`, `pytorch_lightning.root_module`
* Modules: `pytorch_lightning.logging.comet_logger`, `pytorch_lightning.logging.mlflow_logger`, `pytorch_lightning.logging.test_tube_logger`, `pytorch_lightning.overrides.override_data_parallel`, `pytorch_lightning.core.model_saving`, `pytorch_lightning.core.root_module`
* Trainer arguments: `add_row_log_interval`, `default_save_path`, `gradient_clip`, `nb_gpu_nodes`, `max_nb_epochs`, `min_nb_epochs`, `nb_sanity_val_steps`
* Trainer attributes: `nb_gpu_nodes`, `num_gpu_nodes`, `gradient_clip`, `max_nb_epochs`, `min_nb_epochs`, `nb_sanity_val_steps`, `default_save_path`, `tng_tqdm_dic`

Fixed

- Run graceful training teardown on interpreter exit (1631)
- Fixed user warning when apex was used together with learning rate schedulers (1873)
- Fixed multiple calls of `EarlyStopping` callback (1863)
- Fixed an issue with `Trainer.from_argparse_args` when passing in unknown Trainer args (1932)
- Fixed bug related to logger not being reset correctly for model after tuner algorithms (1933)
- Fixed root node resolution for SLURM cluster with dash in hostname (1954)
- Fixed `LearningRateLogger` in multi-scheduler setting (1944)
- Fixed test configuration check and testing (1804)
- Fixed an issue with Trainer constructor silently ignoring unknown/misspelt arguments (1820)
- Fixed `save_weights_only` in ModelCheckpoint (1780)
- Allow use of same `WandbLogger` instance for multiple training loops (2055)
- Fixed an issue with `_auto_collect_arguments` collecting local variables that are not constructor arguments and not working for signatures that have the instance not named `self` (2048)
- Fixed mistake in parameters' grad norm tracking (2012)
- Fixed CPU and hanging GPU crash (2118)
- Fixed an issue with the model summary and `example_input_array` depending on a specific ordering of the submodules in a LightningModule (1773)
- Fixed Tpu logging (2230)
- Fixed Pid port + duplicate `rank_zero` logging (2140, 2231)

Contributors

awaelchli, baldassarreFe, Borda, borisdayma, cuent, devashishshankar, ivannz, j-dsouza, justusschock, kepler, kumuji, lezwon, lgvaz, LoicGrobol, mateuszpieniak, maximsch2, moi90, rohitgr7, SkafteNicki, tullie, williamFalcon, yukw777, ZhaofengWu

_If we forgot someone due to not matching commit email with GitHub account, let us know :]_

0.7.6

Overview

Highlights of this release are adding support for TorchElastic enables distributed PyTorch training jobs to be executed in a fault-tolerant and elastic manner; auto-scaling of batch size; new transfer learning example; an option to provide seed to random generators to ensure reproducibility.

Detail changes

Added

- Added callback for logging learning rates (1498)
- Added transfer learning example (for a binary classification task in computer vision) (1564)
- Added type hints in `Trainer.fit()` and `Trainer.test()` to reflect that also a list of dataloaders can be passed in (1723).
- Added auto scaling of batch size (1638)
- The progress bar metrics now also get updated in `training_epoch_end` (1724)
- Enable `NeptuneLogger` to work with `distributed_backend=ddp` (1753)
- Added option to provide seed to random generators to ensure reproducibility (1572)
- Added override for hparams in `load_from_ckpt` (1797)
- Added support multi-node distributed execution under `torchelastic` (1811, 1818)
- Added using `store_true` for bool args (1822, 1842)
- Added dummy logger for internally disabling logging for some features (1836)

Changed

- Enable `non-blocking` for device transfers to GPU (1843)
- Replace mata_tags.csv with hparams.yaml (1271)
- Reduction when `batch_size < num_gpus` (1609)
- Updated LightningTemplateModel to look more like Colab example (1577)
- Don't convert `namedtuple` to `tuple` when transferring the batch to target device (1589)
- Allow passing `hparams` as a keyword argument to LightningModule when loading from checkpoint (1639)
- Args should come after the last positional argument (1807)
- Made DDP the default if no backend specified with multiple GPUs (1789)

Deprecated

- Deprecated `tags_csv` in favor of `hparams_file` (1271)

Fixed

- Fixed broken link in PR template (1675)
- Fixed ModelCheckpoint not None checking file path (1654)
- Trainer now calls `on_load_checkpoint()` when resuming from a checkpoint (1666)
- Fixed sampler logic for DDP with the iterable dataset (1734)
- Fixed `_reset_eval_dataloader()` for IterableDataset (1560)
- Fixed Horovod distributed backend to set the `root_gpu` property (1669)
- Fixed wandb logger `global_step` affects other loggers (1492)
- Fixed disabling progress bar on non-zero ranks using Horovod backend (1709)
- Fixed bugs that prevent LP finder to be used together with early stopping and validation dataloaders (1676)
- Fixed a bug in Trainer that prepended the checkpoint path with `version_` when it shouldn't (1748)
- Fixed LR key name in case of param groups in LearningRateLogger (1719)
- Fixed saving native AMP scaler state (introduced in 1561)
- Fixed accumulation parameter and suggestion method for learning rate finder (1801)
- Fixed num processes wasn't being set properly and auto sampler was DDP failing (1819)
- Fixed bugs in semantic segmentation example (1824)
- Fixed saving native AMP scaler state (1561, 1777)
- Fixed native AMP + DDP (1788)
- Fixed `hparam` logging with metrics (1647)

Contributors

ashwinb, awaelchli, Borda, cmpute, festeh, jbschiratti, justusschock, kepler, kumuji, nanddalal, nathanbreitsch, olineumann, pitercl, rohitgr7, S-aiueo32, SkafteNicki, tgaddair, tullie, tw991, williamFalcon, ybrovman, yukw777

_If we forgot someone due to not matching commit email with GitHub account, let us know :]_

Page 24 of 32

Releases

Has known vulnerabilities

Previous Next

Lightning

Page 24 of 32

0.8.4

0.8.3

0.8.2

0.8.1

0.8.0

0.7.6

Page 24 of 32

Links

Releases