- Added `checkpoint` parameter to callback's `on_save_checkpoint` hook (6072)
- Changed the order of `backward`, `step`, `zero_grad` to `zero_grad`, `backward`, `step` (6147)
- Changed default for DeepSpeed CPU Offload to False, due to prohibitively slow speeds at smaller scale (6262)
- Fixed epoch level schedulers not being called when `val_check_interval < 1.0` (6075)
- Fixed multiple early stopping callbacks (6197)
- Fixed incorrect usage of `detach()`, `cpu()`, `to()` (6216)
- Fixed LBFGS optimizer support which didn't converge in automatic optimization (6147)
- Prevent `WandbLogger` from dropping values (5931)
- Fixed error thrown when using valid distributed mode in multi node (6297)
akihironitta, borisdayma, carmocca, dvolgyes, SeanNaren, SkafteNicki
_If we forgot someone due to not matching commit email with GitHub account, let us know :]_