* **fix(Sign)**: self-graft correctly; previously, we did `update.sign() * update.norm()`, omitting the required division by the original norm. Now, it's `F.normalize(update.sign()) * update.norm()`. This changes the required learning rates for self-grafted `tg.optim.Sign`.
4.0.2
* Use WeightDecayChain in OptimizerOptimizer
4.0.1
* Add missing `params_flat` in `Graft`
4.0.0
* Add configurable weight decay via `WeightDecayChain` * L1/L2 Decay * Decay to Init/EMA * Remove `decay_to_init` flag. Use `weight_decay_cls=tg.WeightDecayChain(tg.optim.WeightDecayToInit())` instead. * Remove `default_to_adam` flag. Use `default_to_baseline`.
2.3.5
Fix the bugs
2.2.0
* Improve TG-Optimizer extensibility by adding `TrueGrad` base optimizer class * Add (TG-)[LaProp](https://arxiv.org/abs/2002.04839)