Easydel

Latest version: v0.0.65

Safety actively analyzes 632648 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

0.0.65

EasyDeL version 0.0.65 Pallas Fusion: GPU Turbocharged 🚀

- New Features
- Pallas Flash Attention on CPU/GPU/TPU via FJFormer and supports bias.
- ORPO Trainer is added and now it's in your bag.
- WebSocket Serve Engine.
- Now EasyDeL is 30% faster on GPUs.
- No JAX-Triton is now needed to run GPU kernels.
- Now you can specify the backward kernel implementation for Pallas Attention.
- now you have to import EasyDeL as `easydel` instead of `EasyDel`.

- New Models
- OpenELM model series are now present.
- DeepseekV2 model series are now present.

- Fixed Bugs
- CUDNN FlashAttention Bugs are now fixed.
- Llama3 Model 8Bit quantization of parameters had a lot of improvements.
- Splash Attention bugs on TPUs are now fixed .
- Dbrx Model Bugs are fixed.
- DPOTrainer Bugs are Fixed (creating dataset).

- Known Bugs
- Splash Attention won't work on TPUv3.
- Pallas Attention won't work on TPUv3.
- You need to install flash_attn in order to convert HF DeepseekV2 to EasyDeL (bug in DeepseekV2 implementation from original authors).
- Some Examples are out dated.


**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.63...0.0.65

0.0.63

whats changed

- Phi3 Model Added.
- Dbrx Model Added.
- Arctic Model Added.
- Lora Fine-Tuning Bugs Fixed.
- Vanilla Attention is Optimized.
- Sharded Vanilla is the default attention mechanism now.

**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.61...0.0.63

0.0.61

What's Changed
* Add support for iterable dataset loading by yhavinga in https://github.com/erfanzar/EasyDeL/pull/138
* `SFTTrainer` bugs are fixed.
* `Parameter quantization` is now available for all of the models.
* `AutoEasyDeLModelForCausalLM` now supports `load_in_8bit`.
* Memory Management improved.
* `Gemma` Models Generation Issue is now Fixed.
* Trainers are now 2~8% faster.
* Attention Operation is improved.
* The `Cohere` Model is now present.
* `JAXServer` is improved.
* Due to recent changes a lot of examples of documentation have changed and will be changed soon.

**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.60...0.0.61

0.0.60

What's Changed

* `SFTTrainer` is now available.
* `VideoCausalLanguageModelTrainer` is now available.
* New models such as Grok-1, Qwen2Moe, Mamba, Rwkv, and Whisper are available.
* MoE models had some speed improvements.
* Training Speed is now 18%~42% faster.
* Normal Attention is now faster by 12%~30% 131 .
* DPOTrainer Bugs Fixed.
* CausalLanguageModelTrainer is now more customizable.
* WANDB logging has improved.
* Performace Mode is added to Training Arguments.
* Model configs pass attributes to PretrainedConfig to prevent override… by yhavinga in https://github.com/erfanzar/EasyDeL/pull/122
* Ignore token label smooth z loss by yhavinga in https://github.com/erfanzar/EasyDeL/pull/123
* Time the whole train loop instead of only call to train step function by yhavinga in https://github.com/erfanzar/EasyDeL/pull/124
* Add save_total_limit argument to delete older checkpoints by yhavinga in https://github.com/erfanzar/EasyDeL/pull/127
* Add gradient norm logging, fix metric collection on multi-worker setup by yhavinga in https://github.com/erfanzar/EasyDeL/pull/135


**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.55...0.0.60

0.0.55

- JAX `DPOTrainer` Bugs Fixed
- StableLM Models are supported with FlashAttention and RING-Attention
- RingAttention is supported for Up to 512K or 1M token training and inference
- chunk MLP Is Supported for Up to 512K or 1M token training and inference
- now all the Models support shared key and value caching for high context length interface and can be accessed via `use_sharded_kv_caching=True` in model config (see examples).
- EasyDeL successfully passed 1256000 Context Length Inference on TPUs (Llama Model Tested)
- Vision Trainer is added, you might except some bugs from that.

**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.50...0.0.55

0.0.50

What's Changed
* Optimize mean loss and accuracy calculation by yhavinga in https://github.com/erfanzar/EasyDeL/pull/100
* Mixtral Models are fully supported and they are `PJIT-compatible`
* A Wider range of models now support FlashAttention on TPU
* Qwen 1, Qwen 2, PHI 2, Robert is new Added Models which support FlashAttention on TPU and `EasyBIT`
* LoRA support for the trainer is now Added (`EasyDeLXRapTureConfig`)
* Adding EasyDel Serve Engine APIs
* Adding Prompter (Beta and might be removed in future updates)
* The Training Process is now 21 % Faster in `0.0.50` than `0.0.42`.
* Transform Functions are now Automated for all the models (Except `MosaicMPT` for this one you still have to use static methods)
* The Trainer APIs have changed and now it's faster, more dynamic, and more hackable.
* Default Version of the JAX now changed to 0.4.22 for `FJFormer` custom Pallas kernels usage.

New Contributors
* yhavinga made their first contribution at https://github.com/erfanzar/EasyDeL/pull/100

**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.42...0.0.50

Page 1 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.