Easydel

Latest version: v0.0.67

Safety actively analyzes 638773 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

0.0.66

- New Features
- `GenerationPipeLine` was added for fast streaming and easy generation with JAX.
- Using Int8Params instead of `LinearBitKernel`.
- Better GPU support.
- EasyDeLState is now better and supports more general options.
- Trainers now support `.save_pretrained(to_torch)` and training logging.
- EasyDeLState supports to_8bit.
- All of the models support `to_8bit` for params.
- imports are now 91x times faster in EasyDeL version 0.0.67.

- Removed API
- `JAXServe` is no longer available.
- `PyTorchServe` is no longer available.
- `EasyServe` is no longer available.
- `LinearBitKernel` is no longer available.
- `EasyDeL` partitioners are no longer available.
- `Llama/Mistral/Falcon/Mpt` static convertors or transforms are no longer available.

- Known Issues
- Lora Kernel Sometimes Crash.
- `GenerationPipeLine` has a compiling problem when the number of available devices is more than 4 and using 8_bit params.
- Most of the features won't work for _TPU-v3_ and GPUs with compute capability lower than _7.5_.
- Kaggle session will crash after importing EasyDeL (Kaggle's latest environment is not stable it's not related to EasyDeL). (Fixed in EasyDeL version 0.0.67)

0.0.65

- New Features
- Pallas Flash Attention on CPU/GPU/TPU via FJFormer and supports bias.
- ORPO Trainer is added and now it's in your bag.
- WebSocket Serve Engine.
- Now EasyDeL is 30% faster on GPUs.
- No JAX-Triton is now needed to run GPU kernels.
- Now you can specify the backward kernel implementation for Pallas Attention.
- now you have to import EasyDeL as `easydel` instead of `EasyDel`.

- New Models
- OpenELM model series are now present.
- DeepseekV2 model series are now present.

- Fixed Bugs
- CUDNN FlashAttention Bugs are now fixed.
- Llama3 Model 8Bit quantization of parameters had a lot of improvements.
- Splash Attention bugs on TPUs are now fixed .
- Dbrx Model Bugs are fixed.
- DPOTrainer Bugs are Fixed (creating dataset).

- Known Bugs
- Splash Attention won't work on TPUv3.
- Pallas Attention won't work on TPUv3.
- You need to install flash_attn in order to convert HF DeepseekV2 to EasyDeL (bug in DeepseekV2 implementation from original authors).
- Some Examples are out dated.


**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.63...0.0.65

0.0.63

whats changed

- Phi3 Model Added.
- Dbrx Model Added.
- Arctic Model Added.
- Lora Fine-Tuning Bugs Fixed.
- Vanilla Attention is Optimized.
- Sharded Vanilla is the default attention mechanism now.

**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.61...0.0.63

0.0.61

What's Changed
* Add support for iterable dataset loading by yhavinga in https://github.com/erfanzar/EasyDeL/pull/138
* `SFTTrainer` bugs are fixed.
* `Parameter quantization` is now available for all of the models.
* `AutoEasyDeLModelForCausalLM` now supports `load_in_8bit`.
* Memory Management improved.
* `Gemma` Models Generation Issue is now Fixed.
* Trainers are now 2~8% faster.
* Attention Operation is improved.
* The `Cohere` Model is now present.
* `JAXServer` is improved.
* Due to recent changes a lot of examples of documentation have changed and will be changed soon.

**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.60...0.0.61

0.0.60

What's Changed

* `SFTTrainer` is now available.
* `VideoCausalLanguageModelTrainer` is now available.
* New models such as Grok-1, Qwen2Moe, Mamba, Rwkv, and Whisper are available.
* MoE models had some speed improvements.
* Training Speed is now 18%~42% faster.
* Normal Attention is now faster by 12%~30% 131 .
* DPOTrainer Bugs Fixed.
* CausalLanguageModelTrainer is now more customizable.
* WANDB logging has improved.
* Performace Mode is added to Training Arguments.
* Model configs pass attributes to PretrainedConfig to prevent override… by yhavinga in https://github.com/erfanzar/EasyDeL/pull/122
* Ignore token label smooth z loss by yhavinga in https://github.com/erfanzar/EasyDeL/pull/123
* Time the whole train loop instead of only call to train step function by yhavinga in https://github.com/erfanzar/EasyDeL/pull/124
* Add save_total_limit argument to delete older checkpoints by yhavinga in https://github.com/erfanzar/EasyDeL/pull/127
* Add gradient norm logging, fix metric collection on multi-worker setup by yhavinga in https://github.com/erfanzar/EasyDeL/pull/135


**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.55...0.0.60

0.0.55

- JAX `DPOTrainer` Bugs Fixed
- StableLM Models are supported with FlashAttention and RING-Attention
- RingAttention is supported for Up to 512K or 1M token training and inference
- chunk MLP Is Supported for Up to 512K or 1M token training and inference
- now all the Models support shared key and value caching for high context length interface and can be accessed via `use_sharded_kv_caching=True` in model config (see examples).
- EasyDeL successfully passed 1256000 Context Length Inference on TPUs (Llama Model Tested)
- Vision Trainer is added, you might except some bugs from that.

**Full Changelog**: https://github.com/erfanzar/EasyDeL/compare/0.0.50...0.0.55

Page 1 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.