Megatron-energon

Latest version: v4.0.0

Safety actively analyzes 687918 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

4.0.0

What's Changed
* Enable adding of additional data by joining another dataset by voegtlel and philipp-fischer in https://github.com/NVIDIA/Megatron-Energon/pull/20
* Replace the dataset type in the dataset.yaml by sample type directly by voegtlel and philipp-fischer in https://github.com/NVIDIA/Megatron-Energon/pull/29


**Full Changelog**: https://github.com/NVIDIA/Megatron-Energon/compare/3.0.1...4.0.0

3.0.1

What's Changed
* This fixes `AttributeError: module 'fsspec' has no attribute 'asyn'` see https://github.com/NVIDIA/Megatron-Energon/pull/26 by philipp-fischer


**Full Changelog**: https://github.com/NVIDIA/Megatron-Energon/compare/3.0.0...3.0.1

3.0.0

What's Changed
* Allow for reproducible scaling with different micro batch size in https://github.com/NVIDIA/Megatron-Energon/pull/11 by philipp-fischer
* Introduce sequence packing and sample restore in https://github.com/NVIDIA/Megatron-Energon/pull/12 by voegtlel and philipp-fischer
* `energon info` command in https://github.com/NVIDIA/Megatron-Energon/pull/21 by voegtlel


**Full Changelog**: https://github.com/NVIDIA/Megatron-Energon/compare/2.3.0...3.0.0

2.3.0

What's Changed
* Support loading datasets via dict() & update SimilarityInterleaved by paul-gibbons in https://github.com/NVIDIA/Megatron-Energon/pull/4
* Fix gc_init_worker for ShardedTensor by philipp-fischer in https://github.com/NVIDIA/Megatron-Energon/pull/7
* Fix dataloader being stuck iterating samples in a certain condition by voegtlel in https://github.com/NVIDIA/Megatron-Energon/pull/6
* Fix epath with relative usage by voegtlel in https://github.com/NVIDIA/Megatron-Energon/pull/3

New Contributors
* paul-gibbons made their first contribution in https://github.com/NVIDIA/Megatron-Energon/pull/4

**Full Changelog**: https://github.com/NVIDIA/Megatron-Energon/compare/2.2.0...2.3.0

2.2.0

**Full Changelog**: https://github.com/NVIDIA/Megatron-Energon/compare/2.1.1...2.2.0

Version 2.2.0 introduces support for video files in your dataset and adds new samples and dataset classes for video question answering (VidQA).
The new classes are: `VidQASample` and `VidQAWebdataset`.

If you have videos in your dataset shards, those should now automatically be decoded and yield a `VideoData` object, which contains the frames and the audio data as well as some meta data.

2.1.1

You can find the installable package here: https://pypi.org/project/megatron-energon/2.1.1/

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.