Composer

Latest version: v0.29.0

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Page 11 of 11

0.5

New Contributors
* nikhilsardana made their first contribution in https://github.com/mosaicml/composer/pull/433
* knighton made their first contribution in https://github.com/mosaicml/composer/pull/284

**Full Changelog**: https://github.com/mosaicml/composer/compare/v0.4.0...v0.5.0

0.5.0

We are excited to share Composer v0.5, a library of speed-up methods for efficient neural network training. This release features:
* Revamped checkpointing API based on community feedback
* New baselines: ResNet34-SSD, GPT-3, and Vision Transformers
* Additional improvements to our [documentation](https://docs.mosaicml.com/en/latest/)
* Support for `bfloat16`
* Streaming dataset support
* Unified functional API for our algorithms

Highlights

Checkpointing API

Checkpointing models are now a Callback, so that users can easily write and add their own callbacks. The callback is automatically appended if a `save_folder` is provided to the Trainer.

python
trainer = Trainer(
model=model,
algorithms=algorithms,
save_folder="checkpoints",
save_interval="1ep"
)

Alternatively, `CheckpointSaver` can be directly added as a callback:

python
trainer = Trainer(..., callbacks=[
CheckpointSaver(
save_folder='checkpoints',
name_format="ep{epoch}-ba{batch}/rank_{rank}",
save_latest_format="latest/rank_{rank}",
save_interval="1ep",
weights_only=False,
)
])

Subclass from `CheckpointSaver` to add your own logic for saving the best model, or saving at specific intervals. Thanks to mansheej siriuslee and other users for their feedback.

bloat16

We've added experimental support for `bfloat16`, which can be provided via the `precision` argument to the Trainer:

python
trainer = Trainer(
...,
precision="bfloat16"
)

Streaming datasets

We've added support for fast streaming datasets. For NLP-based datasets such as C4, we use the HuggingFace datasets backend, and add dataset-specific shuffling, tokenization , and grouping on-the-fly. To support data parallel training, we added specific sharding logic for efficiency. See `C4Datasets` for more details.

Vision streaming datasets are supported via a patched version of the `webdatasets` package, and added support for data sharding by workers for fast augmentations. See `composer.datasets.webdataset` for more details.

Baseline GPT-3, ResNet34-SSD, and Vision Transformer benchmarks

Configurations for GPT-3-like models ranging from 125m to 760m parameters are now released, and use DeepSpeed Zero Stage 0 for memory-efficient training.
* [GPT3-125m](https://github.com/mosaicml/composer/blob/v0.5.0/composer/yamls/models/gpt3_125m.yaml)
* [GPT3-350m](https://github.com/mosaicml/composer/blob/v0.5.0/composer/yamls/models/gpt3_350m.yaml)
* [GPT3-760m](https://github.com/mosaicml/composer/blob/v0.5.0/composer/yamls/models/gpt3_760m.yaml)

We've also added the Single Shot Detection (SSD) model ([Wei et al, 2016](https://arxiv.org/abs/1512.02325)) with a ResNet34 backbone, based on the MLPerf reference implementation.

Our first Vision Transformer benchmark is the ViT-S/16 model from [Touvron et al, 2021](https://arxiv.org/pdf/2012.12877.pdf), and based on the `vit-pytorch` package.

See below for the full details:

What's Changed
* Export Transforms in `composer.algorithms` by ajaysaini725 in https://github.com/mosaicml/composer/pull/603
* Make batchnorm default for UNet by dskhudia in https://github.com/mosaicml/composer/pull/535
* Fix no_op_model algorithm by dskhudia in https://github.com/mosaicml/composer/pull/614
* Pin pre-1.0 packages by bandish-shah in https://github.com/mosaicml/composer/pull/595
* Updated dark mode composer logo, and graph by nqn in https://github.com/mosaicml/composer/pull/617
* Jenkins + Docker Improvements by ravi-mosaicml in https://github.com/mosaicml/composer/pull/621
* update README links by hanlint in https://github.com/mosaicml/composer/pull/628
* Remove all old timing calls by ravi-mosaicml in https://github.com/mosaicml/composer/pull/594
* Remove state shorthand by mvpatel2000 in https://github.com/mosaicml/composer/pull/629
* add bfloat16 support by nikhilsardana in https://github.com/mosaicml/composer/pull/433
* v0.4.0 Hotfix: Docker documentation updates by bandish-shah in https://github.com/mosaicml/composer/pull/631
* Fix wrong icons in the method cards by hanlint in https://github.com/mosaicml/composer/pull/636
* fix autocast for pytorch < 1.10 by nikhilsardana in https://github.com/mosaicml/composer/pull/639
* Add tutorial notebooks to the README by moinnadeem in https://github.com/mosaicml/composer/pull/630
* Converted Stateless Schedulers to Classes by ravi-mosaicml in https://github.com/mosaicml/composer/pull/632
* Jenkinsfile Fixes Part 2 by ravi-mosaicml in https://github.com/mosaicml/composer/pull/627
* Add C4 Streaming dataset by abhi-mosaic in https://github.com/mosaicml/composer/pull/489
* CONTRIBUTING.md additions by kobindra in https://github.com/mosaicml/composer/pull/648
* Hide showing `object` as a base class; fix skipping documentation of `forward`; fixed docutils dependency. by ravi-mosaicml in https://github.com/mosaicml/composer/pull/643
* Matthew/functional docstrings update by growlix in https://github.com/mosaicml/composer/pull/622
* docstrings improvements for core modules by dskhudia in https://github.com/mosaicml/composer/pull/598
* ssd-resnet34 on COCO map 0.23 by florescl in https://github.com/mosaicml/composer/pull/646
* Fix broken "best practices" link by growlix in https://github.com/mosaicml/composer/pull/649
* Update progressive resizing to work for semantic segmentation by coryMosaicML in https://github.com/mosaicml/composer/pull/604
* Let C4 Dataset overwrite `num_workers` if set incorrectly by abhi-mosaic in https://github.com/mosaicml/composer/pull/655
* Lazy imports for `pycocotools` by abhi-mosaic in https://github.com/mosaicml/composer/pull/656
* W&B excludes final eval metrics when plotted as a fxn of epoch or trainer/global_step by growlix in https://github.com/mosaicml/composer/pull/633
* Update GPT3-yamls for default 8xA100-40GB by abhi-mosaic in https://github.com/mosaicml/composer/pull/663
* Set WandB default to log rank zero only by abhi-mosaic in https://github.com/mosaicml/composer/pull/461
* Update schedulers guide by hanlint in https://github.com/mosaicml/composer/pull/661
* [XS] Fix a TQDM deserialization bug by jbloxham in https://github.com/mosaicml/composer/pull/665
* Add defaults to the docstrings for algorithms by hanlint in https://github.com/mosaicml/composer/pull/662
* Fix ZeRO config by jbloxham in https://github.com/mosaicml/composer/pull/667
* [XS] fix formatting for colout by hanlint in https://github.com/mosaicml/composer/pull/666
* Composer.core docstring touch-up by ravi-mosaicml in https://github.com/mosaicml/composer/pull/657
* Add Uniform bounding box sampling option for CutOut and CutMix by coryMosaicML in https://github.com/mosaicml/composer/pull/634
* Update README.md by ravi-mosaicml in https://github.com/mosaicml/composer/pull/678
* Fix bug in trainer test by hanlint in https://github.com/mosaicml/composer/pull/651
* InMemoryLogger has get_timeseries() method by growlix in https://github.com/mosaicml/composer/pull/644
* Batchwise resolution for SWA by growlix in https://github.com/mosaicml/composer/pull/654
* Fixed the conda build script so it runs on jenkins by ravi-mosaicml in https://github.com/mosaicml/composer/pull/676
* Yahp version update to 0.1.0 by Averylamp in https://github.com/mosaicml/composer/pull/674
* Streaming vision datasets by knighton in https://github.com/mosaicml/composer/pull/284
* Fix DeepSpeed checkpointing by jbloxham in https://github.com/mosaicml/composer/pull/686
* Vit by A-Jacobson in https://github.com/mosaicml/composer/pull/243
* [S] cleanup tldr; standardize `__all__` by hanlint in https://github.com/mosaicml/composer/pull/688
* Unify algorithms part 2: mixup, cutmix, label smoothing by dblalock in https://github.com/mosaicml/composer/pull/658
* `composer.optim` docstrings by jbloxham in https://github.com/mosaicml/composer/pull/653
* Fix DatasetHparams, WebDatasetHparams docstring by growlix in https://github.com/mosaicml/composer/pull/697
* Models docstrings by A-Jacobson in https://github.com/mosaicml/composer/pull/469
* docstrings improvements for composer.datasets by dskhudia in https://github.com/mosaicml/composer/pull/694
* Updated contributing.md and the style guide by ravi-mosaicml in https://github.com/mosaicml/composer/pull/670
* Ability to retry ADE20k crop transform by Landanjs in https://github.com/mosaicml/composer/pull/702
* Add mmsegmentation DeepLabv3(+) by Landanjs in https://github.com/mosaicml/composer/pull/684
* Unify functional API part 3 by dblalock in https://github.com/mosaicml/composer/pull/715
* Update example notebooks by coryMosaicML in https://github.com/mosaicml/composer/pull/707
* [Checkpointing - PR1] Store the `rank_zero_seed` on state by ravi-mosaicml in https://github.com/mosaicml/composer/pull/680
* [Checkpointing - PR2] Added in new Checkpointing Events by ravi-mosaicml in https://github.com/mosaicml/composer/pull/690
* [Checkpointing - PR3] Clean up RNG and State serialization by ravi-mosaicml in https://github.com/mosaicml/composer/pull/692
* [Checkpointing - PR4] Refactored the `CheckpointLoader` into a `load_checkpoint` function by ravi-mosaicml in https://github.com/mosaicml/composer/pull/693
* Update {blurpool,factorize,ghostbn} method cards by dblalock in https://github.com/mosaicml/composer/pull/711
* [Checkpointing - PR 5] Move the `CheckpointSaver` to a callback. by ravi-mosaicml in https://github.com/mosaicml/composer/pull/687
* Update datasets docstrings by growlix in https://github.com/mosaicml/composer/pull/709
* add notebooks and functional api by hanlint in https://github.com/mosaicml/composer/pull/714
* Migrating from PTL notebook by florescl in https://github.com/mosaicml/composer/pull/436
* Docs 0.4.1: Profiler section and tutorials by bandish-shah in https://github.com/mosaicml/composer/pull/696
* Improve datasets docstrings by knighton in https://github.com/mosaicml/composer/pull/695
* Update `C4Dataset` to repeat, handle `max_samples` safely by abhi-mosaic in https://github.com/mosaicml/composer/pull/722
* Fix docs build by ravi-mosaicml in https://github.com/mosaicml/composer/pull/773

0.4.0

New Contributors
* A-Jacobson made their first contribution in https://github.com/mosaicml/composer/pull/100
* jacobfulano made their first contribution in https://github.com/mosaicml/composer/pull/99
* kobindra made their first contribution in https://github.com/mosaicml/composer/pull/160
* ravirahman made their first contribution in https://github.com/mosaicml/composer/pull/200
* Landanjs made their first contribution in https://github.com/mosaicml/composer/pull/107
* siriuslee made their first contribution in https://github.com/mosaicml/composer/pull/236
* mvpatel2000 made their first contribution in https://github.com/mosaicml/composer/pull/336
* abhi-mosaic made their first contribution in https://github.com/mosaicml/composer/pull/410
* jzf2101 made their first contribution in https://github.com/mosaicml/composer/pull/583
* jfrankle made their first contribution in https://github.com/mosaicml/composer/pull/589

**Full Changelog**: https://github.com/mosaicml/composer/compare/v0.3.1...v0.4.0

0.3.1

Hotfix

Hotfix to fix installation of the `composer` package

0.3.0

[Release PR](https://github.com/mosaicml/composer/pull/94)

Major Changes

- Python 3.7 Compatibility
- Adds CutMix Method
- New Pre-Fork DDP entrypoint
- [Change PR](https://github.com/mosaicml/composer/pull/65)
- `composer` Entrypoint for DDP forking prior to script start
- [Documentation on Usage](https://mosaicml-docs.readthedocs-hosted.com/projects/composer/en/stable/getting_started/distributed.html#single-node-example)

Minor Changes

- Lazy-Loading of dependencies
- General Docs updates for readability and correctness
- DDP Port auto-selection by default (no more conflicting ports upon reuse of trainer)
- Small bug fixes for YAHP inheritance

Notes

- Google Colab may have issues installing composer with `!pip install mosaicml`
- Known workaround: Install through git with `!pip install git+https://github.com/mosaicml/composermain`

0.1

We've spun off Streaming datasets into it's own [repository](https://github.com/mosaicml/streaming)! Streaming datasets is a high-performance drop-in for Torch `IterableDataset`, enabling users to stream training data from cloud based object stores. Streaming is shipping with built-in support for popular open source datasets (ADE20K, C4, COCO, Enwiki, ImageNet, etc.)

To get started, install the Streaming PyPi package:
bash
pip install mosaicml-streaming

You can use the streaming Dataset class with the PyTorch native DataLoader class as follows:
python
import torch
from streaming import Dataset

dataloader = torch.utils.data.DataLoader(dataset=Dataset(remote='s3://...'))

For more information, please check out the [Streaming docs](https://docs.mosaicml.com/projects/streaming/en/latest/).

1. **✔👉 Simplified Checkpointing Interface**

With this release we’ve greatly simplified configuration of loading and saving checkpoints in Composer.

To save checkpoints to S3, all you need to do is:
- Specify with `save_folder` your full URI to your save directory destination (e.g. `'s3://my-bucket/{run_name}/checkpoints'`)
- Optionally, set `save_filename` to the pattern you want for your checkpoint file names

python
from composer.trainer import Trainer

Checkpoint saving to S3.
trainer = Trainer(
model=model,
save_folder="s3://my-bucket/{run_name}/checkpoints",
run_name='my-run',
save_interval="1ep",
save_filename="ep{epoch}.pt",
save_num_checkpoints_to_keep=0, delete all checkpoints locally
...
)

trainer.fit()

Likewise, to load checkpoints from S3, all you have to do is:
- Set `load_path` to the full URI to your desired checkpoint file (e.g.`'s3://my-bucket/my-run/checkpoints/epoch13.pt'`)

python
from composer.trainer import Trainer

Checkpoint loading from S3.
new_trainer = Trainer(
model=model,
train_dataloader=train_dataloader,
max_duration="10ep",
load_path="s3://my-bucket/my-run/checkpoints/ep13.pt",
)

new_trainer.fit()

For more information, please see our [Checkpointing guide](https://docs.mosaicml.com/en/v0.11.0/trainer/checkpointing.html).

1. **𐄳 Improved Distributed Experience**

We’ve made it easier to write your own custom distributed entry points by exposing our distributed API. You can now leverage all of our helpful distributed functions and contexts.

For example, let's say we want to need to download a dataset in a distributed training application. To avoid race conditions where different ranks try to write the dataset to the same place, we need to ensure that only rank 0 downloads the dataset first:

python
import datetime
from composer.trainer.devices import DeviceGPU
from composer.utils import dist

dist.initialize(DeviceGPU(), datetime.timedelta(seconds=30)) Initialize distributed module

if dist.get_local_rank() == 0: Download dataset on rank zero
dataset = download_my_dataset()
dist.barrier() All ranks wait until dataset is downloaded

Create and train your model!

For more information, please check out our [Distributed API docs](https://docs.mosaicml.com/en/v0.11.0/api_reference/composer.utils.dist.html).

Bug Fixes
* fix loss and eval_forward for HF models (1597)
* add more robust casting to int for fsdp min_params (1608)
* Deepspeed Docs Typo (1605)
* Fix mmdet typo (1618)
* Blurpool idempotent (1625)
* When model is not on `meta` device, initialization should occur on compute device not CPU (1623)
* Auto resumption (1615)
* Adjust speed monitor (1645)
* Hot fix console logging (1643)
* Lazy Logging + pretty print dict for hparams (1653)
* Fix many failing notebook tests (1646)

What's Changed
* Bump coverage[toml] from 6.4.4 to 6.5.0 by dependabot in https://github.com/mosaicml/composer/pull/1583
* Bump furo from 2022.9.15 to 2022.9.29 by dependabot in https://github.com/mosaicml/composer/pull/1584
* Add English Wikipedia 2020-01-01 dataset by knighton in https://github.com/mosaicml/composer/pull/1572
* Add pull request template by dakinggg in https://github.com/mosaicml/composer/pull/1588
* Bump ipykernel from 6.15.3 to 6.16.0 by dependabot in https://github.com/mosaicml/composer/pull/1587
* Update importlib-metadata requirement from <5,>=4.11.0 to >=5.0,<6 by dependabot in https://github.com/mosaicml/composer/pull/1585
* Bump sphinx-argparse from 0.3.1 to 0.3.2 by dependabot in https://github.com/mosaicml/composer/pull/1586
* Add step explicitly to ImageVisualizer logging calls by dakinggg in https://github.com/mosaicml/composer/pull/1591
* Image viz test by dakinggg in https://github.com/mosaicml/composer/pull/1592
* Remove unused fixture by mvpatel2000 in https://github.com/mosaicml/composer/pull/1594
* Fixes RandAugment API by mvpatel2000 in https://github.com/mosaicml/composer/pull/1596
* fix loss and eval_forward for HF models by dskhudia in https://github.com/mosaicml/composer/pull/1597
* Remove tensorflow-io from setup.py by eracah in https://github.com/mosaicml/composer/pull/1577
* Fixes enwiki for the newly processed wiki dataset by dskhudia in https://github.com/mosaicml/composer/pull/1600
* Change install to all by mvpatel2000 in https://github.com/mosaicml/composer/pull/1599
* Remove log level and should_log_artifact by dakinggg in https://github.com/mosaicml/composer/pull/1603
* Add more robust casting to int for fsdp min_params by dblalock in https://github.com/mosaicml/composer/pull/1608
* Deepspeed Docs Typo by mvpatel2000 in https://github.com/mosaicml/composer/pull/1605
* Object store logger refactor by dakinggg in https://github.com/mosaicml/composer/pull/1601
* Bump gitpython from 3.1.27 to 3.1.28 by dependabot in https://github.com/mosaicml/composer/pull/1609
* Bump tabulate from 0.8.10 to 0.9.0 by dependabot in https://github.com/mosaicml/composer/pull/1610
* Log the number of GPUs and nodes Composer running on. by eracah in https://github.com/mosaicml/composer/pull/1604
* Update MLPerfCallback for v2.1 by hanlint in https://github.com/mosaicml/composer/pull/1607
* Remove object store cls by dakinggg in https://github.com/mosaicml/composer/pull/1606
* Add LAMB Optimizer by hanlint in https://github.com/mosaicml/composer/pull/1613
* Mmdet adapter by A-Jacobson in https://github.com/mosaicml/composer/pull/1545
* Fix mmdet typo by Landanjs in https://github.com/mosaicml/composer/pull/1618
* update torchmetrics requirement by hanlint in https://github.com/mosaicml/composer/pull/1620
* Add distributed sampler error by mvpatel2000 in https://github.com/mosaicml/composer/pull/1598
* Landan/deeplabv3 ade20k example by Landanjs in https://github.com/mosaicml/composer/pull/1593
* Upgrade CodeQL Action to version 2 by karan6181 in https://github.com/mosaicml/composer/pull/1628
* Blurpool idempotent by mvpatel2000 in https://github.com/mosaicml/composer/pull/1625
* Defaulting streaming dataset version to 2 by karan6181 in https://github.com/mosaicml/composer/pull/1616
* Abhi/fsdp bugfix 0 11 by abhi-mosaic in https://github.com/mosaicml/composer/pull/1623
* Remove warning when `master_port` is auto selected by abhi-mosaic in https://github.com/mosaicml/composer/pull/1629
* Remove unused import by dakinggg in https://github.com/mosaicml/composer/pull/1630
* Usability improvements to `intitialize_dist()` by growlix in https://github.com/mosaicml/composer/pull/1619
* Remove Graph in Auto Grad Accum by mvpatel2000 in https://github.com/mosaicml/composer/pull/1631
* Auto resumption by dakinggg in https://github.com/mosaicml/composer/pull/1615
* add stop method by hanlint in https://github.com/mosaicml/composer/pull/1627
* S3 Checkpoint Saving By URI by eracah in https://github.com/mosaicml/composer/pull/1614
* S3 Checkpoint loading from URI by eracah in https://github.com/mosaicml/composer/pull/1624
* Add mvpatel2000 as codeowner for algos by mvpatel2000 in https://github.com/mosaicml/composer/pull/1640
* Adjust speed monitor by mvpatel2000 in https://github.com/mosaicml/composer/pull/1645
* Adding in FSDP Docs by bcui19 in https://github.com/mosaicml/composer/pull/1621
* Attempt to fix flaky doctest by dakinggg in https://github.com/mosaicml/composer/pull/1647
* Fix Missing Underscores in FSDP Docs by bcui19 in https://github.com/mosaicml/composer/pull/1648
* Fixed html path for make host command for docs by karan6181 in https://github.com/mosaicml/composer/pull/1642
* Fix hyperparameters logged to console even when progress_bar and log_to_console are False by eracah in https://github.com/mosaicml/composer/pull/1643
* Fix ImageNet Example normalization values by Landanjs in https://github.com/mosaicml/composer/pull/1641
* Python log level by dakinggg in https://github.com/mosaicml/composer/pull/1651
* Changed default logging to WARN for doctests by eracah in https://github.com/mosaicml/composer/pull/1644
* Add Event.AFTER_LOAD by mvpatel2000 in https://github.com/mosaicml/composer/pull/1652
* Lazy Logging + pretty print dict for hparams by eracah in https://github.com/mosaicml/composer/pull/1653
* Fix todo in memory monitor by mvpatel2000 in https://github.com/mosaicml/composer/pull/1654
* Tests for Idempotent Surgery by mvpatel2000 in https://github.com/mosaicml/composer/pull/1639
* Remove c4 dataset by mvpatel2000 in https://github.com/mosaicml/composer/pull/1635
* Update torchmetrics by hanlint in https://github.com/mosaicml/composer/pull/1656
* Search index filtered by project by nqn in https://github.com/mosaicml/composer/pull/1549
* FSDP Tests by bcui19 in https://github.com/mosaicml/composer/pull/1650
* Add composer version to issue template by dakinggg in https://github.com/mosaicml/composer/pull/1657
* Fix many failing notebook tests by dakinggg in https://github.com/mosaicml/composer/pull/1646
* Re-build the Docker images to resolve pip version error by bandish-shah in https://github.com/mosaicml/composer/pull/1655

**Full Changelog**: https://github.com/mosaicml/composer/compare/v0.10.1...v0.11.0

Page 11 of 11

Releases

Has known vulnerabilities

Composer

Page 11 of 11

0.5

0.5.0

0.4.0

0.3.1

0.3.0

0.1

Page 11 of 11

Links

Releases