Composer

Latest version: v0.27.0

Safety actively analyzes 682387 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 9 of 11

0.10.1

bash
pip install --upgrade mosaicml==0.10.1


New Features

1. **𐄷 Weight Standardization**

Weight Standardization reparametrizes convolutional weights such that the fan-in dimensions have zero mean and unit standard deviation. This could slightly improve performance at the expensive of 5% lower throughput. This has been used in several papers to train with smaller batch sizes, with normalization layers besides batch norm, and for transfer learning.


Using Weight Standardization with the Composer Trainer:

python
import composer

Apply Weight Standardization (when training is initialized)
weight_std = composer.algorithms.WeightStandardization()

Train with Weight Standardization
trainer = composer.trainer.Trainer(
...
algorithms=[weight_std]
)
trainer.fit()


Using Weight Standardization with the Composer functional interface:

python
import composer
from torchvision.models import resnet50

my_model = resnet50()

Apply weight standardization to model
my_model = composer.functional.weight_standardization(my_model)


Please see the [Weight Standardization Method Card](https://docs.mosaicml.com/en/stable/method_cards/weight_standardization.html) for more details.

Bug Fixes

* Fix for checkpoints not being saved automatically at the end of a run (1552)
* Fix Onnx export for Composer HuggingFaceModels (1557)
* Fix for MIoU metric producing NaN's (1558)
* CometML logger documentation updates and fixes (1567, 1570, 1571)
* WandB image visualizer fix (1591)

What's Changed
* Update evaluate_periodically() when eval interval is of type Duration by karan6181 in https://github.com/mosaicml/composer/pull/1523
* Quality of life updates to EMA by coryMosaicML in https://github.com/mosaicml/composer/pull/1524
* Add ADE20K and COCO v2 dataset behind a version flag by karan6181 in https://github.com/mosaicml/composer/pull/1528
* Pinned setuptools version to fix distutils version error by karan6181 in https://github.com/mosaicml/composer/pull/1536
* Less strict name formatting by hanlint in https://github.com/mosaicml/composer/pull/1535
* Defaulting streaming dataset version to 1 and add a deprecation warning by karan6181 in https://github.com/mosaicml/composer/pull/1532
* Changing 'stable' to 'latest' in notebooks in examples by bcui19 in https://github.com/mosaicml/composer/pull/1534
* Bump furo from 2022.6.21 to 2022.9.15 by dependabot in https://github.com/mosaicml/composer/pull/1540
* Bump fasteners from 0.17.3 to 0.18 by dependabot in https://github.com/mosaicml/composer/pull/1538
* Add Pandoc to Docker images, bump version to 2.19.2 by bandish-shah in https://github.com/mosaicml/composer/pull/1550
* Removed streaming version 2 from yaml since version 1 is default by karan6181 in https://github.com/mosaicml/composer/pull/1551
* Bump ipykernel from 6.15.2 to 6.15.3 by dependabot in https://github.com/mosaicml/composer/pull/1548
* Bump yamllint from 1.27.1 to 1.28.0 by dependabot in https://github.com/mosaicml/composer/pull/1546
* Bump traitlets from 5.3.0 to 5.4.0 by dependabot in https://github.com/mosaicml/composer/pull/1539
* Object Store Logger Race Condition + EMA Fix by mvpatel2000 in https://github.com/mosaicml/composer/pull/1552
* Adding in erroring for when using GradMonitor and DeepSpeed by bcui19 in https://github.com/mosaicml/composer/pull/1555
* Bump pypandoc from 1.8.1 to 1.9 by dependabot in https://github.com/mosaicml/composer/pull/1559
* Update context to raise errror by mvpatel2000 in https://github.com/mosaicml/composer/pull/1561
* Fix MIoU metric when `self.total_union==0` by abhi-mosaic in https://github.com/mosaicml/composer/pull/1558
* Move dataloader `initialize_object` to factory methods by hanlint in https://github.com/mosaicml/composer/pull/1510
* Weight Standardization method by Landanjs in https://github.com/mosaicml/composer/pull/1562
* Update comet links to include query params and point to main site by dakinggg in https://github.com/mosaicml/composer/pull/1567
* remove dead line in alibi by mvpatel2000 in https://github.com/mosaicml/composer/pull/1568
* GLU Fixes by mvpatel2000 in https://github.com/mosaicml/composer/pull/1564
* Add FSDP strategy by abhi-mosaic in https://github.com/mosaicml/composer/pull/1553
* Comet example by dakinggg in https://github.com/mosaicml/composer/pull/1570
* Add missing _enabled flag, post_close, and clean up comet ml tests by dakinggg in https://github.com/mosaicml/composer/pull/1571
* Consistent Method Card Style by growlix in https://github.com/mosaicml/composer/pull/1407
* add missing return in context by mvpatel2000 in https://github.com/mosaicml/composer/pull/1574
* Remove eval batch split by mvpatel2000 in https://github.com/mosaicml/composer/pull/1576
* Fix Onnx Export for Composer HuggingFaceModels by nik-mosaic in https://github.com/mosaicml/composer/pull/1557
* Revert checkpoint rename by hanlint in https://github.com/mosaicml/composer/pull/1579

New Contributors
* bcui19 made their first contribution in https://github.com/mosaicml/composer/pull/1534

**Full Changelog**: https://github.com/mosaicml/composer/compare/v0.10.0...v0.10.1

0.10.0

bash
pip install --upgrade mosaicml==0.10.0


New Features

1. **:comet: Comet Experiment Tracking (1490)**

We've added support for the popular [Comet](https://www.comet.com) experiment tracker! To enable, simply create the logger and pass it to the `Trainer` object at initialization:

python
from composer import Trainer
from composer.loggers import CometMLLogger

cometml_logger = CometMLLogger()

trainer = Trainer(
...
loggers=[cometml_logger],
)


Please see our [Logging](https://docs.mosaicml.com/en/stable/trainer/logging.html) and [CometMLLogger](https://docs.mosaicml.com/en/stable/api_reference/generated/composer.loggers.CometMLLogger.html#composer.loggers.CometMLLogger) docs pages for details on usage.


1. **:magic_wand: Automatic Evaluation Batch Size Selection (1417)**

Composer now supports `eval_batch_size='auto'`, which will choose the right evaluation batch size to avoid CUDA OOMs! Now, in conjunction with `grad_accum='auto'`, you can run the same code on any hardware with no changes necessary. This makes it easy to add evaluation to a training script without having to pick and choose the right batch sizes to avoid CUDA OOMs.

1. **:dart: Evaluation API Changes (1479)**

The Evaluation API has been updated to be consistent with the Trainer API. If the `eval_dataloader` was provided to the Trainer during initialization, `eval` can be invoked without needing to provide anything additional:

python
trainer = Trainer(
eval_dataloader=...
)
trainer.eval()


Alternatively, the `eval_dataloader` can be passed directly to the `eval()` method:

python
trainer = Trainer(
...
)
trainer.eval(
eval_dataloader=...
)


The `eval_dataloader` can be a pytorch dataloader, or for multiple metrics, a list of `Evaluator` objects.

1. :wood: **Simplified Logging (1416)**

We've significantly simplified our internal logging interface:
- Removed the use of `LogLevel` throughout the logging, which was a mostly unused feature. Filtering logs are the responsibility of the logger.
- For better compatibility with external logging interfaces such as CometML or Weights & Biases, loggers now support the following methods: `log_metrics`, `log_hyperparameters`, and `log_artifacts`. Previous calls to `data_fit, data_epeoch, ..` have been removed.

1. :dart: **validate --> eval_forward (1411 , 1419)**

Previously, `ComposerModel` implemented the `validate(batch: Any) -> Tuple[Any, Any]` method which returns an `(input, target)` tuple, and the Trainer handles updating the metrics. In `v0.10`, we return the metrics updating control to the user.

Now, models instead implement `def eval_forward(batch: Any)` which returns the outputs of evaluation, and also `def update_metric(batch, outputs, metric)` which updates the metric.

An example implementation for classification can be found in our `ComposerClassifer` base class:

python
def update_metric(self, batch: Any, outputs: Any, metric: Metric) -> None:
_, targets = batch
metric.update(outputs, targets)

def eval_forward(self, batch: Any, outputs: Optional[Any] = None) -> Any:
return outputs if outputs is not None else self.forward(batch)


1. :female_detective: **Evaluator changes**

The `Evaluator` class now stores evaluation metric _names_ instead of metric _instances_. For example:

python
glue_mrpc_task = Evaluator(
label='glue_mrpc',
dataloader=mrpc_dataloader,
metric_names=['BinaryF1Score', 'Accuracy']
)


These metric names are matched against the metrics returned by the `ComposerModel`. The metric _instances_ are now stored as deep copies in the `State` class as `state.train_metrics` or `state.eval_metrics`.

1. **:construction: Streaming Datasets Repository Preview**

We're in the process of splitting out streaming datasets into it's own repository! Streaming datasets is a high-performance drop-in replacement for Torch `IterableDataset` objects and enables you to stream your training data from cloud based object stores. For an early preview, please checkout the [Streaming repo](https://github.com/mosaicml/streaming).

1. :x: **YAHP deprecation**

We are deprecating support for [yahp](https://github.com/mosaicml/yahp), our hyperparameter configuration tool. Support for this will be removed in the following minor version release of Composer. We recommend users migrate to OmegaConf, or Hydra as tools.

Bug Fixes

* Documentation fixes (1408, 1422, 1425, 1413, 1432, 1403, 1426, 1396, 1446, 1466, 1443)
* Upgrade WandB version (1440)
* fix import (1442)
* fix wrong extra deps group (1449)
* wandb bug fix (1488)
* Reset train metrics every batch (1496)
* fix auto grad accum (1515)
* Fix compression file remote download exception handling (1526)
* Add Pandoc to Docker images, bump version to 2.19.2 (1550)

What's Changed
* current metrics docs by A-Jacobson in https://github.com/mosaicml/composer/pull/1402
* merge nlp+hf notebooks by A-Jacobson in https://github.com/mosaicml/composer/pull/1406
* Add break epoch exception by mvpatel2000 in https://github.com/mosaicml/composer/pull/1415
* Upgrade to torch 1.12.1 by abhi-mosaic in https://github.com/mosaicml/composer/pull/1409
* Metrics refactor pt1 by ishanashastri in https://github.com/mosaicml/composer/pull/1411
* Use state algos by mvpatel2000 in https://github.com/mosaicml/composer/pull/1412
* Add default ignore index by moinnadeem in https://github.com/mosaicml/composer/pull/1421
* Update default hparams for ResNet model card by abhi-mosaic in https://github.com/mosaicml/composer/pull/1423
* update colout link in custom speedup notebook by A-Jacobson in https://github.com/mosaicml/composer/pull/1408
* Clean up prose in key files by dblalock in https://github.com/mosaicml/composer/pull/1422
* Relax codeowners by bandish-shah in https://github.com/mosaicml/composer/pull/1424
* Fix typo by Landanjs in https://github.com/mosaicml/composer/pull/1425
* Fix pre-commit checks failing on fresh checkout of dev by dblalock in https://github.com/mosaicml/composer/pull/1414
* Have docs use preferred import paths, not longest import paths by dblalock in https://github.com/mosaicml/composer/pull/1413
* Fix missing indent by Landanjs in https://github.com/mosaicml/composer/pull/1432
* eval_batch_size=auto by mvpatel2000 in https://github.com/mosaicml/composer/pull/1417
* Simplify helper for conflicting files by hanlint in https://github.com/mosaicml/composer/pull/1427
* add install from dev instructions by A-Jacobson in https://github.com/mosaicml/composer/pull/1403
* Style/tone consistency update for tutorial notebooks by alextrott16 in https://github.com/mosaicml/composer/pull/1426
* Dynamic quantization + minor improvements in inference APIs by dskhudia in https://github.com/mosaicml/composer/pull/1433
* Upgrade WandB version by moinnadeem in https://github.com/mosaicml/composer/pull/1440
* Log multiple losses by Landanjs in https://github.com/mosaicml/composer/pull/1375
* Fix attribute by mvpatel2000 in https://github.com/mosaicml/composer/pull/1442
* Expand evaluation doc by alextrott16 in https://github.com/mosaicml/composer/pull/1396
* Metrics Refactor Part 2 by ishanashastri in https://github.com/mosaicml/composer/pull/1419
* Create dependabot.yml by mvpatel2000 in https://github.com/mosaicml/composer/pull/1448
* Methods overview fix by growlix in https://github.com/mosaicml/composer/pull/1446
* Bump custom-inherit from 2.3.2 to 2.4.0 by dependabot in https://github.com/mosaicml/composer/pull/1451
* Bump junitparser from 2.4.3 to 2.8.0 by dependabot in https://github.com/mosaicml/composer/pull/1453
* Update moto[s3] requirement from <3.2,>=3.1.12 to >=4.0.1,<5 by dependabot in https://github.com/mosaicml/composer/pull/1450
* Update monai requirement from <0.9,>=0.8.0 to >=0.9.0,<0.10 by dependabot in https://github.com/mosaicml/composer/pull/1452
* Update torch-optimizer requirement from <0.2,>=0.1.0 to >=0.3.0,<0.4 by dependabot in https://github.com/mosaicml/composer/pull/1454
* Bump cryptography from 37.0.2 to 37.0.4 by dependabot in https://github.com/mosaicml/composer/pull/1457
* Bump sphinxext-opengraph from 0.6.1 to 0.6.3 by dependabot in https://github.com/mosaicml/composer/pull/1458
* Bump coverage[toml] from 6.3.2 to 6.4.4 by dependabot in https://github.com/mosaicml/composer/pull/1460
* Bump nbsphinx from 0.8.8 to 0.8.9 by dependabot in https://github.com/mosaicml/composer/pull/1459
* Fix incorrect deps group in `streaming` requirement by hanlint in https://github.com/mosaicml/composer/pull/1449
* Logger Destination Refactor by eracah in https://github.com/mosaicml/composer/pull/1416
* Bump sphinx-markdown-tables from 0.0.15 to 0.0.17 by dependabot in https://github.com/mosaicml/composer/pull/1463
* Bump traitlets from 5.1.1 to 5.3.0 by dependabot in https://github.com/mosaicml/composer/pull/1462
* Bump vit-pytorch from 0.27 to 0.35.8 by dependabot in https://github.com/mosaicml/composer/pull/1465
* Bump furo from 2022.3.4 to 2022.6.21 by dependabot in https://github.com/mosaicml/composer/pull/1467
* Bump ipykernel from 6.9.2 to 6.15.1 by dependabot in https://github.com/mosaicml/composer/pull/1470
* Bump pytest from 7.1.0 to 7.1.2 by dependabot in https://github.com/mosaicml/composer/pull/1469
* Bump sphinxcontrib-katex from 0.8.6 to 0.9.0 by dependabot in https://github.com/mosaicml/composer/pull/1476
* Bump tabulate from 0.8.9 to 0.8.10 by dependabot in https://github.com/mosaicml/composer/pull/1478
* Bump yamllint from 1.26.3 to 1.27.1 by dependabot in https://github.com/mosaicml/composer/pull/1481
* Bump ipykernel from 6.15.1 to 6.15.2 by dependabot in https://github.com/mosaicml/composer/pull/1482
* Refactor CheckpointSaver by hanlint in https://github.com/mosaicml/composer/pull/1428
* Clean up docs Makefile by eracah in https://github.com/mosaicml/composer/pull/1466
* Model surgery info -> debug by mvpatel2000 in https://github.com/mosaicml/composer/pull/1485
* Docker image with Flash Attention by abhi-mosaic in https://github.com/mosaicml/composer/pull/1471
* Fix WandBLogger bug with inaccurate step count by eracah in https://github.com/mosaicml/composer/pull/1488
* Update Eval API by hanlint in https://github.com/mosaicml/composer/pull/1479
* Random Names with Fixed Seed by mvpatel2000 in https://github.com/mosaicml/composer/pull/1487
* ResNet50 on ImageNet training script example by Landanjs in https://github.com/mosaicml/composer/pull/1434
* Remove hparams from `test_precision` and `test_state` by hanlint in https://github.com/mosaicml/composer/pull/1486
* Clean up `save_checkpoint` by hanlint in https://github.com/mosaicml/composer/pull/1484
* Remove hparams from test_ddp by hanlint in https://github.com/mosaicml/composer/pull/1489
* update model token embeddings according to tokenizer len by ananyahjha93 in https://github.com/mosaicml/composer/pull/1493
* BERT classifier metrics depend on num_labels by alextrott16 in https://github.com/mosaicml/composer/pull/1495
* Reset train metrics every batch by abhi-mosaic in https://github.com/mosaicml/composer/pull/1496
* Algolia doc search by nqn in https://github.com/mosaicml/composer/pull/1443
* Squelch Engine debug logs by hanlint in https://github.com/mosaicml/composer/pull/1497
* Remove TODO by mvpatel2000 in https://github.com/mosaicml/composer/pull/1499
* Remove hparams from checkpoint tests by hanlint in https://github.com/mosaicml/composer/pull/1491
* [Docs] Training ResNet-50 on AWS tutorial by bandish-shah in https://github.com/mosaicml/composer/pull/1444
* Refactor hparams in tests by hanlint in https://github.com/mosaicml/composer/pull/1498
* Bump pytest from 7.1.2 to 7.1.3 by dependabot in https://github.com/mosaicml/composer/pull/1500
* Improved comments and improved test code by karan6181 in https://github.com/mosaicml/composer/pull/1502
* Refactor GLUE fine-tune queuing to improve efficiency and add task-specific seed sweeps by alextrott16 in https://github.com/mosaicml/composer/pull/1363
* Raise ValueError for Profiler + Auto Grad Accum by mvpatel2000 in https://github.com/mosaicml/composer/pull/1504
* add yahp deprecation warnings by hanlint in https://github.com/mosaicml/composer/pull/1505
* Move logic from `initialize_object` to object store class by hanlint in https://github.com/mosaicml/composer/pull/1508
* Fix run name comment by mvpatel2000 in https://github.com/mosaicml/composer/pull/1509
* Add CometML Support by eracah in https://github.com/mosaicml/composer/pull/1490
* Raise ValueError if missing a surgery algorithm by mvpatel2000 in https://github.com/mosaicml/composer/pull/1506
* remove datasets from gitignore by hanlint in https://github.com/mosaicml/composer/pull/1513
* fix auto grad accum by mvpatel2000 in https://github.com/mosaicml/composer/pull/1515
* Use eval context by mvpatel2000 in https://github.com/mosaicml/composer/pull/1516
* Update tensorflow-io requirement from <0.27,>=0.26.0 to >=0.26.0,<0.28 by dependabot in https://github.com/mosaicml/composer/pull/1522
* Bump cryptography from 37.0.4 to 38.0.1 by dependabot in https://github.com/mosaicml/composer/pull/1521
* Fix SAM loss by mvpatel2000 in https://github.com/mosaicml/composer/pull/1518
* Fixed remote path in streaming dataloader facesynthetics jupyter notebook by karan6181 in https://github.com/mosaicml/composer/pull/1519
* Rework auto grad accum checks by mvpatel2000 in https://github.com/mosaicml/composer/pull/1517
* [xs] remove libcloudhparams from `test_filehelpers.py` by hanlint in https://github.com/mosaicml/composer/pull/1514
* Add v2 datasets behind a version flag by knighton in https://github.com/mosaicml/composer/pull/1507
* Fix compression file remote download exception handling. by knighton in https://github.com/mosaicml/composer/pull/1526

New Contributors
* ananyahjha93 made their first contribution in https://github.com/mosaicml/composer/pull/1493

**Full Changelog**: https://github.com/mosaicml/composer/compare/v0.9.0...v0.10.0

0.9.0

Excited to share the release of Composer v0.9.0, which comes with an Inference Export API, beta support for Apple Silicon and TPU training, as well as expanded usability of NLP-related speed-up methods. This release includes 175 commits from 34 contributors, including 10 new contributors :raised_hands: !

bash
pip install --upgrade mosaicml==0.9.0

Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.9.0


New Features

1. **:package: Export for inference APIs**

Train with Composer and deploy anywhere! We have added a dedicated [export API](https://docs.mosaicml.com/en/v0.9.0/api_reference/composer.utils.inference.html) as well as an [export training callback](https://docs.mosaicml.com/en/v0.9.0/api_reference/composer.callbacks.export_for_inference.html) to allow you to export Composer-trained models for inference, supporting popular formats such as [torchscript](https://pytorch.org/docs/stable/jit.html) and [ONNX](https://onnx.ai/).

For example, here’s how to export a model in torchscript format:

python
from composer.utils import export_for_inference

Invoking export with a trained model
export_for_inference(model=model,
save_format='torchscript',
save_path=model_save_path)


Here’s an example of using the training callback, which automatically exports the model at the end of training to ONNX format:

python
from composer.callbacks import ExportForInferenceCallback

Initializing Trainer with the export callback
callback = ExportForInferenceCallback(save_format='onnx',
save_path=model_save_path)
trainer = Trainer(model=model,
callbacks=callback,
train_dataloader=dataloader,
max_duration='10ep')

Model will be exported at the end of training
trainer.fit()


Please see our [Exporting for Inference](https://docs.mosaicml.com/en/stable/examples/exporting_for_inference.html) notebook for more information.

1. **:chart_with_upwards_trend: ALiBi support for BERT training**

You can now use ALiBi (**A**ttention with **Li**near **Bi**ases; [Press et al., 2021](https://arxiv.org/abs/2108.12409)) when training BERT models with Composer, delivering faster training and higher accuracy by leveraging shorter sequence lengths.

ALiBi improves the quality of BERT pre-training, especially when pre-training uses shorter sequence lengths than the downstream (fine-tuning) task. This allows models with ALiBi to reach higher downstream accuracy with less pre-training time.

Example of using ALiBi as an algorithm with the Composer Trainer:

python
Create an instance of a BERT masked language model
model = composer.models.create_bert_mlm()

Apply ALiBi (when training is initialized)
alibi = composer.algorithms.alibi(max_sequence_length=1024)

Train with ALiBi
trainer = composer.trainer.Trainer(
model=model,
train_dataloader=train_dataloader,
algorithms=[alibi]
)
trainer.fit()


Example using the Composer Functional API:

python
import composer.functional as cf

Create an instance of a BERT masked language model
model = composer.models.create_bert_mlm()

Apply ALiBi and expand the model's maximum sequence length to 1024
cf.apply_alibi(model=model, max_sequence_length=1024)


AliBi can also now be extended to work with custom models by registering your attention and embedding layers. Please see our [ALiBi method card](https://docs.mosaicml.com/en/stable/method_cards/alibi.html) for more information.

1. **🧐 Entry point for GLUE tasks pre-training and fine-tuning**

You can now easily pre-train and fine-tune NLP models across all [GLUE](https://gluebenchmark.com/) (General Language Understanding Evaluation) tasks through one simple entry point! The entry point handles model saving and loading, spawns GLUE tasks in parallel across all available GPUs, and delivers a highly efficient evaluation of model performance.

Example of launching the entrypoint:

bash
This runs pre-training followed by fine-tuning.
--training_scheme can take either pretrain, finetune, or all depending on the task!
python run_glue_trainer.py -f glue_example.yaml --training_scheme all


Please see our [GLUE entrypoint notebook](https://docs.mosaicml.com/en/v0.9.0/examples/glue/glue_entrypoint.html) for more information.

1. **πŸ€– TPU support (in beta)**

You can now use Composer to train your models on TPUs! Support is now available in Beta, and currently only supports single-core TPU training. Try it out, explore optimizations, and share your feedback and feature requests with us so we can make it better for you and for the community.

To use TPUs with Composer, simply specify a `tpu` device:

python
Set device to `tpu`
trainer = composer.trainer.Trainer(
model=model,
train_dataloader=train_dataloader,
max_duration=train_epochs,
device='tpu')

Run fit
trainer.fit()


Please see our [Training with TPUs notebook](https://docs.mosaicml.com/en/v0.9.0/examples/TPU_Training_in_composer.html) for more information.

1. **:apple: Apple Silicon support (beta)**

Leverage Apple Silicon chips to train your models with Composer by providing the `device='mps'` argument:

python
trainer = Trainer(
...,
device='mps'
)


We use the latest PyTorch MPS backend to execute the training. This requires torch version β‰₯1.12, and Max OSX 12.3+.

For more information on training with Apple M chips, see the [PyTorch 1.12 blog](https://pytorch.org/blog/pytorch-1.12-released/#prototype-introducing-accelerated-pytorch-training-on-mac) and our [API Reference](https://docs.mosaicml.com/en/v0.9.0/api_reference/composer.trainer.devices.device_mps.html) for Composer specific details.

1. **:construction: Contrib repository**

Got a new method idea, or published a paper and want those methods to be easily accessible? We’ve created the [`mcontrib` repository](https://github.com/mosaicml/mcontrib), with a lightweight process to contribute new algorithms. We’re happy to work directly with you to benchmark these methods and eventually β€œpromote” them to Composer for use by end customers.

Please checkout the [README](https://github.com/mosaicml/mcontrib#adding-algorithms) for details on how to contribute a new algorithm. For more details on how to write speed-up methods, see our notebook on [custom speed-up methods](https://docs.mosaicml.com/en/v0.9.0/examples/custom_speedup_methods.html).

Additional API Changes

1. **:1234: Passes Module**

The order in which algorithms are run matters significantly during composition. With this release we refactored algorithm passes into their own [`passes` module](https://docs.mosaicml.com/en/v0.9.0/api_reference/composer.core.passes.html). Users can now register custom passes (for custom algorithms) with the Engine. Please see #1377 for more information.

1. **:file_cabinet: Default Checkpoint Extension**

The CheckpointSaver now defaults to using the `*.pt` extension for checkpoint fienames. Please see 1370 for more information.

1. **:eye: Models Refactor**

Most vision models (ResNet, MNIST, ViT, EfficientNet) have been refactored from classes to a factory function. For example `ComposerResNet` -> `composer_resnet`.

python
before
from composer.models import ComposerResNet
model = ComposerResNet(..)

from composer.models import composer_resnet after
model = composer_resnet(..)


The same refactor has been done for NLP as well, e.g. `BERTModel` -> `create_bert_mlm` and `create_bert_classification`.

See 1227 (vision) and 1130 (NLP) for more details.

1. **:heavy_plus_sign: Misc API Changes**

* `BreakEpochException` has been removed.
* `state.is_model_deepspeed` has been moved to `composer.utils.is_model_deepspeed`.
* Helper function `monitored_barrier` has been added to `composer` distributed.


Bug Fixes

* Add informative error for infer batch size issues (1401)
* Fix ImagenetDatasetHparams bug (1392), resolves 1111
* Fix hparams error condition checking (1394)
* Fix AMP resumption with grad scaler (1376)
* Auto Grad Accum Cache Clearing (1380), fixes issue reported in 1331
* Fix default precision (1369)
* Fix the profiler on multi-node training (1358), resolves 1270
* Retry SFTP on Size Mismatch (1300)
* Fix scheduler edge cases (1350), resolves 1077
* Fix a race condition in the object store logger (1328)
* Fix WandB load from checkpoint (1326)
* Fix Notebook Progress Bars (1313)

Commits

What's Changed
* Fix DeepSpeed typo in docstring by abhi-mosaic in https://github.com/mosaicml/composer/pull/1188
* Move grad_accum logging to every step by coryMosaicML in https://github.com/mosaicml/composer/pull/1187
* Update STYLE_GUIDE with details on Documentation by bandish-shah in https://github.com/mosaicml/composer/pull/1183
* ProgressBar Units by hanlint in https://github.com/mosaicml/composer/pull/1190
* Added Xavier Normal initializer by vladd-i in https://github.com/mosaicml/composer/pull/1196
* Updated cost figure by nqn in https://github.com/mosaicml/composer/pull/1180
* Remove algorithm yamls by hanlint in https://github.com/mosaicml/composer/pull/1193
* Fix the Composer Launch Script for the Composer Dockerimage; Default `nproc = torch.cuda.device_count()` if not specified via env by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1195
* Bert model card by A-Jacobson in https://github.com/mosaicml/composer/pull/1198
* Add Notes on Early Stopping by anisehsani in https://github.com/mosaicml/composer/pull/1182
* Stochastic depth that preserves weights by Landanjs in https://github.com/mosaicml/composer/pull/1085
* Adding Gated Linear Units as an algorithm by moinnadeem in https://github.com/mosaicml/composer/pull/1192
* A utility to fuse parallel linear layers in FX-traced models by dskhudia in https://github.com/mosaicml/composer/pull/1189
* Build+push Composer dockerimages to `mosaicml/composer_staging` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1197
* Fix the SFTP Object Store by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1202
* Bert emoji by A-Jacobson in https://github.com/mosaicml/composer/pull/1205
* Adding a constant warmup scheduler by linden-li in https://github.com/mosaicml/composer/pull/1203
* Fix multi-GPU conflicts when downloading `torchvision` datasets by abhi-mosaic in https://github.com/mosaicml/composer/pull/1201
* Add caveats about automatic gradient accumulation by hanlint in https://github.com/mosaicml/composer/pull/1207
* Remove the `composer_train` entrypoint; put it back in `examples` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1211
* Fix Composer staging dockerimages by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1210
* Set SFTP Object Store Private Key Filepath from an Environ by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1212
* [xs] Fix progress bars in `get_file` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1216
* Cleanup SFTP url parsing for StreamingDataset by abhi-mosaic in https://github.com/mosaicml/composer/pull/1217
* Fix Symlinks on Non-Libcloud Object Stores by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1209
* Fix the ObjectStoreLogger with Overwrite=True by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1208
* Throughput metrics by linden-li in https://github.com/mosaicml/composer/pull/1215
* Fix module surgery for training resumptions with optimizers that save state by dskhudia in https://github.com/mosaicml/composer/pull/1200
* Update bert-base.yaml by moinnadeem in https://github.com/mosaicml/composer/pull/1219
* StreamingDataset: make remote optional, attempt to prettify docstrings. by knighton in https://github.com/mosaicml/composer/pull/1220
* Update vision-style `StreamingDataset`s to subclass `VisionDataset` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1223
* Improve docstrings. by knighton in https://github.com/mosaicml/composer/pull/1222
* shardwise zip streaming datasets by milocress in https://github.com/mosaicml/composer/pull/1177
* updated mosaic logos to composer logos in docs by ejyuen in https://github.com/mosaicml/composer/pull/1221
* Add `COMPOSER_KNOWN_HOSTS_FILENAME` for setting the sftp known hosts file environ by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1224
* StreamingDataset: correctly handle exceptions in child download thread. by knighton in https://github.com/mosaicml/composer/pull/1228
* hot fix compression 404 by milocress in https://github.com/mosaicml/composer/pull/1229
* Treat any dropped SSH/SFTP connection as a transient error by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1225
* refactor bert and gpt by A-Jacobson in https://github.com/mosaicml/composer/pull/1130
* Hotfix for S3 `FileNotFoundError` by abhi-mosaic in https://github.com/mosaicml/composer/pull/1233
* Fix StreamingDataset compression with multi-rank by milocress in https://github.com/mosaicml/composer/pull/1231
* Refactor vision models by Landanjs in https://github.com/mosaicml/composer/pull/1227
* Update resnet50_medium.yaml by lupesko in https://github.com/mosaicml/composer/pull/1235
* Increase default timeout for `StreamingC4` to 120s by abhi-mosaic in https://github.com/mosaicml/composer/pull/1234
* Add Debug Log Statements; Fix Pyright by hanlint in https://github.com/mosaicml/composer/pull/1218
* Hotfix deeplabv3 by Landanjs in https://github.com/mosaicml/composer/pull/1238
* Add Tensorboard Logger by eracah in https://github.com/mosaicml/composer/pull/1194
* Move the model and optimizers to the device before `Event.INIT` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1084
* Fix bug in streaming iteration/downloading, refactor by knighton in https://github.com/mosaicml/composer/pull/1239
* Support sequence of losses in backwards pass by Landanjs in https://github.com/mosaicml/composer/pull/1240
* Add device_id param to DeviceGPU by ishanashastri in https://github.com/mosaicml/composer/pull/1244
* Update CutMix to work with segmentation style labels by coryMosaicML in https://github.com/mosaicml/composer/pull/1230
* Catching ChannelErrors on SFTP Failures by moinnadeem in https://github.com/mosaicml/composer/pull/1245
* Make `StreamingDataset` compression file easier to write/read by abhi-mosaic in https://github.com/mosaicml/composer/pull/1246
* [XS] Updating console progress_bar logger to use max_duration units by moinnadeem in https://github.com/mosaicml/composer/pull/1243
* Catch botocore ClientError 403 by abhi-mosaic in https://github.com/mosaicml/composer/pull/1249
* Tensorboard Notebook + Tutorial by eracah in https://github.com/mosaicml/composer/pull/1250
* Fix repeated words in event.py by isaac0804 in https://github.com/mosaicml/composer/pull/1254
* Make progressive resizing quieter by coryMosaicML in https://github.com/mosaicml/composer/pull/1255
* fix typo in example by xloem in https://github.com/mosaicml/composer/pull/1259
* Create a new `boto3.Session()` per `S3ObjectStore` instance by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1260
* Fix recipe yamls for `v0.8`, add testing by hanlint in https://github.com/mosaicml/composer/pull/1257
* Automatic Stochastic depth on residual blocks by dskhudia in https://github.com/mosaicml/composer/pull/1253
* Sequence length warmup update and tests by alextrott16 in https://github.com/mosaicml/composer/pull/1199
* ProgressBarLogger UX Enhancements by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1264
* Update to latest pytorch by mvpatel2000 in https://github.com/mosaicml/composer/pull/1262
* Add packaging to `meta.yaml`; add `py-cpuinfo` max version by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1271
* Fix Flaky Tests by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1272
* Add callback for visualizing image inputs and outputs by coryMosaicML in https://github.com/mosaicml/composer/pull/1266
* Add `scale_warmup` argument to schedulers by hanlint in https://github.com/mosaicml/composer/pull/1268
* Switch Jenkins to r1z3 by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1277
* BERT and C4 updates by abhi-mosaic in https://github.com/mosaicml/composer/pull/1252
* Default to `allow_tf32=True` for GPU Devices by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1275
* Fix grad accum parsing in hparams by hanlint in https://github.com/mosaicml/composer/pull/1256
* Fix issue with doctest format in some docstring examples by Landanjs in https://github.com/mosaicml/composer/pull/1269
* Adds S3ObjectStore import to util __init__.py by codestar12 in https://github.com/mosaicml/composer/pull/1274
* Add tutorial on exporting for inference by hanlint in https://github.com/mosaicml/composer/pull/1276
* HTTPS downloads for streaming datasets by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1258
* object stores for streaming datasets by milocress in https://github.com/mosaicml/composer/pull/1248
* Allow object name prefix for S3ObjectStore by abhi-mosaic in https://github.com/mosaicml/composer/pull/1278
* Hotfix CO-658 by milocress in https://github.com/mosaicml/composer/pull/1273
* Fix S3 remote paths for StreamingDataset download by abhi-mosaic in https://github.com/mosaicml/composer/pull/1280
* Add combo loss to DeepLabv3+ by Landanjs in https://github.com/mosaicml/composer/pull/1265
* Checkpoint backwards compatibility for ProgressBar by hanlint in https://github.com/mosaicml/composer/pull/1287
* Add missing callbacks by hanlint in https://github.com/mosaicml/composer/pull/1286
* Fix S3 prefix upload/download by abhi-mosaic in https://github.com/mosaicml/composer/pull/1288
* Fix device inference in module surgery by hanlint in https://github.com/mosaicml/composer/pull/1290
* Actual fix to backwards compatibility by hanlint in https://github.com/mosaicml/composer/pull/1289
* Bugs in getting_started.ipynb by rahulvigneswaran in https://github.com/mosaicml/composer/pull/1285
* Add pytorch 1.12.0 docker image by linden-li in https://github.com/mosaicml/composer/pull/1247
* Fix TB Logger + ObjectStore quadratic complexity issue by doing 1 file per flush by eracah in https://github.com/mosaicml/composer/pull/1283
* Enable README Doctests with GPUs by mvpatel2000 in https://github.com/mosaicml/composer/pull/1279
* Fix logging of hparams to object stores by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1297
* [xs] Reformat the Composer Version String by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1301
* Add monitored barrier for autograd accum by mvpatel2000 in https://github.com/mosaicml/composer/pull/1295
* [xs] Notebook Fixes by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1299
* [xs] Store the Composer version in one place. by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1302
* model export for inference. Functional API by dskhudia in https://github.com/mosaicml/composer/pull/1294
* Add a `return_outputs` flag to `predict()` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1307
* Integration Testing by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1305
* Fix `get_file_artifact` in the WandBLogger to work on all ranks by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1304
* Add documentation about `run_name` to Composer by eracah in https://github.com/mosaicml/composer/pull/1298
* Enforce FusedLayerNorm is ordered last by alextrott16 in https://github.com/mosaicml/composer/pull/1309
* Revert monitored barrier by mvpatel2000 in https://github.com/mosaicml/composer/pull/1311
* [xs] Build the Composer Docker Image only on `dev` branch merges by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1308
* Fix Notebook Progress Bars by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1313
* Remove `pytest-timeout` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1317
* [Minor] Inference API parameter name change by dskhudia in https://github.com/mosaicml/composer/pull/1315
* Matthew/swa readme by growlix in https://github.com/mosaicml/composer/pull/1292
* Enable gloo backend by mvpatel2000 in https://github.com/mosaicml/composer/pull/1321
* [xs] Fix pytest test filtering; Bump the minimum pytorch version to 1.10 by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1320
* revert gloo by mvpatel2000 in https://github.com/mosaicml/composer/pull/1324
* Fix WandB load from checkpoint by abhi-mosaic in https://github.com/mosaicml/composer/pull/1326
* ALiBi for BERT and ALiBi testing by alextrott16 in https://github.com/mosaicml/composer/pull/1267
* Update HF example with read of model eval accuracy by lupesko in https://github.com/mosaicml/composer/pull/1332
* Cleanup API Reference Titles by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1336
* Fix a race condition in the object store logger by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1328
* Auto Grad Accum Change to Warning by mvpatel2000 in https://github.com/mosaicml/composer/pull/1338
* Add export for inference callback by nik-mosaic in https://github.com/mosaicml/composer/pull/1323
* Add save fine-tune model to HuggingFace example by lupesko in https://github.com/mosaicml/composer/pull/1333
* Update DWD optimizers by abhi-mosaic in https://github.com/mosaicml/composer/pull/1339
* Cap Numpy Version by mvpatel2000 in https://github.com/mosaicml/composer/pull/1345
* Update slack link by hanlint in https://github.com/mosaicml/composer/pull/1344
* Fix scheduler edge cases by abhi-mosaic in https://github.com/mosaicml/composer/pull/1350
* Integration Tests for Object Stores and Loggers by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1322
* Retry SFTP on Size Mismatch by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1300
* [xs] Restore the dataloader and training properties in `predict()` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1352
* Add Precision Contexts by mvpatel2000 in https://github.com/mosaicml/composer/pull/1347
* Update GLU logging strings by moinnadeem in https://github.com/mosaicml/composer/pull/1348
* Add domain-specific codeowners by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1354
* fix marker by mvpatel2000 in https://github.com/mosaicml/composer/pull/1359
* Fix the profiler on multi-node training by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1358
* Glue Entrypoint by ishanashastri in https://github.com/mosaicml/composer/pull/1263
* Yahp v0.1.3 by mvpatel2000 in https://github.com/mosaicml/composer/pull/1346
* Move metrics to context by mvpatel2000 in https://github.com/mosaicml/composer/pull/1361
* Refactor multiple losses to support dictionaries and fix discrepancies by Landanjs in https://github.com/mosaicml/composer/pull/1349
* Fix Coverage Reports on Jenkins by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1114
* JSON Schemas by mvpatel2000 in https://github.com/mosaicml/composer/pull/1371
* add filename extension by mvpatel2000 in https://github.com/mosaicml/composer/pull/1370
* JSON Schemas pt 2 by mvpatel2000 in https://github.com/mosaicml/composer/pull/1373
* Update Export for Inference methods by nik-mosaic in https://github.com/mosaicml/composer/pull/1355
* Fix default precision by A-Jacobson in https://github.com/mosaicml/composer/pull/1369
* Clean up unused exception by mvpatel2000 in https://github.com/mosaicml/composer/pull/1368
* Revert "Clean up unused exception" by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1378
* Remove Unused Exception by mvpatel2000 in https://github.com/mosaicml/composer/pull/1379
* Auto Grad Accum Cache Clearing by mvpatel2000 in https://github.com/mosaicml/composer/pull/1380
* Add ability to register algorithm passes by hanlint in https://github.com/mosaicml/composer/pull/1377
* Fix AMP resumption with grad scaler by hanlint in https://github.com/mosaicml/composer/pull/1376
* Update CUDA and remove NCCL downgrade from Dockerfile by abhi-mosaic in https://github.com/mosaicml/composer/pull/1362
* Add Notes on Artifact Logging by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1381
* Print the microbatch size when using Adaptive Gradient Accumulation by hanlint in https://github.com/mosaicml/composer/pull/1387
* Cleaner API reference part 1: references with minimal import paths by dblalock in https://github.com/mosaicml/composer/pull/1385
* Add Event.BEFORE_DATALOADER by mvpatel2000 in https://github.com/mosaicml/composer/pull/1388
* remove private s3 paths by A-Jacobson in https://github.com/mosaicml/composer/pull/1389
* Tutorial on training without Local Storage by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1351
* [inference] Update export_for_inference notebook with new APIs by dskhudia in https://github.com/mosaicml/composer/pull/1360
* Fix resnet warnings criteria by mvpatel2000 in https://github.com/mosaicml/composer/pull/1395
* Fix hparams error by mvpatel2000 in https://github.com/mosaicml/composer/pull/1394
* Add knighton to codeowners for datasets by knighton in https://github.com/mosaicml/composer/pull/1397
* Fix ImagenetDatasetHparams bug by nik-mosaic in https://github.com/mosaicml/composer/pull/1392
* Decouple GLUE entry point saving and loading logic by ishanashastri in https://github.com/mosaicml/composer/pull/1390
* Glue example notebook by ishanashastri in https://github.com/mosaicml/composer/pull/1383
* Add informative error for infer batch size issues by hanlint in https://github.com/mosaicml/composer/pull/1401
* Only sync batchnorm statistics within a node for deeplab by Landanjs in https://github.com/mosaicml/composer/pull/1391
* Update DeepLabv3 pretrained weight interface to work with PyTorch 1.12 by Landanjs in https://github.com/mosaicml/composer/pull/1399
* tpu single core by florescl in https://github.com/mosaicml/composer/pull/1400
* Add support for Apple M chips by hanlint in https://github.com/mosaicml/composer/pull/1405
* [xs] Add `mps` and `tpu` device to Trainer docstrings by hanlint in https://github.com/mosaicml/composer/pull/1410

**Full Changelog**: https://github.com/mosaicml/composer/compare/v0.8.2...v0.9.0

New Contributors
* vladd-i made their first contribution in https://github.com/mosaicml/composer/pull/1196
* linden-li made their first contribution in https://github.com/mosaicml/composer/pull/1203
* ejyuen made their first contribution in https://github.com/mosaicml/composer/pull/1221
* lupesko made their first contribution in https://github.com/mosaicml/composer/pull/1235
* isaac0804 made their first contribution in https://github.com/mosaicml/composer/pull/1254
* xloem made their first contribution in https://github.com/mosaicml/composer/pull/1259
* alextrott16 made their first contribution in https://github.com/mosaicml/composer/pull/1199
* codestar12 made their first contribution in https://github.com/mosaicml/composer/pull/1274
* rahulvigneswaran made their first contribution in https://github.com/mosaicml/composer/pull/1285
* nik-mosaic made their first contribution in https://github.com/mosaicml/composer/pull/1323

0.8.2

bash
pip install --upgrade mosaicml==0.8.2

Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.8.2


πŸ› Bug Fixes

1. **Fixed Notebook Progress Bars in Colab**

Fixes a bug introduced by 1264 which causes Composer running in Colab notebooks to error out with:
UnsupportedOperation: fileno.

Closes 1312. Fixed in PR 1314.

Changelog

https://github.com/mosaicml/composer/compare/v0.8.1...v0.8.2

0.8.1

bash
pip install --upgrade mosaicml==0.8.1

Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.8.1


🎁 New Features


1. **πŸ–ΌοΈ Image Visualizer**

The [`ImageVisualizer`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.callbacks.image_visualizer.html#composer.callbacks.image_visualizer.ImageVisualizer) callback periodically logs the training and validation images when using the WandB logger. This is great for validating your dataloader pipeline, especially if extensive data augmentations are used. Also, when training on a semantic segmentation task, the callback can log the target segmentation mask and the predicted segmentation mask by setting the argument `mode='segmentation'`. See PR 1266 for more details. Here is an example of using the `ImageVisualizer` callback:

python
from composer import Trainer
from composer.callbacks import ImageVisualizer

Callback to log 8 training images after every 100 batches
image_visualizer = ImageVisualizer()

Construct trainer
trainer = Trainer(
...,
callbacks=image_visualizer
)

Train!
trainer.fit()



Here is an example visualization from the training set of ADE20k:

![](https://i.imgur.com/iszIRLS.jpg)


1. **πŸ“Ά TensorBoard Logging**

You can now log metrics and losses from your Composer training runs with Tensorboard! See 1250 and 1283 for more details. All you have to do is create a [`TensorboardLogger`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.loggers.tensorboard_logger.html#composer.loggers.tensorboard_logger.TensorboardLogger) object and add it
to the list of loggers in your [`Trainer`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer) object like so:

python
from composer import Trainer
from composer.loggers import TensorboardLogger

tb_logger = TensorboardLogger(log_dir="./my_tensorboard_logs")

trainer = Trainer(
...
Add your Tensorboard Logger to the trainer here.
loggers=[tb_logger],
)

trainer.fit()


For more information, see this [tutorial](https://docs.mosaicml.com/en/v0.8.1/notes/tensorboard_logger.html).




1. **πŸ”™ Multiple Losses**

Adds support for multiple losses. If a model returns a tuple of losses, they are summed before the `loss.backward()` call. See 1240 for more details.


1. **🌎️ Stream Datasets from HTTP URIs**

You can now specify a HTTP URI for a [Streaming Dataset](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.datasets.streaming.dataset.html#composer.datasets.streaming.dataset.StreamingDataset) remote. See 1258 for more detials. For example:

python
from composer.datasets.streaming import StreamingDataset
from torch.utils.data import DataLoader

Construct the Dataset
dataset = StreamingDataset(
...,
remote="https://example.com/dataset/",
)

Construct the DataLoader
train_dl = DataLoader(dataset)

Construct the Trainer
trainer = Trainer(
...,
train_dataloader=train_dl,
)

Train!
trainer.fit()


For more information on streaming datasets, see this [tutorial](https://docs.mosaicml.com/en/v0.8.1/examples/streaming_dataloader_facesynthetics.html).


1. **πŸ„οΈ GPU Devices default to TF32 Matmuls**

Beginning with PyTorch 1.12, the default behavior for computing FP32 matrix multiplies on NVIDIA Ampere devices was switched from TF32 to FP32. See [PyTorch documentation here](https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices).

Since Composer is designed specifically for ML training with a focus on efficiency, we choose to preserve the old default of using TF32 on Ampere devices. This leads to significantly higher throughput when training in single precision, [without impact training convergence](https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/). See PR #1275 for implementation details.

1. **πŸ‘‹ Set the Device ID for GPU Devices**

Specify the device ID within a DeviceGPU to train on when instantiating a Trainer object instead of using the local ID! For example,

python
from composer.trainer.devices.device_gpu import DeviceGPU

Specify to use GPU 3 to train
device = DeviceGPU(device_id=3)

Construct the Trainer
trainer = Trainer(
...,
device = device
)

Train!
trainer.fit()




1. **BERT and C4 Updates**

We make some minor adjustments to our `bert-base-uncased.yaml` training config. In particular, we make the global train and eval batch sizes a power of 2. This maintains divisibility when using many GPUs in multi-node training. We also adjust the `max_duration` so that it converts cleanly to 70,000 batches.

We also upgrade our StreamingDataset C4 conversion script (`scripts/mds/c4.py`) to use a multi-threaded reader. On a 64-core machine we are able to convert the 770GB train split to `.mds` format in ~1.5hr.


1. **πŸ“‚ Set a `prefix` when using a `S3ObjectStore`**

When using `S3ObjectStore` for applications like checkpointing, it can be useful to provide path prefixes, mimicking `folder/subfolder` directories like on a local filesystem. When `prefix` is provided, any objects uploaded with `S3ObjectStore` will be stored at `f's3://{self.bucket}/{self.prefix}{object_name}'`.


1. **βš–οΈ Scale the Warmup Period of Composer Schedulers**

Added a new flag `scale_warmup` to schedulers that will scale the warmup period when a scale schedule ratio is applied. Default is `False` to mirror default behavior. See 1268 for more detials.

1. **🧊 Stochastic Depth on Residual Blocks**

Residual blocks are detected automatically and replaced with stochastic versions. See 1253 for more details.

πŸ› Bug Fixes

1. **Fixed Progress Bars**

Fixed a bug where the the Progress Bars jumped around and did not stream properly when tailing the terminal over the network. Fixed in 1264, 1287, and 1289.

1. **Fixed S3ObjectStore in Multithreaded Environments**

Fixed a bug where the `boto3` crashed when creating the default session in multiple threads simultaniously (see https://github.com/boto/boto3/issues/1592). Fixed in #1260.

1. **Retry on `ChannelException` errors in the `SFTPObjectStore`**

Catch `ChannelException` SFTP transient error and retry. Fixed in 1245.

1. **Treating S3 Permission Denied Errors as Not Found Errors**

We update our handling of `botocore` 403 ClientErrors to interpret them as `FileNotFoundErrors`. We do this because of a situation that occurs when a user has no S3 credentials configured, and tries to read from a bucket with public files. For privacy, Amazon S3 raises 403 (Permission Denied) instead of 404 (Not Found) errors. As such, PR 1249 treats 403 ClientErrors as FileNotFoundErrors.

1. **Fixed Parsing of `grad_accum` in the `TrainerHparams`**

Fixes an error where the command line override `--grad_accum` lead to incorrect parsing. Fixed in 1256.

1. **Fixed Example YAML Files**

Our recipe configurations (YAML) are updated to the latest version, and a test was added to enforce correctness moving forward. Fixed in 1235 and 1257.




Changelog

https://github.com/mosaicml/composer/compare/v0.8.0...v0.8.1

0.8.0

bash
pip install --upgrade mosaicml==0.8.0

Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.8.0


New Features


1. **πŸ€— HuggingFace ComposerModel**

Train your HuggingFace models with Composer! We introduced a [`HuggingFaceModel`](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.models.huggingface.html#composer.models.huggingface.HuggingFaceModel) that converts your existing πŸ€— Transformers models into a ComposerModel.

For example:

python
import transformers
from composer.models import HuggingFaceModel

Define the model
hf_model = transformers.AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

Convert it into a ComposerModel
model = HuggingFaceModel(hf_model)

Construct the trainer
trainer = Trainer(
...,
model,
)

Train!
trainer.fit()


For more information, see the example on [fine-tuning a pretrained BERT with Composer](https://docs.mosaicml.com/en/v0.8.0/examples/huggingface_models.html).

1. **πŸ«• Fused Layer Norm**


Fused LayerNorm replaces implementations of [`torch.nn.LayerNorm`](https://pytorch.org/docs/1.11/generated/torch.nn.LayerNorm.html) with a [`apex.normalization.fused_layer_norm`](https://nvidia.github.io/apex/layernorm.html). The fused kernel provides increased GPU utilization.

For example:

python
from composer.trainer import Trainer
from composer.algorithms import FusedLayerNorm

Initialize the algorithm
alg = FusedLayerNorm()

Construct the trainer
trainer = Trainer(
algorithms=alg,
)

Train!
trainer.fit()


See the [method card](https://docs.mosaicml.com/en/v0.8.0/method_cards/fused_layernorm.html) for more information.

1. **πŸ’Ύ Ignore Checkpoint Parameters**

If you have a checkpoint and don't want to restore some elements of the chceckpoint to the [state](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.core.state.html#composer.core.state.State), we added a `load_ignore_keys` parameter. Any specified (nested) keys will be ignored. Glob syntax is supported!

For example, to restore a checkpoint without the seed:

python
from composer import Trainer

trainer = Trainer(
...,
load_path="path/to/my/checkpoint.pt",
load_ignore_keys=["state/rank_zero_seed", "rng"],
)


See the [Trainer API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer) for more information.


1. **πŸͺ£ Object Stores**

Composer v0.8.0 introduces an abstract [Object Store API](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.utils.object_store.object_store.html#composer.utils.object_store.object_store.ObjectStore) to support multiple object store drivers, such as boto3 (for Amazon S3) and Paramiko (for SFTP), in addition to the existing libcloud implementation.

For example, if you are training on AWS where credentials are available in the environment, here's how to to save checkpoints to a S3 object store via Boto3.

python
from composer import Trainer
from composer.loggers import ObjectStoreLogger
from composer.utils.object_store import S3ObjectStore

logger = ObjectStoreLogger(
object_store_cls=S3ObjectStore,
object_store_kwargs={
These arguments will be passed into the S3ObjectStore -- e.g.:
object_store = S3ObjectStore(**object_store_kwargs)
Refer to the S3ObjectStore class for documentation
'bucket': 'my-bucket',
},
)

trainer = Trainer(
...,
loggers=logger,
)

Train!
trainer.fit()


See the [Object Store API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.utils.object_store.html#module-composer.utils.object_store) for more information.



1. **πŸͺ¨ Artifact Metadata**

Composer automatically logs the epoch, batch, sample, and token counts as metadata when storing artifacts in Weights & Biases. See the [API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.loggers.wandb_logger.html#composer.loggers.wandb_logger.WandBLogger) for more information.


API Changes

1. **βœ‚οΈ Gradient Clipping is now an Algorithm**

To clean up the Trainer, we moved gradient clipping into an Algorithm. The ``grad_clip_norm`` argument in the Trainer is deprecated and will be removed in a future version of Composer. Instead, use the [Gradient Clipping](https://docs.mosaicml.com/en/v0.8.0/method_cards/gradient_clipping.html) algorithm:

For example:

python
from composer.algorithms import GradientClipping
from composer.trainer import Trainer

Configure gradient clipping
gradient_clipping = GradientClipping()

Configure the trainer
trainer = Trainer(
...,
algorithms=gradient_clipping,
)

Train!
trainer.fit()


See the [method card](https://docs.mosaicml.com/en/v0.8.0/method_cards/gradient_clipping.html) for more information.

1. **πŸ•’οΈ Removed `batch_num_samples` and `batch_num_tokens` from the state.**

State properties `batch_num_samples` and `batch_num_tokens` have been removed.
Instead, use [`State.timestamp`](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.core.time.html#composer.core.time.Timestamp) for token and sample tracking.

1. **πŸ§‘β€πŸ€β€πŸ§‘ DDP Sync Strategy**

We changed the default [DDP Sync Strategy](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.trainer.ddp.html?highlight=MULTI_AUTO_SYNC#composer.trainer.ddp.DDPSyncStrategy) to `MULTI_AUTO_SYNC`, as `FORCED_SYNC` doesn't work with all algorithms.

1. **πŸƒ Moved the `run_name` into the `State`**

The `run_name` has been added to the [State](https://docs.mosaicml.com/en/latest/api_reference/composer.core.state.html#composer.core.state.State.run_name) object, so it is persisted with checkpoints. It has been removed from the Logger.


Bug Fixes

* In the Object Store Logger, added in retries for credential validation, and validating credentials only on global rank zero. (1144)
* Fixed a bug in the speed monitor where it returned negative wall clock times. (1123)
* Fixed how block-wise Stochastic Depth could freeze the trainer. (1087)
* Fixed a bug in the [MLPerfCallback] where sample counts were incorrect on per-sharded datasets. (1156)



Changelog

https://github.com/mosaicml/composer/compare/v0.7.1...v0.8.0

Page 9 of 11

Β© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.