Mosaicml

Latest version: v0.27.0

Safety actively analyzes 682471 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 10 of 15

0.9.0

Not secure
New Features

1. **:package: Export for inference APIs**

Train with Composer and deploy anywhere! We have added a dedicated [export API](https://docs.mosaicml.com/en/v0.9.0/api_reference/composer.utils.inference.html) as well as an [export training callback](https://docs.mosaicml.com/en/v0.9.0/api_reference/composer.callbacks.export_for_inference.html) to allow you to export Composer-trained models for inference, supporting popular formats such as [torchscript](https://pytorch.org/docs/stable/jit.html) and [ONNX](https://onnx.ai/).

For example, hereโ€™s how to export a model in torchscript format:

python
from composer.utils import export_for_inference

Invoking export with a trained model
export_for_inference(model=model,
save_format='torchscript',
save_path=model_save_path)


Hereโ€™s an example of using the training callback, which automatically exports the model at the end of training to ONNX format:

python
from composer.callbacks import ExportForInferenceCallback

Initializing Trainer with the export callback
callback = ExportForInferenceCallback(save_format='onnx',
save_path=model_save_path)
trainer = Trainer(model=model,
callbacks=callback,
train_dataloader=dataloader,
max_duration='10ep')

Model will be exported at the end of training
trainer.fit()


Please see our [Exporting for Inference](https://docs.mosaicml.com/en/stable/examples/exporting_for_inference.html) notebook for more information.

1. **:chart_with_upwards_trend: ALiBi support for BERT training**

You can now use ALiBi (**A**ttention with **Li**near **Bi**ases; [Press et al., 2021](https://arxiv.org/abs/2108.12409)) when training BERT models with Composer, delivering faster training and higher accuracy by leveraging shorter sequence lengths.

ALiBi improves the quality of BERT pre-training, especially when pre-training uses shorter sequence lengths than the downstream (fine-tuning) task. This allows models with ALiBi to reach higher downstream accuracy with less pre-training time.

Example of using ALiBi as an algorithm with the Composer Trainer:

python
Create an instance of a BERT masked language model
model = composer.models.create_bert_mlm()

Apply ALiBi (when training is initialized)
alibi = composer.algorithms.alibi(max_sequence_length=1024)

Train with ALiBi
trainer = composer.trainer.Trainer(
model=model,
train_dataloader=train_dataloader,
algorithms=[alibi]
)
trainer.fit()


Example using the Composer Functional API:

python
import composer.functional as cf

Create an instance of a BERT masked language model
model = composer.models.create_bert_mlm()

Apply ALiBi and expand the model's maximum sequence length to 1024
cf.apply_alibi(model=model, max_sequence_length=1024)


AliBi can also now be extended to work with custom models by registering your attention and embedding layers. Please see our [ALiBi method card](https://docs.mosaicml.com/en/stable/method_cards/alibi.html) for more information.

1. **๐Ÿง Entry point for GLUE tasks pre-training and fine-tuning**

You can now easily pre-train and fine-tune NLP models across all [GLUE](https://gluebenchmark.com/) (General Language Understanding Evaluation) tasks through one simple entry point! The entry point handles model saving and loading, spawns GLUE tasks in parallel across all available GPUs, and delivers a highly efficient evaluation of model performance.

Example of launching the entrypoint:

bash
This runs pre-training followed by fine-tuning.
--training_scheme can take either pretrain, finetune, or all depending on the task!
python run_glue_trainer.py -f glue_example.yaml --training_scheme all


Please see our [GLUE entrypoint notebook](https://docs.mosaicml.com/en/v0.9.0/examples/glue/glue_entrypoint.html) for more information.

1. **๐Ÿค– TPU support (in beta)**

You can now use Composer to train your models on TPUs! Support is now available in Beta, and currently only supports single-core TPU training. Try it out, explore optimizations, and share your feedback and feature requests with us so we can make it better for you and for the community.

To use TPUs with Composer, simply specify a `tpu` device:

python
Set device to `tpu`
trainer = composer.trainer.Trainer(
model=model,
train_dataloader=train_dataloader,
max_duration=train_epochs,
device='tpu')

Run fit
trainer.fit()


Please see our [Training with TPUs notebook](https://docs.mosaicml.com/en/v0.9.0/examples/TPU_Training_in_composer.html) for more information.

1. **:apple: Apple Silicon support (beta)**

Leverage Apple Silicon chips to train your models with Composer by providing the `device='mps'` argument:

python
trainer = Trainer(
...,
device='mps'
)


We use the latest PyTorch MPS backend to execute the training. This requires torch version โ‰ฅ1.12, and Max OSX 12.3+.

For more information on training with Apple M chips, see the [PyTorch 1.12 blog](https://pytorch.org/blog/pytorch-1.12-released/#prototype-introducing-accelerated-pytorch-training-on-mac) and our [API Reference](https://docs.mosaicml.com/en/v0.9.0/api_reference/composer.trainer.devices.device_mps.html) for Composer specific details.

1. **:construction: Contrib repository**

Got a new method idea, or published a paper and want those methods to be easily accessible? Weโ€™ve created the [`mcontrib` repository](https://github.com/mosaicml/mcontrib), with a lightweight process to contribute new algorithms. Weโ€™re happy to work directly with you to benchmark these methods and eventually โ€œpromoteโ€ them to Composer for use by end customers.

Please checkout the [README](https://github.com/mosaicml/mcontrib#adding-algorithms) for details on how to contribute a new algorithm. For more details on how to write speed-up methods, see our notebook on [custom speed-up methods](https://docs.mosaicml.com/en/v0.9.0/examples/custom_speedup_methods.html).

Additional API Changes

1. **:1234: Passes Module**

The order in which algorithms are run matters significantly during composition. With this release we refactored algorithm passes into their own [`passes` module](https://docs.mosaicml.com/en/v0.9.0/api_reference/composer.core.passes.html). Users can now register custom passes (for custom algorithms) with the Engine. Please see #1377 for more information.

1. **:file_cabinet: Default Checkpoint Extension**

The CheckpointSaver now defaults to using the `*.pt` extension for checkpoint fienames. Please see 1370 for more information.

1. **:eye: Models Refactor**

Most vision models (ResNet, MNIST, ViT, EfficientNet) have been refactored from classes to a factory function. For example `ComposerResNet` -> `composer_resnet`.

python
before
from composer.models import ComposerResNet
model = ComposerResNet(..)

from composer.models import composer_resnet after
model = composer_resnet(..)


The same refactor has been done for NLP as well, e.g. `BERTModel` -> `create_bert_mlm` and `create_bert_classification`.

See 1227 (vision) and 1130 (NLP) for more details.

1. **:heavy_plus_sign: Misc API Changes**

* `BreakEpochException` has been removed.
* `state.is_model_deepspeed` has been moved to `composer.utils.is_model_deepspeed`.
* Helper function `monitored_barrier` has been added to `composer` distributed.


Bug Fixes

* Add informative error for infer batch size issues (1401)
* Fix ImagenetDatasetHparams bug (1392), resolves 1111
* Fix hparams error condition checking (1394)
* Fix AMP resumption with grad scaler (1376)
* Auto Grad Accum Cache Clearing (1380), fixes issue reported in 1331
* Fix default precision (1369)
* Fix the profiler on multi-node training (1358), resolves 1270
* Retry SFTP on Size Mismatch (1300)
* Fix scheduler edge cases (1350), resolves 1077
* Fix a race condition in the object store logger (1328)
* Fix WandB load from checkpoint (1326)
* Fix Notebook Progress Bars (1313)

Commits

What's Changed
* Fix DeepSpeed typo in docstring by abhi-mosaic in https://github.com/mosaicml/composer/pull/1188
* Move grad_accum logging to every step by coryMosaicML in https://github.com/mosaicml/composer/pull/1187
* Update STYLE_GUIDE with details on Documentation by bandish-shah in https://github.com/mosaicml/composer/pull/1183
* ProgressBar Units by hanlint in https://github.com/mosaicml/composer/pull/1190
* Added Xavier Normal initializer by vladd-i in https://github.com/mosaicml/composer/pull/1196
* Updated cost figure by nqn in https://github.com/mosaicml/composer/pull/1180
* Remove algorithm yamls by hanlint in https://github.com/mosaicml/composer/pull/1193
* Fix the Composer Launch Script for the Composer Dockerimage; Default `nproc = torch.cuda.device_count()` if not specified via env by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1195
* Bert model card by A-Jacobson in https://github.com/mosaicml/composer/pull/1198
* Add Notes on Early Stopping by anisehsani in https://github.com/mosaicml/composer/pull/1182
* Stochastic depth that preserves weights by Landanjs in https://github.com/mosaicml/composer/pull/1085
* Adding Gated Linear Units as an algorithm by moinnadeem in https://github.com/mosaicml/composer/pull/1192
* A utility to fuse parallel linear layers in FX-traced models by dskhudia in https://github.com/mosaicml/composer/pull/1189
* Build+push Composer dockerimages to `mosaicml/composer_staging` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1197
* Fix the SFTP Object Store by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1202
* Bert emoji by A-Jacobson in https://github.com/mosaicml/composer/pull/1205
* Adding a constant warmup scheduler by linden-li in https://github.com/mosaicml/composer/pull/1203
* Fix multi-GPU conflicts when downloading `torchvision` datasets by abhi-mosaic in https://github.com/mosaicml/composer/pull/1201
* Add caveats about automatic gradient accumulation by hanlint in https://github.com/mosaicml/composer/pull/1207
* Remove the `composer_train` entrypoint; put it back in `examples` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1211
* Fix Composer staging dockerimages by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1210
* Set SFTP Object Store Private Key Filepath from an Environ by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1212
* [xs] Fix progress bars in `get_file` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1216
* Cleanup SFTP url parsing for StreamingDataset by abhi-mosaic in https://github.com/mosaicml/composer/pull/1217
* Fix Symlinks on Non-Libcloud Object Stores by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1209
* Fix the ObjectStoreLogger with Overwrite=True by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1208
* Throughput metrics by linden-li in https://github.com/mosaicml/composer/pull/1215
* Fix module surgery for training resumptions with optimizers that save state by dskhudia in https://github.com/mosaicml/composer/pull/1200
* Update bert-base.yaml by moinnadeem in https://github.com/mosaicml/composer/pull/1219
* StreamingDataset: make remote optional, attempt to prettify docstrings. by knighton in https://github.com/mosaicml/composer/pull/1220
* Update vision-style `StreamingDataset`s to subclass `VisionDataset` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1223
* Improve docstrings. by knighton in https://github.com/mosaicml/composer/pull/1222
* shardwise zip streaming datasets by milocress in https://github.com/mosaicml/composer/pull/1177
* updated mosaic logos to composer logos in docs by ejyuen in https://github.com/mosaicml/composer/pull/1221
* Add `COMPOSER_KNOWN_HOSTS_FILENAME` for setting the sftp known hosts file environ by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1224
* StreamingDataset: correctly handle exceptions in child download thread. by knighton in https://github.com/mosaicml/composer/pull/1228
* hot fix compression 404 by milocress in https://github.com/mosaicml/composer/pull/1229
* Treat any dropped SSH/SFTP connection as a transient error by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1225
* refactor bert and gpt by A-Jacobson in https://github.com/mosaicml/composer/pull/1130
* Hotfix for S3 `FileNotFoundError` by abhi-mosaic in https://github.com/mosaicml/composer/pull/1233
* Fix StreamingDataset compression with multi-rank by milocress in https://github.com/mosaicml/composer/pull/1231
* Refactor vision models by Landanjs in https://github.com/mosaicml/composer/pull/1227
* Update resnet50_medium.yaml by lupesko in https://github.com/mosaicml/composer/pull/1235
* Increase default timeout for `StreamingC4` to 120s by abhi-mosaic in https://github.com/mosaicml/composer/pull/1234
* Add Debug Log Statements; Fix Pyright by hanlint in https://github.com/mosaicml/composer/pull/1218
* Hotfix deeplabv3 by Landanjs in https://github.com/mosaicml/composer/pull/1238
* Add Tensorboard Logger by eracah in https://github.com/mosaicml/composer/pull/1194
* Move the model and optimizers to the device before `Event.INIT` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1084
* Fix bug in streaming iteration/downloading, refactor by knighton in https://github.com/mosaicml/composer/pull/1239
* Support sequence of losses in backwards pass by Landanjs in https://github.com/mosaicml/composer/pull/1240
* Add device_id param to DeviceGPU by ishanashastri in https://github.com/mosaicml/composer/pull/1244
* Update CutMix to work with segmentation style labels by coryMosaicML in https://github.com/mosaicml/composer/pull/1230
* Catching ChannelErrors on SFTP Failures by moinnadeem in https://github.com/mosaicml/composer/pull/1245
* Make `StreamingDataset` compression file easier to write/read by abhi-mosaic in https://github.com/mosaicml/composer/pull/1246
* [XS] Updating console progress_bar logger to use max_duration units by moinnadeem in https://github.com/mosaicml/composer/pull/1243
* Catch botocore ClientError 403 by abhi-mosaic in https://github.com/mosaicml/composer/pull/1249
* Tensorboard Notebook + Tutorial by eracah in https://github.com/mosaicml/composer/pull/1250
* Fix repeated words in event.py by isaac0804 in https://github.com/mosaicml/composer/pull/1254
* Make progressive resizing quieter by coryMosaicML in https://github.com/mosaicml/composer/pull/1255
* fix typo in example by xloem in https://github.com/mosaicml/composer/pull/1259
* Create a new `boto3.Session()` per `S3ObjectStore` instance by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1260
* Fix recipe yamls for `v0.8`, add testing by hanlint in https://github.com/mosaicml/composer/pull/1257
* Automatic Stochastic depth on residual blocks by dskhudia in https://github.com/mosaicml/composer/pull/1253
* Sequence length warmup update and tests by alextrott16 in https://github.com/mosaicml/composer/pull/1199
* ProgressBarLogger UX Enhancements by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1264
* Update to latest pytorch by mvpatel2000 in https://github.com/mosaicml/composer/pull/1262
* Add packaging to `meta.yaml`; add `py-cpuinfo` max version by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1271
* Fix Flaky Tests by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1272
* Add callback for visualizing image inputs and outputs by coryMosaicML in https://github.com/mosaicml/composer/pull/1266
* Add `scale_warmup` argument to schedulers by hanlint in https://github.com/mosaicml/composer/pull/1268
* Switch Jenkins to r1z3 by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1277
* BERT and C4 updates by abhi-mosaic in https://github.com/mosaicml/composer/pull/1252
* Default to `allow_tf32=True` for GPU Devices by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1275
* Fix grad accum parsing in hparams by hanlint in https://github.com/mosaicml/composer/pull/1256
* Fix issue with doctest format in some docstring examples by Landanjs in https://github.com/mosaicml/composer/pull/1269
* Adds S3ObjectStore import to util __init__.py by codestar12 in https://github.com/mosaicml/composer/pull/1274
* Add tutorial on exporting for inference by hanlint in https://github.com/mosaicml/composer/pull/1276
* HTTPS downloads for streaming datasets by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1258
* object stores for streaming datasets by milocress in https://github.com/mosaicml/composer/pull/1248
* Allow object name prefix for S3ObjectStore by abhi-mosaic in https://github.com/mosaicml/composer/pull/1278
* Hotfix CO-658 by milocress in https://github.com/mosaicml/composer/pull/1273
* Fix S3 remote paths for StreamingDataset download by abhi-mosaic in https://github.com/mosaicml/composer/pull/1280
* Add combo loss to DeepLabv3+ by Landanjs in https://github.com/mosaicml/composer/pull/1265
* Checkpoint backwards compatibility for ProgressBar by hanlint in https://github.com/mosaicml/composer/pull/1287
* Add missing callbacks by hanlint in https://github.com/mosaicml/composer/pull/1286
* Fix S3 prefix upload/download by abhi-mosaic in https://github.com/mosaicml/composer/pull/1288
* Fix device inference in module surgery by hanlint in https://github.com/mosaicml/composer/pull/1290
* Actual fix to backwards compatibility by hanlint in https://github.com/mosaicml/composer/pull/1289
* Bugs in getting_started.ipynb by rahulvigneswaran in https://github.com/mosaicml/composer/pull/1285
* Add pytorch 1.12.0 docker image by linden-li in https://github.com/mosaicml/composer/pull/1247
* Fix TB Logger + ObjectStore quadratic complexity issue by doing 1 file per flush by eracah in https://github.com/mosaicml/composer/pull/1283
* Enable README Doctests with GPUs by mvpatel2000 in https://github.com/mosaicml/composer/pull/1279
* Fix logging of hparams to object stores by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1297
* [xs] Reformat the Composer Version String by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1301
* Add monitored barrier for autograd accum by mvpatel2000 in https://github.com/mosaicml/composer/pull/1295
* [xs] Notebook Fixes by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1299
* [xs] Store the Composer version in one place. by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1302
* model export for inference. Functional API by dskhudia in https://github.com/mosaicml/composer/pull/1294
* Add a `return_outputs` flag to `predict()` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1307
* Integration Testing by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1305
* Fix `get_file_artifact` in the WandBLogger to work on all ranks by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1304
* Add documentation about `run_name` to Composer by eracah in https://github.com/mosaicml/composer/pull/1298
* Enforce FusedLayerNorm is ordered last by alextrott16 in https://github.com/mosaicml/composer/pull/1309
* Revert monitored barrier by mvpatel2000 in https://github.com/mosaicml/composer/pull/1311
* [xs] Build the Composer Docker Image only on `dev` branch merges by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1308
* Fix Notebook Progress Bars by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1313
* Remove `pytest-timeout` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1317
* [Minor] Inference API parameter name change by dskhudia in https://github.com/mosaicml/composer/pull/1315
* Matthew/swa readme by growlix in https://github.com/mosaicml/composer/pull/1292
* Enable gloo backend by mvpatel2000 in https://github.com/mosaicml/composer/pull/1321
* [xs] Fix pytest test filtering; Bump the minimum pytorch version to 1.10 by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1320
* revert gloo by mvpatel2000 in https://github.com/mosaicml/composer/pull/1324
* Fix WandB load from checkpoint by abhi-mosaic in https://github.com/mosaicml/composer/pull/1326
* ALiBi for BERT and ALiBi testing by alextrott16 in https://github.com/mosaicml/composer/pull/1267
* Update HF example with read of model eval accuracy by lupesko in https://github.com/mosaicml/composer/pull/1332
* Cleanup API Reference Titles by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1336
* Fix a race condition in the object store logger by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1328
* Auto Grad Accum Change to Warning by mvpatel2000 in https://github.com/mosaicml/composer/pull/1338
* Add export for inference callback by nik-mosaic in https://github.com/mosaicml/composer/pull/1323
* Add save fine-tune model to HuggingFace example by lupesko in https://github.com/mosaicml/composer/pull/1333
* Update DWD optimizers by abhi-mosaic in https://github.com/mosaicml/composer/pull/1339
* Cap Numpy Version by mvpatel2000 in https://github.com/mosaicml/composer/pull/1345
* Update slack link by hanlint in https://github.com/mosaicml/composer/pull/1344
* Fix scheduler edge cases by abhi-mosaic in https://github.com/mosaicml/composer/pull/1350
* Integration Tests for Object Stores and Loggers by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1322
* Retry SFTP on Size Mismatch by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1300
* [xs] Restore the dataloader and training properties in `predict()` by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1352
* Add Precision Contexts by mvpatel2000 in https://github.com/mosaicml/composer/pull/1347
* Update GLU logging strings by moinnadeem in https://github.com/mosaicml/composer/pull/1348
* Add domain-specific codeowners by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1354
* fix marker by mvpatel2000 in https://github.com/mosaicml/composer/pull/1359
* Fix the profiler on multi-node training by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1358
* Glue Entrypoint by ishanashastri in https://github.com/mosaicml/composer/pull/1263
* Yahp v0.1.3 by mvpatel2000 in https://github.com/mosaicml/composer/pull/1346
* Move metrics to context by mvpatel2000 in https://github.com/mosaicml/composer/pull/1361
* Refactor multiple losses to support dictionaries and fix discrepancies by Landanjs in https://github.com/mosaicml/composer/pull/1349
* Fix Coverage Reports on Jenkins by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1114
* JSON Schemas by mvpatel2000 in https://github.com/mosaicml/composer/pull/1371
* add filename extension by mvpatel2000 in https://github.com/mosaicml/composer/pull/1370
* JSON Schemas pt 2 by mvpatel2000 in https://github.com/mosaicml/composer/pull/1373
* Update Export for Inference methods by nik-mosaic in https://github.com/mosaicml/composer/pull/1355
* Fix default precision by A-Jacobson in https://github.com/mosaicml/composer/pull/1369
* Clean up unused exception by mvpatel2000 in https://github.com/mosaicml/composer/pull/1368
* Revert "Clean up unused exception" by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1378
* Remove Unused Exception by mvpatel2000 in https://github.com/mosaicml/composer/pull/1379
* Auto Grad Accum Cache Clearing by mvpatel2000 in https://github.com/mosaicml/composer/pull/1380
* Add ability to register algorithm passes by hanlint in https://github.com/mosaicml/composer/pull/1377
* Fix AMP resumption with grad scaler by hanlint in https://github.com/mosaicml/composer/pull/1376
* Update CUDA and remove NCCL downgrade from Dockerfile by abhi-mosaic in https://github.com/mosaicml/composer/pull/1362
* Add Notes on Artifact Logging by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1381
* Print the microbatch size when using Adaptive Gradient Accumulation by hanlint in https://github.com/mosaicml/composer/pull/1387
* Cleaner API reference part 1: references with minimal import paths by dblalock in https://github.com/mosaicml/composer/pull/1385
* Add Event.BEFORE_DATALOADER by mvpatel2000 in https://github.com/mosaicml/composer/pull/1388
* remove private s3 paths by A-Jacobson in https://github.com/mosaicml/composer/pull/1389
* Tutorial on training without Local Storage by ravi-mosaicml in https://github.com/mosaicml/composer/pull/1351
* [inference] Update export_for_inference notebook with new APIs by dskhudia in https://github.com/mosaicml/composer/pull/1360
* Fix resnet warnings criteria by mvpatel2000 in https://github.com/mosaicml/composer/pull/1395
* Fix hparams error by mvpatel2000 in https://github.com/mosaicml/composer/pull/1394
* Add knighton to codeowners for datasets by knighton in https://github.com/mosaicml/composer/pull/1397
* Fix ImagenetDatasetHparams bug by nik-mosaic in https://github.com/mosaicml/composer/pull/1392
* Decouple GLUE entry point saving and loading logic by ishanashastri in https://github.com/mosaicml/composer/pull/1390
* Glue example notebook by ishanashastri in https://github.com/mosaicml/composer/pull/1383
* Add informative error for infer batch size issues by hanlint in https://github.com/mosaicml/composer/pull/1401
* Only sync batchnorm statistics within a node for deeplab by Landanjs in https://github.com/mosaicml/composer/pull/1391
* Update DeepLabv3 pretrained weight interface to work with PyTorch 1.12 by Landanjs in https://github.com/mosaicml/composer/pull/1399
* tpu single core by florescl in https://github.com/mosaicml/composer/pull/1400
* Add support for Apple M chips by hanlint in https://github.com/mosaicml/composer/pull/1405
* [xs] Add `mps` and `tpu` device to Trainer docstrings by hanlint in https://github.com/mosaicml/composer/pull/1410

**Full Changelog**: https://github.com/mosaicml/composer/compare/v0.8.2...v0.9.0

New Contributors
* vladd-i made their first contribution in https://github.com/mosaicml/composer/pull/1196
* linden-li made their first contribution in https://github.com/mosaicml/composer/pull/1203
* ejyuen made their first contribution in https://github.com/mosaicml/composer/pull/1221
* lupesko made their first contribution in https://github.com/mosaicml/composer/pull/1235
* isaac0804 made their first contribution in https://github.com/mosaicml/composer/pull/1254
* xloem made their first contribution in https://github.com/mosaicml/composer/pull/1259
* alextrott16 made their first contribution in https://github.com/mosaicml/composer/pull/1199
* codestar12 made their first contribution in https://github.com/mosaicml/composer/pull/1274
* rahulvigneswaran made their first contribution in https://github.com/mosaicml/composer/pull/1285
* nik-mosaic made their first contribution in https://github.com/mosaicml/composer/pull/1323

0.8.2

Not secure
๐Ÿ› Bug Fixes

1. **Fixed Notebook Progress Bars in Colab**

Fixes a bug introduced by 1264 which causes Composer running in Colab notebooks to error out with:
UnsupportedOperation: fileno.

Closes 1312. Fixed in PR 1314.

Changelog

https://github.com/mosaicml/composer/compare/v0.8.1...v0.8.2

0.8.1

Not secure
๐ŸŽ New Features


1. **๐Ÿ–ผ๏ธ Image Visualizer**

The [`ImageVisualizer`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.callbacks.image_visualizer.html#composer.callbacks.image_visualizer.ImageVisualizer) callback periodically logs the training and validation images when using the WandB logger. This is great for validating your dataloader pipeline, especially if extensive data augmentations are used. Also, when training on a semantic segmentation task, the callback can log the target segmentation mask and the predicted segmentation mask by setting the argument `mode='segmentation'`. See PR 1266 for more details. Here is an example of using the `ImageVisualizer` callback:

python
from composer import Trainer
from composer.callbacks import ImageVisualizer

Callback to log 8 training images after every 100 batches
image_visualizer = ImageVisualizer()

Construct trainer
trainer = Trainer(
...,
callbacks=image_visualizer
)

Train!
trainer.fit()



Here is an example visualization from the training set of ADE20k:

![](https://i.imgur.com/iszIRLS.jpg)


1. **๐Ÿ“ถ TensorBoard Logging**

You can now log metrics and losses from your Composer training runs with Tensorboard! See 1250 and 1283 for more details. All you have to do is create a [`TensorboardLogger`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.loggers.tensorboard_logger.html#composer.loggers.tensorboard_logger.TensorboardLogger) object and add it
to the list of loggers in your [`Trainer`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer) object like so:

python
from composer import Trainer
from composer.loggers import TensorboardLogger

tb_logger = TensorboardLogger(log_dir="./my_tensorboard_logs")

trainer = Trainer(
...
Add your Tensorboard Logger to the trainer here.
loggers=[tb_logger],
)

trainer.fit()


For more information, see this [tutorial](https://docs.mosaicml.com/en/v0.8.1/notes/tensorboard_logger.html).




1. **๐Ÿ”™ Multiple Losses**

Adds support for multiple losses. If a model returns a tuple of losses, they are summed before the `loss.backward()` call. See 1240 for more details.


1. **๐ŸŒŽ๏ธ Stream Datasets from HTTP URIs**

You can now specify a HTTP URI for a [Streaming Dataset](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.datasets.streaming.dataset.html#composer.datasets.streaming.dataset.StreamingDataset) remote. See 1258 for more detials. For example:

python
from composer.datasets.streaming import StreamingDataset
from torch.utils.data import DataLoader

Construct the Dataset
dataset = StreamingDataset(
...,
remote="https://example.com/dataset/",
)

Construct the DataLoader
train_dl = DataLoader(dataset)

Construct the Trainer
trainer = Trainer(
...,
train_dataloader=train_dl,
)

Train!
trainer.fit()


For more information on streaming datasets, see this [tutorial](https://docs.mosaicml.com/en/v0.8.1/examples/streaming_dataloader_facesynthetics.html).


1. **๐Ÿ„๏ธ GPU Devices default to TF32 Matmuls**

Beginning with PyTorch 1.12, the default behavior for computing FP32 matrix multiplies on NVIDIA Ampere devices was switched from TF32 to FP32. See [PyTorch documentation here](https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices).

Since Composer is designed specifically for ML training with a focus on efficiency, we choose to preserve the old default of using TF32 on Ampere devices. This leads to significantly higher throughput when training in single precision, [without impact training convergence](https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/). See PR #1275 for implementation details.

1. **๐Ÿ‘‹ Set the Device ID for GPU Devices**

Specify the device ID within a DeviceGPU to train on when instantiating a Trainer object instead of using the local ID! For example,

python
from composer.trainer.devices.device_gpu import DeviceGPU

Specify to use GPU 3 to train
device = DeviceGPU(device_id=3)

Construct the Trainer
trainer = Trainer(
...,
device = device
)

Train!
trainer.fit()




1. **BERT and C4 Updates**

We make some minor adjustments to our `bert-base-uncased.yaml` training config. In particular, we make the global train and eval batch sizes a power of 2. This maintains divisibility when using many GPUs in multi-node training. We also adjust the `max_duration` so that it converts cleanly to 70,000 batches.

We also upgrade our StreamingDataset C4 conversion script (`scripts/mds/c4.py`) to use a multi-threaded reader. On a 64-core machine we are able to convert the 770GB train split to `.mds` format in ~1.5hr.


1. **๐Ÿ“‚ Set a `prefix` when using a `S3ObjectStore`**

When using `S3ObjectStore` for applications like checkpointing, it can be useful to provide path prefixes, mimicking `folder/subfolder` directories like on a local filesystem. When `prefix` is provided, any objects uploaded with `S3ObjectStore` will be stored at `f's3://{self.bucket}/{self.prefix}{object_name}'`.


1. **โš–๏ธ Scale the Warmup Period of Composer Schedulers**

Added a new flag `scale_warmup` to schedulers that will scale the warmup period when a scale schedule ratio is applied. Default is `False` to mirror default behavior. See 1268 for more detials.

1. **๐ŸงŠ Stochastic Depth on Residual Blocks**

Residual blocks are detected automatically and replaced with stochastic versions. See 1253 for more details.

๐Ÿ› Bug Fixes

1. **Fixed Progress Bars**

Fixed a bug where the the Progress Bars jumped around and did not stream properly when tailing the terminal over the network. Fixed in 1264, 1287, and 1289.

1. **Fixed S3ObjectStore in Multithreaded Environments**

Fixed a bug where the `boto3` crashed when creating the default session in multiple threads simultaniously (see https://github.com/boto/boto3/issues/1592). Fixed in #1260.

1. **Retry on `ChannelException` errors in the `SFTPObjectStore`**

Catch `ChannelException` SFTP transient error and retry. Fixed in 1245.

1. **Treating S3 Permission Denied Errors as Not Found Errors**

We update our handling of `botocore` 403 ClientErrors to interpret them as `FileNotFoundErrors`. We do this because of a situation that occurs when a user has no S3 credentials configured, and tries to read from a bucket with public files. For privacy, Amazon S3 raises 403 (Permission Denied) instead of 404 (Not Found) errors. As such, PR 1249 treats 403 ClientErrors as FileNotFoundErrors.

1. **Fixed Parsing of `grad_accum` in the `TrainerHparams`**

Fixes an error where the command line override `--grad_accum` lead to incorrect parsing. Fixed in 1256.

1. **Fixed Example YAML Files**

Our recipe configurations (YAML) are updated to the latest version, and a test was added to enforce correctness moving forward. Fixed in 1235 and 1257.




Changelog

https://github.com/mosaicml/composer/compare/v0.8.0...v0.8.1

0.8.0

Not secure
New Features


1. **๐Ÿค— HuggingFace ComposerModel**

Train your HuggingFace models with Composer! We introduced a [`HuggingFaceModel`](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.models.huggingface.html#composer.models.huggingface.HuggingFaceModel) that converts your existing ๐Ÿค— Transformers models into a ComposerModel.

For example:

python
import transformers
from composer.models import HuggingFaceModel

Define the model
hf_model = transformers.AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

Convert it into a ComposerModel
model = HuggingFaceModel(hf_model)

Construct the trainer
trainer = Trainer(
...,
model,
)

Train!
trainer.fit()


For more information, see the example on [fine-tuning a pretrained BERT with Composer](https://docs.mosaicml.com/en/v0.8.0/examples/huggingface_models.html).

1. **๐Ÿซ• Fused Layer Norm**


Fused LayerNorm replaces implementations of [`torch.nn.LayerNorm`](https://pytorch.org/docs/1.11/generated/torch.nn.LayerNorm.html) with a [`apex.normalization.fused_layer_norm`](https://nvidia.github.io/apex/layernorm.html). The fused kernel provides increased GPU utilization.

For example:

python
from composer.trainer import Trainer
from composer.algorithms import FusedLayerNorm

Initialize the algorithm
alg = FusedLayerNorm()

Construct the trainer
trainer = Trainer(
algorithms=alg,
)

Train!
trainer.fit()


See the [method card](https://docs.mosaicml.com/en/v0.8.0/method_cards/fused_layernorm.html) for more information.

1. **๐Ÿ’พ Ignore Checkpoint Parameters**

If you have a checkpoint and don't want to restore some elements of the chceckpoint to the [state](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.core.state.html#composer.core.state.State), we added a `load_ignore_keys` parameter. Any specified (nested) keys will be ignored. Glob syntax is supported!

For example, to restore a checkpoint without the seed:

python
from composer import Trainer

trainer = Trainer(
...,
load_path="path/to/my/checkpoint.pt",
load_ignore_keys=["state/rank_zero_seed", "rng"],
)


See the [Trainer API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer) for more information.


1. **๐Ÿชฃ Object Stores**

Composer v0.8.0 introduces an abstract [Object Store API](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.utils.object_store.object_store.html#composer.utils.object_store.object_store.ObjectStore) to support multiple object store drivers, such as boto3 (for Amazon S3) and Paramiko (for SFTP), in addition to the existing libcloud implementation.

For example, if you are training on AWS where credentials are available in the environment, here's how to to save checkpoints to a S3 object store via Boto3.

python
from composer import Trainer
from composer.loggers import ObjectStoreLogger
from composer.utils.object_store import S3ObjectStore

logger = ObjectStoreLogger(
object_store_cls=S3ObjectStore,
object_store_kwargs={
These arguments will be passed into the S3ObjectStore -- e.g.:
object_store = S3ObjectStore(**object_store_kwargs)
Refer to the S3ObjectStore class for documentation
'bucket': 'my-bucket',
},
)

trainer = Trainer(
...,
loggers=logger,
)

Train!
trainer.fit()


See the [Object Store API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.utils.object_store.html#module-composer.utils.object_store) for more information.



1. **๐Ÿชจ Artifact Metadata**

Composer automatically logs the epoch, batch, sample, and token counts as metadata when storing artifacts in Weights & Biases. See the [API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.loggers.wandb_logger.html#composer.loggers.wandb_logger.WandBLogger) for more information.


API Changes

1. **โœ‚๏ธ Gradient Clipping is now an Algorithm**

To clean up the Trainer, we moved gradient clipping into an Algorithm. The ``grad_clip_norm`` argument in the Trainer is deprecated and will be removed in a future version of Composer. Instead, use the [Gradient Clipping](https://docs.mosaicml.com/en/v0.8.0/method_cards/gradient_clipping.html) algorithm:

For example:

python
from composer.algorithms import GradientClipping
from composer.trainer import Trainer

Configure gradient clipping
gradient_clipping = GradientClipping()

Configure the trainer
trainer = Trainer(
...,
algorithms=gradient_clipping,
)

Train!
trainer.fit()


See the [method card](https://docs.mosaicml.com/en/v0.8.0/method_cards/gradient_clipping.html) for more information.

1. **๐Ÿ•’๏ธ Removed `batch_num_samples` and `batch_num_tokens` from the state.**

State properties `batch_num_samples` and `batch_num_tokens` have been removed.
Instead, use [`State.timestamp`](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.core.time.html#composer.core.time.Timestamp) for token and sample tracking.

1. **๐Ÿง‘โ€๐Ÿคโ€๐Ÿง‘ DDP Sync Strategy**

We changed the default [DDP Sync Strategy](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.trainer.ddp.html?highlight=MULTI_AUTO_SYNC#composer.trainer.ddp.DDPSyncStrategy) to `MULTI_AUTO_SYNC`, as `FORCED_SYNC` doesn't work with all algorithms.

1. **๐Ÿƒ Moved the `run_name` into the `State`**

The `run_name` has been added to the [State](https://docs.mosaicml.com/en/latest/api_reference/composer.core.state.html#composer.core.state.State.run_name) object, so it is persisted with checkpoints. It has been removed from the Logger.


Bug Fixes

* In the Object Store Logger, added in retries for credential validation, and validating credentials only on global rank zero. (1144)
* Fixed a bug in the speed monitor where it returned negative wall clock times. (1123)
* Fixed how block-wise Stochastic Depth could freeze the trainer. (1087)
* Fixed a bug in the [MLPerfCallback] where sample counts were incorrect on per-sharded datasets. (1156)



Changelog

https://github.com/mosaicml/composer/compare/v0.7.1...v0.8.0

0.7.6

Streaming `v0.7.6` is released! Install via `pip`:


pip install --upgrade mosaicml-streaming==0.7.6


:gem: New Features

1. `device_per_stream` batching method
Users can now construct batches such that each device sees only samples from a single stream. This is very useful in cases where different data sources have samples/tensors of different sizes, but the model should still see samples from these different data sources at each optimizer step.
* Adding `device_per_stream` batching by snarayan21 in https://github.com/mosaicml/streaming/pull/661

2. Add `ndarray` type for Spark dataframes.
Enable parsing Spark's ArrayType (of ShortType, LongType, IntegerType, FloatType, DoubleType) when converting a Spark dataframe to MDS.
* Add ndarray type by XiaohanZhangCMU in https://github.com/mosaicml/streaming/pull/623

3. Support for Alipan storage
Adds support for Alipan, Alibaba's cloud storage service.
* Add support for Alipan Storage backend by PeterDing in https://github.com/mosaicml/streaming/pull/651

What's Changed
* Bump fastapi from 0.110.0 to 0.110.2 by dependabot in https://github.com/mosaicml/streaming/pull/660
* Bump pydantic from 2.6.4 to 2.7.0 by dependabot in https://github.com/mosaicml/streaming/pull/653
* Bump pydantic from 2.7.0 to 2.7.1 by dependabot in https://github.com/mosaicml/streaming/pull/666
* Bump pytest from 8.1.1 to 8.2.0 by dependabot in https://github.com/mosaicml/streaming/pull/664
* Bump databricks-sdk from 0.23.0 to 0.27.0 by dependabot in https://github.com/mosaicml/streaming/pull/667
* Version bump to v0.7.6 by snarayan21 in https://github.com/mosaicml/streaming/pull/669

New Contributors
* PeterDing made their first contribution in https://github.com/mosaicml/streaming/pull/651

**Full Changelog**: https://github.com/mosaicml/streaming/compare/v0.7.5...v0.7.6

0.7.5

Streaming `v0.7.5` is released! Install via `pip`:


pip install --upgrade mosaicml-streaming==0.7.5


:gem: New Features

1. Tensor/Sequence Parallelism Support
Using the `replication` argument, easily share data samples across multiple ranks, enabling sequence or tensor parallelism.
* Replicating samples across devices (SP / TP enablement) by knighton in https://github.com/mosaicml/streaming/pull/597
* Expanded replication testing + documentation by snarayan21 in https://github.com/mosaicml/streaming/pull/607
* Make streaming use the correct number of unique samples with SP/TP by snarayan21 in https://github.com/mosaicml/streaming/pull/619

2. Overhauled Streaming Documentation
New and improved streaming documentation can be found [here](https://docs.mosaicml.com/projects/streaming/en/stable/#) -- please submit issues with any feedback.
* Major overhaul of Streaming documentation by snarayan21 in https://github.com/mosaicml/streaming/pull/636

3. `batch_size` is now required for StreamingDataset
As we have seen multiple errors and performance degradations from users not setting the `batch_size` argument to StreamingDataset, we are making it a requirement to iterate over the dataset.
* You must set batch size. There is no other way. by snarayan21 in https://github.com/mosaicml/streaming/pull/624

3. Support for Python 3.11, deprecate Python 3.8
* Add support for Python 3.11 and deprecate Python 3.8 by karan6181 in https://github.com/mosaicml/streaming/pull/586

๐Ÿ› Bug Fixes
* [easy typo fix] fix f-string by bigning in https://github.com/mosaicml/streaming/pull/596
* Change comparison in partitions to include equals by JAEarly in https://github.com/mosaicml/streaming/pull/587
* Use type int when initializing SharedMemory size by bchiang2 in https://github.com/mosaicml/streaming/pull/604
* COCO Dataset fix -- avoids `allow_unsafe_types=True` by snarayan21 in https://github.com/mosaicml/streaming/pull/647

๐Ÿ”งย Improvements
* Allow writers to overwrite existing data by JAEarly in https://github.com/mosaicml/streaming/pull/594
* Update careers link by milocress in https://github.com/mosaicml/streaming/pull/611
* Update license by b-chu in https://github.com/mosaicml/streaming/pull/568
* Updated documentation for S3-compatible object stores by AIproj in https://github.com/mosaicml/streaming/pull/592
* Make yamllint consistent with Composer by b-chu in https://github.com/mosaicml/streaming/pull/583
* Switch linting workflows to ci-testing repo by b-chu in https://github.com/mosaicml/streaming/pull/616

What's Changed
* Bump uvicorn from 0.26.0 to 0.27.1 by dependabot in https://github.com/mosaicml/streaming/pull/599
* Bump pytest-split from 0.8.1 to 0.8.2 by dependabot in https://github.com/mosaicml/streaming/pull/581
* Update ruff to 0.2.2 by Skylion007 in https://github.com/mosaicml/streaming/pull/608
* Bump fastapi from 0.109.0 to 0.110.0 by dependabot in https://github.com/mosaicml/streaming/pull/610
* Bump yamllint from 1.33.0 to 1.35.1 by dependabot in https://github.com/mosaicml/streaming/pull/601
* Bump uvicorn from 0.27.1 to 0.28.0 by dependabot in https://github.com/mosaicml/streaming/pull/626
* Update moto requirement from <5,>=4.0 to >=4.0,<6 by dependabot in https://github.com/mosaicml/streaming/pull/580
* Bump furo from 2023.7.26 to 2024.1.29 by dependabot in https://github.com/mosaicml/streaming/pull/631
* Bump pypandoc from 1.12 to 1.13 by dependabot in https://github.com/mosaicml/streaming/pull/630
* Bump databricks-sdk from 0.14.0 to 0.22.0 by dependabot in https://github.com/mosaicml/streaming/pull/629
* Add batch_size to 1 if not provided for regression testing by karan6181 in https://github.com/mosaicml/streaming/pull/635
* Fixed docstring note for getting sequential sample ordering by snarayan21 in https://github.com/mosaicml/streaming/pull/632
* Bump pytest and fix failing test by snarayan21 in https://github.com/mosaicml/streaming/pull/642
* Update pytest-cov requirement from <5,>=4 to >=4,<6 by dependabot in https://github.com/mosaicml/streaming/pull/638
* Bump pydantic from 2.5.3 to 2.6.4 by dependabot in https://github.com/mosaicml/streaming/pull/639
* Bump uvicorn from 0.28.0 to 0.29.0 by dependabot in https://github.com/mosaicml/streaming/pull/640
* Bump databricks-sdk from 0.22.0 to 0.23.0 by dependabot in https://github.com/mosaicml/streaming/pull/644
* Version bump to 0.7.5 by snarayan21 in https://github.com/mosaicml/streaming/pull/650

New Contributors
* bigning made their first contribution in https://github.com/mosaicml/streaming/pull/596
* JAEarly made their first contribution in https://github.com/mosaicml/streaming/pull/587
* AIproj made their first contribution in https://github.com/mosaicml/streaming/pull/592
* milocress made their first contribution in https://github.com/mosaicml/streaming/pull/611
* bchiang2 made their first contribution in https://github.com/mosaicml/streaming/pull/604

**Full Changelog**: https://github.com/mosaicml/streaming/compare/v0.7.4...v0.7.5

Page 10 of 15

ยฉ 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.