Composer

Latest version: v0.29.0

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 10 of 11

0.8.1

bash
pip install --upgrade mosaicml==0.8.1

Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.8.1


🎁 New Features


1. **🖼️ Image Visualizer**

The [`ImageVisualizer`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.callbacks.image_visualizer.html#composer.callbacks.image_visualizer.ImageVisualizer) callback periodically logs the training and validation images when using the WandB logger. This is great for validating your dataloader pipeline, especially if extensive data augmentations are used. Also, when training on a semantic segmentation task, the callback can log the target segmentation mask and the predicted segmentation mask by setting the argument `mode='segmentation'`. See PR 1266 for more details. Here is an example of using the `ImageVisualizer` callback:

python
from composer import Trainer
from composer.callbacks import ImageVisualizer

Callback to log 8 training images after every 100 batches
image_visualizer = ImageVisualizer()

Construct trainer
trainer = Trainer(
...,
callbacks=image_visualizer
)

Train!
trainer.fit()



Here is an example visualization from the training set of ADE20k:

![](https://i.imgur.com/iszIRLS.jpg)


1. **📶 TensorBoard Logging**

You can now log metrics and losses from your Composer training runs with Tensorboard! See 1250 and 1283 for more details. All you have to do is create a [`TensorboardLogger`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.loggers.tensorboard_logger.html#composer.loggers.tensorboard_logger.TensorboardLogger) object and add it
to the list of loggers in your [`Trainer`](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer) object like so:

python
from composer import Trainer
from composer.loggers import TensorboardLogger

tb_logger = TensorboardLogger(log_dir="./my_tensorboard_logs")

trainer = Trainer(
...
Add your Tensorboard Logger to the trainer here.
loggers=[tb_logger],
)

trainer.fit()


For more information, see this [tutorial](https://docs.mosaicml.com/en/v0.8.1/notes/tensorboard_logger.html).




1. **🔙 Multiple Losses**

Adds support for multiple losses. If a model returns a tuple of losses, they are summed before the `loss.backward()` call. See 1240 for more details.


1. **🌎️ Stream Datasets from HTTP URIs**

You can now specify a HTTP URI for a [Streaming Dataset](https://docs.mosaicml.com/en/v0.8.1/api_reference/composer.datasets.streaming.dataset.html#composer.datasets.streaming.dataset.StreamingDataset) remote. See 1258 for more detials. For example:

python
from composer.datasets.streaming import StreamingDataset
from torch.utils.data import DataLoader

Construct the Dataset
dataset = StreamingDataset(
...,
remote="https://example.com/dataset/",
)

Construct the DataLoader
train_dl = DataLoader(dataset)

Construct the Trainer
trainer = Trainer(
...,
train_dataloader=train_dl,
)

Train!
trainer.fit()


For more information on streaming datasets, see this [tutorial](https://docs.mosaicml.com/en/v0.8.1/examples/streaming_dataloader_facesynthetics.html).


1. **🏄️ GPU Devices default to TF32 Matmuls**

Beginning with PyTorch 1.12, the default behavior for computing FP32 matrix multiplies on NVIDIA Ampere devices was switched from TF32 to FP32. See [PyTorch documentation here](https://pytorch.org/docs/stable/notes/cuda.html#tensorfloat-32-tf32-on-ampere-devices).

Since Composer is designed specifically for ML training with a focus on efficiency, we choose to preserve the old default of using TF32 on Ampere devices. This leads to significantly higher throughput when training in single precision, [without impact training convergence](https://developer.nvidia.com/blog/accelerating-ai-training-with-tf32-tensor-cores/). See PR #1275 for implementation details.

1. **👋 Set the Device ID for GPU Devices**

Specify the device ID within a DeviceGPU to train on when instantiating a Trainer object instead of using the local ID! For example,

python
from composer.trainer.devices.device_gpu import DeviceGPU

Specify to use GPU 3 to train
device = DeviceGPU(device_id=3)

Construct the Trainer
trainer = Trainer(
...,
device = device
)

Train!
trainer.fit()




1. **BERT and C4 Updates**

We make some minor adjustments to our `bert-base-uncased.yaml` training config. In particular, we make the global train and eval batch sizes a power of 2. This maintains divisibility when using many GPUs in multi-node training. We also adjust the `max_duration` so that it converts cleanly to 70,000 batches.

We also upgrade our StreamingDataset C4 conversion script (`scripts/mds/c4.py`) to use a multi-threaded reader. On a 64-core machine we are able to convert the 770GB train split to `.mds` format in ~1.5hr.


1. **📂 Set a `prefix` when using a `S3ObjectStore`**

When using `S3ObjectStore` for applications like checkpointing, it can be useful to provide path prefixes, mimicking `folder/subfolder` directories like on a local filesystem. When `prefix` is provided, any objects uploaded with `S3ObjectStore` will be stored at `f's3://{self.bucket}/{self.prefix}{object_name}'`.


1. **⚖️ Scale the Warmup Period of Composer Schedulers**

Added a new flag `scale_warmup` to schedulers that will scale the warmup period when a scale schedule ratio is applied. Default is `False` to mirror default behavior. See 1268 for more detials.

1. **🧊 Stochastic Depth on Residual Blocks**

Residual blocks are detected automatically and replaced with stochastic versions. See 1253 for more details.

🐛 Bug Fixes

1. **Fixed Progress Bars**

Fixed a bug where the the Progress Bars jumped around and did not stream properly when tailing the terminal over the network. Fixed in 1264, 1287, and 1289.

1. **Fixed S3ObjectStore in Multithreaded Environments**

Fixed a bug where the `boto3` crashed when creating the default session in multiple threads simultaniously (see https://github.com/boto/boto3/issues/1592). Fixed in #1260.

1. **Retry on `ChannelException` errors in the `SFTPObjectStore`**

Catch `ChannelException` SFTP transient error and retry. Fixed in 1245.

1. **Treating S3 Permission Denied Errors as Not Found Errors**

We update our handling of `botocore` 403 ClientErrors to interpret them as `FileNotFoundErrors`. We do this because of a situation that occurs when a user has no S3 credentials configured, and tries to read from a bucket with public files. For privacy, Amazon S3 raises 403 (Permission Denied) instead of 404 (Not Found) errors. As such, PR 1249 treats 403 ClientErrors as FileNotFoundErrors.

1. **Fixed Parsing of `grad_accum` in the `TrainerHparams`**

Fixes an error where the command line override `--grad_accum` lead to incorrect parsing. Fixed in 1256.

1. **Fixed Example YAML Files**

Our recipe configurations (YAML) are updated to the latest version, and a test was added to enforce correctness moving forward. Fixed in 1235 and 1257.




Changelog

https://github.com/mosaicml/composer/compare/v0.8.0...v0.8.1

0.8.0

bash
pip install --upgrade mosaicml==0.8.0

Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.8.0


New Features


1. **🤗 HuggingFace ComposerModel**

Train your HuggingFace models with Composer! We introduced a [`HuggingFaceModel`](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.models.huggingface.html#composer.models.huggingface.HuggingFaceModel) that converts your existing 🤗 Transformers models into a ComposerModel.

For example:

python
import transformers
from composer.models import HuggingFaceModel

Define the model
hf_model = transformers.AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', num_labels=2)

Convert it into a ComposerModel
model = HuggingFaceModel(hf_model)

Construct the trainer
trainer = Trainer(
...,
model,
)

Train!
trainer.fit()


For more information, see the example on [fine-tuning a pretrained BERT with Composer](https://docs.mosaicml.com/en/v0.8.0/examples/huggingface_models.html).

1. **🫕 Fused Layer Norm**


Fused LayerNorm replaces implementations of [`torch.nn.LayerNorm`](https://pytorch.org/docs/1.11/generated/torch.nn.LayerNorm.html) with a [`apex.normalization.fused_layer_norm`](https://nvidia.github.io/apex/layernorm.html). The fused kernel provides increased GPU utilization.

For example:

python
from composer.trainer import Trainer
from composer.algorithms import FusedLayerNorm

Initialize the algorithm
alg = FusedLayerNorm()

Construct the trainer
trainer = Trainer(
algorithms=alg,
)

Train!
trainer.fit()


See the [method card](https://docs.mosaicml.com/en/v0.8.0/method_cards/fused_layernorm.html) for more information.

1. **💾 Ignore Checkpoint Parameters**

If you have a checkpoint and don't want to restore some elements of the chceckpoint to the [state](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.core.state.html#composer.core.state.State), we added a `load_ignore_keys` parameter. Any specified (nested) keys will be ignored. Glob syntax is supported!

For example, to restore a checkpoint without the seed:

python
from composer import Trainer

trainer = Trainer(
...,
load_path="path/to/my/checkpoint.pt",
load_ignore_keys=["state/rank_zero_seed", "rng"],
)


See the [Trainer API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer) for more information.


1. **🪣 Object Stores**

Composer v0.8.0 introduces an abstract [Object Store API](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.utils.object_store.object_store.html#composer.utils.object_store.object_store.ObjectStore) to support multiple object store drivers, such as boto3 (for Amazon S3) and Paramiko (for SFTP), in addition to the existing libcloud implementation.

For example, if you are training on AWS where credentials are available in the environment, here's how to to save checkpoints to a S3 object store via Boto3.

python
from composer import Trainer
from composer.loggers import ObjectStoreLogger
from composer.utils.object_store import S3ObjectStore

logger = ObjectStoreLogger(
object_store_cls=S3ObjectStore,
object_store_kwargs={
These arguments will be passed into the S3ObjectStore -- e.g.:
object_store = S3ObjectStore(**object_store_kwargs)
Refer to the S3ObjectStore class for documentation
'bucket': 'my-bucket',
},
)

trainer = Trainer(
...,
loggers=logger,
)

Train!
trainer.fit()


See the [Object Store API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.utils.object_store.html#module-composer.utils.object_store) for more information.



1. **🪨 Artifact Metadata**

Composer automatically logs the epoch, batch, sample, and token counts as metadata when storing artifacts in Weights & Biases. See the [API Reference](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.loggers.wandb_logger.html#composer.loggers.wandb_logger.WandBLogger) for more information.


API Changes

1. **✂️ Gradient Clipping is now an Algorithm**

To clean up the Trainer, we moved gradient clipping into an Algorithm. The ``grad_clip_norm`` argument in the Trainer is deprecated and will be removed in a future version of Composer. Instead, use the [Gradient Clipping](https://docs.mosaicml.com/en/v0.8.0/method_cards/gradient_clipping.html) algorithm:

For example:

python
from composer.algorithms import GradientClipping
from composer.trainer import Trainer

Configure gradient clipping
gradient_clipping = GradientClipping()

Configure the trainer
trainer = Trainer(
...,
algorithms=gradient_clipping,
)

Train!
trainer.fit()


See the [method card](https://docs.mosaicml.com/en/v0.8.0/method_cards/gradient_clipping.html) for more information.

1. **🕒️ Removed `batch_num_samples` and `batch_num_tokens` from the state.**

State properties `batch_num_samples` and `batch_num_tokens` have been removed.
Instead, use [`State.timestamp`](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.core.time.html#composer.core.time.Timestamp) for token and sample tracking.

1. **🧑‍🤝‍🧑 DDP Sync Strategy**

We changed the default [DDP Sync Strategy](https://docs.mosaicml.com/en/v0.8.0/api_reference/composer.trainer.ddp.html?highlight=MULTI_AUTO_SYNC#composer.trainer.ddp.DDPSyncStrategy) to `MULTI_AUTO_SYNC`, as `FORCED_SYNC` doesn't work with all algorithms.

1. **🏃 Moved the `run_name` into the `State`**

The `run_name` has been added to the [State](https://docs.mosaicml.com/en/latest/api_reference/composer.core.state.html#composer.core.state.State.run_name) object, so it is persisted with checkpoints. It has been removed from the Logger.


Bug Fixes

* In the Object Store Logger, added in retries for credential validation, and validating credentials only on global rank zero. (1144)
* Fixed a bug in the speed monitor where it returned negative wall clock times. (1123)
* Fixed how block-wise Stochastic Depth could freeze the trainer. (1087)
* Fixed a bug in the [MLPerfCallback] where sample counts were incorrect on per-sharded datasets. (1156)



Changelog

https://github.com/mosaicml/composer/compare/v0.7.1...v0.8.0

0.7.1

bash
pip install --upgrade mosaicml==0.7.1

Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.7.1


Bug Fixes

* Upgraded `wandb>=0.12.17`, to fix incompatibility with protobuf >= 4 (https://github.com/wandb/client/pull/3709)

Changelog

https://github.com/mosaicml/composer/compare/v0.7.0...v0.7.1

0.7.0

bash
pip install --upgrade mosaicml==0.7.0

Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.7.0


New Features

1. **🏎️ FFCV Integration**

Composer supports [FFCV](https://ffcv.io/), a fast dataloader for image datasets. We've found FFCV can speed up ResNet-56 training by 16\%, in addition to existing speed-ups already supported by Composer! It's easy to use FFCV with any existing image dataset:

python
import ffcv
from ffcv.fields.decoders import IntDecoder, SimpleRGBImageDecoder
from torchvision.datasets import ImageFolder

from composer import Trainer
from composer.datasets.ffcv_utils import write_ffcv_dataset, ffcv_monkey_patches

Convert the dataset to FFCV format
This step needs to be done only once per dataset
dataset = ImageFolder(...)
ffcv_dataset_path = "my_ffcv_dataset.ffcv"
write_ffcv_dataset(dataset=dataset, write_path=ffcv_dataset_path)

In FFCV v0.0.3, len(dataloader) is expensive. Fix that via a monkeypatch
ffcv_monkey_patches()

Construct the train dataloader
train_dl = ffcv.Loader(
ffcv_dataset_path,
...
)

Construct the trainer
trainer = Trainer(
train_dataloader=train_dl,
)

Train using FFCV!
trainer.fit()


See our notebook on [training with FFCV](https://github.com/mosaicml/composer/blob/v0.7.0/notebooks/composer_with_ffcv_dataloaders.ipynb) for a full example.

1. **✅ Autoresume from Checkpoints**

When setting `autoresume=True`, Composer can automatically resume from an existing checkpoint before starting a new training run. Specifically, the trainer will look in the `save_folder` (and any loggers that save artifacts) for the latest checkpoint; if none is found, then it'll start from the beginning.

This feature does not require a different entrypoint to distinguish between starting a new training run or automatically resuming from an existing one, making it easy to use Composer on spot preemptable cloud instances. Simply set `autoresume=True`, point the instance to your training script, and Composer will handle the rest!


python
from composer import Trainer

When using `autoresume`, it is required to specify the
`run_name`, so Composer will know which training run to
resume
run_name = "my_autoresume_training_run"

trainer = Trainer(
...,
run_name=run_name,
specify where to save checkpoints
save_folder="./my_autoresume_training_run",
autoresume=True,
)

Train! Composer will handle loading an existing
checkpoint or starting a new training run
trainer.fit()


See the [Trainer API Reference](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer) for more information.

1. **♻️ Reuse the Trainer**

Want to train on multiple dataloaders sequentially? Each trainer object now supports multiple calls to [`Trainer.fit()`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer.fit), so you can continue training an existing model on a new dataloader, with new schedulers, all while using the same model and trainer object.

For example:

python
from torch.utils.data import DataLoader

from composer import Trainer

train_dl_1 = DataLoader(...)
trainer = Trainer(
model=model,
max_duration='5ep',
train_dataloader=train_dl_1,
)

Train once!
trainer.fit()

Train again with a new dataloader for another 5 epochs
train_dl_2 = DataLoader(...)
trainer.fit(
train_dataloader=train_dl_2,
duration='5ep',
)


See the [Trainer API Reference](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer.fit) for more information.

1. **⚖️ Eval or Predict Only? No Problem**

You can evaluate or predict on an existing model, without having to supply a train dataloader or training duration argument -- they're now optional.

python

import torchmetrics
from torch.utils.data import DataLoader

from composer import Trainer

Construct the trainer
trainer = Trainer(model=model)

Evaluate!
eval_dl = DataLoader(...)
trainer.eval(
dataloader=eval_dl,
metrics=torchmetrics.Accuracy(),
)

Examine evaluation metrics
print("Eval metrics", trainer.state.metrics['eval'])

Or, predict!
predict_dl = DataLoader(...)
trainer.predict(dataloader=predict_dl)


See the [Trainer API Reference](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer.eval) for more information.


1. **🛑 Early Stopper and Threshold Stopper Callbacks**

The [Early Stopper](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.callbacks.early_stopper.html#composer.callbacks.early_stopper.EarlyStopper) and [Threshold Stopper](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.callbacks.threshold_stopper.html#composer.callbacks.threshold_stopper.ThresholdStopper) callbacks end training early when the target metrics are met:

python
from composer.callbacks.early_stopper import EarlyStopper
from torchmetrics.classification.accuracy import Accuracy

Construct the callback
early_stopper = EarlyStopper(
monitor="Accuracy",
dataloader_label="eval",
patience=2,
)

Construct the trainer
trainer = Trainer(
...,
callbacks=early_stopper,
max_duration="100ep",
)

Train!
Training will end early if the accuracy does not improve
over two epochs
trainer.fit()

1. **🪵 Load Checkpoints from Loggers**

It's now possible to restore checkpoints from loggers that support file artifacts (such as the [Weights & Baises Logger](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.loggers.wandb_logger.html#composer.loggers.wandb_logger.WandBLogger)). No need to download your checkpoints manually anymore.

python
from composer import Trainer
from composer.loggers import WandBLogger

Configure the W&B Logger
wandb_logger = WandBLogger(
set to True to capture artifacts, like checkpoints
log_artifacts=True,
init_params={
'project': 'my-wandb-project-name',
},
)

Then, to train and save checkpoints to W&B:
trainer = Trainer(
...,
loggers=wandb_logger,
save_folder="/tmp/checkpoints",
save_interval="1ep",
save_artifact_name="epoch{epoch}.pt",
)

Finally, to load checkpoints from W&B
trainer = Trainer(
...,
load_object_store=wandb_logger,
load_path="epoch1.pt:latest",
)



1. **⌛ Wall Clock, Evaluation, and Prediction Time Tracking**

The [timestamp](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.time.html#composer.core.time.Timestamp) object measures wall clock time via three new fields: `total_wct`, `epoch_wct`, and `batch_wct`. These fields track the total elapsed training time, the elapsed training time of the current epoch, and the time to train the last batch. Read the wall clock time via a callback:

python
from composer import Callback, Trainer

class MyCallback(Callback):
def batch_end(self, state, event):
print(f"Total wct: {state.timetsamp.total_wct}")
print(f"Epoch wct: {state.timetsamp.epoch_wct}")
print(f"Batch wct: {state.timetsamp.batch_wct}")

Construct the trainer with this callback
trainer = Trainer(
...,
callbacks=MyCallback(),
)

Train!
trainer.fit()


In addition, the training [state](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.state.html#composer.core.state.State) object has two new fields for tracking time during evaluation and prediction: [`eval_timestamp`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.state.html#composer.core.state.State.eval_timestamp) and [`predict_timestamp`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.state.html#composer.core.state.State.predict_timestamp). These fields, just like any others on the state object, are accessible to algorithms, callbacks, and loggers.

1. **Training DeepLabv3+ on the ADE20k Dataset**

[DeepLabv3+](https://arxiv.org/abs/1802.02611) is a common baseline model for semantic segmentation tasks. We provide a `ComposerModel` implementation for DeepLabv3+ built using [torchvision](https://pytorch.org/vision/stable/index.html) and [mmsegmentation](https://github.com/open-mmlab/mmsegmentation) for the backbone and head, respectively.

We found the DeepLabv3+ baseline can be significantly improved using the [new PyTorch pre-trained weights](https://pytorch.org/blog/introducing-torchvision-new-multi-weight-support-api/). Additional gains are made through a hyperparameter sweep.

We benchmark our DeepLabv3+ model on a single 8xA100 machine using [ADE20k](https://arxiv.org/abs/1608.05442), a popular semantic segmentation dataset. The final results on ADE20k are:

| Model | mIoU | Time-to-Train |
| ---------------------- | -------------- | ------------- |
| Unoptimized DeepLabv3+ | 44.17 +/- 0.14 | 6.39 hr |
| Optimized DeepLabv3+ | 45.78 +/- 0.26 | 4.67 hr |

Checkout [our documentation](https://docs.mosaicml.com/en/v0.7.0/model_cards/deeplabv3.html) for more info!

API Changes

1. **🍪 Additional Batch Type Support**

Composer v0.7.0 removed the `BatchDict` and `BatchPair` types, and now supports any batch type. We're updating our algorithms to support batches of custom formats.

1. **🏎️ Simplified Profiling Arguments**

To simplify the Trainer constructor, the profiling arguments were replaced with a single `profiler` argument, which takes an instance of the [Profiler](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.profiler.profiler.html#composer.profiler.profiler.Profiler).

python
from composer.trainer import Trainer
from composer.profiler import PRofiler, JSONTraceHandler, cyclic_schedule

trainer = Trainer(
...,
profiler=Profiler(
trace_handlers=JSONTraceHandler(
folder=composer_trace_dir,
overwrite=True,
),
schedule=cyclic_schedule(
wait=0,
warmup=1,
active=4,
repeat=1,
),
torch_prof_folder=torch_trace_dir,
torch_prof_overwrite=True,
...,
)
)


See the [profiling guide](https://docs.mosaicml.com/en/v0.7.0/trainer/performance_tutorials/profiling.html) for additional information.

1. **🚪 `Event.FIT_END` and `Engine.close()`**

With support for reusing the trainer for multiple calls to [`Trainer.fit`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer.fit), callbacks and loggers are no longer closed at the end of a training run.

Instead, [`Event.FIT_END`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.event.html#composer.core.event.Event.FIT_END) was added, which can be used by Callbacks for anything that should happen at the end of _each_ invocation of [`Trainer.fit`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer.fit). See the [Event Guide](https://docs.mosaicml.com/en/v0.7.0/trainer/events.html) for aadditional inforrmation.

Finally, whenever the trainer is garbage collected or [`Trainer.close`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.trainer.trainer.html#composer.trainer.trainer.Trainer.close) is called, [`Callback.close`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.callback.html#composer.core.callback.Callback.close) and [`Callback.post_close`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.callback.html#composer.core.callback.Callback.post_close) are invoked, ensuring that they will be called only once per trainer.

1. **⌛ `State.timesamp` replaces `State.timer`**

Removed `State.timer` and replaced it with [`State.timestamp`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.state.html#composer.core.state.State.timestamp), which is now a static [Timestamp](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.time.html#composer.core.time.Timestamp) object. The training loop replaces `State.timestamp` with a new object on each batch. See the [Time Guide](https://docs.mosaicml.com/en/v0.7.0/trainer/time.html#tracking-time) for additional information.

1. **💿 Data Configuration**

Two new proerties, [`State.dataloader`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.state.html#composer.core.state.State.dataloader) and [`State.dataloader_label`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.state.html#composer.core.state.State.dataloader_label), were added to the state. These properties track the currently active dataloader (e.g. the training dataloader when training; the evaluation dataloader when evaluating).

In adddition, `State.subset_num_batches` was renamed to [`State.dataloader_len`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.state.html#composer.core.state.State.dataloader_len) to reflect the actual dataloader length that will be used for training and evaluation.

A helper method [`State.set_dataloader`](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.state.html#composer.core.state.State.set_dataloader) was added to ensure the dataloader properties are updated correctly.


1. **⚖️ Removed the Deprecated Scale Schedule Algorithm**

The scale schedule algorithm class, deprecated in v0.4.0, has been removed. Instead, use the `scale_schedule_ratio` argument when constructing the trainer.

python
from composer import Trainer
from composer.optim.scheduler import MultiStepScheduler

trainer = Trainer(
...,
max_duration="20ep",
schedulers=MultiStepScheduler(milestones=["10ep", "16ep"]),
scale_schedule_ratio=0.5,
)


See the [Scale Schedule Method Card](https://docs.mosaicml.com/en/v0.7.0/method_cards/scale_schedule.html) for additional info.

Bug Fixes

* Fixed an bug where `Event.FIT_END` was not being called in the training loop (1054)
* Fixed a bug where evaluation would not run at the end of training unless if it aligned with the ``eval_interval`` (1045)
* Fixed a bug where models trained with SWA could not be used with checkpoints (1015)
* Fixed a bug where the [Speed Monitor](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.callbacks.speed_monitor.html#composer.callbacks.speed_monitor.SpeedMonitor) included validation time in the training throughput measurements, resulting in slower reported throughput measurements (1053)
* Fixed a bug to make the [ComposerClassifier](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.models.tasks.classification.html#composer.models.tasks.classification.ComposerClassifier) compatible with TorchScript (1036)
* Fixed a bug where fractional [Time Objects](https://docs.mosaicml.com/en/v0.7.0/api_reference/composer.core.time.html#composer.core.time.Time) were being truncated instead of raising an exception (1038)
* Changed the defaults for [Selective Backprop](https://docs.mosaicml.com/en/v0.7.0/method_cards/selective_backprop.html) to not scale inputs, so the algorithm can work with non-vision workloads (#896)


New Contributors
* ofirpress made their first contribution in https://github.com/mosaicml/composer/pull/955
* QiyaoWei made their first contribution in https://github.com/mosaicml/composer/pull/866
* pavithranrao made their first contribution in https://github.com/mosaicml/composer/pull/879


Changelog

https://github.com/mosaicml/composer/compare/v0.6.1...v0.7.0

0.6.1

Go ahead and upgrade; it's fully backwards compatible with Composer v0.6.0.

Install via `pip`:

bash
pip install --upgrade mosaicml==0.6.1


Alternatively, install Composer with Conda:

bash
conda install -c mosaicml mosaicml=0.6.1


What's New?

1. **📎 Adaptive Gradient Clipping (AGC)**

[Adaptive Gradient Clipping (AGC)](https://docs.mosaicml.com/en/v0.6.1/method_cards/agc.html) clips gradients based on the ratio of their norms with weights' norms. This technique helps stabilize training with large batch sizes, especially for models without batchnorm layers.

1. **🚚 Exponential Moving Average (EMA)**

[Exponential Moving Average (EMA)](https://docs.mosaicml.com/en/v0.6.1/method_cards/ema.html) is a model averaging technique that maintains an exponentially weighted moving average of the model parameters during training. The averaged parameters are used for model evaluation. EMA typically results in less noisy validation metrics over the course of training, and sometimes increased generalization.

1. **🪵 Logger is available in the ComposerModel**

The [Logger](https://docs.mosaicml.com/en/v0.6.1/trainer/logging.html) is bound to the [ComposerModel](https://docs.mosaicml.com/en/v0.6.1/composer_model.html) via the ``self.logger`` attribute. It is available during training on all methods (other than `__init__`).

For example, to log hidden activation:

python
class Net(ComposerModel):

def forward(self, x):
x = F.relu(F.max_pool2d(self.conv1(x), 2))
x = F.relu(F.max_pool2d(self.conv2_drop(self.conv2(x)), 2))
if self.logger:
self.logger.data_batch({
"hidden_activation_norm": x.norm(2).item(),
})
x = x.view(-1, 320)
x = F.relu(self.fc1(x))
x = F.dropout(x, training=self.training)
x = self.fc2(x)
return F.log_softmax(x)


1. **🐛 Environment Collection Script**

Composer v0.6.1 includes an [environment collection script](https://docs.mosaicml.com/en/v0.6.1/api_reference/composer.utils.collect_env.html#module-composer.utils.collect_env) which generates a printout of your system configuration and python environment. If you run into a bug, the results from this script will help us debug the issue and fix Composer.

To collect your environment information:

bash
$ pip install mosaicml if composer is not already installed
$ composer_collect_env


Then, include the output in your [GitHub Issue](https://github.com/mosaicml/composer/issues/new?assignees=&labels=bug&template=---bug-report.md&title=).

What's Improved?

1. **📜 TorchScriptable Algorithms**

[BlurPool](https://docs.mosaicml.com/en/v0.6.1/method_cards/blurpool.html), [Ghost BatchNorm](https://docs.mosaicml.com/en/v0.6.1/method_cards/ghost_batchnorm.html), and [Stochastic Depth](https://docs.mosaicml.com/en/v0.6.1/method_cards/stochastic_depth.html) are now TorchScript-compatible. Try exporting your models with these algorithms enabled!

1. **🏛️ ColOut on Segmentation**

[ColOut](https://docs.mosaicml.com/en/v0.6.1/method_cards/colout.html) now supports segmentation-style models.

What's Fixed?

1. **🚑️ Loggers capture the Traceback**

We fixed a bug so the [Loggers](https://docs.mosaicml.com/en/v0.6.1/trainer/logging.html), such as the [Weights & Biases Logger](https://docs.mosaicml.com/en/v0.6.1/api_reference/composer.loggers.wandb_logger.html) and the [File Logger](https://docs.mosaicml.com/en/v0.6.1/api_reference/composer.loggers.file_logger.html), will capture the traceback any exception that crashes the training process.

1. **🏋️ Weights & Biases Logger Config**

We fixed a bug where the the [Weights & Biases Logger](https://docs.mosaicml.com/en/v0.6.1/api_reference/composer.loggers.wandb_logger.html) was not properly recording the configuration.

Full Changelog

https://github.com/mosaicml/composer/compare/v0.6.0...v0.6.1

0.6.0

New Contributors
* vahidfazelrezai made their first contribution in https://github.com/mosaicml/composer/pull/781
* murthyn made their first contribution in https://github.com/mosaicml/composer/pull/789
* dlmgary made their first contribution in https://github.com/mosaicml/composer/pull/818
* IanWorley made their first contribution in https://github.com/mosaicml/composer/pull/835

**Full Changelog**: https://github.com/mosaicml/composer/compare/v0.5.0...v0.6.0

Page 10 of 11

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.