Lhotse

Latest version: v1.31.0

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 8

1.24.2

New recipes
* Add KsponSpeech recipe by whsqkaak in https://github.com/lhotse-speech/lhotse/pull/1353

New features
Several new APIs for manifest classes added in 1361:
* `cut.iter_data()` which iterates over (key, manifest) pairs of all data items attached to a given cut (e.g., `("recording", Recording(...)), ("custom_features", TemporalArray(...))`)
* `is_in_memory` property for all manifest types to indicate if it contains data that is held in memory
* `is_placeholder` for non-cut manifests to indicate if a manifest is just a placeholder (has some metadata, but can't be used to load data)
* `cut.drop_in_memory_data()` which converts manifests with in-memory data to placeholders (this is useful for manifests that live longer than just dataloading to avoid blowing up CPU memory and/or slowing down the program)

Bug fixes
* Restoring smart open for local files if available by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1360
* Fix Recording.to_dict() when transforms are dicts and transform pickling issues by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1355
* Utils for discovering attached data and dropping in-memory data by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1361
* Numpy 2.0 compatibility by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1362

New Contributors
* whsqkaak made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1353

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.24.1...v1.24.2

1.24.1

What's Changed
* Support for reading data from AIStore using Python SDK by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1354


**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.24...v1.24.1

1.24

What's Changed

New features

Notably, there's a new optimization for dynamic bucketing sampler in multi-GPU training - it will choose the same (or the closest possible) bucket on each DDP rank to keep the total training step times closer. The expected speedup is dependent on the model and the number of GPUs. We observed 8 and 13% speedups across two experiments compared to non-synchronized bucket selection. The new option is called `sync_buckets` and is enabled by default.

* Dynamic bucket selection RNG sync by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1341
* Add new sampler: weighted sampler by marcoyang1998 in https://github.com/lhotse-speech/lhotse/pull/1344
* `reverb_rir`: support Cut input and in memory data by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1332

Recipes

* Add the ReazonSpeech recipe by Triplecq in https://github.com/lhotse-speech/lhotse/pull/1330

Other improvements

* Missing 'subset' parameter by daniel-dona in https://github.com/lhotse-speech/lhotse/pull/1336
* Fix describe on cuts by keeofkoo in https://github.com/lhotse-speech/lhotse/pull/1340
* Use libsndfile in recording chunk dataset by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1335
* Fix librispeech manifest caching by haerski in https://github.com/lhotse-speech/lhotse/pull/1343
* Fix one-off edge case in split_lazy by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1347
* Increase the start diff tolerance for feature loading by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1349
* More test coverage for lhotse subset by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1345


New Contributors
* keeofkoo made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1340
* haerski made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1343
* Triplecq made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1330

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.23...v1.24

1.23

What's Changed

Recipes
* MDCC recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1302
* Updated text_norm for `aishell` recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1305
* Allow skipping missing files in AMI download by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1318
* Add Chinese TTS dataset `baker`. by csukuangfj in https://github.com/lhotse-speech/lhotse/pull/1304
* In CommonVoice corpus, use .tsv headers to parse and not column index by daniel-dona in https://github.com/lhotse-speech/lhotse/pull/1328

Fixes to a regression in noise mixing augmentations
* Enhance `CutSet.mix()` randomness and data utilization by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1315
* Fix randomness in CutMix transform by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1316
* select a random sub-region of the noise based on the delta duration by osadj in https://github.com/lhotse-speech/lhotse/pull/1317

Other improvements
* Add dataset for audio tagging by marcoyang1998 in https://github.com/lhotse-speech/lhotse/pull/1241
* Fix _get_strided_batch device by lifeiteng in https://github.com/lhotse-speech/lhotse/pull/1303
* Fix typo in README.md by yfyeung in https://github.com/lhotse-speech/lhotse/pull/1308
* Fix export of features/array to shar by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1323
* Fix `trim_to_supervision_groups` by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1322

New Contributors
* daniel-dona made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1328

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.22...v1.23

1.22

What's Changed

New features

* Extending Lhotse dataloading to text/multimodal data by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1295

As an experimental feature, we are extending the API of Lhotse samplers to enable key sampling features for non-audio data such as text. That means text (and other) data can be dynamically multiplexed and bucketed in the same way as audio data with some lightweight wrappers. Please refer to new documentation here: https://lhotse.readthedocs.io/en/latest/datasets.html#customizing-sampling-constraints

* Multi-channel support improvements
* Fix loading multi-channel custom recording fields in multi cuts by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1298
* Channel selection for multi-channel custom recording fields by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1299

Lhotse `MultiCut`s:
* are now exportable into Lhotse Shar format
* gained a new method `cut = cut.with_channels([0, 1, ...])` to modify the channels they refer to
* can have multi-channel custom Recordings with channels selectable via a special custom key (e.g., if defining `cut.target_recording`, audio can be read via `cut.load_target_recording()` and channels will be auto-selected by looking up `cut.target_recording_channel_selector`).

Recipes

* Add new recipe: speechio by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1297
* tedlium2 recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1296

Other improvements

* Use audio backends and export custom fields in Lhotse Shar by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1290
* Documentation for random seeds in lhotse + extended support of lazy r… by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1291
* Cutconcat fixed max duration by swigls in https://github.com/lhotse-speech/lhotse/pull/1292
* Fix feature_dim of Spectrogram extractors. by csukuangfj in https://github.com/lhotse-speech/lhotse/pull/1294
* fix whisper for multi-channel data by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1289
* Xfail flaky SileroVAD tests by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1300

New Contributors
* swigls made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1292

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.21...v1.22

1.21

What's Changed

This release patches lhotse to handle cases when libsox is not available for torchaudio. The audio backend code went through additional round of refactoring, and `libsndfile` is now preferred as a default since it showed faster audio decoding performance in our testing. Going forward, when `LHOTSE_AUDIO_BACKEND` is set, we will use the same backend for audio loading, audio saving, and reading audio metadata (if possible). This release also adds support for Python 3.12 and PyTorch 2.2.

* Add VAD to Supervisions in LibriLight Recipe by yfyeung in https://github.com/lhotse-speech/lhotse/pull/1280
* Fixes for manifest validation and fixing by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1284
* Handle error with cachedir creation gracefully by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1287
* `AudioBackend` specific `save_audio` and `info`, managing missing SoX in torchaudio, Python 3.12 / PyTorch 2.2 support, using `libsndfile` as preferred audio backend by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1288


**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.20...v1.21

Page 2 of 8

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.