Lhotse

Latest version: v1.29.0

Safety actively analyzes 688705 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 8

1.24.1

What's Changed
* Support for reading data from AIStore using Python SDK by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1354


**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.24...v1.24.1

1.24

What's Changed

New features

Notably, there's a new optimization for dynamic bucketing sampler in multi-GPU training - it will choose the same (or the closest possible) bucket on each DDP rank to keep the total training step times closer. The expected speedup is dependent on the model and the number of GPUs. We observed 8 and 13% speedups across two experiments compared to non-synchronized bucket selection. The new option is called `sync_buckets` and is enabled by default.

* Dynamic bucket selection RNG sync by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1341
* Add new sampler: weighted sampler by marcoyang1998 in https://github.com/lhotse-speech/lhotse/pull/1344
* `reverb_rir`: support Cut input and in memory data by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1332

Recipes

* Add the ReazonSpeech recipe by Triplecq in https://github.com/lhotse-speech/lhotse/pull/1330

Other improvements

* Missing 'subset' parameter by daniel-dona in https://github.com/lhotse-speech/lhotse/pull/1336
* Fix describe on cuts by keeofkoo in https://github.com/lhotse-speech/lhotse/pull/1340
* Use libsndfile in recording chunk dataset by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1335
* Fix librispeech manifest caching by haerski in https://github.com/lhotse-speech/lhotse/pull/1343
* Fix one-off edge case in split_lazy by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1347
* Increase the start diff tolerance for feature loading by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1349
* More test coverage for lhotse subset by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1345


New Contributors
* keeofkoo made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1340
* haerski made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1343
* Triplecq made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1330

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.23...v1.24

1.23

What's Changed

Recipes
* MDCC recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1302
* Updated text_norm for `aishell` recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1305
* Allow skipping missing files in AMI download by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1318
* Add Chinese TTS dataset `baker`. by csukuangfj in https://github.com/lhotse-speech/lhotse/pull/1304
* In CommonVoice corpus, use .tsv headers to parse and not column index by daniel-dona in https://github.com/lhotse-speech/lhotse/pull/1328

Fixes to a regression in noise mixing augmentations
* Enhance `CutSet.mix()` randomness and data utilization by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1315
* Fix randomness in CutMix transform by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1316
* select a random sub-region of the noise based on the delta duration by osadj in https://github.com/lhotse-speech/lhotse/pull/1317

Other improvements
* Add dataset for audio tagging by marcoyang1998 in https://github.com/lhotse-speech/lhotse/pull/1241
* Fix _get_strided_batch device by lifeiteng in https://github.com/lhotse-speech/lhotse/pull/1303
* Fix typo in README.md by yfyeung in https://github.com/lhotse-speech/lhotse/pull/1308
* Fix export of features/array to shar by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1323
* Fix `trim_to_supervision_groups` by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1322

New Contributors
* daniel-dona made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1328

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.22...v1.23

1.22

What's Changed

New features

* Extending Lhotse dataloading to text/multimodal data by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1295

As an experimental feature, we are extending the API of Lhotse samplers to enable key sampling features for non-audio data such as text. That means text (and other) data can be dynamically multiplexed and bucketed in the same way as audio data with some lightweight wrappers. Please refer to new documentation here: https://lhotse.readthedocs.io/en/latest/datasets.html#customizing-sampling-constraints

* Multi-channel support improvements
* Fix loading multi-channel custom recording fields in multi cuts by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1298
* Channel selection for multi-channel custom recording fields by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1299

Lhotse `MultiCut`s:
* are now exportable into Lhotse Shar format
* gained a new method `cut = cut.with_channels([0, 1, ...])` to modify the channels they refer to
* can have multi-channel custom Recordings with channels selectable via a special custom key (e.g., if defining `cut.target_recording`, audio can be read via `cut.load_target_recording()` and channels will be auto-selected by looking up `cut.target_recording_channel_selector`).

Recipes

* Add new recipe: speechio by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1297
* tedlium2 recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1296

Other improvements

* Use audio backends and export custom fields in Lhotse Shar by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1290
* Documentation for random seeds in lhotse + extended support of lazy r… by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1291
* Cutconcat fixed max duration by swigls in https://github.com/lhotse-speech/lhotse/pull/1292
* Fix feature_dim of Spectrogram extractors. by csukuangfj in https://github.com/lhotse-speech/lhotse/pull/1294
* fix whisper for multi-channel data by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1289
* Xfail flaky SileroVAD tests by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1300

New Contributors
* swigls made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1292

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.21...v1.22

1.21

What's Changed

This release patches lhotse to handle cases when libsox is not available for torchaudio. The audio backend code went through additional round of refactoring, and `libsndfile` is now preferred as a default since it showed faster audio decoding performance in our testing. Going forward, when `LHOTSE_AUDIO_BACKEND` is set, we will use the same backend for audio loading, audio saving, and reading audio metadata (if possible). This release also adds support for Python 3.12 and PyTorch 2.2.

* Add VAD to Supervisions in LibriLight Recipe by yfyeung in https://github.com/lhotse-speech/lhotse/pull/1280
* Fixes for manifest validation and fixing by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1284
* Handle error with cachedir creation gracefully by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1287
* `AudioBackend` specific `save_audio` and `info`, managing missing SoX in torchaudio, Python 3.12 / PyTorch 2.2 support, using `libsndfile` as preferred audio backend by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1288


**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.20...v1.21

1.20

What's Changed

New features
* Extended the subset of lhotse that works without installing torchaudio by pzelasko in 1253 1255
* Ensure `drop_last=False` always returns an equal number of mini-batches by re-distributing and/or duplicating some data by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1277
* Improved CPU memory usage and shuffling + bucketing in `DynamicBucketingSampler` by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1276
* Enable seed randomization in dynamic samplers by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1278

Recipes
* Fluent Speech Commands dataset, SLU task by HSTEHSTEHSTE in https://github.com/lhotse-speech/lhotse/pull/1272

Other improvements
* Update docs with env vars used by Lhotse by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1252
* support whisper large v3; deepspeed launcher rank world_size setting by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1260
* Fix non-deterministic tests by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1261
* Fix duplication issues in CutSet.mix() by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1268
* Support controllable `CutSet.mux` weights in multiprocess dataloading by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1266
* Fix distributed sampler initialization and `exceeded` sampler warning false positives by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1270
* Install kaldi-native-io explicitly in the kaldi doc example. by csukuangfj in https://github.com/lhotse-speech/lhotse/pull/1275
* Allow duplicate cut IDs in a CutSet (CutSet is list-like instead of dict-like) by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1279

New Contributors
* HSTEHSTEHSTE made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1272

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.19...v1.20

Page 2 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.