Lhotse

Latest version: v1.23.0

Safety actively analyzes 623909 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 6

1.23

What's Changed

Recipes
* MDCC recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1302
* Updated text_norm for `aishell` recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1305
* Allow skipping missing files in AMI download by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1318
* Add Chinese TTS dataset `baker`. by csukuangfj in https://github.com/lhotse-speech/lhotse/pull/1304
* In CommonVoice corpus, use .tsv headers to parse and not column index by daniel-dona in https://github.com/lhotse-speech/lhotse/pull/1328

Fixes to a regression in noise mixing augmentations
* Enhance `CutSet.mix()` randomness and data utilization by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1315
* Fix randomness in CutMix transform by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1316
* select a random sub-region of the noise based on the delta duration by osadj in https://github.com/lhotse-speech/lhotse/pull/1317

Other improvements
* Add dataset for audio tagging by marcoyang1998 in https://github.com/lhotse-speech/lhotse/pull/1241
* Fix _get_strided_batch device by lifeiteng in https://github.com/lhotse-speech/lhotse/pull/1303
* Fix typo in README.md by yfyeung in https://github.com/lhotse-speech/lhotse/pull/1308
* Fix export of features/array to shar by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1323
* Fix `trim_to_supervision_groups` by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1322

New Contributors
* daniel-dona made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1328

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.22...v1.23

1.22

What's Changed

New features

* Extending Lhotse dataloading to text/multimodal data by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1295

As an experimental feature, we are extending the API of Lhotse samplers to enable key sampling features for non-audio data such as text. That means text (and other) data can be dynamically multiplexed and bucketed in the same way as audio data with some lightweight wrappers. Please refer to new documentation here: https://lhotse.readthedocs.io/en/latest/datasets.html#customizing-sampling-constraints

* Multi-channel support improvements
* Fix loading multi-channel custom recording fields in multi cuts by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1298
* Channel selection for multi-channel custom recording fields by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1299

Lhotse `MultiCut`s:
* are now exportable into Lhotse Shar format
* gained a new method `cut = cut.with_channels([0, 1, ...])` to modify the channels they refer to
* can have multi-channel custom Recordings with channels selectable via a special custom key (e.g., if defining `cut.target_recording`, audio can be read via `cut.load_target_recording()` and channels will be auto-selected by looking up `cut.target_recording_channel_selector`).

Recipes

* Add new recipe: speechio by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1297
* tedlium2 recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1296

Other improvements

* Use audio backends and export custom fields in Lhotse Shar by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1290
* Documentation for random seeds in lhotse + extended support of lazy r… by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1291
* Cutconcat fixed max duration by swigls in https://github.com/lhotse-speech/lhotse/pull/1292
* Fix feature_dim of Spectrogram extractors. by csukuangfj in https://github.com/lhotse-speech/lhotse/pull/1294
* fix whisper for multi-channel data by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1289
* Xfail flaky SileroVAD tests by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1300

New Contributors
* swigls made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1292

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.21...v1.22

1.21

What's Changed

This release patches lhotse to handle cases when libsox is not available for torchaudio. The audio backend code went through additional round of refactoring, and `libsndfile` is now preferred as a default since it showed faster audio decoding performance in our testing. Going forward, when `LHOTSE_AUDIO_BACKEND` is set, we will use the same backend for audio loading, audio saving, and reading audio metadata (if possible). This release also adds support for Python 3.12 and PyTorch 2.2.

* Add VAD to Supervisions in LibriLight Recipe by yfyeung in https://github.com/lhotse-speech/lhotse/pull/1280
* Fixes for manifest validation and fixing by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1284
* Handle error with cachedir creation gracefully by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1287
* `AudioBackend` specific `save_audio` and `info`, managing missing SoX in torchaudio, Python 3.12 / PyTorch 2.2 support, using `libsndfile` as preferred audio backend by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1288

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.20...v1.21

1.20

What's Changed

New features
* Extended the subset of lhotse that works without installing torchaudio by pzelasko in 1253 1255
* Ensure `drop_last=False` always returns an equal number of mini-batches by re-distributing and/or duplicating some data by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1277
* Improved CPU memory usage and shuffling + bucketing in `DynamicBucketingSampler` by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1276
* Enable seed randomization in dynamic samplers by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1278

Recipes
* Fluent Speech Commands dataset, SLU task by HSTEHSTEHSTE in https://github.com/lhotse-speech/lhotse/pull/1272

Other improvements
* Update docs with env vars used by Lhotse by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1252
* support whisper large v3; deepspeed launcher rank world_size setting by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1260
* Fix non-deterministic tests by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1261
* Fix duplication issues in CutSet.mix() by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1268
* Support controllable `CutSet.mux` weights in multiprocess dataloading by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1266
* Fix distributed sampler initialization and `exceeded` sampler warning false positives by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1270
* Install kaldi-native-io explicitly in the kaldi doc example. by csukuangfj in https://github.com/lhotse-speech/lhotse/pull/1275
* Allow duplicate cut IDs in a CutSet (CutSet is list-like instead of dict-like) by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1279

New Contributors
* HSTEHSTEHSTE made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1272

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.19...v1.20

1.19

What's Changed

Features

* Support for OPUS encoding in Lhotse Shar format by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1238
* Perform CutSet.mix() lazily by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1244
* `CutSampler.map()` for transforming `CutSet` mini-batches by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1246
* Support multiplexing with a limited number of open streams by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1248

Recipes

* support icmc eval track 1 by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1235
* updating the voxpopuli recipe by vesis84 in https://github.com/lhotse-speech/lhotse/pull/1243
* Allowing downloading Edin. ver. of VCTK by JinZr in https://github.com/lhotse-speech/lhotse/pull/1247

Other improvements
* Micro-optimization for LazyJsonlIterator len() by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1237
* Drop python3.7 support by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1245
* Fix `normalize_loudness` for MixedCuts with PaddingCuts by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1249

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.18...v1.19

1.18

What's Changed

New features

* MMS forced alignment backend by flyingleafe in https://github.com/lhotse-speech/lhotse/pull/1185
* Two new options: `CutSet.from_shar(seed="trng")` and `DynamicCutSampler(quadratic_duration=...)` by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1199
* Faster initialization option in `DynamicBucketingSampler` + various fixes by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1210
* CLI to estimate and print bucket bins for a cut set by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1214
* More flexible setting of audio backends by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1219

Recipes

* Add recipe for Medical Corpus by yfyeung in https://github.com/lhotse-speech/lhotse/pull/1212
* minor fix for the AMI recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1178
* fixes compatibility with Edin. ver. VCTK dataset by JinZr in https://github.com/lhotse-speech/lhotse/pull/1182
* Minor bug fix for eval2000 recipe by JinZr in https://github.com/lhotse-speech/lhotse/pull/1127
* support far field data for icmcasr challenge by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1189
* fixed text norm for `tal_csasr` by JinZr in https://github.com/lhotse-speech/lhotse/pull/1198 https://github.com/lhotse-speech/lhotse/pull/1213

Other improvements

* `MixedCut.truncate`: fix the case when only `PaddingCut`s are left by flyingleafe in https://github.com/lhotse-speech/lhotse/pull/1157
* Fix some potential problems in OPUS file reading by yangb05 in https://github.com/lhotse-speech/lhotse/pull/1181
* fix an issue where 404 exception leaves 0 byte placeholder by JinZr in https://github.com/lhotse-speech/lhotse/pull/1190
* Prevent accidental renaming when using with_suffix by chiiyeh in https://github.com/lhotse-speech/lhotse/pull/1192
* Fix shar export for `num_jobs>1` and recordings with transforms by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1196
* fix speaker error by yzmyyff in https://github.com/lhotse-speech/lhotse/pull/1197
* Fix for `trim_to_alignments` issue by desh2608 in https://github.com/lhotse-speech/lhotse/pull/1193
* Add `deterministic_rng` to more flaky tests by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1200
* update_recipes by vesis84 in https://github.com/lhotse-speech/lhotse/pull/1208
* SpeechSynthesisDataset returns `speaker_ids` by JinZr in https://github.com/lhotse-speech/lhotse/pull/1206
* Fix audio backend selection by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1216
* save sdm files into a single mdm file to do gss by yuekaizhang in https://github.com/lhotse-speech/lhotse/pull/1221
* Modify SpeechSynthesisDataset class, make it return text by yaozengwei in https://github.com/lhotse-speech/lhotse/pull/1205
* Allow lhotse installation without torchaudio for a limited set of features by pzelasko in https://github.com/lhotse-speech/lhotse/pull/1231
* Use `attacut` module for Thai word tokenization (in MMS forced alignment) by flyingleafe in https://github.com/lhotse-speech/lhotse/pull/1232

New Contributors
* yangb05 made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1181
* chiiyeh made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1192
* yzmyyff made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1197
* yaozengwei made their first contribution in https://github.com/lhotse-speech/lhotse/pull/1205

**Full Changelog**: https://github.com/lhotse-speech/lhotse/compare/v1.17...v1.18

Page 1 of 6

Releases

Has known vulnerabilities

Lhotse

Page 1 of 6

1.23

1.22

1.21

1.20

1.19

1.18

Page 1 of 6

Links

Releases