Torchaudio

Latest version: v2.6.0

Safety actively analyzes 723158 Python packages for vulnerabilities to keep your Python projects secure.

Page 15 of 16

0.6.0

Highlights

torchaudio now includes a new model module (with wav2letter included), new functionals (contrast, cvm, dcshift, overdrive, vad, phaser, flanger, biquad), datasets (GTZAN, CMU), and a new optional sox backend with support for torchscript. torchaudio now also supports Windows, with the soundfile backend.

torchaudio requires python 3.6 or more recent.

Backwards Incompatible Changes

* We reorganized the C++ resources (630) and replaced C++ bindings for sox_effects init/list/shutdown with torch binding (748).
* We removed code specific to python 2 (691), and we no longer tests against python 2 (575) and 3.5 (577)

New Features

* We now support Windows. (604, 637, 642, 655, 743)
* We now have a model module which includes wav2letter. (462, 722)
* We added the GTZAN and CMU datasets. (668, 710)
* We now have the contrast functional (551), cvm (540), dcshift (558), overdrive (569), vad (578, 599), phaser (587, 607, 702), flanger (651, 702), biquad (661).
* We added a new sox_io backend (718, 728, 734, 727, 763, 752, 731, 732, 726, 780) that is compatible with torchscript with a new AudioMetaData class (761).
* MelSpectrogram now has power and normalized parameters (633), and slaney normalization (589, 641).
* lfilter now has a clamp option. (600)
* Griffin-Lim can now have zero momentum. (601)
* sliding_window_cmn now supports batching. (570)
* Downloaded datasets now verify checksums. (499)

Improvements

* We added ogg/vorbis/opus support to binary distribution (750, 755).
* We replaced the use of torch.norm in spectrogram to improve performance (747).
* We now use fused operations in lfilter for faster computation. (517, 564)
* STFT is now called directly from torchaudio. (531)
* We redesigned the backend mechanism to support torchscript, by restructuring the code (695, 696, 700, 706, 707, 698), adding dynamic listing (697)
* torchaudio can be built along with sox, or can use external sox. (625, 669, 739)
* We redesigned the sox_effects module. (708)
* We added more details to compilation instructions. (667)
* We updated the README with instructions on changing the backend. (553)
* We now have a version compatibility matrix in README. (685)
* We now use cmake to build third party libraries (753).
* We now use CircleCI instead of travis (576, 584, 598, 603, 636, 738) and we test on GPU (586, 777).
* We run the test suite against nightlies. (538, 678)
* We redesigned our test suite: with new helper functions (514, 519, 521, 565, 616, 690, 692, 694), standard pytorch test utilities (513, 640, 643, 645, 646, 652, 650, 712), separated CPU and GPU tests (513, 528, 644), more descriptive names (532), clearer organization (539, 541, 542, 664, 672, 687, 703, 716, 732), standardized name (559), and backend aware (719). This is detailed in a new README for testing (566, 759).
* We now support typing, for datasets (511, 522), for backends (527), for init (526), and inline (530), with mypy configuration (524, 544, 590).

Bug Fixes

* We removed in place operations so that Griffin-Lim can be backpropagated through. (730)
* We fixed kaldi MFCC on GPU. (681)
* We removed multiple definitions of SoxEffect in C++. (635)
* We fixed the docstring of masking. (612)
* We replaced views by reshape for batching. (594)
* We fixed missing conda environment when testing in python 3.8. (582)
* We ensure that sox is not exposed in windows. (579)
* We corrected the instructions to install nightlies. (547, 552)
* We fix the seed of mask_along_iid. (529)
* We correctly report GPU tests as skipped instead of passed. (516)

Deprecations

* Since sox_effects is now automatically initialized and shutdown (572, 693), we are deprecating these functions (709).
* ISTFT is migrating to torch. (523)

0.5.1

Highlights

* Updated pinned version of PyTorch to [`v1.5.1`](https://github.com/pytorch/pytorch/releases/tag/v1.5.1)

0.5.0

Highlights

torchaudio includes new transforms (e.g. Griffin-Lim and inverse Mel scale), new filters (e.g. all pass, fade, band pass/reject, band, treble, deemph, riaa), and datasets (LJ Speech and SpeechCommands).

Backwards Incompatible Changes

* torchaudio no longer supports python 2. We removed future and six imports. We added inline typing. (413, 478, 479, 482, 486)
* We fixed CommonVoice dataset download, and updated to the latest version. (498)
* We now skip data point with missing data in VCTK dataset. (484)

New Features

* We now have the Vol transforms, and DB_to_amplitude.(468, 469)
* We now have the InverseMelScale (448)
* We now have the Griffin-Lim functional. (365)
* We now support allpass, fade, bandpass, bandreject, band, treble, deemph, riaa. (444, 449, 464, 470, 508)
* We now offer LJSpeech and SpeechCommands datasets. (439, 437)

Improvements

* We added inline typing to SoxEffects and Kaldi compliance. (490, 497)
* We refactored the tests. (480, 485, 496, 491, 501, 502, 503, 506, 507, 509)
* We now run tests with sox only when sox is available. (419)
* We extended batch support to MelScale, MelSpectrogram, MFCC, Resample. (391, 435)
* The speed of torchaudio.functional.istft was improved. (471)
* We now have transform and functional tests for AmplitudeToDB. (463)
* We now ignore pycharm and OSX files in git. (461)
* TimeStretch now has a batch test. (459)
* Docstrings in transforms were polished. (442)
* TimeStretch and AmplitudeToDB are now torch.nn.Module. (456)
* Resample is now jitable. (441)
* We support python 3.8. (397)
* Add cuda test for complex norm. (421)
* Dither is jitable with the latest version of pytorch. (417)
* Batching uses view instead of reshape. (409)
* We refactored the jitability test. (395)
* In .circleci, we removed a conditional block that wasn't doing anything. (399)
* We now have Windows CI for building. (394 and 398)
* We corrected the use of standard variable names in code. (393)
* We adopted native-Python code generation convention. (378)
* torchaudio.istft creates tensors directly on device. (377)
* torchaudio.compliance.kaldi.resample_waveform is now jitable. (362)
* The runtime of torchaudio.functional.lfilter was decreased. (374)

Bug Fixes

* We fixed flake8 errors. (504, 505)
* We fixed Windows test by only testing with cpu-only binaries. (489)
* Spelling correction in docstrings for transforms.FrequencyMasking and transforms.TimeMasking. (474)
* In .circleci, we switched to use token for conda uploads. (460)
* The default value of dither parameter was changed. (453)
* TimeStretch moves device correctly. (457)
* Adding dev-other option in librispeech. (433)
* In build script, we install the correct version of pytorch for pip. (412)
* Upgrading dataset DeprecationWarning to UserWarning so that the user gets the warning. (402)
* Make power of spectrogram a float to work with complex norm. (392)
* Fix random seed for flaky test_griffinlim test. (388)
* Apply 'nightly' branch filter to binary uploads. (385)
* Fixed build errors: added explicitly utf8 decoration, added explicit utf_8_encoder definition if not available, explicitly cast to int. (380)

Deprecations

* None

0.4

* We introduce an interactive speech recognition demo. (266, 229, 248)
* SoX is now optional, and a new extensible backend dispatch mechanism exposes SoundFile as an alternative to SoX.
* The interface for datasets has been unified. This enables the addition of two large datasets: LibriSpeech and Common Voice.
* New filters such as biquad, data augmentation such as time and frequency masking, and transforms such as gain and dither, and new feature computation such as deltas, are now available.
* Transformations now support batches and are jitable.

We would like to thank again our contributors and the wider community for their significant contributions to this release. In particular we'd like to thank keunwoochoi, ksanjeevan, and all the other maintainers and contributors of torchaudio-contrib for their significant and valuable additions around augmentations (285) and batching (327).

Breaking Changes

* torchaudio now requires PyTorch 1.3.0 or newer, see https://pytorch.org/ for installation instructions. (#312)
* We make jit compilation optional for functions and use nn.Module where possible. (314, 326, 342, 369)
* By unifying the interface for datasets, we changed the interface for VCTK and YESNO (303, 316). In particular, the construction parameters `downsample`, `transform`, `target_transform`, and `return_dict` are being deprecated.
* SoxEffectsChain.EFFECTS_AVAILABLE replaced by SoxEffectsChain().EFFECTS_AVAILABLE (355)
* This is the last version to support Python 2.

New Features

* SoX is now optional, and a new extensible backend dispatch mechanism exposes SoundFile as an alternative to SoX. This makes it possible to use torchaudio even when SoX or SoundFile are not installed or available. (355)
* We now have a unified dataset interface that loads in memory only one item at a time enabling new large datasets: LibriSpeech and CommonVoice. (303, 316, 330)
* We introduce a pitch detection algorithm: `torchaudio.functional.detect_pitch_frequency`. (313, 322)
* We offer data augmentations in `torchaudio.transforms`: `TimeStretch`, `FrequencyMasking`, `TimeMasking`. (285, 333, 348)
* We introduce a complex norm transform: `torchaudio.transform.ComplexNorm`. (285, 333)
* We now have a new audio feature generation for computing deltas: `torchaudio.functional.compute_deltas`. (268, 326)
* We introduce `torchaudio.functional.gain` and `torchaudio.functional.dither` (319, 360). We welcome work to continue the effort to implement features available in SoX, see 260.
* We now include `equalizer_biquad` (315, 340), `lowpass_biquad`, `highpass_biquad` (275), `lfilter`, and `biquad` (275, 291, 326) in `torchaudio.functional`.
* MFCC is available as `torchaudio.functional.mfcc`. (228)

Improvements

* We now support batching in transforms. (327, 337, 404)
* Functions are now jitable, and nn.Module is used where possible. (314, 326, 342, 362, 369, 395)
* Downloads of large files are now automatically resumed with new download function. (320)
* New tests for ISTFT are added. (279)
* We introduce nightly builds. (301)
* We now have smoke tests for builds. (346, 359)

Bug Fixes

* Fix mismatch between `MelScale` and librosa. (294)
* Fix `torchaudio.compliance.kaldi.resample_waveform` where internal variables where not moved to the GPU when used. (277)
* Fix a bug that occurred when importing torchaudio built outside of a git repository. (276)
* Fix `istft` where the `dtype` and `device` of parameters were not created on the same device as the tensor provided by the user. (264)
* Fix size mismatch when saving and loading from state dictionary (`load_state_dict`). (246)
* Clarified internal naming convention within transforms and functionals. (298)
* Fix build script to be more tolerant to download drops. (280, 284, 305)
* Correct documentation for SoxEffectsChain. (283)
* Fix resample error with cuda tensors. (277)
* Fix error when importing version outside of git. (276)
* Fix missing asound in linux build. (254)
* Fix deprecated torch. (254)
* Fix link in README. (253)
* Fix window device in ISTFT. (240)
* Documentation: Fix range in documentation for `torchaudio.load` to [-1, 1]. (283)

0.4.0

0.3.2

This release is to update the dependency to PyTorch 1.3.1.

Page 15 of 16

Releases

Has known vulnerabilities

Previous Next

Torchaudio

Page 15 of 16

0.6.0

0.5.1

0.5.0

0.4

0.4.0

0.3.2

Page 15 of 16

Links

Releases