Faster-whisper

Latest version: v1.1.1

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Page 3 of 6

1.1.1

What's Changed
* Brings back original VAD parameters naming by Purfview in https://github.com/SYSTRAN/faster-whisper/pull/1181
* Make batched `suppress_tokens` behaviour same as in sequential by Purfview in https://github.com/SYSTRAN/faster-whisper/pull/1194
* Fixes OOM Errors - too high RAM usage by VAD by Purfview in https://github.com/SYSTRAN/faster-whisper/pull/1198
* Add duration of audio and VAD removed duration to `BatchedInferencePipeline` by greenw0lf in https://github.com/SYSTRAN/faster-whisper/pull/1186
* Fix `neg_threshold` by Purfview in https://github.com/SYSTRAN/faster-whisper/pull/1191

New Contributors
* greenw0lf made their first contribution in https://github.com/SYSTRAN/faster-whisper/pull/1186

**Full Changelog**: https://github.com/SYSTRAN/faster-whisper/compare/v1.1.0...v1.1.1

1.1

Updated the default settings within main.py and added detailed instructions. I've also quantized any and all whisper models myself and put them on huggingface.co, which the script now utilizes.

Read instructions within main.py first before changing the recommended size and/or quantization level.

Also added TWO .exe, no installation necessary.

* CUDA version uses small.en model
* The other uses base.en, cpu-only, 4 threads.

1.1.0

New Features
* New batched inference that is 4x faster and accurate, Refer to [README](https://github.com/SYSTRAN/faster-whisper/tree/v1.1.0?tab=readme-ov-file#batched-transcription) on usage instructions.
* Support for the new `large-v3-turbo` model.
* VAD filter is now 3x faster on CPU.
* Feature Extraction is now 3x faster.
* Added `log_progress` to `WhisperModel.transcribe` to print transcription progress.
* Added `multilingual` option to transcription to allow transcribing multilingual audio. Note that Large models already have codeswitching capabilities, so this is mostly beneficial to `medium` model or smaller.
* `WhisperModel.detect_language` now has the option to use VAD filter and improved language detection using `language_detection_segments` and `language_detection_threshold`.

Bug Fixes
* Use correct features padding for encoder input when `chunk_length` <30s
* Use correct `seek` value in output

Other Changes
* replace `NamedTuple` with `dataclass` in `Word`, `Segment`, `TranscriptionOptions`, `TranscriptionInfo`, and `VadOptions`, this allows conversion to `json` without nesting. Note that `_asdict()` method is still available in `Word` and `Segment` classes for backward compatibility but will be removed in the next release, you can use `dataclasses.asdict()` instead.
* Added new tests for development
* Updated benchmarks in the Readme
* use `jiwer` instead of `evaluate` in benchmarks
* Filter out non_speech_tokens in suppressed tokens by jordimas in https://github.com/SYSTRAN/faster-whisper/pull/898

New Contributors
* Jiltseb made their first contribution in https://github.com/SYSTRAN/faster-whisper/pull/856
* heimoshuiyu made their first contribution in https://github.com/SYSTRAN/faster-whisper/pull/1092

**Full Changelog**: https://github.com/SYSTRAN/faster-whisper/compare/v1.0.3...v1.1.0

1.0.3

Upgrade Silero-Vad model to latest V5 version (https://github.com/SYSTRAN/faster-whisper/pull/884)

Silero-vad V5 release: https://github.com/snakers4/silero-vad/releases/tag/v5.0
- window_size_samples parameter is fixed at 512.
- Change to use the state variable instead of the existing h and c variables.
- Slightly changed internal logic, now some context (part of previous chunk) is passed along with the current chunk.
- Change the dimensions of the state variable from 64 to 128.
- Replace ONNX file with V5 version

Other changes

* Improve language detection when using clip_timestamps (867)
* Docker file improvements (848)
* Fix 839 incorrect clip_timestamps being used in model (842)

1.0.2

* Add support for distil-large-v3 (https://github.com/SYSTRAN/faster-whisper/pull/755)
The latest Distil-Whisper model, [distil-large-v3](https://huggingface.co/distil-whisper/distil-large-v3-ct2), is intrinsically designed to work with the OpenAI sequential algorithm.

* Benchmarks (https://github.com/SYSTRAN/faster-whisper/pull/773)
Introduces functionality to measure benchmarking for memory, Word Error Rate (WER), and speed in Faster-whisper.

* Support initializing more whisper model args (https://github.com/SYSTRAN/faster-whisper/pull/807)

* Small bug fix:
* code breaks if audio is empty (https://github.com/SYSTRAN/faster-whisper/pull/768)
* Foolproof: Disable VAD if clip_timestamps is in use (https://github.com/SYSTRAN/faster-whisper/pull/769)
* make faster_whisper.assets as a valid python package to distribute (https://github.com/SYSTRAN/faster-whisper/pull/774)
* Loosen tokenizers version constraint (https://github.com/SYSTRAN/faster-whisper/pull/804)
* CUDA version and updated installation instructions (https://github.com/SYSTRAN/faster-whisper/pull/785)

* New feature from original openai Whisper project:
* Feature/add hotwords (https://github.com/SYSTRAN/faster-whisper/pull/731)
* Improve language detection (https://github.com/SYSTRAN/faster-whisper/pull/732)

1.0.1

* Bug fixes and performance improvements:
* Update logic to get segment from features before encoding (https://github.com/SYSTRAN/faster-whisper/pull/705)
* Fix window end heuristic for hallucination_silence_threshold (https://github.com/SYSTRAN/faster-whisper/pull/706)

Page 3 of 6

Releases

Has known vulnerabilities

Previous Next

Faster-whisper

Page 3 of 6

1.1.1

1.1

1.1.0

1.0.3

1.0.2

1.0.1

Page 3 of 6

Links

Releases