Faster-whisper

Latest version: v1.1.1

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 6

0.7.1

* Fix a bug related to `no_speech_threshold`: when the threshold was met for a segment, the next 30-second window reused the same encoder output and was also considered as non speech
* Improve selection of the final result when all temperature fallbacks failed by returning the result with the best log probability

0.7.0

Improve word-level timestamps heuristics

Some recent improvements from openai-whisper are ported to faster-whisper:

* Squash long words at window and sentence boundaries (https://github.com/openai/whisper/commit/255887f219e6b632bc1a6aac1caf28eecfca1bac)
* Improve timestamp heuristics (https://github.com/openai/whisper/commit/f572f2161ba831bae131364c3bffdead7af6d210)

Support download of user converted models from the Hugging Face Hub

The `WhisperModel` constructor now accepts any repository ID as argument, for example:

python
model = WhisperModel("username/whisper-large-v2-ct2")


The utility function `download_model` has been updated similarly.

Other changes

* Accept an iterable of token IDs for the argument `initial_prompt` (useful to include timestamp tokens in the prompt)
* Avoid computing higher temperatures when `no_speech_threshold` is met (same as https://github.com/openai/whisper/commit/e334ff141d5444fbf6904edaaf408e5b0b416fe8)
* Fix truncated output when using a prefix without disabling timestamps
* Update the minimum required CTranslate2 version to 3.17.0 to include the latest fixes

0.6.0

Extend `TranscriptionInfo` with additional properties

* `all_language_probs`: the probability of each language (only set when `language=None`)
* `vad_options`: the VAD options that were used for this transcription

Improve robustness on temporary connection issues to the Hugging Face Hub

When the model is loaded from its name like `WhisperModel("large-v2")`, a request is made to the Hugging Face Hub to check if some files should be downloaded.

It can happen that this request raises an exception: the Hugging Face Hub is down, the internet is temporarily disconnected, etc. These types of exception are now catched and the library will try to directly load the model from the local cache if it exists.

Other changes

* Enable the `onnxruntime` dependency for Python 3.11 as the latest version now provides binary wheels for Python 3.11
* Fix occasional `IndexError` on empty segments when using `word_timestamps=True`
* Export `__version__` at the module level
* Include missing requirement files in the released source distribution

0.5.1

Fix `download_root` to correctly set the cache directory where the models are downloaded.

0.5.0

Improved logging

Some information are now logged under `INFO` and `DEBUG` levels. The logging level can be configured like this:

python
import logging

logging.basicConfig()
logging.getLogger("faster_whisper").setLevel(logging.DEBUG)


More control over model downloads

New arguments were added to the `WhisperModel` constructor to better control how the models are downloaded:

* `download_root` to specify where the model should be downloaded.
* `local_files_only` to avoid downloading the model and directly return the path to the cached model, it it exists.

Other changes

* Improve the default VAD behavior to prevent some words from being assigned to the incorrect speech chunk in the original audio
* Fix incorrect application of option `condition_on_previous_text=False` (note that the bug still exists in openai/whisper v20230314)
* Fix segment timestamps that are sometimes inconsistent with the words timestamps after VAD
* Extend the `Segment` structure with additional properties to match openai/whisper
* Rename `AudioInfo` to `TranscriptionInfo` and add a new property `options` to summarize the transcription options that were used

0.4.1

Fix some `IndexError` exceptions:

* when VAD is enabled and a predicted timestamp is after the last speech chunk
* when word timestamps are enabled and the model predicts a tokens sequence that is decoded to invalid Unicode characters

Page 5 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.