Realtimetts

Latest version: v0.4.21

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 7

0.3.2

- fixes a bug causing play_async() to fail after using stop(). Affected all engines.
- added tqdm to requirements (used to show progressbar while coqui model downloads)

0.3.0

To use the new tokenizer please call the ´play´ or ´play_async´ methods with the new parameter ´tokenizer="stanza"´ and provide the language shortcut. Also adjust ´minimum_sentence_length´, ´minimum_first_fragment_length´ and ´context_size´ parameters to the average word length of the desired language.

For example (chinese):
python
self.stream.play_async(
minimum_sentence_length = 2,
minimum_first_fragment_length = 2,
tokenizer="stanza",
language="zh",
context_size=2)


Example implementations [here](https://github.com/KoljaB/RealtimeTTS/blob/master/tests/chinese_test.py) and [here](https://github.com/KoljaB/RealtimeTTS/blob/master/tests/pyqt6_speed_test_chinese.py).

Fallback engines

Fallback now supported for azure, coqui and system engine (elevenlabs coming soon), enhancing reliability for real-time scenarios by switching to alternate engines if one fails

To use the fallback mechanism **just submit a list of engines** to the TextToAudioStream constructor instead of a single engine. In case the synthesis of the first engine in the list throws an exception or gives a result hinting to a not successful synthesis the next engine in the list will be tried.

For example:
python
engines = [AzureEngine(azure_speech_key, azure_speech_region),
coqui_engine,
system_engine]
stream = TextToAudioStream(engines)


Example implementation [here](https://github.com/KoljaB/RealtimeTTS/blob/master/tests/fallback_test.py).

Audio file saving feature

Usage via the `output_wavfile` parameter of ´play´ and ´play_async´ methods. This allows for the simultaneous saving of real-time synthesized audio, enabling later playback of the live synthesis.

For example:
python
filename = "synthesis_" + engine.engine_name
stream.play(output_wavfile = f"{filename}.wav")


Also compare to usage [here](https://github.com/KoljaB/RealtimeTTS/blob/master/tests/chinese_test.py).

0.2.7

- added specific_model parameter which allows using XTTS model checkpoints like "2.0.2" or "2.0.1"
- currently set to "2.0.2" as a default because 2.0.3 seems to perform worse
- set to None if you want to always use the latest model
- added local_models_path
- if not specified specific_model models will be loaded in a directory "models" in the script directory
- if no specific_model is set it will use coqui's default model directory (users/start/appdata/local/tts under Windows)
- added use_deepspeed parameter in case anybody has it installed
- added prepare_text_for_synthesis_callback parameter in case the default handler for preparing text for synthesis fails (maybe due to using a language I am not familiar with) and you want to implement your own

0.2.6

- metal shader support configurable via coqui engine constructor (set env variable PYTORCH_ENABLE_MPS_FALLBACK=1 to use mps)
- fix for sentence end punctuation handling bug

0.2.4

- added more config options to CoquiEngine constructor
- support for metal shaders gpu acceleration (you may need to set the environment variable `PYTORCH_ENABLE_MPS_FALLBACK=1` to use the CPU as a fallback for ops not implemented in mps yet)
- default voice added in case no wav or speaker latents json file present

0.2.1

General stability improvements around Coqui XTTS 2.0 synthesis

Page 6 of 7

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.