Realtimestt

Latest version: v0.3.93

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 5

0.3.6

- more logging for client/server:
Additional parameters for server:
- --use_extended_logging, writes extensive log messages for the recording worker, that processes the audio chunks
- --debug, enables debug logging for detailed server operations
- --logchunks, enables logging of incoming audio chunks (periods)
- --writechunks, saves received audio chunks to a WAV file
Additional parameters for client:
- --debug, enables debug logging for detailed client operations
- --writechunks, saves recorded audio chunks to a WAV file
- more logging for AudioToTextRecorder when called with use_extended_logging = True
- new init_realtime_after_seconds parameter for AudioToTextRecorder to finetune the default of 0.2s

0.3.5

- some upgrades and bugfixes for cli and server (linux support)

0.3.4

- some upgrades and bugfixes for server
- v0.3.2 yanked

0.3.2

New Features:
- server/stt_server.py and AudioToTextRecorderClient class now support wake words (all parameters and callbacks of AudioToTextRecorder should now have been already implemented into AudioToTextRecorderClient class, please write an issue if you miss a functionality)
- update microphone reconnect

0.3.1

New Features:
- **AudioToTextRecorderClient class**: automatically starts a server if none is running and connects to it. The class shares the same interface as AudioToTextRecorder, making it easy to upgrade or switch between the two. (Work in progress, most parameters and callbacks of AudioToTextRecorder are already implemented into AudioToTextRecorderClient, but not all. Also the server can not handle concurrent (parallel) requests yet.)
- **New reworked CLI interface**: "stt-server" to start the server, "stt" to start the client, look at "server" folder for more info

- fixed 127
- integrated PR 131

0.3.0

New Features:
- **Soundcard Compatibility**: Automatically adjusts from 48kHz downwards if 16kHz is unsupported, resampling to 16kHz.
- **Early Transcription**: Added `early_transcription_on_silence` parameter to enable transcription during speech pauses, reducing overall latency.
- **Transcription Process Optimizations**: Transcription process outsourced into separate class and optimized pipe communication for more stability and faster pipe communication, leading to fewer occurrances of audio chunks getting discarded due to queue size overflows.
- **Immediate Listen State**: Fixed issue soi the system immediately returns to the listening state right after stopping the recording, preventing lost chunks.
- **Improved Logging**: Always logs debug messages to a file, even if not explicitly configured. Option to disable logging with `no_log_file` parameter.
- **Transcription Time Display**: New `print_transcription_time` parameter to show model processing time.

Bugfixes:
- **Chunk Handling**: Enhanced chunk handling with the new `allowed_latency_limit` parameter, reducing dropped data during high-latency scenarios.

Page 2 of 5

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.