Livekit-agents

Latest version: v0.11.3

Safety actively analyzes 682416 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 4

0.8.3

There were breaking changes from v0.7.x to v0.8.x. See the full 0.8.0 changelog [here](https://github.com/livekit/agents/releases/tag/livekit-agents%400.8.0)

Patch Changes

- voiceassistant: run function calls sequentially - [554](https://github.com/livekit/agents/pull/554) ([theomonnom](https://github.com/theomonnom))

- configure plugins loggers & more debug logs on the voiceassistant - [555](https://github.com/livekit/agents/pull/555) ([theomonnom](https://github.com/theomonnom))

- warn no room connection after job_entry was called after 10 seconds. - [558](https://github.com/livekit/agents/pull/558) ([theomonnom](https://github.com/theomonnom))

- deepgram: reduce chunks size to 100ms - [561](https://github.com/livekit/agents/pull/561) ([theomonnom](https://github.com/theomonnom))

- voiceassistant: cleanup validation behaviour 545 - [553](https://github.com/livekit/agents/pull/553) ([theomonnom](https://github.com/theomonnom))

- voiceassistant: commit user question directly when allow_interruptions=False - [547](https://github.com/livekit/agents/pull/547) ([theomonnom](https://github.com/theomonnom))

- ipc: increase high ping threshold - [556](https://github.com/livekit/agents/pull/556) ([theomonnom](https://github.com/theomonnom))

- voiceassistant: interrupt on final transcript - [546](https://github.com/livekit/agents/pull/546) ([theomonnom](https://github.com/theomonnom))

- voiceassistant: tweaks & fix speech being removed too soon from the queue - [560](https://github.com/livekit/agents/pull/560) ([theomonnom](https://github.com/theomonnom))

- voiceassistant: fix duplicate answers - [548](https://github.com/livekit/agents/pull/548) ([theomonnom](https://github.com/theomonnom))

- reduce the default load threshold to a more appropriate default - [559](https://github.com/livekit/agents/pull/559) ([theomonnom](https://github.com/theomonnom))

0.8.2

There were breaking changes from v0.7.x to v0.8.x. See the full 0.8.0 changelog [here](https://github.com/livekit/agents/releases/tag/livekit-agents%400.8.0)

Patch Changes

- fix: remove unnecessary async function - [540](https://github.com/livekit/agents/pull/540) ([Nabil372](https://github.com/Nabil372))


livekit-plugins-silero0.6.1
Patch Changes

- fix end_input not flushing & unhandled flush messages - [528](https://github.com/livekit/agents/pull/528) ([theomonnom](https://github.com/theomonnom))


livekit-plugins-openai0.7.1
Patch Changes

- set timeout to 5 seconds - [524](https://github.com/livekit/agents/pull/524) ([nbsp](https://github.com/nbsp))


livekit-plugins-google0.6.1
Patch Changes

- fix end_input not flushing & unhandled flush messages - [528](https://github.com/livekit/agents/pull/528) ([theomonnom](https://github.com/theomonnom))


livekit-plugins-elevenlabs0.7.1
Patch Changes

- fix end_input not flushing & unhandled flush messages - [528](https://github.com/livekit/agents/pull/528) ([theomonnom](https://github.com/theomonnom))


livekit-plugins-deepgram0.6.1
Patch Changes

- fix end_input not flushing & unhandled flush messages - [528](https://github.com/livekit/agents/pull/528) ([theomonnom](https://github.com/theomonnom))


livekit-plugins-azure0.3.1
Patch Changes

- fix end_input not flushing & unhandled flush messages - [528](https://github.com/livekit/agents/pull/528) ([theomonnom](https://github.com/theomonnom))

0.8.1

There were breaking changes from v0.7.x to v0.8.x. Please reference the full 0.8.0 changelog [here](https://github.com/livekit/agents/releases/tag/livekit-agents%400.8.0)

Patch Changes

- update livekit-rtc to v0.12.0 - [535](https://github.com/livekit/agents/pull/535) ([theomonnom](https://github.com/theomonnom))

- automatically create stt.StreamAdapter when provided stt doesn't support streaming - [536](https://github.com/livekit/agents/pull/536) ([theomonnom](https://github.com/theomonnom))

- update examples to the latest API & export AutoSubscribe - [534](https://github.com/livekit/agents/pull/534) ([theomonnom](https://github.com/theomonnom))

- fix end_input not flushing & unhandled flush messages - [528](https://github.com/livekit/agents/pull/528) ([theomonnom](https://github.com/theomonnom))


livekit-plugins-silero0.6.0
Minor Changes

- dev prerelease - [435](https://github.com/livekit/agents/pull/435) ([theomonnom](https://github.com/theomonnom))

Patch Changes

- test release - [435](https://github.com/livekit/agents/pull/435) ([theomonnom](https://github.com/theomonnom))

- pull: '--rebase --autostash ...' - [435](https://github.com/livekit/agents/pull/435) ([theomonnom](https://github.com/theomonnom))

- Default loglevel to warn - [472](https://github.com/livekit/agents/pull/472) ([lukasIO](https://github.com/lukasIO))

- bump versions to update dependencies - [510](https://github.com/livekit/agents/pull/510) ([theomonnom](https://github.com/theomonnom))

- test release - [435](https://github.com/livekit/agents/pull/435) ([theomonnom](https://github.com/theomonnom))

- fix changesets release CI - [435](https://github.com/livekit/agents/pull/435) ([theomonnom](https://github.com/theomonnom))

0.8.0

v0.8.0 is our biggest release yet, featuring significant reliability improvements to VoiceAssistant. This update includes a few breaking API changes that will impact the way you build your agents. We strive to minimize breaking changes and will stabilize the API as we approach version 1.0.

Migrating to v0.8.0 (Breaking Changes)
<details>
<summary><h2>Job and Worker</h2></summary>

entrypoint moved from req.accept() to WorkerOptions

Previously the job entrypoint was in the req.accept() method call. Now the job entrypoint has been moved into WorkerOptions.

namespace removed

The WorkerOptions namespace field has been removed and will be replaced in the future.

explict connection to the room

You now need to call ctx.connect() to initiate the connection to the room. This allows for pre-connect setup (such as callback registrations) to avoid race conditions.

The following shows a minimal_worker.py example:

python
from livekit.agents import JobContext, JobRequest, WorkerOptions, cli

async def job_entrypoint(ctx: JobContext):
await ctx.connect()
...

if __name__ == "__main__":
cli.run_app(
WorkerOptions(entrypoint_fnc=job_entrypoint)
)

</details>

<details>
<summary><h2>LLM</h2></summary>

> 💡 These changes may not be relevant to users of the VoiceAssistant class.

The LLM class has been restructured to enhance ergonomics and improve the function calling experience.

Function/tool calling

Function calling has gotten a complete overhaul in v0.8.0. Most the the changes are additive and can be found in the New Features section.

The primary breaking change is that function calls are now **NOT** automatically invoked when iterating the LLM stream. `LLMStream.execute_functions` needs to be called instead.

TODO: insert code snipper showing some ai_callable fncs

`LLM.chat()` is no longer an async method

Previously, LLM.chat() was an async method that returned an LLMStream (which itself was an AsyncIterable).

We found it easier and less-confusing for LLM.chat() to be synchronous, while still returning the same AsyncIterable LLMStream.

LLM.chat ‘history’ has been renamed to ‘chat_ctx’

In order to improve consistency and reduce confusion.

TODO: insert code snippet

</details>

<details>
<summary><h2>STT</h2></summary>

> 💡 These changes may not be relevant to users of the VoiceAssistant class.

SpeechStream.flush()

Previously, to communicate to a STT provider that you have sent enough input to generate a response - you could push_frame(None) to coax the TTS into synthesizing a response.

In v0.8.0 that API has been removed and replaced with flush()

SpeechStream.end_input()

`end_input` signals to the STT provider that the input is complete and no additional input will follow. Previously, this was done using aclose(wait=True).

SpeechStream.aclose()

The “wait” arg of aclose has been removed in favor of SpeechStream.end_input (see above). Now, if you call TTS.aclose() without first calling STT.end_input, the behavior will be that the request is cancelled.

python
stt_stream = my_stt_instance.stream()
async for ev in audio_stream:
stt_stream.push_frame(ev.frame)
optionally flush when enough frames have been pushed
stt_stream.flush()

stt_stream.end_input()
await stt_stream.aclose()


</details>

<details>
<summary><h2>TTS</h2></summary>

> 💡 These changes may not be relevant to users of the VoiceAssistant class.

SynthesizedAudio changed and SynthesisEvent removed

Most of the fields of the SynthesizedAudio dataclass have been changed:

python
New SynthesizedAudio dataclass
dataclass
class SynthesizedAudio:
request_id: str
"""Request ID (one segment could be made up of multiple requests)"""
segment_id: str
"""Segment ID, each segment is separated by a flush"""
frame: rtc.AudioFrame
"""Synthesized audio frame"""
delta_text: str = ""
"""Current segment of the synthesized audio"""

Old SynthesizedAudio dataclass
dataclass
class SynthesizedAudio:
text: str
data: rtc.AudioFrame


The SynthesisEvent has been removed entirely. All occurrences of it have been replaced with SynthesizedAudio

SynthesizeStream.flush()

Similar to the STT changes, this coaxes the TTS provider into generating a response. The SynthesizedAudio response will have a new segment_id after calls to flush().

SynthesizeStream.end_input()

Similar to the STT changes, this replaces aclose(wait=True).

SynthesizeStream.aclose()

Similar to the STT changes, the wait arg has been removed.

python
tts_stream = my_tts_instance.stream()
tts_stream.push_text("This is the first sentence")
tts_stream.flush()
tts_stream.push_text("This is the second sentence")
tts_stream.end_input()
await tts_stream.aclose()


</details>

<details>
<summary><h2>VAD</h2></summary>

flush(), end_input(), aclose()

The same changes made to STT and TTS have also been made to VAD

python
vad_stream = my_vad_instance.stream()
async for ev in audio_stream:
vad_stream.push_frame(ev.frame)
optionally flush when enough frames have been pushed
vad_stream.flush()

vad_stream.end_input()
await vad_stream.aclose()

</details>

<details>
<summary><h2>VoiceAssistant</h2></summary>

Much of the VoiceAssistant API remains unchanged, despite significant improvements to functionality and internals. However, there have been changes to the configuration.

Initialization args

- Removed
- base_volume
- debug
- sentence_tokenizer, word_tokenizer, hyphenate_word
- Changed
- transcription related options now all fall into the “transcription” arg

python
class VoiceAssistant(utils.EventEmitter[EventTypes]):
def __init__(
self,
*,
vad: vad.VAD,
stt: stt.STT,
llm: LLM,
tts: tts.TTS,
chat_ctx: ChatContext | None = None,
fnc_ctx: FunctionContext | None = None,
allow_interruptions: bool = True,
interrupt_speech_duration: float = 0.6,
interrupt_min_words: int = 0,
preemptive_synthesis: bool = True,
transcription: AssistantTranscriptionOptions = AssistantTranscriptionOptions(),
will_synthesize_assistant_reply: WillSynthesizeAssistantReply = _default_will_synthesize_assistant_reply,
plotting: bool = False,
loop: asyncio.AbstractEventLoop | None = None,
) -> None:
...

</details>

New features

Job and Worker

- New prewarm_fnc in WorkerOptions that can be used to setup agent subprocesses before the agent joins the room. Useful for things like loading model weights.
- New num_idle_processes in WorkerOptions for keeping a process pool available for subsequent agents. This improves the latency of agents joining rooms and being ready to participate.
- Health server listens on 0.0.0.0 by default now instead of localhost

LLM

- You can now add AI functions at runtime.
- AI functions can now return values and throw exceptions. The return values and exception are automatically added to the chat_ctx so the LLM is aware of them.

VAD

- livekit-plugins-silero
- The onnx runtime is used directly now which removes pytorch dependency
- Model weights are included in the python package itself, you no longer need to download model weights as a build step
- The model has been updated to the latest silero model (V5) which has improved [accuracy](https://github.com/snakers4/silero-vad/issues/2#issuecomment-2195433115)
- Logic fixes to inference + hidden state which improves accuracy

TTS

- A new Cartesia plugin has been introduced
- SynthesizeStream now has flush() and end_input() for better control over which text input to audio output synchronization
- SynthesizedAudio now has a segment_id for more granularity around what audio corresponds to what input text

VoiceAssistant

- Big improvements and bug fixes to interrupt logic
- Bug fixes for duplicated responses
- Bug fixes for stuck responses

RAG

- New livekit-plugins-rag package to help with RAG related tasks
- Index builder for creating searchable index
- Nearest neighbor search on indexes based on spotify annoy library

New Contributors

Thanks to Ocupe mattherzog lukasIO seanmuirhead PaulLockett CalinR cs50victor vanics brightsparc ty-elastic naman-scogo eltociear hauntsaninja minhpq331 nbsp for their first contributions on the project!

**Full Detailed Changes**: [https://github.com/livekit/agents/compare/3c340eabfc6fc42bcd88fb08c90c101463cca8f5..596ac9042b3ecbe40c270d035d5da8f25474e569?diff=split&w=](https://github.com/livekit/agents/compare/3c340eabfc6fc42bcd88fb08c90c101463cca8f5..596ac9042b3ecbe40c270d035d5da8f25474e569?diff=split&w=)

Page 4 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.