Gateway API
The Gateway API is a load balancing and resiliency solution for embeddings. It sits in front of Azure OpenAI, serving vectorization embedding requests with the correct model and automatically handling rate limits.
- Vectorization Text Embedding Profiles can be configured to use `GatewayTextEmbedding`, complementing the existing `SemanticKernelTextEmbedding`
- Vectorization with the Gateway API only supports asynchronous requests
Agent RBAC
Agent-level RBAC enables FoundationaLLM administrators to manage access to individual agents, protecting organizations from data exfiltration. When a user creates an agent through the Management API, they will automatically be granted Owner access.
Vectorization Request Management Through the Management API
Users can submit and trigger Vectorization requests through the Management API, rather than the separate Vectorization API, improving consistency across the platform. Creating and triggering Vectorization requests are handled as two separate HTTP requests.
Citations Available in the Chat UI
Knowledge Management agents without Inline Contexts will include citations, indicating the document from the vector store used to answer the user's request.
Agent to Agent Conversations
Through the Semantic Kernel API, FoundationaLLM enables robust agent-to-agent interactions. Users can develop complex, multi-agent workflows that perform well across a variety of tasks.
End to end Testing architecture
With the release of 0.7.0, FoundationaLLM has established an elaborate architecture for E2E testing
Improvements
- User portal session linking and loading improvements
- Documentation updates for ACA and AKS deployments
- Added fix to ensure API keys are unique
- Some restructuring of folders and file movement
- Added support for prompt injection detection
- Added support for authorizing multiple resources in a single request
- Vectorization pipeline execution and state management improvements
- Added the ability for invocation of external orchestration services
- Added the ability to create OneLake synchronous and asynchronous vectorization
- Added support for GPT-3.5 1106 and GPT-4o