Clientai

Latest version: v0.3.3

Safety actively analyzes 685525 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.3.3

What's Changed
* Some improvements by igorbenav in https://github.com/igorbenav/clientai/pull/11

**Full Changelog**: https://github.com/igorbenav/clientai/compare/v0.3.2...v0.3.3

0.3.2

What's Changed
* groq-fix by igorbenav in https://github.com/igorbenav/clientai/pull/10


**Full Changelog**: https://github.com/igorbenav/clientai/compare/v0.3.1...v0.3.2

0.3.1

Groq Support

python title="groq_setup.py" hl_lines="4"
from clientai import ClientAI

Initialize the Groq client
client = ClientAI('groq', host="your-ollama-host")

Now you can use the client for text generation or chat


Groq-Specific Parameters in ClientAI

This guide covers the Groq-specific parameters that can be passed to ClientAI's `generate_text` and `chat` methods. These parameters are passed as additional keyword arguments to customize Groq's behavior.

generate_text Method

Basic Structure
python
from clientai import ClientAI

client = ClientAI('groq', api_key="your-groq-api-key")
response = client.generate_text(
prompt="Your prompt here", Required
model="llama3-8b-8192", Required
frequency_penalty=0.5, Groq-specific
presence_penalty=0.2, Groq-specific
max_tokens=100, Groq-specific
response_format={"type": "json"}, Groq-specific
seed=12345, Groq-specific
temperature=0.7, Groq-specific
top_p=0.9, Groq-specific
n=1, Groq-specific
stop=["END"], Groq-specific
stream=False, Groq-specific
stream_options=None, Groq-specific
functions=None, Groq-specific (Deprecated)
function_call=None, Groq-specific (Deprecated)
tools=None, Groq-specific
tool_choice=None, Groq-specific
parallel_tool_calls=True, Groq-specific
user="user_123" Groq-specific
)


Groq-Specific Parameters

`frequency_penalty: Optional[float]`
- Range: -2.0 to 2.0
- Default: 0
- Penalizes tokens based on their frequency in the text
python
response = client.generate_text(
prompt="Write a creative story",
model="llama3-8b-8192",
frequency_penalty=0.7 Reduces repetition
)


`presence_penalty: Optional[float]`
- Range: -2.0 to 2.0
- Default: 0
- Penalizes tokens based on their presence in prior text
python
response = client.generate_text(
prompt="Write a varied story",
model="llama3-8b-8192",
presence_penalty=0.6 Encourages topic diversity
)


`max_tokens: Optional[int]`
- Maximum tokens for completion
- Limited by model's context length
python
response = client.generate_text(
prompt="Write a summary",
model="llama3-8b-8192",
max_tokens=100
)


`response_format: Optional[Dict]`
- Controls output structure
- Requires explicit JSON instruction in prompt
python
response = client.generate_text(
prompt="List three colors in JSON",
model="llama3-8b-8192",
response_format={"type": "json_object"}
)


`seed: Optional[int]`
- For deterministic generation
python
response = client.generate_text(
prompt="Generate a random number",
model="llama3-8b-8192",
seed=12345
)


`temperature: Optional[float]`
- Range: 0 to 2
- Default: 1
- Controls randomness in output
python
response = client.generate_text(
prompt="Write creatively",
model="llama3-8b-8192",
temperature=0.7 More creative output
)


`top_p: Optional[float]`
- Range: 0 to 1
- Default: 1
- Alternative to temperature, called nucleus sampling
python
response = client.generate_text(
prompt="Generate text",
model="llama3-8b-8192",
top_p=0.1 Only consider top 10% probability tokens
)


`n: Optional[int]`
- Default: 1
- Number of completions to generate
- Note: Currently only n=1 is supported
python
response = client.generate_text(
prompt="Generate a story",
model="llama3-8b-8192",
n=1
)


`stop: Optional[Union[str, List[str]]]`
- Up to 4 sequences where generation stops
python
response = client.generate_text(
prompt="Write until you see END",
model="llama3-8b-8192",
stop=["END", "STOP"] Stops at either sequence
)


`stream: Optional[bool]`
- Default: False
- Enable token streaming
python
for chunk in client.generate_text(
prompt="Tell a story",
model="llama3-8b-8192",
stream=True
):
print(chunk, end="", flush=True)


`stream_options: Optional[Dict]`
- Options for streaming responses
- Only used when stream=True
python
response = client.generate_text(
prompt="Long story",
model="llama3-8b-8192",
stream=True,
stream_options={"chunk_size": 1024}
)


`user: Optional[str]`
- Unique identifier for end-user tracking
python
response = client.generate_text(
prompt="Hello",
model="llama3-8b-8192",
user="user_123"
)


chat Method

Basic Structure
python
response = client.chat(
model="llama3-8b-8192", Required
messages=[...], Required
tools=[...], Groq-specific
tool_choice="auto", Groq-specific
parallel_tool_calls=True, Groq-specific
response_format={"type": "json"}, Groq-specific
temperature=0.7, Groq-specific
frequency_penalty=0.5, Groq-specific
presence_penalty=0.2, Groq-specific
max_tokens=100, Groq-specific
seed=12345, Groq-specific
stop=["END"], Groq-specific
stream=False, Groq-specific
stream_options=None, Groq-specific
top_p=0.9, Groq-specific
n=1, Groq-specific
user="user_123" Groq-specific
)


Groq-Specific Parameters

`tools: Optional[List[Dict]]`
- List of available tools (max 128)
python
response = client.chat(
model="llama3-70b-8192",
messages=[{"role": "user", "content": "What's the weather?"}],
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather data",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
}
}
}
}]
)


`tool_choice: Optional[Union[str, Dict]]`
- Controls tool selection behavior
- Values: "none", "auto", "required"
python
response = client.chat(
model="llama3-70b-8192",
messages=[{"role": "user", "content": "Calculate something"}],
tool_choice="auto" or "none" or "required"
)


`parallel_tool_calls: Optional[bool]`
- Default: True
- Enable parallel function calling
python
response = client.chat(
model="llama3-70b-8192",
messages=[{"role": "user", "content": "Multiple tasks"}],
parallel_tool_calls=True
)


Complete Examples

Example 1: Structured Output with Tools
python
response = client.chat(
model="llama3-70b-8192",
messages=[
{"role": "system", "content": "You are a data assistant"},
{"role": "user", "content": "Get weather for Paris"}
],
response_format={"type": "json_object"},
tools=[{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get weather data",
"parameters": {
"type": "object",
"properties": {
"location": {"type": "string"}
}
}
}
}],
tool_choice="auto",
temperature=0.7,
max_tokens=200,
seed=42
)


Example 2: Advanced Text Generation
python
response = client.generate_text(
prompt="Write a technical analysis",
model="mixtral-8x7b-32768",
max_tokens=500,
frequency_penalty=0.7,
presence_penalty=0.6,
temperature=0.4,
top_p=0.9,
stop=["END", "CONCLUSION"],
user="analyst_1",
seed=42
)


Example 3: Streaming Generation
python
for chunk in client.chat(
model="llama3-8b-8192",
messages=[{"role": "user", "content": "Explain quantum physics"}],
stream=True,
temperature=0.7,
max_tokens=1000,
stream_options={"chunk_size": 1024}
):
print(chunk, end="", flush=True)


Parameter Validation Notes

1. Both `model` and `prompt`/`messages` are required
2. Model must be one of: "gemma-7b-it", "llama3-70b-8192", "llama3-8b-8192", "mixtral-8x7b-32768"
3. `n` parameter only supports value of 1
4. `stop` sequences limited to 4 maximum
5. Tool usage limited to 128 functions
6. `response_format` requires explicit JSON instruction in prompt
7. Parameters like `logprobs`, `logit_bias`, and `top_logprobs` are not yet supported
8. Deterministic generation with `seed` is best-effort
9. `functions` and `function_call` are deprecated in favor of `tools` and `tool_choice`

These parameters allow you to fully customize Groq's behavior while working with ClientAI's abstraction layer.

What's Changed
* documentation fixes by igorbenav in https://github.com/igorbenav/clientai/pull/7
* Groq support by igorbenav in https://github.com/igorbenav/clientai/pull/8
* mkdocs settings fix by igorbenav in https://github.com/igorbenav/clientai/pull/9


**Full Changelog**: https://github.com/igorbenav/clientai/compare/v0.3.0...v0.3.1

0.3.0

Ollama Manager Guide

Introduction

Ollama Manager provides a streamlined way to prototype and develop applications using Ollama's AI models. Instead of manually managing the Ollama server process, installing it as a service, or running it in a separate terminal, Ollama Manager handles the entire lifecycle programmatically.

**Key Benefits for Prototyping:**
- Start/stop Ollama server automatically within your Python code
- Configure resources dynamically based on your needs
- Handle multiple server instances for testing
- Automatic cleanup of resources
- Platform-independent operation

Quick Start

python
from clientai import ClientAI
from clientai.ollama import OllamaManager

Basic usage - server starts automatically and stops when done
with OllamaManager() as manager:
Create a client that connects to the managed server
client = ClientAI('ollama', host="http://localhost:11434")

Use the client normally
response = client.generate_text(
"Explain quantum computing",
model="llama2"
)
print(response)
Server automatically stops when exiting the context



Installation

bash
Install with Ollama support
pip install "clientai[ollama]"

Install with all providers
pip install "clientai[all]"


Core Concepts

Server Lifecycle Management

1. **Context Manager (Recommended)**
python
with OllamaManager() as manager:
Server starts automatically
client = ClientAI('ollama')
Use client...
Server stops automatically


2. **Manual Management**
python
manager = OllamaManager()
try:
manager.start()
client = ClientAI('ollama')
Use client...
finally:
manager.stop()


Configuration Management

python
from clientai.ollama import OllamaServerConfig

Create custom configuration
config = OllamaServerConfig(
host="127.0.0.1",
port=11434,
gpu_layers=35,
memory_limit="8GiB"
)

Use configuration with manager
with OllamaManager(config) as manager:
client = ClientAI('ollama')
Use client...


For more information, see the [docs](https://igorbenav.github.io/clientai/usage/ollama_manager).

What's Changed
* Ollama manager added by igorbenav in https://github.com/igorbenav/clientai/pull/5


**Full Changelog**: https://github.com/igorbenav/clientai/compare/v0.2.1...v0.3.0

0.2.1

Extended Error Handling in ClientAI

Changes
* Added HTTP status codes to Ollama error handling by igorbenav in https://github.com/igorbenav/clientai/pull/2

Changes
Ollama errors now include standard HTTP status codes for better error handling and compatibility:
- `AuthenticationError`: 401
- `RateLimitError`: 429
- `ModelError`: 404
- `InvalidRequestError`: 400
- `TimeoutError`: 408
- `APIError`: 500

This improves error handling consistency across all providers and enables better error status tracking.

Example:
python
try:
response = client.generate_text("Hello", model="llama2")
except RateLimitError as e:
print(e) Will now show: "[429] Rate limit exceeded"
print(e.status_code) Will return: 429


What's Changed
* advanced docs by igorbenav in https://github.com/igorbenav/clientai/pull/2
* ollama exceptions now with status_codes by igorbenav in https://github.com/igorbenav/clientai/pull/3
* project version bumped to 0.2.1 by igorbenav in https://github.com/igorbenav/clientai/pull/4


**Full Changelog**: https://github.com/igorbenav/clientai/compare/v0.2.0...v0.2.1

0.2.0

Error Handling in ClientAI

ClientAI now provides an error handling system that unifies exceptions across different AI providers. This guide covers how to handle potential errors when using ClientAI.

Table of Contents

1. [Exception Hierarchy](exception-hierarchy)
2. [Handling Errors](handling-errors)
3. [Provider-Specific Error Mapping](provider-specific-error-mapping)
4. [Best Practices](best-practices)

Exception Hierarchy

ClientAI uses a custom exception hierarchy to provide consistent error handling across different AI providers:

python
from clientai.exceptions import (
ClientAIError,
AuthenticationError,
RateLimitError,
InvalidRequestError,
ModelError,
TimeoutError,
APIError
)


- `ClientAIError`: Base exception class for all ClientAI errors.
- `AuthenticationError`: Raised when there's an authentication problem with the AI provider.
- `RateLimitError`: Raised when the AI provider's rate limit is exceeded.
- `InvalidRequestError`: Raised when the request to the AI provider is invalid.
- `ModelError`: Raised when there's an issue with the specified model.
- `TimeoutError`: Raised when a request to the AI provider times out.
- `APIError`: Raised when there's an API-related error from the AI provider.

Handling Errors

Here's how to handle potential errors when using ClientAI:

python
from clientai import ClientAI
from clientai.exceptions import (
ClientAIError,
AuthenticationError,
RateLimitError,
InvalidRequestError,
ModelError,
TimeoutError,
APIError
)

client = ClientAI('openai', api_key="your-openai-api-key")

try:
response = client.generate_text("Tell me a joke", model="gpt-3.5-turbo")
print(f"Generated text: {response}")
except AuthenticationError as e:
print(f"Authentication error: {e}")
except RateLimitError as e:
print(f"Rate limit exceeded: {e}")
except InvalidRequestError as e:
print(f"Invalid request: {e}")
except ModelError as e:
print(f"Model error: {e}")
except TimeoutError as e:
print(f"Request timed out: {e}")
except APIError as e:
print(f"API error: {e}")
except ClientAIError as e:
print(f"An unexpected ClientAI error occurred: {e}")


Provider-Specific Error Mapping

ClientAI maps provider-specific errors to its custom exception hierarchy. For example:

OpenAI

python
def _map_exception_to_clientai_error(self, e: Exception) -> None:
error_message = str(e)
status_code = getattr(e, 'status_code', None)

if isinstance(e, OpenAIAuthenticationError) or "incorrect api key" in error_message.lower():
raise AuthenticationError(error_message, status_code, original_error=e)
elif status_code == 429 or "rate limit" in error_message.lower():
raise RateLimitError(error_message, status_code, original_error=e)
elif status_code == 404 or "not found" in error_message.lower():
raise ModelError(error_message, status_code, original_error=e)
elif status_code == 400 or "invalid" in error_message.lower():
raise InvalidRequestError(error_message, status_code, original_error=e)
elif status_code == 408 or "timeout" in error_message.lower():
raise TimeoutError(error_message, status_code, original_error=e)
elif status_code and status_code >= 500:
raise APIError(error_message, status_code, original_error=e)

raise ClientAIError(error_message, status_code, original_error=e)


Replicate

python
def _map_exception_to_clientai_error(self, e: Exception, status_code: int = None) -> ClientAIError:
error_message = str(e)
status_code = status_code or getattr(e, 'status_code', None)

if "authentication" in error_message.lower() or "unauthorized" in error_message.lower():
return AuthenticationError(error_message, status_code, original_error=e)
elif "rate limit" in error_message.lower():
return RateLimitError(error_message, status_code, original_error=e)
elif "not found" in error_message.lower():
return ModelError(error_message, status_code, original_error=e)
elif "invalid" in error_message.lower():
return InvalidRequestError(error_message, status_code, original_error=e)
elif "timeout" in error_message.lower() or status_code == 408:
return TimeoutError(error_message, status_code, original_error=e)
elif status_code == 400:
return InvalidRequestError(error_message, status_code, original_error=e)
else:
return APIError(error_message, status_code, original_error=e)


Best Practices

1. **Specific Exception Handling**: Catch specific exceptions when you need to handle them differently.

2. **Logging**: Log errors for debugging and monitoring purposes.

python
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

try:
response = client.generate_text("Tell me a joke", model="gpt-3.5-turbo")
except ClientAIError as e:
logger.error(f"An error occurred: {e}", exc_info=True)


3. **Retry Logic**: Implement retry logic for transient errors like rate limiting.

python
import time
from clientai.exceptions import RateLimitError

def retry_generate(prompt, model, max_retries=3, delay=1):
for attempt in range(max_retries):
try:
return client.generate_text(prompt, model=model)
except RateLimitError as e:
if attempt == max_retries - 1:
raise
wait_time = e.retry_after if hasattr(e, 'retry_after') else delay * (2 ** attempt)
logger.warning(f"Rate limit reached. Waiting for {wait_time} seconds...")
time.sleep(wait_time)


4. **Graceful Degradation**: Implement fallback options when errors occur.

python
def generate_with_fallback(prompt, primary_client, fallback_client):
try:
return primary_client.generate_text(prompt, model="gpt-3.5-turbo")
except ClientAIError as e:
logger.warning(f"Primary client failed: {e}. Falling back to secondary client.")
return fallback_client.generate_text(prompt, model="llama-2-70b-chat")


By following these practices and utilizing ClientAI's unified error handling system, you can create more robust and maintainable applications that gracefully handle errors across different AI providers.

See this in the [docs](https://igorbenav.github.io/clientai/usage/error_handling/).

What's Changed
* unified error handling by igorbenav in https://github.com/igorbenav/clientai/pull/1

New Contributors
* igorbenav made their first contribution in https://github.com/igorbenav/clientai/pull/1

**Full Changelog**: https://github.com/igorbenav/clientai/compare/v0.1.2...v0.2.0

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.