New feature
With this release comes with a new feature, where you can use the `format` = '' in the `chat_completion` of OllamaInstructorClient and OllamaInstructorAsyncClient. This allows the LLM and vision model to reason before it response with the JSON output.
How is this done?
By setting `format` = '' the LLM is not forced to respond within JSON. Therefore, the instructions are crucial, to get a JSON anyway. But with `format` = '' the capabilities of LLMs to reason step by step before answering can be used. When you set `format` = '' the LLM will be instructed differently (have a look into the code in file 'prompt_manager.py'). After the step by step reasoning the LLM is instructed to respond within a code block starting with json and ending with . The content of this code block will be extracted and validated against the Pydantic model. When comments within the JSON code block occur, the code tries to delete them.
Why?
Letting the LLM reason before the final output seems to lead into better responses.
Example:
Here is an example for the usage:
Python
import rich.markdown
from ollama_instructor.ollama_instructor_client import OllamaInstructorClient
from pydantic import BaseModel
from typing import List
from enum import Enum
import rich
class Gender(Enum):
male = "male"
female = "female"
other = "other"
class User(BaseModel):
name: str
email: str
age: int
gender: Gender
friends: List[str]
client = OllamaInstructorClient(
host='http://localhost:11434',
debug=True
)
response = client.chat_completion(
pydantic_model=User,
model='mistral',
messages=[
{
'role': 'user',
'content': 'Extract the name, email and age from this text: "My name is Jane Smith. I am 25 years old. My email is janeexample.com. My friends are Karen and Becky."'
},
],
format='',
allow_partial=False
)
rich.print(response)
from rich.console import Console
from rich.markdown import Markdown
console = Console()
md = Markdown(response['raw_message']['content'])
console.print(md)
Output of the reasoning which is stored in response['raw_message']['content']:
Task Description: Given a JSON schema, extract the named properties 'name', 'email', 'age' and 'gender' from the provided text and return a valid JSON response adhering to the schema.
Reasoning:
1 'name': The name is mentioned as "My name is Jane Smith".
2 'email': The email is mentioned as "My email is janeexample.com".
3 'age': The age is mentioned as "I am 25 years old".
4 'gender': According to the text, 'Jane' is a female, so her gender would be represented as 'female' in the JSON response.
5 'friends': The friends are mentioned as "Her friends are Karen and Becky", but according to the schema, 'friends' should be an array of strings, not an object with a title. However, the schema does not explicitly state that each friend must have a unique name, so both 'Karen' and 'Becky' could be included in the 'friends' list.
JSON response:
´´´json
{
"name": "Jane Smith",
"email": "janeexample.com",
"age": 25,
"gender": "female",
"friends": ["Karen", "Becky"]
}
´´´
Output of the extracted JSON within response['message']['content']:
{
"name": "Jane Smith",
"email": "janeexample.com",
"age": 25,
"gender": "female",
"friends": ["Karen", "Becky"]
}
> *Note*: This feature is currently not available for `chat_completion_with_stream`. I try to make this happen.
Fixes
**Important**
Due to the last refactoring it turns out the retry logic was broken. With this update the logic is fixed for `chat_completion` and `chat_completion_with_stream`. The `chat_completion` remains in the refactored code but for `chat_completion_with_stream` I had to use the old version of the code.
Sorry for any inconvenience with the last release!
**Full Changelog**: https://github.com/lennartpollvogt/ollama-instructor/compare/v0.3.0...v0.4.0