ollama-instructor Changelog

0.5.1

Bug fix

This release fixes a bug reported with issue 11

The bug occurred when a retry was executed due to an invalid response from the LLM.
As the Ollama server expects string objects for `content` in the `messages` list, the method `prepare_messages` of `BaseOllamaInstructorClient` was falsely handing over python objects for the retry.

**It's recommended that everyone using version v0.5.0 updates to v0.5.1**
> Note:

What changed?

- Due to new version 0.4.1 of `ollama` I had to bump the pydantic version to `^2.9.0` as this is a dependency of `ollama`. So I had to refactor the function `create_partial_model` in `cleaner.py`
- Refactored the type hints regarding the `messages` argument.

Warnings!

- Unfortunately, with the new versions 0.4.1 of `ollama` the method `chat_completion_with_stream` of `OllamaInstructorClient` is broken. I will try to finde a fix. I marked the method as `deprecated`. Please use `chat_completion` instead. Older versions of `ollama-instructor` with using `ollama` versions below 0.4.0 should work. I am sorry for the inconvenience!

0.5.0

What changed?

- Separated the `OllamaInstructorBaseClient` from `ollama_instructor_client.py` into `base_client.py`
- Separated the functionality of methods like `handel_response` of `OllamaInstructorBaseClient` into easier maintainable methods
- Switched from `icecream` to pythons `logging` module for debugging. Deleted dependency for `icecream`.
- Added the arguments `keep_alive` and `options` to `chat_completion` and `chat_completion_with_stream` (no more `**kwargs`)
- Added support for `ollama-python` `Message` and `Options` classes for code completion within IDE/Editor
- Added example for multiple asynchronous operations (see [here](https://github.com/lennartpollvogt/ollama-instructor/blob/main/examples/async/async_operations.md) )
- new version v0.5.0

Details on switch from `icecream` to `logging`

As announced in the previous `README.md` I was planning to switch from the `icecream` module to pythons built-in `logging` module. When setting `debug = True` in the classes `OllamaInstructorClient` and `OllamaInstructorAsyncClient` you will now see something like the following:

2024-10-19 16:02:47,851 - ollama_instructor.base_client - DEBUG - ollama_instructor_client.py:369 - Reset state
2024-10-19 16:02:47,851 - ollama_instructor.base_client - DEBUG - ollama_instructor_client.py:371 - Create prompt
2024-10-19 16:02:47,852 - ollama_instructor.base_client - DEBUG - ollama_instructor_client.py:374 - Start while loop
2024-10-19 16:02:47,852 - ollama_instructor.base_client - DEBUG - ollama_instructor_client.py:379 - Prepare message and set left retries
2024-10-19 16:02:47,852 - ollama_instructor.base_client - DEBUG - ollama_instructor_client.py:381 - Request ollama client
...

Details on code completion for `messages` and `options` arguments

**Code completion support for `messages`**:
![image](https://github.com/user-attachments/assets/f80496e0-bc8e-456e-86b5-d95a57aae406)

**Code completion support for `options`**:
![image](https://github.com/user-attachments/assets/c1c0f843-4e9a-4e13-ad39-b7450c77c756)

**Full Changelog**: https://github.com/lennartpollvogt/ollama-instructor/compare/v0.4.2...v0.5.0

0.4.2

What changed?

- New version `0.4.2`

Bugfix
- added a `reset_state` method to `OllamaInstructorBaseClient` class. This is used with every request of the methods `chat_completion` and `chat_completion_with_stream` (sync + async). Otherwise the states `retry_counter`, `validation_error` and `chat_history` would have the value of the last request which was leading into an exception/failing request. Now you can use the same `client` for instance within a `for`- or `while`-loop and not having to create a new `client` with every request.

Example:
Python
from ollama_instructor.ollama_instructor_client import OllamaInstructorClient
from pydantic import BaseModel
from enum import Enum
from typing import List
import rich

class Gender(Enum):
MALE = 'male'
FEMALE = 'female'

class Person(BaseModel):
name: str
age: int
gender: Gender

client = OllamaInstructorClient(host='http://localhost:11434')

list_of_text = [
"Jason is a 21 years old man.",
"Jane is a 31 years old womand.",
"Bob is a 35 years old man."
]

list_of_response = []

for text in list_of_text:
try:
response = client.chat_completion(
model='llama3.1:latest',
pydantic_model=Person,
messages=[
{
'role': 'user',
'content': text
}
],
format='json',
)
list_of_response.append(response['message']['content'])
except Exception as e:
pass

print(list_of_response)

This was not possible before, due to missing reset of the states, when using the same `client`.

Documentation
- added in [blog](/blog) a creative story about `ollama-instructor`s intention and approach (with time there might be some pictures added ;-) )
- added the example [Todos from Conversation](/examples/todos/todos_from_chat.md) for automatic todos creation from chat conversations or transcripts
- added a [Best practices](/docs/5_Best%20practices.md) to the docs for having a place to quickly add them without writing bigger documentations. Will be extended or parts of it transferred into detailed documentation.

Tests
- added more tests

0.4.1

What changed?

- New version `0.4.1`
- Retry logic was fixed: The retry logic with the change in version 0.4.0 was not working correct. With the new version `0.4.1` this was fixed.
- Enhanced error guidance, when no code block was found in response: Firstly, if a code block is missing the `handle_response` function will now create a partial response where all values are set to `None`. This partial response will be used for the validation and to append the `chat_history` to create a proper error guidance. Secondly, the error guidance prompt itself was enhanced.
- More test cases: More test cases where added in the [tests](/tests) folder to test the functionality of the code block extraction and response handling
- The documentation about the features of `ollama-instructor` was enriched with a detailed explanation how the `reasoning` feature works (see [docs](/docs))

**Full Changelog**: https://github.com/lennartpollvogt/ollama-instructor/compare/v0.4.0...v0.4.1

0.4.0

New feature
With this release comes with a new feature, where you can use the `format` = '' in the `chat_completion` of OllamaInstructorClient and OllamaInstructorAsyncClient. This allows the LLM and vision model to reason before it response with the JSON output.

How is this done?
By setting `format` = '' the LLM is not forced to respond within JSON. Therefore, the instructions are crucial, to get a JSON anyway. But with `format` = '' the capabilities of LLMs to reason step by step before answering can be used. When you set `format` = '' the LLM will be instructed differently (have a look into the code in file 'prompt_manager.py'). After the step by step reasoning the LLM is instructed to respond within a code block starting with json and ending with . The content of this code block will be extracted and validated against the Pydantic model. When comments within the JSON code block occur, the code tries to delete them.

Why?
Letting the LLM reason before the final output seems to lead into better responses.

Example:
Here is an example for the usage:
Python
import rich.markdown
from ollama_instructor.ollama_instructor_client import OllamaInstructorClient
from pydantic import BaseModel
from typing import List
from enum import Enum
import rich

class Gender(Enum):
male = "male"
female = "female"
other = "other"

class User(BaseModel):
name: str
email: str
age: int
gender: Gender
friends: List[str]

client = OllamaInstructorClient(
host='http://localhost:11434',
debug=True
)

response = client.chat_completion(
pydantic_model=User,
model='mistral',
messages=[
{
'role': 'user',
'content': 'Extract the name, email and age from this text: "My name is Jane Smith. I am 25 years old. My email is janeexample.com. My friends are Karen and Becky."'
},
],
format='',
allow_partial=False
)

rich.print(response)

from rich.console import Console
from rich.markdown import Markdown
console = Console()
md = Markdown(response['raw_message']['content'])
console.print(md)

Output of the reasoning which is stored in response['raw_message']['content']:

Task Description: Given a JSON schema, extract the named properties 'name', 'email', 'age' and 'gender' from the provided text and return a valid JSON response adhering to the schema.

Reasoning:

1 'name': The name is mentioned as "My name is Jane Smith".
2 'email': The email is mentioned as "My email is janeexample.com".
3 'age': The age is mentioned as "I am 25 years old".
4 'gender': According to the text, 'Jane' is a female, so her gender would be represented as 'female' in the JSON response.
5 'friends': The friends are mentioned as "Her friends are Karen and Becky", but according to the schema, 'friends' should be an array of strings, not an object with a title. However, the schema does not explicitly state that each friend must have a unique name, so both 'Karen' and 'Becky' could be included in the 'friends' list.

JSON response:

´´´json
{
"name": "Jane Smith",
"email": "janeexample.com",
"age": 25,
"gender": "female",
"friends": ["Karen", "Becky"]
}
´´´

Output of the extracted JSON within response['message']['content']:

{
"name": "Jane Smith",
"email": "janeexample.com",
"age": 25,
"gender": "female",
"friends": ["Karen", "Becky"]
}

> *Note*: This feature is currently not available for `chat_completion_with_stream`. I try to make this happen.

Fixes
**Important**
Due to the last refactoring it turns out the retry logic was broken. With this update the logic is fixed for `chat_completion` and `chat_completion_with_stream`. The `chat_completion` remains in the refactored code but for `chat_completion_with_stream` I had to use the old version of the code.

Sorry for any inconvenience with the last release!

**Full Changelog**: https://github.com/lennartpollvogt/ollama-instructor/compare/v0.3.0...v0.4.0

0.3.0

What is new?
This new version contains an updated ollama_instructor_client.py. The code was refactored to get rid of redundancies within the two clients `OllamaInstructorClient` and `OllamaInstructorAsyncClient`. The new `OllamaBaseClient` contains the redundant code for the derived clients. The usage of this library is not affected.
The version of ollama-instructor was bumped to 0.3.0 and supports the new version of ollama (python) which is 0.3.0.
A new folder "blog" was added to the respository. There will be blog articles about my thoughts and ideas regarding `Ollama-instructor`. The README.md was updated to display this.

**List of announcements**:
- Refactoring the clients of `ollama-instructor` in ollama_instructor_client.py to make maintenance easier and reduce redundancies in code (see [commit](https://github.com/lennartpollvogt/ollama-instructor/commit/af5731bd6e6d9152f4659d098573fea08659c27c))
- Supporting new ollama version 0.3.0 (see [commit](https://github.com/lennartpollvogt/ollama-instructor/commit/af5731bd6e6d9152f4659d098573fea08659c27c))
- New "blog" folder for publishing thoughts and ideas regarding ollama-instructor and its usage (see [commit](https://github.com/lennartpollvogt/ollama-instructor/commit/7e04c0188c7dd9759f1b61c887a7118e171ff401))
- Updated README.md file (see [commit](https://github.com/lennartpollvogt/ollama-instructor/commit/1966d870d9635ac18c300a891eb362feb2e2ed58))
- Some smaller refactorings to docstrings of functions and tests (see [commit](https://github.com/lennartpollvogt/ollama-instructor/commit/af5731bd6e6d9152f4659d098573fea08659c27c))

**Full Changelog**: https://github.com/lennartpollvogt/ollama-instructor/compare/v0.2.0...v0.3.0

Ollama-instructor

Page 1 of 2