llm-ie Changelog

0.4.6

Documentation

Changes
- Added `allow_overlap_entities` parameter to `FrameExtractor.extract_frames()` method. This allows LLM to output multiple frames with overlapping entity spans. For example, the text below has two "headache" mentions,
python
text = "In trial 12345, headache was reported in 5% of patients, while headache was reported in 10% of patients in arm A."

LLM generated:

"[
{"ClinicalTrial": "12345", "Arm": "A", "AdverseReaction": "headache", "Percentage": "10%"},
{"ClinicalTrial": "12345", "Arm": "", "AdverseReaction": "headache", "Percentage": "5%"}
]"

When `allow_overlap_entities=False`, the two frames will be the two "headache" mentions:
python
[
{'frame_id': '0', 'start': 17, 'end': 25, 'entity_text': 'headache', 'attr': {'ClinicalTrial': 'trial 12345', 'Arm': 'arm A', 'Percentage': '5%'}}
{'frame_id': '1', 'start': 64, 'end': 72, 'entity_text': 'headache', 'attr': {'ClinicalTrial': 'trial 12345', 'Arm': 'arm A', 'Percentage': '10%'}}
]

While `allow_overlap_entities=True`, the two frames will overlap on the first mention:
python
[
{'frame_id': '0', 'start': 17, 'end': 25, 'entity_text': 'headache', 'attr': {'ClinicalTrial': '12345', 'Arm': 'A', 'Percentage': '10%'}}
{'frame_id': '1', 'start': 17, 'end': 25, 'entity_text': 'headache', 'attr': {'ClinicalTrial': '12345', 'Percentage': '5%'}}
]

- Fixed **UnboundLocalError** in `extract_async()`. The issue happened when input `text_content` is too short to be sentence tokenized.

0.4.5

Documentation

Changes
- Added option to adjust the number of context sentences in sentence-based extractors.
The `context_sentences` sets number of sentences before and after the sentence of interest to provide additional context. When `context_sentences=2`, 2 sentences before and 2 sentences after are included in the user prompt as context. When `context_sentences="all"`, the entire document is included as context. When `context_sentences=0`, no context is provided and LLM will only extract based on the current sentence of interest.

python
from llm_ie.extractors import SentenceFrameExtractor

extractor = SentenceFrameExtractor(inference_engine=inference_engine,
prompt_template=prompt_temp,
context_sentences=2)
frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", case_sensitive=False, fuzzy_match=True, stream=True)

For the sentence:

*The patient has a history of hypertension, hyperlipidemia, and Type 2 diabetes mellitus.*

The context is "previous sentence 2" "previous sentence 1" "the sentence of interest" "proceeding sentence 1" "proceeding sentence 2":

*Emily Brown, MD (Cardiology), Dr. Michael Green, MD (Pulmonology)

* Reason for Admission*
*John Doe, a 49-year-old male, was admitted to the hospital with complaints of chest pain, shortness of breath, and dizziness. The patient has a history of hypertension, hyperlipidemia, and Type 2 diabetes mellitus. History of Present Illness*
*The patient reported that the chest pain started two days prior to admission. The pain was described as a pressure-like sensation in the central chest, radiating to the left arm and jaw.*

- Added support for OpenAI reasoning models ("o" series).
For reasoning models ("o" series), use the `reasoning_model=True` flag. The `max_completion_tokens` will be used instead of the `max_tokens`. `temperature` will be ignored.

python
from llm_ie.engines import OpenAIInferenceEngine

inference_engine = OpenAIInferenceEngine(model="o1-mini", reasoning_model=True)

0.4.4

Documentation

Changes
- Fixed a dependency version error.

0.4.3

Documentation

Changes
- Added Azure OpenAI support

export AZURE_OPENAI_API_KEY="<your_API_key>"
export AZURE_OPENAI_ENDPOINT="<your_endpoint>"

python
from llm_ie.engines import AzureOpenAIInferenceEngine

inference_engine = AzureOpenAIInferenceEngine(model="gpt-4o-mini")

- Fixed issues in `PromptEditor.chat()` kwrs passing.
- Added missing dependency for `nest-asyncio` .

0.4.2

Documentation

Changes
- Fixed bug in `LLMInformationExtractionDocument.add_frame()`. Previously, when `create_id=Ture`, the frame_id would not be reassigned. Now a sequential ID will be assigned.
- Added error message for `FrameExtractor.extract_frames()` when `text_content` is a dictionary and `document_key` is None. Now a ValueError will be thrown.
- Added error message for `Extractor._get_user_prompt()` when some values in `text_content` dictionary are not string. A ValueError will be thrown.

0.4.1

Documentation

New features
- Added filters, table view, and more user-friendly visualization features.
- Please make sure to update visualization dependency [ie-viz](https://github.com/daviden1013/ie-viz) with `pip install -U ie-viz`

![visualization](https://github.com/user-attachments/assets/12b02d6d-6047-4b6d-be73-8bab501271a5)

Llm-ie

Page 1 of 3