Documentation
New features
- **Concurrent extraction** for extractors that requires multiple inferencing: `SentenceFrameExtractor`, `SentenceReviewFrameExtractor`, `SentenceCoTFrameExtractor`, `BinaryRelationExtractor`, and `MultiClassRelationExtractor`. We use Python `asyncio` for concurrent, high-throughput inferencing. On a 4×A100 GPU server running vLLM, the speed is 10× faster than synchronous extraction.
To use concurrent for sentence-level frame extraction. The `concurrent_batch_size=32` sets 32 sentences to be processed at once.
python
from llm_ie.extractors import SentenceFrameExtractor
extractor = SentenceFrameExtractor(inference_engine, prompt_temp)
frames = extractor.extract_frames(text_content=text, entity_key="Diagnosis", concurrent=True, concurrent_batch_size=32)
Using concurrent for relation extraction. The `concurrent_batch_size=32` sets 32 frame pairs to be processed at once.
python
from llm_ie.extractors import MultiClassRelationExtractor
extractor = MultiClassRelationExtractor(inference_engine, prompt_template=re_prompt_template, possible_relation_types_func=possible_relation_types_func)
relations = extractor.extract_relations(doc, concurrent=True, concurrent_batch_size=32)
- Supports for 🚅 [LiteLLM](https://github.com/BerriAI/litellm)
python
from llm_ie.engines import LiteLLMInferenceEngine
inference_engine = LiteLLMInferenceEngine(model="openai/Llama-3.3-70B-Instruct", base_url="http://localhost:8000/v1", api_key="EMPTY")
- The `PromptEditor` LLM agent now accepts `prompt_guide` for customized prompt guidelines.
python
from llm_ie import PromptEditor, BasicFrameExtractor, OllamaInferenceEngine
Define an LLM inference engine
inference_engine = OllamaInferenceEngine(model_name="llama3.1:8b-instruct-q8_0")
Define editor
editor = PromptEditor(inference_engine, BasicFrameExtractor, prompt_guide="<a customized guideline>")
editor.chat()