Added
- `toolio.responder` module, with coherent factoring from `server.py`
- `llm_helper.model_manager` convenience API for direct Python loading & inferencing over models
- `llm_helper.extract_content` helper to simplify the OpenAI-style streaming completion responses
- `test/quick_check.py` for quick assessment of LLMs in Toolio
- Mistral model type support
Changed
- Turn off prompt caching until we figure out [12](https://github.com/OoriData/Toolio/issues/12)
- Have responders return actual dicts, rather than label + JSON dump
- Factor out HTTP protocol schematics to a new module
- Handle more nuances of tool-calling tokenizer setup
- Harmonize tool definition patterns across invocation styles
Fixed
- More vector shape mamagement
Removed
- Legacy OpenAI-style function-calling support