- Handle special tokens properly while tokenizing - Handle incomplete UTF-8 multi-byte characters while generating text - Increase buffer size in `tokenize()` for BOS token - Update ggml and llama.cpp
0.2.8
Changes
- Add support for the new k-quantization formats - Update GGML
0.2.7
Changes
- Add support for StarChat special tokens - Add `context_length` parameter support for MPT - Add low-level API to get `eos_token_id`, `vocab_size`, `context_length`