New Features
- Added transformers embedding provider for local Hugging Face models
- Support for BERT and other transformer models
- Multiple pooling strategies (mean, max, cls)
- Device configuration (CPU, CUDA, MPS) with auto-detection
- Optional model quantization (4-bit and 8-bit)
- Model caching configuration
- Batch processing for efficient embedding generation
- Both synchronous and asynchronous methods
- Comprehensive test coverage