First release of EchoSwift CLI tool
Features
- Benchmark LLM inference across multiple providers (e.g., Ollama, vLLM, TGI)
- Measure key performance metrics: latency, throughput, and TTFT
- Support for varying input and output token lengths
- Simulate concurrent users to test scalability
- Easy-to-use CLI interface
- Detailed logging and progress tracking
Performance metrics:
The performance metrics captured for varying input and output tokens and parallel users while running the benchmark includes
- Latency (ms/token)
- TTFT(ms)
- Throughput(tokens/sec)