- **Source Distribution Fixes**: Included missing `.h` files in the source distribution, ensuring the package can be compiled from PyPI.
- **Cross-Platform Support**: Added wheels for Linux, macOS, and Windows, extending compatibility across major operating systems.
- **Performance Benchmarks**: Implemented detailed benchmarks and CPU performance calculations for ByteCore and ByteCoreFast.
- **Optimizations**: Enhanced memory index calculation, addition operations, applied `-O3` optimization flag, and utilized `inline` for better performance.
- **Windows Compatibility**: Adjusted `fast_emulator.c` to resolve a compilation issue on Windows.
- **Workflow Enhancements**: Integrated multiple CI jobs into a single pipeline for building, testing, and publishing the Python package.