What's Changed
* Cache CLIP embeddings for the dataset, rather than recomputing them each time.
* Reduce model file sizes by storing at lower precision
* Add an `ImageCaptionInferenceModel` class for easier out-of-the-box use
* Fix some broken unit tests
* Better Data Caching by fkodom in https://github.com/fkodom/clip-text-decoder/pull/3
* Bug Fixes for Broken Tests by fkodom in https://github.com/fkodom/clip-text-decoder/pull/4
**Full Changelog**: https://github.com/fkodom/clip-text-decoder/compare/1.1.0...1.2.0