Breaking changes
None
New features
* [`cappr.huggingface.classify`](https://cappr.readthedocs.io/en/latest/cappr.huggingface.classify.html) doesn't copy the prompt's KVs when broadcasting the prompt to completions if `batch_size=1` or if you pass in a single prompt. Instead, it repeats a view of it. This change saves memory for tasks where there are many completions. For example, in the [Banking 77 demo](https://github.com/kddubey/cappr/blob/main/demos/huggingface/banking_77_classes.ipynb), peak reserved CUDA memory goes from 13.8 GB to 8.3 GB (~40% decrease), and peak allocated CUDA memory goes from 9.3 GB to 7.7 GB (~17% decrease).
Bug fixes
None