What's Changed
* Apply prompt style for tp.py and sequentially.py by Andrei-Aksionov in https://github.com/Lightning-AI/litgpt/pull/1629
* Fix prompt docstring in Python API by rasbt in https://github.com/Lightning-AI/litgpt/pull/1635
* Update windows cpu-tests.yml by rasbt in https://github.com/Lightning-AI/litgpt/pull/1630
* Remove NumPy < 2.0 pin by rasbt in https://github.com/Lightning-AI/litgpt/pull/1631
* Fix kv-cache issue in Python API streaming mode by rasbt in https://github.com/Lightning-AI/litgpt/pull/1633
* Updates installation requirements to install minimal required packages for basic use by rasbt in https://github.com/Lightning-AI/litgpt/pull/1634
* Faster safetensors conversion when downloading model by awaelchli in https://github.com/Lightning-AI/litgpt/pull/1624
* Add Sebastian as code owner by awaelchli in https://github.com/Lightning-AI/litgpt/pull/1641
* Add missing super() call in data modules by awaelchli in https://github.com/Lightning-AI/litgpt/pull/1639
* Update Lightning version to 2.4.0 pre by awaelchli in https://github.com/Lightning-AI/litgpt/pull/1640
* Add tunable kvcache with error handling for nonsense inputs. by apaz-cli in https://github.com/Lightning-AI/litgpt/pull/1636
* Use Python API in serve code by rasbt in https://github.com/Lightning-AI/litgpt/pull/1644
* Fix autodownload + conversion issue by rasbt in https://github.com/Lightning-AI/litgpt/pull/1645
* Properly clear kv-cache by rasbt in https://github.com/Lightning-AI/litgpt/pull/1647
* Fix error raising where max_returned_tokens > max_seq_length_setting by rasbt in https://github.com/Lightning-AI/litgpt/pull/1648
* Add quantization support to litgpt serve by rasbt in https://github.com/Lightning-AI/litgpt/pull/1646
* Bump for 0.4.7 release by rasbt in https://github.com/Lightning-AI/litgpt/pull/1649
**Full Changelog**: https://github.com/Lightning-AI/litgpt/compare/v0.4.6...v0.4.7