Added:
- Preparation for [Chat Completions API by OpenAI ©](https://platform.openai.com/docs/overview) compatible server.
Fixed:
- Argument `options` is `deepcopy`-ed when passed to `llama_generate(options)`, so it can be reused.
Changed:
- Build for `manylinux_2_28` and `musllinux_1_2`.
- Build for [CUDA Compute Capability](https://developer.nvidia.com/cuda-gpus) >= 6.1.