- feat: Update llama.cpp to ggerganov/llama.cpp0e18b2e7d0b5c0a509ea40098def234b8d4a938a
- feat: Add offload_kqv option to llama and server by abetlen in 095c65000642a3cf73055d7428232fb18b73c6f3
- feat: n_ctx=0 now uses the n_ctx_train of the model by DanieleMorotti in 1015
- feat: logits_to_logprobs supports both 2-D and 3-D logits arrays by kddubey in 1002
- fix: Remove f16_kv, add offload_kqv fields in low level and llama apis by brandonrobertz in 1019
- perf: Don't convert logprobs arrays to lists by kddubey in 1021
- docs: Fix README.md functionary demo typo by evelynmitchell in 996
- examples: Update low_level_api_llama_cpp.py to match current API by jsoma in 1023