Scalellm

Latest version: v0.2.2

Safety actively analyzes 682361 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 4

0.2.1

What's Changed
* feat: added awq marlin qlinear by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/315
* build: speed up compilation for marlin kernels by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/316
* test: added unittests for marlin kernels by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/317
* refactor: clean up build warnings and refactor marlin kernels by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/318
* fix: clean up build warnings: "LOG" redefined by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/319
* cmake: make includes private and disable jinja2cpp build by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/320
* ci: allow build without requiring a physical gpu device by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/321
* fix: put item into asyncio.Queue in a thread-safe way by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/324
* refactor: added static switch for marlin kernel dispatch by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/325
* feat: fix and use marlin kernel for awq by default by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/326


**Full Changelog**: https://github.com/vectorch-ai/ScaleLLM/compare/v0.2.0...v0.2.1

0.2.0

What's Changed
* kernel: port softcap support for flash attention by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/298
* test: added unittests for attention sliding window by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/299
* model: added gemma2 with softcap and sliding window support by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/300
* kernel: support kernel test in python via pybind by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/301
* test: added unittests for marlin fp16xint4 gemm by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/302
* fix: move eos out of stop token list to honor ignore_eos option by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/305
* refactor: move models to upper folder by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/306
* kernel: port gptq marlin kernel and fp8 marlin kernel by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/307
* rust: upgrade rust libs to latest version by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/309
* refactor: remove the logic loading individual weight from shared partitions by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/311
* feat: added fused column parallel linear by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/313
* feat: added gptq marlin qlinear layer by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/312
* kernel: port awq repack kernel by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/314


**Full Changelog**: https://github.com/vectorch-ai/ScaleLLM/compare/v0.1.9...v0.2.0

0.1.9

What's Changed
* ci: cancel all previous runs if a new one is triggered by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/283
* pypi: fix invalid classifier by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/284
* refactor: remove exllama kernels by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/285
* kernel: added marlin dense and sparse kernels by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/287
* debug: added environment collection script. by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/288
* kernel: added triton kernel build support by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/289
* feat: added THUDM/glm-4* support by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/292
* fix: handle unfinished utf8 bytes for tiktoken tokenizer by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/293
* triton: fix build error and add example with unittest by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/294
* model: added qwen2 support by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/295
* feat: added sliding window support for QWen2 by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/296
* ci: fix pytest version to avoid flakiness by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/297


**Full Changelog**: https://github.com/vectorch-ai/ScaleLLM/compare/v0.1.8...v0.1.9

0.1.8

What's Changed
* ci: increase ccache max size from 5GB(default) to 25GB by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/279
* upgrade torch to 2.4.0 by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/280
* default use cuda 12.1 for wheel package by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/281
* ci: fix cuda version for wheel build workflow by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/282


**Full Changelog**: https://github.com/vectorch-ai/ScaleLLM/compare/v0.1.7...v0.1.8

0.1.7

What's Changed
* build: fix build error with gcc-13 by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/264
* kernel: upgrade cutlass to 3.5.0 + cuda 12.4 for sm89 fp8 support by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/265
* cmake: define header only library instead of symbol link for cutlass and flashinfer by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/266
* feat: added range to support Range-for loops by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/267
* kernel: added attention cpu implementation for testing by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/268
* build: added nvbench as submodule by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/269
* build: upgrade cmake required version from 3.18 to 3.26 by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/270
* ci: build and test in devel docker image by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/272
* ci: use manylinux image to build wheel and run pytest by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/271
* attention: added tile logic using cute::local_tile into cpu attention by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/273
* kernel: added playground for learning and experimenting cute. by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/274
* feat: added rope scaling support for llama3.1 by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/277
* update docs for llama3.1 support and bump up version by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/278


**Full Changelog**: https://github.com/vectorch-ai/ScaleLLM/compare/v0.1.6...v0.1.7

0.1.6

What's Changed
* alllow deploy docs when triggered on demand by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/253
* [model] support vision language model llava. by liutongxuan in https://github.com/vectorch-ai/ScaleLLM/pull/178
* dev: fix issues in run_in_docker script by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/254
* dev: added cuda 12.4 build support by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/255
* build: fix multiple definition issue by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/256
* fix: check against num_tokens instead of num_prompt_tokens for shared blocks by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/257
* bugfix: fix invalid max_cache_size when device is cpu. by liutongxuan in https://github.com/vectorch-ai/ScaleLLM/pull/259
* ci: fail test if not all tests were passed successfully by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/263
* Revert "[model] support vision language model llava. (178)" by guocuimi in https://github.com/vectorch-ai/ScaleLLM/pull/262


**Full Changelog**: https://github.com/vectorch-ai/ScaleLLM/compare/v0.1.5...v0.1.6

Page 1 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.