Flash-linear-attention

Latest version: v0.1.2

Safety actively analyzes 722460 Python packages for vulnerabilities to keep your Python projects secure.

0.1.2

What's Changed
* [RWKV7] fix `RWKV7Attention.__init__` by exhyy in https://github.com/fla-org/flash-linear-attention/pull/238
* fix(triton): remove num_warps=8 in bwd_prepare_wy_repr_kernel to avoid MMA layout assertion on non-Ampere GPUs. by kugwzk in https://github.com/fla-org/flash-linear-attention/pull/240
* [Fix]: reshape o before o_proj in linear_attn layer. by Luther-Sparks in https://github.com/fla-org/flash-linear-attention/pull/243
* [CI] Seperate tests to compile , normal and varlen by zhiyuan1i in https://github.com/fla-org/flash-linear-attention/pull/247
* [ABC] Add `use_rope` parameter to ABCAttention and ABCConfig & Fix compiler bugs in kernels by yzhangcs in https://github.com/fla-org/flash-linear-attention/pull/248
* [CI] trigger GPU workflow only on pull_request events by zhiyuan1i in https://github.com/fla-org/flash-linear-attention/pull/249
* Create test_linearatten.py by kangyiyang in https://github.com/fla-org/flash-linear-attention/pull/250
* [CI] Fix all erros and enable testing for PR by zhiyuan1i in https://github.com/fla-org/flash-linear-attention/pull/251
* [CI] add H100 GPU by zhiyuan1i in https://github.com/fla-org/flash-linear-attention/pull/254
* [Gated DeltaNet] fix gdn kernel bugs on h100 when vdim=64 by kugwzk in https://github.com/fla-org/flash-linear-attention/pull/256
* [Test] Enhance support for NVIDIA Hopper GPU by zhiyuan1i in https://github.com/fla-org/flash-linear-attention/pull/257
* [FAQ] Update triton-nightly links by yzhangcs in https://github.com/fla-org/flash-linear-attention/pull/259
* [Attn] Add triton impls for MHA/GQA by yzhangcs in https://github.com/fla-org/flash-linear-attention/pull/260
* [Attn] Use larger block size for hopper devices by yzhangcs in https://github.com/fla-org/flash-linear-attention/pull/261
* [Attn] Enable test for attn by zhiyuan1i in https://github.com/fla-org/flash-linear-attention/pull/262
* [CI] fix a syntax error in triton-nightly by zhiyuan1i in https://github.com/fla-org/flash-linear-attention/pull/263
* Bump `fla` to v0.1.2 by yzhangcs in https://github.com/fla-org/flash-linear-attention/pull/264

New Contributors
* exhyy made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/238
* kugwzk made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/240
* Luther-Sparks made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/243
* yzhangcs made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/248
* kangyiyang made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/250

**Full Changelog**: https://github.com/fla-org/flash-linear-attention/compare/v0.1.1...v0.1.2

0.1.1

What's Changed
* [README] Fix HGRN2 bibs in https://github.com/fla-org/flash-linear-attention/commit/a43b525e397ecb92e3f1335e39029be731b9cc21
* [LightNet] Use fused norm, output gate and proj in https://github.com/fla-org/flash-linear-attention/commit/bc86b590880e47f3c27753ac377138ab165414a2
* [LayerNormGated] Support combined sigmoid/swish output gate in https://github.com/fla-org/flash-linear-attention/commit/27de88dbf8382c859a4136c0002bf231908a8235
* [NSA] Fix missing Cache class in https://github.com/fla-org/flash-linear-attention/commit/a5199200efdce7eae142498f137ea77219191a21
* [BUG][RWKV7] Fix value head dim mismatch in https://github.com/fla-org/flash-linear-attention/commit/8fd1d3d55c19ccbb012b33f1caa6d271e58ead8d
* [DeltaNet] Improved kernel speed by finegrained autotuning in https://github.com/fla-org/flash-linear-attention/commit/3b9bba8b9c9397b5ce69677e02c3181a056bc741

**Full Changelog**: https://github.com/fla-org/flash-linear-attention/compare/v0.1.0...v0.1.1

0.1.0

What's Changed
* Update README.md by eltociear in https://github.com/fla-org/flash-linear-attention/pull/2
* fix simple gla backward by sunyt32 in https://github.com/fla-org/flash-linear-attention/pull/6
* Adding RWKV-v4. by ridgerchu in https://github.com/fla-org/flash-linear-attention/pull/8
* fixed hgrn.py paper link and title by ridgerchu in https://github.com/fla-org/flash-linear-attention/pull/10
* Update recurrent_naive.py by hypnopump in https://github.com/fla-org/flash-linear-attention/pull/12
* fix: calculate du on different batch by uniartisan in https://github.com/fla-org/flash-linear-attention/pull/35
* fix: enhance state gradient when bf16 by uniartisan in https://github.com/fla-org/flash-linear-attention/pull/37
* Add implementations of Mamba 2 into FLA by DanFosing in https://github.com/fla-org/flash-linear-attention/pull/39
* Minor mamba-2 fixes by DanFosing in https://github.com/fla-org/flash-linear-attention/pull/40
* [DeltaNet] Adds beta as a vector option by hypnopump in https://github.com/fla-org/flash-linear-attention/pull/42
* [DRAFT] Beta gradient does not match by hypnopump in https://github.com/fla-org/flash-linear-attention/pull/43
* [Attn] fix negative value of seqlen offset during sft by ChaosCodes in https://github.com/fla-org/flash-linear-attention/pull/45
* [RWKV6] fix backward if h0 not passed by hypnopump in https://github.com/fla-org/flash-linear-attention/pull/48
* Replace mamba2 `mamba_chunk_scan_combined` triton kernel by `simple_gla` triton kernel by learning-chip in https://github.com/fla-org/flash-linear-attention/pull/49
* benchmark script for simple_gla vs mamba2 kernel by learning-chip in https://github.com/fla-org/flash-linear-attention/pull/50
* Update amp custom_fwd, custom_bwd usage for torch 2.4.0 compatibility by mirceamironenco in https://github.com/fla-org/flash-linear-attention/pull/54
* Fix syntax error by JulienSiems in https://github.com/fla-org/flash-linear-attention/pull/55
* Add `__init__.py` in `fla/ops/common` for automatic package discovery by zhixuan-lin in https://github.com/fla-org/flash-linear-attention/pull/56
* [`Mamba2`] Post Merge Fixes - `norm_before_gate` and generation with `inputs_embeds` by vasqu in https://github.com/fla-org/flash-linear-attention/pull/57
* Correctly compute `max_seqlen` when `max_position_embeddings` is `None` by zhixuan-lin in https://github.com/fla-org/flash-linear-attention/pull/59
* add chunked kl div by ChaosCodes in https://github.com/fla-org/flash-linear-attention/pull/62
* Add fine-grained warning category for easier supression by mirceamironenco in https://github.com/fla-org/flash-linear-attention/pull/65
* Update fused_chunk.py by hypnopump in https://github.com/fla-org/flash-linear-attention/pull/72
* [`Mamba2`] Fix slow path by vasqu in https://github.com/fla-org/flash-linear-attention/pull/84
* Add BitNet by DustinWang1 in https://github.com/fla-org/flash-linear-attention/pull/85
* Fix RWKV6 Cache Problems by WorldEditors in https://github.com/fla-org/flash-linear-attention/pull/78
* Bugs in RWKV6 OP by WorldEditors in https://github.com/fla-org/flash-linear-attention/pull/87
* fix mamba2 cache bug by WorldEditors in https://github.com/fla-org/flash-linear-attention/pull/89
* fix dh0 is None breaking backward pass by Sxela in https://github.com/fla-org/flash-linear-attention/pull/102
* support varlen training for conv1d by LKJacky in https://github.com/fla-org/flash-linear-attention/pull/116
* blood for the torch.compile gods by harrisonvanderbyl in https://github.com/fla-org/flash-linear-attention/pull/119
* Added forward pass for chunckwise ttt-linear, varlen is supported. by Pan-Yuqi in https://github.com/fla-org/flash-linear-attention/pull/124
* Add scripts for converting pretrained RWKV7 models to fla format by Triang-jyed-driung in https://github.com/fla-org/flash-linear-attention/pull/128
* [`Mamba2`] Fixes for caching and multiple other small issues by vasqu in https://github.com/fla-org/flash-linear-attention/pull/129
* [LinAttn] Fix handling of None scale in chunk_linear_attn for output normalization by HallerPatrick in https://github.com/fla-org/flash-linear-attention/pull/130
* Fix incorrect kwarg name in `fused_recurrent` by fffffgggg54 in https://github.com/fla-org/flash-linear-attention/pull/134
* RWKV-7 conversion and evals by Triang-jyed-driung in https://github.com/fla-org/flash-linear-attention/pull/135
* Fixed dtype mismatch of mamba & mamba2 under residual_in_fp32 setting by chengshuang18 in https://github.com/fla-org/flash-linear-attention/pull/137
* [RWKV7] Fix masking before time shifting modules by Triang-jyed-driung in https://github.com/fla-org/flash-linear-attention/pull/141
* [RWKV7, but applicable to all models] Update modeling_rwkv7.py: Fixing `base_model_prefix` by Triang-jyed-driung in https://github.com/fla-org/flash-linear-attention/pull/143
* fix bitattn with latest attn implementation. by ridgerchu in https://github.com/fla-org/flash-linear-attention/pull/146
* [RWKV7] Remove in-place operations and add gradient checkpointing for `v_first` by Triang-jyed-driung in https://github.com/fla-org/flash-linear-attention/pull/145
* [BitNet] Fix bugs of model definitions by ridgerchu in https://github.com/fla-org/flash-linear-attention/pull/147
* Fix 157 by jannalulu in https://github.com/fla-org/flash-linear-attention/pull/167
* [Mamba, Samba] Add weight initializations and reset_parameters() in _init_weights() for compatibility in Flame by zaydzuhri in https://github.com/fla-org/flash-linear-attention/pull/169
* fix lint errors by jannalulu in https://github.com/fla-org/flash-linear-attention/pull/170
* [RWKV] Follow-up to fix cache management by jannalulu in https://github.com/fla-org/flash-linear-attention/pull/168
* one liner by seanxwzhang in https://github.com/fla-org/flash-linear-attention/pull/178
* [Attn] Fix cache update of swa by Pan-Yuqi in https://github.com/fla-org/flash-linear-attention/pull/183
* [RWKV7] Fix conversion precision by Triang-jyed-driung in https://github.com/fla-org/flash-linear-attention/pull/188
* [GRPO]: add grpo functions by uniartisan in https://github.com/fla-org/flash-linear-attention/pull/189
* [RWKV] fix logits handling by jannalulu in https://github.com/fla-org/flash-linear-attention/pull/192
* [Modules]: Enhance the precision of the fused LayerNorm OP. by uniartisan in https://github.com/fla-org/flash-linear-attention/pull/200
* [MISC] fix delta_net logit handling by jannalulu in https://github.com/fla-org/flash-linear-attention/pull/205
* [RWKV7] Keep compatibility with Torch Compiler. by uniartisan in https://github.com/fla-org/flash-linear-attention/pull/208
* [Misc.] Update wrapper to support contiguous and guard custom device … by uniartisan in https://github.com/fla-org/flash-linear-attention/pull/212
* [Models] Fix the error in the judgment of past_key_values when inputs… by uniartisan in https://github.com/fla-org/flash-linear-attention/pull/213
* [Titans] Update Titans implementation by rucnyz in https://github.com/fla-org/flash-linear-attention/pull/214
* [Mamba2] Fix initialization by HanGuo97 in https://github.com/fla-org/flash-linear-attention/pull/225
* [TTT] Update fused chunk ops and state bias term by Pan-Yuqi in https://github.com/fla-org/flash-linear-attention/pull/230
* Enable utils.py to be imported on CPU-only machines (231) by zhuzeyuan in https://github.com/fla-org/flash-linear-attention/pull/232
* [Utils] use fla.utils.device instead of cuda by uniartisan in https://github.com/fla-org/flash-linear-attention/pull/163
* fix(GatedDeltaNet): Ensure integer dimensions when using `expand_v` by vladislavalerievich in https://github.com/fla-org/flash-linear-attention/pull/234

New Contributors
* eltociear made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/2
* sunyt32 made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/6
* ridgerchu made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/8
* hypnopump made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/12
* DanFosing made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/39
* ChaosCodes made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/45
* learning-chip made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/49
* mirceamironenco made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/54
* JulienSiems made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/55
* zhixuan-lin made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/56
* vasqu made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/57
* DustinWang1 made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/85
* WorldEditors made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/78
* Sxela made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/102
* LKJacky made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/116
* harrisonvanderbyl made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/119
* Triang-jyed-driung made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/128
* HallerPatrick made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/130
* fffffgggg54 made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/134
* chengshuang18 made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/137
* jannalulu made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/167
* zaydzuhri made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/169
* seanxwzhang made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/178
* zhuzeyuan made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/232
* vladislavalerievich made their first contribution in https://github.com/fla-org/flash-linear-attention/pull/234

**Full Changelog**: https://github.com/fla-org/flash-linear-attention/commits/v0.1.0

Releases

Has known vulnerabilities