Sae-lens

Latest version: v5.2.0

Safety actively analyzes 688126 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 19

5.2.0

Chore

* chore: fix tokenizer typing for bos_token_id (399) ([`b3b67d6`](https://github.com/jbloomAus/SAELens/commit/b3b67d6c26d7a088f02bd78f4082ec134a6fd6a0))

* chore: Replace isort black and flake8 with ruff (393)

* replaces in cache_activations_runner.py

* replaces isort, black, adn flake8 with Ruff

* adds SIM lint rule

* fixes for CI check

* adds RET lint rule

* adds LOG lint rule

* fixes RET error

* resolves conflicts

* applies make format

* adds T20 rule

* replaces extend-select with select

* resolves conflicts

* fixes lint errors

* update .vscode/settings.json

* Revert &34;update .vscode/settings.json&34;

This reverts commit 1bb5497d7495f7fb0843bc4eb885ba90cf6b4f47.

* updates .vscode/settings.json

* adds newline ([`52dbff9`](https://github.com/jbloomAus/SAELens/commit/52dbff9d4311b873641c17cadcdc8a7f2c562269))

Feature

* feat: Save estimated norm scaling factor during checkpointing (395)

* refactor saving

* save estimated_norm_scaling_factor

* use new constant names elsewhere

* estimate norm scaling factor in `ActivationsStore` init

* fix tests

* add test

* tweaks

* safetensors path

* remove scaling factor on fold

* test scaling factor value

* format

* format

* undo silly change

* format

* save fn protocol

* make save fn static

* test which checkpoints have estimated norm scaling factor

* fix test

* fmt ([`63a15a0`](https://github.com/jbloomAus/SAELens/commit/63a15a010c3f018ae227584a0bc2866b04fe4f79))

Fix

* fix: force build ([`53180e0`](https://github.com/jbloomAus/SAELens/commit/53180e000928695748dc56787f9995f3ee35096c))

* fix: typo in pretrained yaml ([`9db9e36`](https://github.com/jbloomAus/SAELens/commit/9db9e3660f866322a756ca7f596077537a5fa25e))

Unknown

* Merge pull request 397 from jbloomAus/np_yaml

fix: typo in pretrained yaml ([`19bcb2e`](https://github.com/jbloomAus/SAELens/commit/19bcb2e3245962add858c289802cf9fb57c014b4))

5.1.0

Feature

* feat: Replace print with controllable logging (388)

* replaces in pretrained_sae_loaders.py

* replaces in load_model.py

* replaces in neuronpedia_integration.py

* replaces in tsea.py

* replaces in pretrained_saes.py

* replaces in cache_activations_runner.py

* replaces in activations_store.py

* replaces in training_sae.py

* replaces in upload_saes_to_huggingface.py

* replaces in sae_training_runner.py

* replaces in config.py

* fixes error for CI

---------

Co-authored-by: David Chanin <chanindavgmail.com> ([`2bcd646`](https://github.com/jbloomAus/SAELens/commit/2bcd646bf69a116d5a7df14d2fe07988539a930b))

5.0.0

Breaking

* feat: Cleaned up CacheActionsRunnerConfig (389)

BREAKING CHANGE: Superfluous config options have been removed

* Cleaned up CacheActionsRunnerConfig

Before `CacheActivationConfig` had a inconsistent config file for some
interopability with `LanguageModelSAERunnerConfig`. It was kind of
unclear which parameters were necessary vs redundant, and just was
fairly unclear.

Simplified to the required arguments:

- `dataset_path`: Tokenized or untokenized dataset
- `total_training_tokens`
- `model_name`
- `model_batch_size`
- `hook_name`
- `final_hook_layer`
- `d_in`

I think this scheme captures everything you need when attempting to
cache activations and makes it a lot easier to reason about.

Optional:


activation_save_path defaults to &34;activations/{dataset}/{model}/{hook_name}
shuffle=True
prepend_bos=True
streaming=True
seqpos_slice
buffer_size_gb=2 Size of each buffer. Affects memory usage and saving freq
device=&34;cuda&34; or &34;cpu&34;
dtype=&34;float32&34;
autocast_lm=False
compile_llm=True
hf_repo_id Push to hf
model_kwargs `run_with_cache`
model_from_pretrained_kwargs


* Keep compatiability with old config

- Renamed to keep values same where possible
- Moved _from_saved_activations (private api for CachedActivationRunner)
to cached_activation_runner.py
- Use properties instead of `__post_init__` ([`d81e286`](https://github.com/jbloomAus/SAELens/commit/d81e2862ce914c0b0f86c544fa8f4320c82032ac))

4.4.5

Fix

* fix: add missing np ([`9d26da4`](https://github.com/jbloomAus/SAELens/commit/9d26da40a35dcd335038df8724f94558a80766e0))

Unknown

* Merge pull request 387 from jbloomAus/np_yaml

fix: add missing neuronpedia yaml entries ([`deae2a7`](https://github.com/jbloomAus/SAELens/commit/deae2a7e81552b9f055baf7da1084231a3a5811c))

4.4.4

Fix

* fix: add missing np ([`3192463`](https://github.com/jbloomAus/SAELens/commit/31924632d0e49c57fb3cf939a789e0b2aa10152d))

Unknown

* Merge pull request 386 from jbloomAus/np_yaml

fix: add missing neuronpedia yaml entries ([`e35f998`](https://github.com/jbloomAus/SAELens/commit/e35f998f3d51ec5491347a3157838c96794f09d9))

4.4.3

Fix

* fix: add missing np ([`9b2a19c`](https://github.com/jbloomAus/SAELens/commit/9b2a19c86dea6c8b7d34736cdfaec28c40859440))

Unknown

* Merge pull request 385 from jbloomAus/np_yaml

fix: add missing neuronpedia yaml entry ([`7ac5253`](https://github.com/jbloomAus/SAELens/commit/7ac5253d0f9c3fcc180b82e087070b96498da988))

Page 1 of 19

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.