Swissarmytransformer

Latest version: v0.4.12

Safety actively analyzes 723650 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 6

2023.4.9

Large update v.0.3.0
1. delete `--sandwich-ln`
2. `from_pretrained(args, name) => from_pretrained(name, args=None)`
3. MODEL_URLS fix typo
4. enable model-only mode

2022.6.27

1. Fix *flat_output bug.
2. fix defualt mpu init_method bug.

2022.6.6

1. `from_pretrained` now auto downloads models. There are two kinds of usages: `SomeModel.from_pretrained(args, name)` will load the weights of `name` model to a `SomeModel` with the same model arch hyper-params with `name`; `AutoModel.from_pretrained(args, name)` will return an official model (`model_class` Class) with the pretrained weights.
2. ENV `SAT_HOME` is where we put the models in. Set it in your shell file.
3. don't necessarily need `deepspeed_config`, or pass model arch hyper-params for `from_pretrained`. Use `zero-stage 0/1/2`.

2022.6.3

1. split all the default hooks out
2. change the order, model hooks will not override all the things. They now are the same as mixin hooks added in the **front** of all the mixins.

2022.1.13

1. Add Vit
2. Fix evaluation all_reduce bug

2021.12.13

1. Ensure enough training data, no longer always 200 times
2. You can use `kw_args['cross_layer_output']['new_key']=xxx` to pass other results to each layer in `position/word_embedding_forward`.
3. Add `--train-data-weights`.

Page 3 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.