Swissarmytransformer

Latest version: v0.4.12

Safety actively analyzes 724352 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 6

2021.11.5

1. Add generation.autoregressive_sampling.evalute_perplexity
2. fix Runtime Error in skipping Nan Loss

2021.10.29

1. change `mixins` from `ModuleList` to `ModuleDict`
2. return tokens and mems in `fill_sequence`, and mems becomes a tensor.
3. `CachedAutoRegressiveMixin`
How to migrate old SAT ckpt to new version?
Example:
python
import torch
old = torch.load('xxxxx/mp_rank_00_model_states.pt.old', map_location='cpu')

replace names, mixins index to keys
oldm = old['module']
for k in list(oldm.keys()):
if k.startswith('mixins.0'):
new_k = k.replace('mixins.0', 'mixins.extra_position_embedding')
elif k.startswith('mixins.1'):
new_k = k.replace('mixins.1', 'mixins.attention_plus')
else:
continue
oldm[new_k] = oldm[k]
del oldm[k]
save to destination
torch.save(old, 'xxxxx/mp_rank_00_model_states.pt')


for the older framework, you also need:
python
old['module']['transformer.word_embeddings.weight'] = old['module']['word_embeddings.weight']
del old['module']['word_embeddings.weight']

0.4.11

1. fix the tarfile buffer_size bug in 0.4.9 and 0.4.10.
2. fix potential problem to pass a mixed-device model to training_main
3. fix emaadam no use error introduced in 0.4.9 and 0.4.10.

0.4.10

1. fix model parallel init possible bug by additional broadcast
2. add nsys profiling
3. add gated mlp option
4. support batch_from_same_dataset for multi-webds
5. fix cmp kernel quant no bias bug

0.4.6

1. add droppath and checkpoint last layer skip
2. support multiple webdataset weighting
3. fix lora merging
4. add different lr in different parts, add a 'lr' attr for parameters in the `disable_untrainable_params`.

0.4.1

1. better model parallel support (training mode split)
2. better default zero 1/2 config
3. test bf16 training
4. change qkv order of chatglm1
5. only use pytorch 2.0 attention when full / causal.

Page 4 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.