Rtg

Latest version: v0.7.2

Safety actively analyzes 688792 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.7

- Big improvements:
- Autocast / mixed precision: `bfloat16` instead of `float16`. Now we can train larger models on larger batches using 16bit float ops without loss becoming infinity!
- WARNING: we need pytorch 1.10 or newer. Please upgrade!
- validation BLEU scores are computed without teacher forcing i.e., similar to inference. BLEU is more realistic estimate of test time bleu
- WARNING: validations can be slower. Dont use too big validation set
- schedule:
- `inverse_sqrt` support scaler multiplier term, similar to `noam`
- `inverse_root` schedule added, generalization of `inverse_sqrt`
- fixes
- `rtg.prep` CLI arguments works now
- optimizer state loading now works while resuming training
- parent model will be recreated if missing even after _PREPARED flag exists

0.6.1

- `rtg.fork` accepts multiple to_dir; thus supports cloning multiple times at once
- Bug fix: early stopping on distributed parallel training
- `rtg.tool.augment` to support data augmentations
- Add attention visualization in rtg.serve; powered by plotly
- rtg.pipeline and rtg.fork: uses relative symlinks instead of absolute paths
- rtg.decode shows decoding speed (segs, src_toks, hyp_toks)
- `batch_size` is auto adjusted based on number of workers and gradient_accum (huh! finally)
- `batch_size` normalizer in distributed training setting (fix! faster convergence now)
- support for `byte` encoding added
- Validation metrics; previously BLEU was teacher-forced similar to validation loss, now BLEU is from autoregressive output (resembling test time)
- Use bfloat16 for mixed precision training, requires torch 1.10+
-

0.6.0

- Redesign of registry; using decorators to register all modules
- `optim` block is split into `optimizer` `schedule` and `criterion; as a result, **this version is not backward compatible with prior versions** Refer to migration guide
- `NoamOpt` replaced with `ScheduledOptimizer` which takes scheduler and optimizer objects which are independently configurable from conf.yml

- Add transformer sequence classification model: `tfmcls`, supports initialization from pretrained NMT (picks encoder layers, source embeddings, and source vocabs from NMT experiment)

0.5.2

- Fix `rtg.decode` bug fix (partial migration to new API)
- test case added for `decode` api so we can catch such errors in future

0.5.1

- `rtg.serve` supports flexible transformations on source (pre processing) and target (post processing)
- Travis build configured to auto run tests
- sequence classification is now supported via `tfmcls` model

0.5.0

- DDP: multinode training see `scripts/slurm-multinode-launch.sh`
- FP16 and mixed precision (upgrade from APEX to torch's built in AMP)
- NLCodec & NLDb integration for scaling to large datasets using pyspark backend
- Web UI rtg-serve
- Cache ensemble state for rtg-decode
- Docker images for 500-eng model
- Parent child transfer: Shrink parent model vocab and embeddings to child datasets
- Fix packaging of flask app: now templates and static files are also included in PyPI package

Page 1 of 2

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.