- Extended parallelization of data preparation to vocabulary and statistics creation while minimizing the overhead of sharding.
2.3.20
Added
- Added debug logging for restrict_lexicon lookups
2.3.19
Changed
- When training only the decoder (`--fixed-param-strategy all_except_decoder`), disable autograd for the encoder and embeddings to save memory.
2.3.18
Changed
- Updated Docker builds and documentation. See [sockeye_contrib/docker](sockeye_contrib/docker).
2.3.17
Not secure
Added - Added an alternative, faster implementation of greedy search. The '--greedy' flag to `sockeye.translate` will enable it. This implementation does not support hypothesis scores, batch decoding, or lexical constraints."
2.3.16
Added - Added option `--transformer-feed-forward-use-glu` to use Gated Linear Units in transformer feed forward networks ([Dauphin et al., 2016](https://arxiv.org/abs/1612.08083); [Shazeer, 2020](https://arxiv.org/abs/2002.05202)).