Changed - Extracted prepare vocab functionality in the build vocab step into its own function. This matches the pattern in prepare data and train where the main() function only has argparsing, and it invokes a separate function to do the work. This is to allow modules that import this one to circumvent the command line.
1.18.95
Changed - Removed custom operators from transformer models and replaced them with symbolic operators. Improves Performance.
1.18.94
Added - Added ability to accumulate gradients over multiple batches (--update-interval). This allows simulation of large batch sizes on environments with limited memory. For example: training with `--batch-size 4096 --update-interval 2` should be close to training with `--batch-size 8192` at smaller memory footprint.
1.18.93
Not secure
Fixed - Made `brevity_penalty` argument in `Translator` class optional to ensure backwards compatibility.
1.18.92
Not secure
Added - Added sentence length (and length ratio) prediction to be able to discourage hypotheses that are too short at inference time. Can be enabled for training with `--length-task` and with `--brevity-penalty-type` during inference.