Added
- Added ELECTRA pretraining support.
- Added better support for configuring model architectures when training language models from scratch.
- Any options which should be overriden from the default config can now be specified in the `args` dict. (`config` key)
Changed
- Default entry for `vocab_size` removed from `args` for `LanguageModelingModel` as it differs for different model types.
- `vocab_size` must now be specified whenever a new tokenizer is to be trained.
Fixed
- Fixed bugs when training BERT (with word piece tokenization) language models from scratch.
- Fixed incorrect special tokens being used with BERT models when training a new tokenizer.
- Fixed potential bugs with BERT tokenizer training.