News
====
- GluonNLP will be featured in KDD 2019 Alaska! Check out our tutorial: [From Shallow to Deep Language Representations: Pre-training, Fine-tuning, and Beyond](https://www.kdd.org/kdd2019/hands-on-tutorials).
- GluonNLP was featured in JSALT 2019 in Montreal, 2019-6-14! Checkout https://jsalt19.mxnet.io.
- This is the last release in GluonNLP that will officially support Python 2. 721
Models and Scripts
==================
BERT
- a BERT BASE model pre-trained on a large corpus including [OpenWebText Corpus](https://skylion007.github.io/OpenWebTextCorpus/), BooksCorpus, and English Wikipedia, which has comparable performance with the BERT large model from Google. The test score on GLUE Benchmark is reported below. Also improved usability of the BERT pre-training script: on-the-fly training data generation, sentencepiece, horovod, etc. (799, 687, 806, 669, 665). Thank you davisliang vanyacohen Skylion007
| Source | GluonNLP | google-research/bert | google-research/bert |
|-----------|-----------------------------------------|-----------------------------|-----------------------------|
| Model | bert_12_768_12 | bert_12_768_12 | bert_24_1024_16 |
| Dataset | `openwebtext_book_corpus_wiki_en_uncased` | `book_corpus_wiki_en_uncased` | `book_corpus_wiki_en_uncased` |
| SST-2 | **95.3** | 93.5 | 94.9 |
| RTE | **73.6** | 66.4 | 70.1 |
| QQP | **72.3** | 71.2 | 72.1 |