========
------------------------------
Major features and improvements
------------------------------
* Introducing MarianTransformer annotator for machine translation based on MarianNMT models. Marian is an efficient, free Neural Machine Translation framework mainly being developed by the Microsoft Translator team (646+ pretrained models & pipelines in 192+ languages)
* Introducing T5Transformer annotator for Text-To-Text Transfer Transformer (Google T5) models to achieve state-of-the-art results on multiple NLP tasks such as Translation, Summarization, Question Answering, Sentence Similarity, and so on
* Introducing brand new and refactored language detection and identification models. The new LanguageDetectorDL is faster, more accurate, and supports up to 375 languages
* Introducing WordSegmenter annotator, a trainable annotator for word segmentation of languages without any rule-based tokenization such as Chinese, Japanese, or Korean
* Introducing DocumentNormalizer annotator cleaning content from HTML or XML documents, applying either data cleansing using an arbitrary number of custom regular expressions either data extraction following the different parameters
* [Spark NLP Display](https://github.com/JohnSnowLabs/spark-nlp-display) for visualization of different types of annotations
* Add support for new multi-lingual models in UniversalSentenceEncoder annotator
* Add support to Lemmatizer to be trained directly from a DataFrame instead of a text file
* Add training helper to transform CoNLL-U into Spark NLP annotator type columns
----------------
Bugfixes and Enhancements
----------------
* Fix all the known issues in ClassifierDL, SentimentDL, and MultiClassifierDL annotators in a Cluster
* NerDL enhancements for memory optimization and logging during the training with the test dataset
* SentenceEmbeddings annotator now reuses the storageRef of any embeddings used in prior
* Fix dropout in SentenceDetectorDL models for more deterministic results. Both English and Multi-lingual models are retrained for the 2.7.0 release
* Fix Python dataType Annotation
* Upgrade to Apache Spark 2.4.7
========