We are incredibly excited to announce John Snow Labs NLU 3.4.0 has been released!
This release features `11 new annotator classes` and `80` new models, including 3 `OCR Transformers` which enable you to extract text
from various file types, support for `GPT2` and new pretrained `T5` models for **Text Generation** and dozens more of new transformer based models
for **Token and Sequence Classification**.
This includes `8 new Sequence classifier models` which can be pretrained in Huggingface and imported into Spark NLP and NLU.
Finally, the NLU tutorial page of the [140+ notebooks has been updated](https://nlu.johnsnowlabs.com/docs/en/notebooks)
**New** NLU OCR Features
3 new OCR based spells are supported, which enable extracting `text` from files of type
`JPEG`, `PNG`, `BMP`, `WBMP`, `GIF`, `JPG`, `TIFF`, `DOCX`, `PDF` in just 1 line of code.
You need a Spark OCR license for using these, which is available for [free here](https://www.johnsnowlabs.com/spark-nlp-try-free/) and refer to the new
[OCR tutorial notebook](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/ocr/ocr_for_img_pdf_docx_files.ipynb)
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/ocr/ocr_for_img_pdf_docx_files.ipynb)
Find more details on the [NLU OCR documentation page](https://nlu.johnsnowlabs.com/docs/en/nlu_for_ocr)
**New** NLU Healthcare Features
The healthcare side features a new `MedicalBertForTokenClassifier` annotator which is a Bert based model for token classification problems like `Named Entity Recognition`,
`Parts of Speech` and much more. Overall there are `28` new models which include German De-Identification models, English NER models for extracting `Drug Development Trials`,
`Clinical Abbreviations and Acronyms`, NER models for chemical compounds/drugs and genes/proteins, updated `MedicalBertForTokenClassifier` NER models for the medical domains `Adverse drug Events`,
`Anatomy`, `Chemicals`, `Genes`,`Proteins`, `Cellular/Molecular Biology`, `Drugs`, `Bacteria`, `De-Identification` and general Medical and Clinical Named Entities.
For **Entity Relation Extraction** between entity pairs new models for interaction between `Drugs and Proteins`.
For **Entity Resolution** new models for resolving `Clinical Abbreviations and Acronyms` to their full length names and also a model for resolving `Drug Substance Entities` to the categories
`Clinical Drug`, `Pharmacologic Substance`, `Antibiotic`, `Hazardous` or `Poisonous Substance` and new resolvers for `LOINC` and `SNOMED` terminologies.
**New** NLU Open source Features
On the open source side we have new support for [Open Ai's `GPT2`](https://openai.com/blog/tags/gpt-2/) for various text sequence to sequence problems and
additionally the following new Transformer models are supported :
`RoBertaForSequenceClassification`, `XlmRoBertaForSequenceClassification`, `LongformerForSequenceClassification`,
`AlbertForSequenceClassification`, `XlnetForSequenceClassification`, `Word2Vec` with various pre-trained weights for various problems!
New **GPT2** models for generating text conditioned on some input,
New **T5 style transfer models** for `active to passive`, `formal to informal`, `informal to formal`, `passive to active` sequence to sequence generation.
Additionally, a new T5 model for generating SQL code from natural language input is provided.
On top of this dozens new Transformer based Sequence Classifiers and Token Classifiers have been released, this is includes for `Token Classifier` the following models :
Multi-Lingual general NER models for **10 African Languages** (`Amharic`, `Hausa`, `Igbo`, `Kinyarwanda`, `Luganda`, `Nigerian`, `Pidgin`, `Swahilu`, `Wolof`, and `Yorùbá`),
**10 high resourced languages** (10 high resourced languages (`Arabic`, `German`, `English`, `Spanish`, `French`, `Italian`, `Latvian`, `Dutch`, `Portuguese` and `Chinese`),
**6 Scandinavian languages** (`Danish`, `Norwegian-Bokmål`, `Norwegian-Nynorsk`, `Swedish`, `Icelandic`, `Faroese`) ,
Uni-Lingual NER models for general entites in the language `Chinese`, `Hindi`, `Islandic`, `Indonesian`
and finally English NER models for extracting entities related to `Stocks Ticker Symbols`, `Restaurants`, `Time`.
For `Sequence Classification` new models for classifying `Toxicity in Russian text` and English models for
`Movie Reviews`, `News Categorization`, `Sentimental Tone` and `General Sentiment`
New NLU OCR Models
The following Transformers have been integrated from [Spark OCR](https://nlp.johnsnowlabs.com/docs/en/ocr_pipeline_components)
| NLU Spell | Transformer Class |
|----------------------|-----------------------------------------------------------------------------------------|
| nlu.load(`img2text`) | [ImageToText](https://nlp.johnsnowlabs.com/docs/en/ocr_pipeline_components#imagetotext) |
| nlu.load(`pdf2text`) | [PdfToText](https://nlp.johnsnowlabs.com/docs/en/ocr_pipeline_components#pdftotext) |
| nlu.load(`doc2text`) | [DocToText](https://nlp.johnsnowlabs.com/docs/en/ocr_pipeline_components#doctotext) |
New Open Source Models
Integration for the 49 new models from the colossal [Spark NLP 3.4.0 release](https://nlp.johnsnowlabs.com/docs/en/release_notes#340)
| Language | NLU Reference | Spark NLP Reference | Task | Annotator Class |
|:-----------|:----------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------|:------------------------------------|
| en | [en.gpt2.distilled](https://nlp.johnsnowlabs.com/2021/12/03/gpt2_distilled_en.html) | [gpt2_distilled](https://nlp.johnsnowlabs.com/2021/12/03/gpt2_distilled_en.html) | Text Generation | GPT2Transformer |
| en | [en.gpt2](https://nlp.johnsnowlabs.com/2021/12/03/gpt2_en.html) | [gpt2](https://nlp.johnsnowlabs.com/2021/12/03/gpt2_en.html) | Text Generation | GPT2Transformer |
| en | [en.gpt2.medium](https://nlp.johnsnowlabs.com/2021/12/03/gpt2_medium_en.html) | [gpt2_medium](https://nlp.johnsnowlabs.com/2021/12/03/gpt2_medium_en.html) | Text Generation | GPT2Transformer |
| en | [en.gpt2.large](https://nlp.johnsnowlabs.com/2021/12/03/gpt_large_en.html) | [gpt_large](https://nlp.johnsnowlabs.com/2021/12/03/gpt_large_en.html) | Text Generation | GPT2Transformer |
| en | [en.t5.active_to_passive_styletransfer](https://nlp.johnsnowlabs.com/2022/01/12/t5_active_to_passive_styletransfer_en.html) | [t5_active_to_passive_styletransfer](https://nlp.johnsnowlabs.com/2022/01/12/t5_active_to_passive_styletransfer_en.html) | Text Generation | T5Transformer |
| en | [en.t5.formal_to_informal_styletransfer](https://nlp.johnsnowlabs.com/2022/01/12/t5_formal_to_informal_styletransfer_en.html) | [t5_formal_to_informal_styletransfer](https://nlp.johnsnowlabs.com/2022/01/12/t5_formal_to_informal_styletransfer_en.html) | Text Generation | T5Transformer |
| en | [en.t5.grammar_error_corrector](https://nlp.johnsnowlabs.com/2022/01/12/t5_grammar_error_corrector_en.html) | [t5_grammar_error_corrector](https://nlp.johnsnowlabs.com/2022/01/12/t5_grammar_error_corrector_en.html) | Text Generation | T5Transformer |
| en | [en.t5.informal_to_formal_styletransfer](https://nlp.johnsnowlabs.com/2022/01/12/t5_informal_to_formal_styletransfer_en.html) | [t5_informal_to_formal_styletransfer](https://nlp.johnsnowlabs.com/2022/01/12/t5_informal_to_formal_styletransfer_en.html) | Text Generation | T5Transformer |
| en | [en.t5.passive_to_active_styletransfer](https://nlp.johnsnowlabs.com/2022/01/12/t5_passive_to_active_styletransfer_en.html) | [t5_passive_to_active_styletransfer](https://nlp.johnsnowlabs.com/2022/01/12/t5_passive_to_active_styletransfer_en.html) | Text Generation | T5Transformer |
| en | [en.t5.wikiSQL](https://nlp.johnsnowlabs.com/2022/01/12/t5_small_wikiSQL_en.html) | [t5_small_wikiSQL](https://nlp.johnsnowlabs.com/2022/01/12/t5_small_wikiSQL_en.html) | Text Generation | T5Transformer |
| xx | [xx.ner.masakhaner](https://nlp.johnsnowlabs.com/2021/12/06/xlm_roberta_large_token_classifier_masakhaner_xx.html) | [xlm_roberta_large_token_classifier_masakhaner](https://nlp.johnsnowlabs.com/2021/12/06/xlm_roberta_large_token_classifier_masakhaner_xx.html) | Named Entity Recognition | XlmRoBertaForTokenClassification |
| xx | [xx.ner.high_resourced_lang](https://nlp.johnsnowlabs.com/2021/12/26/xlm_roberta_large_token_classifier_hrl_xx.html) | [xlm_roberta_large_token_classifier_hrl](https://nlp.johnsnowlabs.com/2021/12/26/xlm_roberta_large_token_classifier_hrl_xx.html) | Named Entity Recognition | XlmRoBertaForTokenClassification |
| xx | [xx.ner.scandinavian](https://nlp.johnsnowlabs.com/2021/12/09/bert_token_classifier_scandi_ner_xx.html) | [bert_token_classifier_scandi_ner](https://nlp.johnsnowlabs.com/2021/12/09/bert_token_classifier_scandi_ner_xx.html) | Named Entity Recognition | BertForTokenClassification |
| en | [en.embed.electra.medical](https://nlp.johnsnowlabs.com/2022/01/04/electra_medal_acronym_en.html) | [electra_medal_acronym](https://nlp.johnsnowlabs.com/2022/01/04/electra_medal_acronym_en.html) | Embeddings | BertEmbeddings |
| en | [en.ner.restaurant](https://nlp.johnsnowlabs.com/2021/12/31/nerdl_restaurant_100d_en.html) | [nerdl_restaurant_100d](https://nlp.johnsnowlabs.com/2021/12/31/nerdl_restaurant_100d_en.html) | Named Entity Recognition | NerDLModel |
| en | [en.embed.word2vec.gigaword_wiki](https://nlp.johnsnowlabs.com/2022/01/03/word2vec_gigaword_wiki_300_en.html) | [word2vec_gigaword_wiki_300](https://nlp.johnsnowlabs.com/2022/01/03/word2vec_gigaword_wiki_300_en.html) | Embeddings | Word2VecModel |
| en | [en.embed.word2vec.gigaword](https://nlp.johnsnowlabs.com/2022/01/03/word2vec_gigaword_300_en.html) | [word2vec_gigaword_300](https://nlp.johnsnowlabs.com/2022/01/03/word2vec_gigaword_300_en.html) | Embeddings | Word2VecModel |
| en | [en.classify.xlm_roberta.imdb](https://nlp.johnsnowlabs.com/2021/12/23/xlm_roberta_base_sequence_classifier_imdb_en.html) | [xlm_roberta_base_sequence_classifier_imdb](https://nlp.johnsnowlabs.com/2021/12/23/xlm_roberta_base_sequence_classifier_imdb_en.html) | Text Classification | XlmRoBertaForSequenceClassification |
| en | [en.classify.xlm_roberta.ag_news](https://nlp.johnsnowlabs.com/2021/12/23/xlm_roberta_base_sequence_classifier_ag_news_en.html) | [xlm_roberta_base_sequence_classifier_ag_news](https://nlp.johnsnowlabs.com/2021/12/23/xlm_roberta_base_sequence_classifier_ag_news_en.html) | Text Classification | XlmRoBertaForSequenceClassification |
| en | [en.classify.roberta.imdb](https://nlp.johnsnowlabs.com/2021/12/16/roberta_base_sequence_classifier_imdb_en.html) | [roberta_base_sequence_classifier_imdb](https://nlp.johnsnowlabs.com/2021/12/16/roberta_base_sequence_classifier_imdb_en.html) | Text Classification | RoBertaForSequenceClassification |
| en | [en.classify.roberta.ag_news](https://nlp.johnsnowlabs.com/2021/12/16/roberta_base_sequence_classifier_ag_news_en.html) | [roberta_base_sequence_classifier_ag_news](https://nlp.johnsnowlabs.com/2021/12/16/roberta_base_sequence_classifier_ag_news_en.html) | Text Classification | RoBertaForSequenceClassification |
| en | [en.classify.albert.ag_news](https://nlp.johnsnowlabs.com/2021/12/16/albert_base_sequence_classifier_ag_news_en.html) | [albert_base_sequence_classifier_ag_news](https://nlp.johnsnowlabs.com/2021/12/16/albert_base_sequence_classifier_ag_news_en.html) | Text Classification | AlbertForSequenceClassification |
| en | [en.classify.albert.imdb](https://nlp.johnsnowlabs.com/2021/12/16/albert_base_sequence_classifier_imdb_en.html) | [albert_base_sequence_classifier_imdb](https://nlp.johnsnowlabs.com/2021/12/16/albert_base_sequence_classifier_imdb_en.html) | Text Classification | AlbertForSequenceClassification |
| en | [en.classify.ag_news.longformer](https://nlp.johnsnowlabs.com/2021/12/16/longformer_base_sequence_classifier_ag_news_en.html) | [longformer_base_sequence_classifier_ag_news](https://nlp.johnsnowlabs.com/2021/12/16/longformer_base_sequence_classifier_ag_news_en.html) | Text Classification | LongformerForSequenceClassification |
| en | [en.classify.imdb.xlnet](https://nlp.johnsnowlabs.com/2021/12/23/xlnet_base_sequence_classifier_imdb_en.html) | [xlnet_base_sequence_classifier_imdb](https://nlp.johnsnowlabs.com/2021/12/23/xlnet_base_sequence_classifier_imdb_en.html) | Text Classification | XlnetForSequenceClassification |
| en | [en.classify.finance_sentiment](https://nlp.johnsnowlabs.com/2021/12/21/bert_sequence_classifier_finbert_tone_en.html) | [bert_sequence_classifier_finbert_tone](https://nlp.johnsnowlabs.com/2021/12/21/bert_sequence_classifier_finbert_tone_en.html) | Sentiment Analysis | BertForSequenceClassification |
| en | [en.classify.imdb.longformer](https://nlp.johnsnowlabs.com/2021/12/16/longformer_base_sequence_classifier_imdb_en.html) | [longformer_base_sequence_classifier_imdb](https://nlp.johnsnowlabs.com/2021/12/16/longformer_base_sequence_classifier_imdb_en.html) | Text Classification | LongformerForSequenceClassification |
| en | [en.classify.ag_news.longformer](https://nlp.johnsnowlabs.com/2021/12/16/longformer_base_sequence_classifier_ag_news_en.html) | [longformer_base_sequence_classifier_ag_news](https://nlp.johnsnowlabs.com/2021/12/16/longformer_base_sequence_classifier_ag_news_en.html) | Text Classification | LongformerForSequenceClassification |
| en | [en.ner.time](https://nlp.johnsnowlabs.com/2021/12/28/roberta_token_classifier_timex_semeval_en.html) | [roberta_token_classifier_timex_semeval](https://nlp.johnsnowlabs.com/2021/12/28/roberta_token_classifier_timex_semeval_en.html) | Named Entity Recognition | RoBertaForTokenClassification |
| en | [en.ner.stocks_ticker](https://nlp.johnsnowlabs.com/2021/12/27/roberta_token_classifier_ticker_en.html) | [roberta_token_classifier_ticker](https://nlp.johnsnowlabs.com/2021/12/27/roberta_token_classifier_ticker_en.html) | Named Entity Recognition | RoBertaForTokenClassification |
| ru | [ru.classify.toxic](https://nlp.johnsnowlabs.com/2021/12/22/bert_sequence_classifier_toxicity_ru.html) | [bert_sequence_classifier_toxicity](https://nlp.johnsnowlabs.com/2021/12/22/bert_sequence_classifier_toxicity_ru.html) | Text Classification | BertForSequenceClassification |
| it | [it.classify.sentiment](https://nlp.johnsnowlabs.com/2021/12/21/bert_sequence_classifier_sentiment_it.html) | [bert_sequence_classifier_sentiment](https://nlp.johnsnowlabs.com/2021/12/21/bert_sequence_classifier_sentiment_it.html) | Sentiment Analysis | BertForSequenceClassification |
| es | [es.ner](https://nlp.johnsnowlabs.com/2020/02/03/wikiner_6B_100_es.html) | [wikiner_6B_100](https://nlp.johnsnowlabs.com/2020/02/03/wikiner_6B_100_es.html) | Named Entity Recognition | NerDLModel |
| is | [is.ner](https://nlp.johnsnowlabs.com/2021/12/06/roberta_token_classifier_icelandic_ner_is.html) | [roberta_token_classifier_icelandic_ner](https://nlp.johnsnowlabs.com/2021/12/06/roberta_token_classifier_icelandic_ner_is.html) | Named Entity Recognition | RoBertaForTokenClassification |
| id | [id.pos](https://nlp.johnsnowlabs.com/2021/12/27/roberta_token_classifier_pos_tagger_id.html) | [roberta_token_classifier_pos_tagger](https://nlp.johnsnowlabs.com/2021/12/27/roberta_token_classifier_pos_tagger_id.html) | Part of Speech Tagging | RoBertaForTokenClassification |
| tr | [tr.ner](https://nlp.johnsnowlabs.com/2020/11/10/turkish_ner_840B_300_tr.html) | [turkish_ner_840B_300](https://nlp.johnsnowlabs.com/2020/11/10/turkish_ner_840B_300_tr.html) | Named Entity Recognition | NerDLModel |
| id | [id.ner](https://nlp.johnsnowlabs.com/2021/12/03/xlm_roberta_large_token_classification_ner_id.html) | [xlm_roberta_large_token_classification_ner](https://nlp.johnsnowlabs.com/2021/12/03/xlm_roberta_large_token_classification_ner_id.html) | Named Entity Recognition | XlmRoBertaForTokenClassification |
| de | [de.ner](https://nlp.johnsnowlabs.com/2021/12/25/xlm_roberta_large_token_classifier_conll03_de.html) | [xlm_roberta_large_token_classifier_conll03](https://nlp.johnsnowlabs.com/2021/12/25/xlm_roberta_large_token_classifier_conll03_de.html) | Named Entity Recognition | XlmRoBertaForTokenClassification |
| hi | [hi.ner](https://nlp.johnsnowlabs.com/2021/12/27/bert_token_classifier_hi_en_ner_hi.html) | [bert_token_classifier_hi_en_ner](https://nlp.johnsnowlabs.com/2021/12/27/bert_token_classifier_hi_en_ner_hi.html) | Named Entity Recognition | BertForTokenClassification |
| nl | [nl.ner](https://nlp.johnsnowlabs.com/2020/05/10/wikiner_6B_100_nl.html) | [wikiner_6B_100](https://nlp.johnsnowlabs.com/2020/05/10/wikiner_6B_100_nl.html) | Named Entity Recognition | NerDLModel |
| zh | [zh.ner](https://nlp.johnsnowlabs.com/2021/12/07/bert_token_classifier_chinese_ner_zh.html) | [bert_token_classifier_chinese_ner](https://nlp.johnsnowlabs.com/2021/12/07/bert_token_classifier_chinese_ner_zh.html) | Named Entity Recognition | BertForTokenClassification |
| fr | [fr.classify.xlm_roberta.allocine](https://nlp.johnsnowlabs.com/2021/12/23/xlm_roberta_base_sequence_classifier_allocine_fr.html) | [xlm_roberta_base_sequence_classifier_allocine](https://nlp.johnsnowlabs.com/2021/12/23/xlm_roberta_base_sequence_classifier_allocine_fr.html) | Text Classification | XlmRoBertaForSequenceClassification |
| ur | [ur.classify.fakenews](https://nlp.johnsnowlabs.com/2021/12/29/classifierdl_urduvec_fakenews_ur.html) | [classifierdl_urduvec_fakenews](https://nlp.johnsnowlabs.com/2021/12/29/classifierdl_urduvec_fakenews_ur.html) | Text Classification | ClassifierDLModel |
| ur | [ur.classify.news](https://nlp.johnsnowlabs.com/2021/12/10/classifierdl_bert_news_ur.html) | [classifierdl_bert_news](https://nlp.johnsnowlabs.com/2021/12/10/classifierdl_bert_news_ur.html) | Text Classification | ClassifierDLModel |
| fi | [fi.embed_sentence.bert.uncased](https://nlp.johnsnowlabs.com/2022/01/03/bert_base_finnish_uncased_fi.html) | [bert_base_finnish_uncased](https://nlp.johnsnowlabs.com/2022/01/03/bert_base_finnish_uncased_fi.html) | Embeddings | BertSentenceEmbeddings |
| fi | [fi.embed_sentence.bert](https://nlp.johnsnowlabs.com/2022/01/03/bert_base_finnish_uncased_fi.html) | [bert_base_finnish_uncased](https://nlp.johnsnowlabs.com/2022/01/03/bert_base_finnish_uncased_fi.html) | Embeddings | BertSentenceEmbeddings |
| fi | [fi.embed_sentence.bert.cased](https://nlp.johnsnowlabs.com/2022/01/03/bert_base_finnish_cased_fi.html) | [bert_base_finnish_cased](https://nlp.johnsnowlabs.com/2022/01/03/bert_base_finnish_cased_fi.html) | Embeddings | BertSentenceEmbeddings |
| te | [te.embed.distilbert](https://nlp.johnsnowlabs.com/2021/12/14/distilbert_uncased_te.html) | [distilbert_uncased](https://nlp.johnsnowlabs.com/2021/12/14/distilbert_uncased_te.html) | Embeddings | DistilBertEmbeddings |
| sw | [sw.embed.xlm_roberta](https://nlp.johnsnowlabs.com/2021/10/16/xlm_roberta_base_finetuned_swahili_sw.html) | [xlm_roberta_base_finetuned_swahili](https://nlp.johnsnowlabs.com/2021/10/16/xlm_roberta_base_finetuned_swahili_sw.html) | Embeddings | XlmRoBertaEmbeddings |
New Healthcare Models
Integration for the 28 new models from the amazing [Spark NLP for healthcare 3.4.0 release](https://nlp.johnsnowlabs.com/docs/en/licensed_release_notes#340)
| Language | NLU Reference | Spark NLP Reference | Task | Annotator Class |
|:-----------|:------------------------------------------------------------------------------------------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------|:-------------------------|:----------------------------|
| en | [en.med_ner.chemprot.bert](https://nlp.johnsnowlabs.com/2021/10/19/bert_token_classifier_ner_chemprot_en.html) | [bert_token_classifier_ner_chemprot](https://nlp.johnsnowlabs.com/2021/10/19/bert_token_classifier_ner_chemprot_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.med_ner.chemprot.bert](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_chemprot_en.html) | [bert_token_classifier_ner_chemprot](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_chemprot_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_bacteria](https://nlp.johnsnowlabs.com/2021/09/30/bert_token_classifier_ner_bacteria_en.html) | [bert_token_classifier_ner_bacteria](https://nlp.johnsnowlabs.com/2021/09/30/bert_token_classifier_ner_bacteria_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_bacteria](https://nlp.johnsnowlabs.com/2022/01/07/bert_token_classifier_ner_bacteria_en.html) | [bert_token_classifier_ner_bacteria](https://nlp.johnsnowlabs.com/2022/01/07/bert_token_classifier_ner_bacteria_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_anatomy](https://nlp.johnsnowlabs.com/2021/09/30/bert_token_classifier_ner_anatomy_en.html) | [bert_token_classifier_ner_anatomy](https://nlp.johnsnowlabs.com/2021/09/30/bert_token_classifier_ner_anatomy_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_anatomy](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_anatomy_en.html) | [bert_token_classifier_ner_anatomy](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_anatomy_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_drugs](https://nlp.johnsnowlabs.com/2021/09/20/bert_token_classifier_ner_drugs_en.html) | [bert_token_classifier_ner_drugs](https://nlp.johnsnowlabs.com/2021/09/20/bert_token_classifier_ner_drugs_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_drugs](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_drugs_en.html) | [bert_token_classifier_ner_drugs](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_drugs_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_jsl_slim](https://nlp.johnsnowlabs.com/2021/09/24/bert_token_classifier_ner_jsl_slim_en.html) | [bert_token_classifier_ner_jsl_slim](https://nlp.johnsnowlabs.com/2021/09/24/bert_token_classifier_ner_jsl_slim_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_jsl_slim](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_jsl_slim_en.html) | [bert_token_classifier_ner_jsl_slim](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_jsl_slim_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_ade](https://nlp.johnsnowlabs.com/2021/09/30/bert_token_classifier_ner_ade_en.html) | [bert_token_classifier_ner_ade](https://nlp.johnsnowlabs.com/2021/09/30/bert_token_classifier_ner_ade_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_ade](https://nlp.johnsnowlabs.com/2022/01/04/bert_token_classifier_ner_ade_en.html) | [bert_token_classifier_ner_ade](https://nlp.johnsnowlabs.com/2022/01/04/bert_token_classifier_ner_ade_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_deid](https://nlp.johnsnowlabs.com/2021/09/13/bert_token_classifier_ner_deid_en.html) | [bert_token_classifier_ner_deid](https://nlp.johnsnowlabs.com/2021/09/13/bert_token_classifier_ner_deid_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_deid](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_deid_en.html) | [bert_token_classifier_ner_deid](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_deid_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_clinical](https://nlp.johnsnowlabs.com/2021/08/28/bert_token_classifier_ner_clinical_en.html) | [bert_token_classifier_ner_clinical](https://nlp.johnsnowlabs.com/2021/08/28/bert_token_classifier_ner_clinical_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_clinical](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_clinical_en.html) | [bert_token_classifier_ner_clinical](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_clinical_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_jsl](https://nlp.johnsnowlabs.com/2021/08/28/bert_token_classifier_ner_jsl_en.html) | [bert_token_classifier_ner_jsl](https://nlp.johnsnowlabs.com/2021/08/28/bert_token_classifier_ner_jsl_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_jsl](https://nlp.johnsnowlabs.com/2021/09/16/bert_token_classifier_ner_jsl_en.html) | [bert_token_classifier_ner_jsl](https://nlp.johnsnowlabs.com/2021/09/16/bert_token_classifier_ner_jsl_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_jsl](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_jsl_en.html) | [bert_token_classifier_ner_jsl](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_jsl_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_chemical](https://nlp.johnsnowlabs.com/2021/10/19/bert_token_classifier_ner_chemicals_en.html) | [bert_token_classifier_ner_chemicals](https://nlp.johnsnowlabs.com/2021/10/19/bert_token_classifier_ner_chemicals_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.ner_chemical](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_chemicals_en.html) | [bert_token_classifier_ner_chemicals](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_chemicals_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.bionlp](https://nlp.johnsnowlabs.com/2021/11/03/bert_token_classifier_ner_bionlp_en.html) | [bert_token_classifier_ner_bionlp](https://nlp.johnsnowlabs.com/2021/11/03/bert_token_classifier_ner_bionlp_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.bionlp](https://nlp.johnsnowlabs.com/2022/01/03/bert_token_classifier_ner_bionlp_en.html) | [bert_token_classifier_ner_bionlp](https://nlp.johnsnowlabs.com/2022/01/03/bert_token_classifier_ner_bionlp_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.cellular](https://nlp.johnsnowlabs.com/2021/11/03/bert_token_classifier_ner_cellular_en.html) | [bert_token_classifier_ner_cellular](https://nlp.johnsnowlabs.com/2021/11/03/bert_token_classifier_ner_cellular_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.classify.token_bert.cellular](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_cellular_en.html) | [bert_token_classifier_ner_cellular](https://nlp.johnsnowlabs.com/2022/01/06/bert_token_classifier_ner_cellular_en.html) | Named Entity Recognition | MedicalBertForTokenClassifier |
| en | [en.med_ner.abbreviation_clinical](https://nlp.johnsnowlabs.com/2021/12/30/ner_abbreviation_clinical_en.html) | [ner_abbreviation_clinical](https://nlp.johnsnowlabs.com/2021/12/30/ner_abbreviation_clinical_en.html) | Named Entity Recognition | MedicalNerModel |
| en | [en.med_ner.drugprot_clinical](https://nlp.johnsnowlabs.com/2021/12/20/ner_drugprot_clinical_en.html) | [ner_drugprot_clinical](https://nlp.johnsnowlabs.com/2021/12/20/ner_drugprot_clinical_en.html) | Named Entity Recognition | MedicalNerModel |
| en | [en.ner.drug_development_trials](https://nlp.johnsnowlabs.com/2021/12/17/bert_token_classifier_drug_development_trials_en.html) | [bert_token_classifier_drug_development_trials](https://nlp.johnsnowlabs.com/2021/12/17/bert_token_classifier_drug_development_trials_en.html) | Named Entity Recognition | BertForTokenClassification |
| en | [en.med_ner.chemprot](https://nlp.johnsnowlabs.com/2021/04/01/ner_chemprot_biobert_en.html) | [ner_chemprot_biobert](https://nlp.johnsnowlabs.com/2021/04/01/ner_chemprot_biobert_en.html) | Named Entity Recognition | MedicalNerModel |
| en | [en.relation.drugprot](https://nlp.johnsnowlabs.com/2022/01/05/redl_drugprot_biobert_en.html) | [redl_drugprot_biobert](https://nlp.johnsnowlabs.com/2022/01/05/redl_drugprot_biobert_en.html) | Relation Extraction | RelationExtractionDLModel |
| en | [en.relation.drugprot.clinical](https://nlp.johnsnowlabs.com/2022/01/05/re_drugprot_clinical_en.html) | [re_drugprot_clinical](https://nlp.johnsnowlabs.com/2022/01/05/re_drugprot_clinical_en.html) | Relation Extraction | RelationExtractionModel |
| en | [en.resolve.clinical_abbreviation_acronym](https://nlp.johnsnowlabs.com/2021/12/11/sbiobertresolve_clinical_abbreviation_acronym_en.html) | [sbiobertresolve_clinical_abbreviation_acronym](https://nlp.johnsnowlabs.com/2021/12/11/sbiobertresolve_clinical_abbreviation_acronym_en.html) | Entity Resolution | SentenceEntityResolverModel |
| en | [en.resolve.clinical_abbreviation_acronym](https://nlp.johnsnowlabs.com/2022/01/03/sbiobertresolve_clinical_abbreviation_acronym_en.html) | [sbiobertresolve_clinical_abbreviation_acronym](https://nlp.johnsnowlabs.com/2022/01/03/sbiobertresolve_clinical_abbreviation_acronym_en.html) | Entity Resolution | SentenceEntityResolverModel |
| en | [en.resolve.umls_drug_substance](https://nlp.johnsnowlabs.com/2021/12/06/sbiobertresolve_umls_drug_substance_en.html) | [sbiobertresolve_umls_drug_substance](https://nlp.johnsnowlabs.com/2021/12/06/sbiobertresolve_umls_drug_substance_en.html) | Entity Resolution | SentenceEntityResolverModel |
| en | [en.resolve.loinc_cased](https://nlp.johnsnowlabs.com/2021/12/24/sbiobertresolve_loinc_cased_en.html) | [sbiobertresolve_loinc_cased](https://nlp.johnsnowlabs.com/2021/12/24/sbiobertresolve_loinc_cased_en.html) | Entity Resolution | SentenceEntityResolverModel |
| en | [en.resolve.loinc_uncased](https://nlp.johnsnowlabs.com/2021/12/31/sbluebertresolve_loinc_uncased_en.html) | [sbluebertresolve_loinc_uncased](https://nlp.johnsnowlabs.com/2021/12/31/sbluebertresolve_loinc_uncased_en.html) | Entity Resolution | SentenceEntityResolverModel |
| en | [en.embed_sentence.biobert.rxnorm](https://nlp.johnsnowlabs.com/2021/12/23/sbiobert_jsl_rxnorm_cased_en.html) | [sbiobert_jsl_rxnorm_cased](https://nlp.johnsnowlabs.com/2021/12/23/sbiobert_jsl_rxnorm_cased_en.html) | Entity Resolution | BertSentenceEmbeddings |
| en | [en.embed_sentence.bert_uncased.rxnorm](https://nlp.johnsnowlabs.com/2021/12/23/sbert_jsl_medium_rxnorm_uncased_en.html) | [sbert_jsl_medium_rxnorm_uncased](https://nlp.johnsnowlabs.com/2021/12/23/sbert_jsl_medium_rxnorm_uncased_en.html) | Embeddings | BertSentenceEmbeddings |
| en | [en.embed_sentence.bert_uncased.rxnorm](https://nlp.johnsnowlabs.com/2022/01/03/sbert_jsl_medium_rxnorm_uncased_en.html) | [sbert_jsl_medium_rxnorm_uncased](https://nlp.johnsnowlabs.com/2022/01/03/sbert_jsl_medium_rxnorm_uncased_en.html) | Embeddings | BertSentenceEmbeddings |
| en | [en.resolve.snomed_drug](https://nlp.johnsnowlabs.com/2022/01/01/sbiobertresolve_snomed_drug_en.html) | [sbiobertresolve_snomed_drug](https://nlp.johnsnowlabs.com/2022/01/01/sbiobertresolve_snomed_drug_en.html) | Entity Resolution | SentenceEntityResolverModel |
| de | [de.med_ner.deid_subentity](https://nlp.johnsnowlabs.com/2022/01/06/ner_deid_subentity_de.html) | [ner_deid_subentity](https://nlp.johnsnowlabs.com/2022/01/06/ner_deid_subentity_de.html) | Named Entity Recognition | MedicalNerModel |
| de | [de.med_ner.deid_generic](https://nlp.johnsnowlabs.com/2022/01/06/ner_deid_generic_de.html) | [ner_deid_generic](https://nlp.johnsnowlabs.com/2022/01/06/ner_deid_generic_de.html) | Named Entity Recognition | MedicalNerModel |
| de | [de.embed.w2v](https://nlp.johnsnowlabs.com/2020/09/06/w2v_cc_300d_de.html) | [w2v_cc_300d](https://nlp.johnsnowlabs.com/2020/09/06/w2v_cc_300d_de.html) | Embeddings | WordEmbeddingsModel |
Additional NLU resources
* [NLU OCR tutorial notebook](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/ocr/ocr_for_img_pdf_docx_files.ipynb)
* [140+ NLU Tutorials](https://nlu.johnsnowlabs.com/docs/en/notebooks)
* [NLU in Action](https://nlp.johnsnowlabs.com/demo)
* [Streamlit visualizations docs](https://nlu.johnsnowlabs.com/docs/en/streamlit_viz_examples)
* The complete list of all 4000+ models & pipelines in 200+ languages is available on [Models Hub](https://nlp.johnsnowlabs.com/models).
* [Spark NLP publications](https://medium.com/spark-nlp)
* [NLU documentation](https://nlu.johnsnowlabs.com/docs/en/install)
* [Discussions](https://github.com/JohnSnowLabs/spark-nlp/discussions) Engage with other community members, share ideas, and show off how you use Spark NLP and NLU!
1 line Install NLU on Google Colab
!wget https://setup.johnsnowlabs.com/nlu/colab.sh -O - | bash
1 line Install NLU on Kaggle
!wget https://setup.johnsnowlabs.com/nlu/kaggle.sh -O - | bash
Install via PIP
! pip install nlu pyspark streamlit==0.80.0