Nlu

Latest version: v5.4.0

Safety actively analyzes 663890 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 12

2004.09297

Powered by [MPNet](https://sparknlp.org/docs/en/transformers#mpnetforsequenceclassification)

|Language|nlp.load() reference|Spark NLP Model reference|
|---|---|---|
|en|en.classify.mpnet.ukr_message|[mpnet_sequence_classifier_ukr_message](https://sparknlp.org/2024/01/10/mpnet_sequence_classifier_ukr_message_en.html)|

---
Pipeline Tracer

[Tutorial Notebook](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/healthcare/pipeline_parser/Parser.ipynb)

The PipelineTracer is now accessible on NLU pipelines which is a versatile class designed to trace and analyze the stages of a pipeline, offering in-depth insights into entities, assertions, deidentification, classification, and relationships. It also facilitates the creation of parser dictionaries for building a PipelineOutputParser. Key functions include printing the pipeline schema, creating parser dictionaries, and retrieving possible assertions, relations, and entities. Also, provide direct access to parser dictionaries and available pipeline schemas


Load a pipe
python
pipe = nlp.load("en.explain_doc.clinical_oncology.pipeline")



Get all assertions predictable with pipe
python
pipe.getPossibleAssertions()
>>> ['Past', 'Family', 'Absent', 'Hypothetical', 'Possible', 'Present']


Get all entities predictable with pipe
python
pipe.getPossibleEntities()
>>> ['Cycle_Number','Direction','Histological_Type', .... ]


Get all relation predictable with pipe
python
pipe.getPossibleRelations()
>>> ['is_size_of', 'is_date_of', 'is_location_of', 'is_finding_of']


Predict parsed with configs
python
column_maps = pipe.createParserDictionary()
column_maps.update({"document_identifier": "clinical_deidentification"})
pipe = nlp.load("en.explain_doc.clinical_oncology.pipeline")
res = pipe.predict(data,parser_output=True, parser_config=column_maps)
pd.json_normalize(res['result'][0]["entities"])



![Pasted image 20240713173038](https://github.com/user-attachments/assets/62920dae-56b7-473b-8543-37e9acf63b56)



**Powered By**: [PipelineTracer](https://nlp.johnsnowlabs.com/licensed/api/com/johnsnowlabs/util/tracer/PipelineTracer.html)



-----



📖Additional NLU resources

* [140+ NLU Tutorials](https://nlp.johnsnowlabs.com/docs/en/jsl/notebooks)
* [Streamlit visualizations docs](https://nlp.johnsnowlabs.com/docs/en/jsl/streamlit_viz_examples)
* The complete list of all 20000+ models & pipelines in 300+ languages is available on [Models Hub](https://nlp.johnsnowlabs.com/models)
* [Spark NLP publications](https://medium.com/spark-nlp)
* [NLU documentation](https://nlp.johnsnowlabs.com/docs/en/jsl/install)
* [Discussions](https://github.com/JohnSnowLabs/spark-nlp/discussions) Engage with other community members, share ideas, and show off how you use Spark NLP and NLU!


---

Installation
shell
pip install johnsnowlabs

30.4

27.3

| 26 | 4 | 120.3 | 91 | 4.43 | 2.14 | 16.7 | 0 | 1 | 5 |

22.8

21.4

Extract Tables from DOC/DOCX files as Pandas DataFrames

Sample DOCX:
![Sample DOCX](https://github.com/JohnSnowLabs/nlu/blob/4.0.0/docs/assets/images/ocr/nlu_ocr/tables/doc.png?raw=true)

python
nlu.load('doc2table').predict('/path/to/sample.docx')

**Output of DOCX Table OCR :**

| Screen Reader | Responses | Share |
|:----------------|------------:|:--------|
| JAWS | 853 | 49% |
| NVDA | 238 | 14% |
| Window-Eyes | 214 | 12% |
| System Access | 181 | 10% |
| VoiceOver | 159 | 9% |





Extract Tables from PPT files as Pandas DataFrame

Sample PPT with two tables:
![Sample PPT with two tables](https://github.com/JohnSnowLabs/nlu/blob/4.0.0/docs/assets/images/ocr/nlu_ocr/tables/ppt.png?raw=true)

python
nlu.load('ppt2table').predict('/path/to/sample.docx')


**Output of PPT Table OCR :**


| Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
|---------------:|--------------:|---------------:|--------------:|:----------|

19.7

| 15 | 8 | 301 | 335 | 3.54 | 3.57 | 14.6 | 0 | 1 | 5 |

Page 1 of 12

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.