Stanza

Latest version: v1.10.1

Safety actively analyzes 691322 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 5

0.2.0

This release features major improvements on memory efficiency and speed of the neural network pipeline in stanfordnlp and various bugfixes. These features include:

- The downloadable pretrained neural network models are now substantially smaller in size (due to the use of smaller pretrained vocabularies) with comparable performance. Notably, the default English model is now ~9x smaller in size, German ~11x, French ~6x and Chinese ~4x. As a result, memory efficiency of the neural pipelines for most languages are substantially improved.

- Substantial speedup of the neural lemmatizer via reduced neural sequence-to-sequence operations.

- The neural network pipeline can now take in a Python list of strings representing pre-tokenized text. (https://github.com/stanfordnlp/stanfordnlp/issues/58)

- A requirements checking framework is now added in the neural pipeline, ensuring the proper processors are specified for a given pipeline configuration. The pipeline will now raise an exception when a requirement is not satisfied. (https://github.com/stanfordnlp/stanfordnlp/issues/42)

- Bugfix related to alignment between tokens and words post the multi-word expansion processor. (https://github.com/stanfordnlp/stanfordnlp/issues/71)

- More options are added for customizing the Stanford CoreNLP server at start time, including specifying properties for the default pipeline, and setting all server options such as username/password. For more details on different options, please checkout the [client documentation page](https://stanfordnlp.github.io/stanfordnlp/corenlp_client.html#customizing-properties-for-server-start-and-requests).

- `CoreNLPClient` instance can now be created with CoreNLP default language properties as:
python
client = CoreNLPClient(properties='chinese')


- Alternatively, a properties file can now be used during the creation of a `CoreNLPClient`:
python
client = CoreNLPClient(properties='/path/to/corenlp.props')


- All specified CoreNLP annotators are now preloaded by default when a `CoreNLPClient` instance is created. (https://github.com/stanfordnlp/stanfordnlp/issues/56)

0.1.2

This is a maintenance release of stanfordnlp. This release features:

* Allowing the tokenizer to treat the incoming document as pretokenized with space separated words in newline separated sentences. Set `tokenize_pretokenized` to `True` when building the pipeline to skip the neural tokenizer, and run all downstream components with your own tokenized text. (24, 34)
* Speedup in the POS/Feats tagger in evaluation (up to 2 orders of magnitude). (18)
* Various minor fixes and documentation improvements

We would also like to thank the following community members for their contribution:
Code improvements: lwolfsonkin
Documentation improvements: 0xflotus
And thanks to everyone that raised issues and helped improve stanfordnlp!

0.1.0

The initial release of StanfordNLP. StanfordNLP is the combination of the software package used by the Stanford team in the CoNLL 2018 Shared Task on Universal Dependency Parsing, and the group’s official Python interface to the [Stanford CoreNLP software](https://stanfordnlp.github.io/CoreNLP). This package is built with highly accurate neural network components that enables efficient training and evaluation with your own annotated data. The modules are built on top of [PyTorch](https://pytorch.org/) (v1.0.0).

StanfordNLP features:

- Native Python implementation requiring minimal efforts to set up;
- Full neural network pipeline for robust text analytics, including tokenization, multi-word token (MWT) expansion, lemmatization, part-of-speech (POS) and morphological features tagging and dependency parsing;
- Pretrained neural models supporting 53 (human) languages featured in 73 treebanks;
- A stable, officially maintained Python interface to CoreNLP.

Page 5 of 5

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.