Streusle

Latest version: v4.5

Safety actively analyzes 681775 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

4.5

4.4

* Update govobj.py to recognize a different style of annotation for preposition stranding.
* Update UD to v2.6.
* Link from README to [a new paper](https://arxiv.org/abs/2005.12889) on converting STREUSLE annotations to UCCA (Universal Conceptual Cognitive Annotation), which uses this version of the data in experiments.

4.3

* Updated preposition/possessive annotations to [SNACS v2.5 guidelines](https://arxiv.org/abs/1704.02134v6), which includes changes in the set of labels.
* Added a sentence that had been omitted from a document in the training set.
* Updated UD parses to the latest dev version (post-v2.5). This improves lemmas for misspelled words and adds paragraph boundaries.
* Link from README to new Pepper converter module.
* Link from README to online search tool using ANNIS.

4.2

Annotations

* Manually corrected all tokens with the placeholder lexcat symbol `!!` (introduced in v4.0) to have a real lexcat and, if appropriate, a supersense (issue 15).
* A number of revisions to SNACS (preposition/possessive supersense) annotations coordinated with updated guidelines ([5], specifically SNACS v2.4, <https://arxiv.org/abs/1704.02134v5>; this incorporates updates for SNACS v2.3 as well).
* Minor corrections in the data and validation improvements.
* Updated UD parses to the latest dev version (post-v2.5). Among other things, this improves lemmas for words with nonstandard spellings.

Utilities and data formats

* Added streuseval.py, a unified evaluation script for MWEs + supersenses (issue 31).
* Added streusvis.py, for viewing sentences with their MWE and supersense annotations.
* Added supdate.py (sentence-wise) and tupdate.py (token-wise) for editing lexical semantic annotations (issue 54).
* Added format conversion scripts conllulex2json.py, conllulex2UDlextag.py, and UDlextag2json.py.
* Normalized the way MWEs within a sentence are numbered in markup (normalize_mwe_numbering.py, issue 42).
* Several improvements to govobj.py (most notably issue 35, affecting 184 tokens, and a small fix in 58db569 which affected 53 tokens).
* Subdirectories for splits (train/, dev/, test/) now include .json and .govobj.json files alongside the source .conllulex.
* Added release preparation scripts under releaseutil/.
* Added setup.py.
* Fixed a very small bug in tquery.py affecting the display of sentence-final matches, and made minor changes in functionality involving null values and negative constraints; token-level attributes of multiword expressions; and a new option to filter by sentence length.

4.1

- Added subtypes to verbal MWEs (871 tokens) per [PARSEME Shared Task 1.1 guidelines](http://parsemefr.lif.univ-mrs.fr/parseme-st-guidelines/1.1/?page=home); some MWE groupings revised in the process.
- Minor improvements to SNACS (preposition/possessive supersense) annotations coordinated with updated [guidelines](https://arxiv.org/abs/1704.02134v3).
- Implementation of SNACS (preposition/possessive supersense) target identification heuristics from

Nathan Schneider, Jena D. Hwang, Vivek Srikumar, Jakob Prange, Austin Blodgett, Sarah R. Moeller, Aviram Stern, Adi Bitan, and Omri Abend. Comprehensive supersense disambiguation of English prepositions and possessives. _Proceedings of the Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics_, Melbourne, Australia, July 15–20, 2018. <http://people.cs.georgetown.edu/nschneid/p/pssdisambig.pdf>

- New utility scripts for listing/filtering tokens (tquery.py) and converting to and from an Excel-compatible CSV format.

4.0

* Updated preposition supersenses to new annotation scheme (4398 tokens).
* Annotated possessives (1117 tokens) using preposition supersenses.
* Revised a considerable number of MWEs involving prepositions.
* Added lexical category for every single-word or strong multiword expression.
* New data format (.conllulex) integrates gold syntactic annotations from the Universal Dependencies project.

Page 1 of 2

Links

Releases

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.