Scribe-data

Latest version: v3.3.0

Safety actively analyzes 682416 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 3

4.0.0

3.3.0

✨ Features

- The translation process has been updated to allow for translations from non-English languages ([72](https://github.com/scribe-org/Scribe-Data/issues/72), [#73](https://github.com/scribe-org/Scribe-Data/issues/73), [#74](https://github.com/scribe-org/Scribe-Data/issues/74), [#75](https://github.com/scribe-org/Scribe-Data/issues/75), [#75](https://github.com/scribe-org/Scribe-Data/issues/75), [#76](https://github.com/scribe-org/Scribe-Data/issues/76), [#77](https://github.com/scribe-org/Scribe-Data/issues/77), [#78](https://github.com/scribe-org/Scribe-Data/issues/78), [#79](https://github.com/scribe-org/Scribe-Data/issues/79)).

📝 Documentation

- The documentation has been given a new layout with the logo in the top left ([90](https://github.com/scribe-org/Scribe-Data/issues/90)).
- The documentation now has links to the code at the top of each page ([91](https://github.com/scribe-org/Scribe-Data/issues/91)).

🐞 Bug Fixes

- Annotation bugs were removed like repeat or empty values.
- Perfect tenses of Portuguese verbs were fixed via finding the appropriate PID ([68](https://github.com/scribe-org/Scribe-Data/issues/68)).
- Note that the most common past perfect property is not the standard one, so this will need to be fixed.

♻️ Code Refactoring

- [pre-commit](https://pre-commit.com/) have been added to the repo to improve the development experience ([#137](https://github.com/scribe-org/Scribe-Data/issues/137)).
- Code formatting was shifted from [black](https://github.com/psf/black) to [Ruff](https://github.com/astral-sh/ruff).
- A Ruff based GitHub workflow was added to check the code formatting and lint the codebase on each pull request ([109](https://github.com/scribe-org/Scribe-Data/issues/109)).
- The `_update_files` directory was renamed `update_files` as these files are used in non-internal manners now ([57](https://github.com/scribe-org/Scribe-Data/issues/57)).
- A common function has been created to map Wikidata ids to noun genders ([69](https://github.com/scribe-org/Scribe-Data/issues/69)).
- The project now is installed locally for development and command line usage, so usages of `sys.path` have been removed from files ([122](https://github.com/scribe-org/Scribe-Data/issues/122)).
- The directory structure has been dramatically streamlined and includes folders for future projects where language data could come from other sources like Wiktionary ([139](https://github.com/scribe-org/Scribe-Data/issues/139)).
- Translation files are moved to their own directory.
- The `extract_transform` directory has been removed and all files within it have been moved one level up.
- The `languages` directory has been renamed `language_data_extraction`.
- All files within `wikidata/_resources` have been moved to the `resources` directory.
- The gender and case annotations for data formatting have now been commonly defined.
- All language directory `formatted_data` files have been now moved to the `scribe_data_json_export` directory to prepare for outputs being required to be directed to a directory outside of the package.
- Path computing has been refactored throughout the codebase, and unneeded functions for data transfers have been removed.

3.2.2

- Minor fixes to documentation index and file docstrings to fix errors.
- Revert change to package path definition to hopefully register the resources directory.

3.2.1

♻️ Code Refactoring

- The docs and tests were grafted into the package using `MANIFEST.in`.
- Minor fixes to file and function docstrings and documentation files.
- `include_package_data=True` is used in `setup.py` to hopefully include all files in the package distribution.

3.2.0

✨ Features

- The data and process needed for an English keyboard has been added ([39](https://github.com/scribe-org/Scribe-Data/issues/39)).
- The Wikidata queries for English have been updated to get all nouns and verbs.
- Formatting scripts have been written to prepare the queried data and load it into an SQLite database.
- The data update process has been cleaned up in preparation for future changes to Scribe-Data and to implement better practices.
- Language data was extracted into a JSON file for more succinct referencing ([52](https://github.com/scribe-org/Scribe-Data/issues/52)).
- Language codes are now checked with the package [langcodes](https://github.com/rspeer/langcodes) for easier expansion.
- A process has been created to check and update words that can be translated for each Scribe language ([44](https://github.com/scribe-org/Scribe-Data/issues/44)).
- The baseline data returned from Wikidata queries is now removed once a formatted data file is created.

✅ Tests

- A full testing suite has been added to run on GitHub Actions ([37](https://github.com/scribe-org/Scribe-Data/issues/37)).
- Unit tests have been added for Wikidata queries ([48](https://github.com/scribe-org/Scribe-Data/issues/48)) and utility functions ([#50](https://github.com/scribe-org/Scribe-Data/issues/50)).

🐞 Bug Fixes

- Tensorflow was removed from the download wiki process to fix build problems on Macs.

♻️ Code Refactoring

- The Anaconda based virtual environment was removed and documentation was updated to reflect this.
- Language data processes were moved into the `src/scribe_data/extract_transform/languages` directory to clean up the structure.
- Code formatting processes were defined with common structures based on language and word type variables defined at the top of files.

3.1.0

✨ Features

- The word "Scribe" is now added to language database nouns files if it's not already present ([35](https://github.com/scribe-org/Scribe-Data/issues/35)).
- German contracted prepositions have been added to the German prepositions formatting process ([34](https://github.com/scribe-org/Scribe-Data/issues/34)).
- Words that are upper case are now better included in the autocomplete lexicon with their lower case equivalents being removed.
- Words with apostrophes have been removed from the autocomplete lexicon.

♻️ Code Refactoring

- Database output column names are now zero indexed to better align with Python and other language standards.

Page 1 of 3

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.