S3-ocr

Latest version: v0.6.3

Safety actively analyzes 714919 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 2

0.3

First non-alpha release.

- Breaking change: the order of arguments for `s3-ocr index <bucket> <database_file>` has been swapped, for consistency with other commands. [9](https://github.com/simonw/s3-ocr/issues/9)
- Breaking change: the `start` command no longer defaults to processing every `.pdf` file in the bucket. It now accepts a list of keys, or use the `--all` option to process every PDF file. [10](https://github.com/simonw/s3-ocr/issues/10)
- New `s3-ocr fetch <bucket> <path>` command for fetching the raw OCR JSON data for that file. [7](https://github.com/simonw/s3-ocr/issues/7)
- New `s3-ocr text <bucket> <path>` command for outputting just the extracted OCR text for a specified file. [8](https://github.com/simonw/s3-ocr/issues/8)

0.2a0

- New `s3-ocr index database.db name-of-bucket` command for creating a SQLite database containing the OCR results that have been written to the bucket. [2](https://github.com/simonw/s3-ocr/issues/2)

0.1a0

- `s3-ocr start <bucket>` command for triggering OCR runs using [Textract](https://aws.amazon.com/textract/) for every PDF file in a bucket. [#1](https://github.com/simonw/s3-ocr/issues/1)
- `s3-ocr status <bucket>` command for checking on the status of the ongoing OCR tasks.

Page 2 of 2

Releases

Has known vulnerabilities

S3-ocr

Page 2 of 2

0.3

0.2a0

0.1a0

Page 2 of 2

Links

Releases