Colusa

Latest version: v0.12.0

Safety actively analyzes 623075 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

0.12.0

New

* Add new command to crawl an URL. [Nguyen Huu Hoa]

Crawl an URL (website) to help generate list of URL, mostly story chapters

Changes

* Add some debug capability. [Nguyen Huu Hoa]

* Add support for new websites. [Nguyen Huu Hoa]

Other

* Merge branch 'main' of github.com:huuhoa/colusa. [Nguyen Huu Hoa]

* Chg(plugins/truyenfull): clean ads content. [Nguyen Huu Hoa]

* Chg(plugins/truyenfull): clean ads content. [Nguyen Huu Hoa]

0.11.0

Changes

* Support for tangthuvien (72) [Huu Hoa NGUYEN]

Fix

* Gitchangelog ignore pattern. [Nguyen Huu Hoa]

0.10.0

Changes

* Update dev requirements. [Nguyen Huu Hoa]

* Improve code coverage (21) [Huu Hoa NGUYEN]

Mock up two functions download_image and download_content

+ `download_content` will return existing cached file, so that we don't have to redownload every time
we run the test
+ `download_image` will just return True, do nothing, so that we don't have to download images

Other

* Add: support for techtarget.com (32) [Huu Hoa NGUYEN]

* chg(asciidoc_visitor): support parsing datasrc and data-srcset for img
* add(web): support for techtarget.com

0.9.0

New

* Integration tests (20) [Huu Hoa NGUYEN]

* chg: add tox.ini for running tox
* chg(colusa): move colusa source to src folder

* Support parsing site xp123.com (18) [Huu Hoa NGUYEN]

Other

* Chore: update setup.cfg for version location. [Nguyen Huu Hoa]

* Prepare for next release. [Nguyen Huu Hoa]

* Refactor(etr): Rework on Extractor (19) [Huu Hoa NGUYEN]

* refactor(etr): move _parse_yoast from a plugin extract to base Extractor
* refactor(etr): rename methods for clarification

+ rename `internal_init` to `_find_main_content`
+ rename `get_author` to field `author` and `_parse_author` for parsing value
+ rename `get_published` to field `published` and `_parse_published` for parsing value
+ rename `get_title` to field `title` and `_parse_title` for parsing value
+ add `_parse_metadata` for parsing all related metadata from html

* refactor(etr): change signature of Extractor._find_main_content

+ `_find_main_content` is now return bs.Tag instead of setting value for field `main_content`. The change make it more clear for purpose of `_find_main_content`, i.e. only to find the main content, does not modify anything
+ `_parse_metadata` will be executed after we found the main content

0.8.0

New

* Support rendering additional book properties (16) [Huu Hoa NGUYEN]

In the book configuration file, add new array `book_properties`
with content is list of strings. Those strings will be render as
book properties on master file (index.asciidoc)

Example:

json
"book_properties": [
"ifdef::backend-pdf[]",
":front-cover-image: image:cover.pdf[]",
":notitle:",
"endif::[]",
"ifdef::backend-epub3[]",
":front-cover-image: image:cover.png[]",
"endif::[]"
]


Above example will instruct asciidoctor processor to use:
+ cover.pdf as front cover image when generating pdf
+ cover.png as front cover image when generating epub3

Changes

* Render html table as native asciidoc table (17) [Huu Hoa NGUYEN]

Other

* Prepare for bump version 0.8. [Nguyen Huu Hoa]

* Add(plugins): support for scrumcrazy.wordpress.com. [Nguyen Huu Hoa]

0.7.0

Changes

* Improve article parsing (15) [Huu Hoa NGUYEN]

* add: agilethought support to get article's author
* add: support website tech.trivago.com

* Metadata rendering (14) [Huu Hoa NGUYEN]

render metadata in a more clean way, the format should be

`by **{author}** on {published_date} at {url | domain}`

Other

* Chore: refactor project's setup configurations. [Nguyen Huu Hoa]

* Setup codeql-analysis. [Huu Hoa NGUYEN]

* Setup dependabot. [Huu Hoa NGUYEN]

* Dev: update requirements_dev.txt. [Nguyen Huu Hoa]

* Docs: update CHANGELOG. [Nguyen Huu Hoa]

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.