In one month time we have added lots into sadedegel library.
News
* We have [doruktiktiklar](https://github.com/doruktiktiklar) as the first code contributor out of Global Maksimum AI team.
New Capabilities
* **ADD**: Addition of Vocabulary and Token concepts into library
* `Token`: singleton per word (case sensitive) to store unique token features (lower form, shape, document frequency, etc.)
* New `sadedegel-build-vocabulary` to manage **sadedegel** vocabularies.
New Summarizers
* **ADD**: TextRank Summarizer
TextRank summarizer uses Google's PageRank algorithm based on distance/similarity defined by BERT embedding cosine distance/similarity (as of this release and more to come)
* **ADD**: TFIDF Summarizer
TFIDF Summarizer uses element sum of tfidf vector of a sentence as the relevance score of a sentence in a document.
Others
* **UPDATE**: Some annotator consensus issues on summary corpus.
* **UPDATE**: A better command-line for summarizer evaluation. Check `sadedegel-summarize evaluate` for more
* **ADD**: Sentences level `tf`, `idf` and `tfidf` embeddings
* **ADD**: `Doc` has `tfidf_embeddings` property similar to `bert_embeddings` property.
Documentation
* **ADD**: Youtube webinar videos (in Turkish) on [sadedeGel YouTube Channel](https://www.youtube.com/channel/UCyNG1Mehl44XWZ8LzkColuw)
Contribution Guidelines
* **ADD**: Commit Guidelines
* **ADD**: **New Feature** checklist
Feature Drop & Deprecation
* **DROP**: Code quality guidelines is removed since [Code Inspector](https://www.code-inspector.com) limits the number of lines per open source project. We might continue with other providers later in the future.
* **DEPRECATED**: `Doc.sents` will be removed by version `0.17`
* Use `[i]` to access **i**th sentences of a document
* `Doc` object now implements `__iter__` to let iterate over all sentences of a document.
Bugfix
* Properly handle empty documents. Ex `Doc("")` or `Doc('')`