Bertopic

Latest version: v0.17.0

Safety actively analyzes 723177 Python packages for vulnerabilities to keep your Python projects secure.

Page 6 of 6

0.2.1

*Release date: 11 October, 2020*

**Highlights**:

* Improved the calculation of the class-based TF-IDF procedure by limiting the calculation to sparse matrices. This prevents out-of-memory problems when faced with large datasets.

0.2.0

*Release date: 11 October, 2020*

**Highlights**:

- Changed c-TF-IDF procedure such that it implements a version of scikit-learns procedure. This should also speed up the calculation of the sparse matrix and prevent memory errors.
- Added automated unit tests

0.1.2

*Release date: 1 October, 2020*

**Highlights**:

* When transforming new documents, self.mapped_topics seemed to be missing. Added to the init.

0.1.1

*Release date: 24 September, 2020*

**Highlights**:

* Fixed requirements --> Issue with pytorch
* Update documentation

0.01

new_topics = [np.argmax(prob) if max(prob) >= probability_threshold else -1 for prob in probs]

**Fixes**:

* Fixed `None` being returned for probabilities when transforming unseen documents
* Replaced all instances of `arg:` with `Arguments:` for consistency
* Before saving a fitted BERTopic instance, we remove the stopwords in the fitted CountVectorizer model as it can get quite large due to the number of words that end in stopwords if `min_df` is set to a value larger than 1
* Set `"hdbscan>=0.8.28"` to prevent numpy issues
* Although this was already fixed by the new release of HDBSCAN, it is technically still possible to install 0.8.27 with BERTopic which leads to these numpy issues
* Update gensim dependency to `>=4.0.0` ([371](https://github.com/MaartenGr/BERTopic/issues/371))
* Fix topic 0 not appearing in visualizations ([472](https://github.com/MaartenGr/BERTopic/issues/472))
* Fix ([506](https://github.com/MaartenGr/BERTopic/issues/506))
* Fix ([429](https://github.com/MaartenGr/BERTopic/issues/429))
* Fix typo in DTM documentation by [hp0404](https://github.com/hp0404) in [#386](https://github.com/MaartenGr/BERTopic/pull/386)

0.1.0

*Release date: 24 September, 2020*

**Highlights**:

- First release of `BERTopic`
- Added parameters for UMAP and HDBSCAN
- Option to choose sentence-transformer model
- Method for transforming unseen documents
- Save and load trained models (UMAP and HDBSCAN)
- Extract topics and their sizes

**Notable Changes**:

- Optimized c-TF-IDF
- Improved documentation
- Improved topic reduction

Page 6 of 6

Releases

Has known vulnerabilities

Bertopic

Page 6 of 6

0.2.1

0.2.0

0.1.2

0.1.1

0.01

0.1.0

Page 6 of 6

Links

Releases