Bertopic

Latest version: v0.16.4

Safety actively analyzes 706259 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 6

0.4.0

*Release date: 21 December, 2020*

**Highlights**:

* Visualize Topics similar to [LDAvis](https://github.com/cpsievert/LDAvis)
* Added option to reduce topics after training
* Added option to update topic representation after training
* Added option to search topics using a search term
* Significantly improved the stability of generating clusters
* Finetune the topic words by selecting the most coherent words with the highest c-TF-IDF values
* More extensive tutorials in the documentation

**Notable Changes**:

* Option to select language instead of sentence-transformers models to minimize the complexity of using BERTopic
* Improved logging (remove duplicates)
* Check if BERTopic is fitted
* Added TF-IDF as an embedder instead of transformer models (see tutorial)
* Numpy for Python 3.6 will be dropped and was therefore removed from the workflow.
* Preprocess text before passing it through c-TF-IDF
* Merged `get_topics_freq()` with `get_topic_freq()`

**Fixes**:

* Fix error handling topic probabilities

0.3.2

*Release date: 16 November, 2020*

**Highlights**:

* Fixed a bug with the topic reduction method that seems to reduce the number of topics but not to the nr_topics as defined in the class. Since this was, to a certain extend, breaking the topic reduction method a new release was necessary.

0.3.1

*Release date: 4 November, 2020*

**Highlights**:

* Adding the option to use custom embeddings or embeddings that you generated beforehand with whatever package you'd like to use. This allows users to further customize BERTopic to their liking.

0.3.0

*Release date: 29 October, 2020*

**Highlights**:

- transform() and fit_transform() now also return the topic probability distributions
- Added visualize_distribution() which visualizes the topic probability distribution for a single document

0.2.2

*Release date: 17 October, 2020*

**Highlights**:

- Fixed n_gram_range not being used
- Added option for using stopwords

0.2.1

*Release date: 11 October, 2020*

**Highlights**:

* Improved the calculation of the class-based TF-IDF procedure by limiting the calculation to sparse matrices. This prevents out-of-memory problems when faced with large datasets.

Page 5 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.