Tinytopics

Latest version: v0.7.4

Safety actively analyzes 723144 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 3

0.7.4

Improvements

- Add Python 3.13 support by conditionally requiring torch >= 2.6.0 under
Python >= 3.13 (47).

Documentation

- Extend the installation section in `README.md` to explain the use cases
on GPU support, dependency override, and project dependency management (48).

Maintenance

- Manage project with uv (46).
- Change logo typeface for a fresh look. Improve the logo text rendering
workflow to use SVG (45).
- Change logo image path from relative to absolute URL for proper rendering
on PyPI (44).

0.7.3

Maintenance

- Use `.yml` extension for GitHub Actions workflows consistently (40).
- Use isort and ruff to sort imports and format Python code (41).

0.7.2

New features

- Add `TorchDiskDataset` class to support using `.pt` or `.pth` files
as inputs for `fit_model()` and `fit_model_distributed()` (38).
Similar to `NumpyDiskDataset` added in tinytopics 0.6.0, this class also
uses memory-mapped mode to load data so that larger than system memory
datasets can be used for training.

0.7.1

Documentation

- Add distributed training speed and cost metrics on 8x A100 (40 GB SXM4) to
the [distributed training](https://nanx.me/tinytopics/articles/distributed/)
article (34). This supplements the existing 1x H100 and 4x H100 metrics.

Testing

- Add unit tests for `fit_model_distributed()` (35).
- Add pytest-cov to development dependencies (35).

0.7.0

New features

- Add `fit_model_distributed()` to support distributed training using
Hugging Face Accelerate.
See the [distributed training](https://nanx.me/tinytopics/articles/distributed/)
article for details (32).

Improvements

- Use `tqdm.auto` for better progress bar visuals when used in notebooks (30).
- Move dataset classes and loss functions into dedicated modules to improve
code structure and reusability (31).

0.6.0

New features

- `fit_model()` now supports using PyTorch `Dataset` as input, in addition
to in-memory tensors. This allows fitting topic models on data larger than
GPU VRAM or system RAM. The `NumpyDiskDataset` class is added to read
`.npy` document-term matrices from disk on-demand (26).

Documentation

- Added a [memory-efficient training](https://nanx.me/tinytopics/articles/memory/)
article demonstrating the new features for fitting topic models on
large datasets (27).

Page 1 of 3

Releases

Has known vulnerabilities

Tinytopics

Page 1 of 3

0.7.4

0.7.3

0.7.2

0.7.1

0.7.0

0.6.0

Page 1 of 3

Links

Releases