Min-dalle

Latest version: v0.4.11

Safety actively analyzes 624429 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.4

- Fixed a criticial CUDA runtime error that occurred when generating tokens larger than the VQGAN's vocabulary
- Added `generate_images_stream` and `generate_images` to generate individual images. Is in active use in discord bot.
- Faster inference, can generate a 9x9 grid in 38 seconds on an A100
- Added `temperature`, `top_k`, and `supercondition_factor` parameters
- Added a simple TKinter UI (thanks to 20kdc)
- Added an option to tiles images in token space instead of pixel space. This creates a seamless effect where the borders between images are blended.

0.3

- added `is_reusable` parameter. Turning it off saves memory (e.g. for command line script) and keeping it on makes multiple calls to `generate_image` faster
- added `log2_k` parameter to control top-k image token sampling
- added `log2_supercondition_factor` parameter to control the super conditioning amount
- added `log2_mid_count` and `generate_image_stream` to stream intermediate outputs. Incomplete tokens are detokenized to an image multiple times during the decoding process. This adds very little time to the overall run time
- added `dtype` parameter to autocast operations to `float32`, `float16`, or `bfloat16`
- a grid size of 8x8 now generates in 35 seconds on an A100

0.2.0

- [Added to PyPI](https://pypi.org/project/min-dalle/) so now the entire setup process is `pip install min-dalle`
- Pre-converted PyTorch weights are downloaded when needed from [a Hugging Face hub](https://huggingface.co/kuprel/min-dalle/tree/main), no more converting from flax

Breaking Changes
- `MinDalleTorch` is now `MinDalle`
- `MinDalleFlax` and flax-to-torch conversion code have been moved to a different repository

0.1.1

Important Bug Fixes
- Image tokens were mistakenly being computed twice in command line script when using torch
- Tokenizer was not working correctly on some machines previously (e.g. windows). Files are now read with ut8-encoding.

New Features
- `is_expendable` argument reduces memory usage for command line script by loading then unloading encoder/decoder/detokenizer when needed
- simpler 4D `attention_state` replacing 5D `keys_values_state` and faster inference time

0.1

MinDalleTorch` and `MinDalleFlax` classes to initialize model once and run multiple times

Links

Releases

Has known vulnerabilities

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.