Diffusers

Latest version: v0.29.1

Safety actively analyzes 640974 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 10 of 14

0.13.0

:dart: Controlling Generation

There has been much recent work on fine-grained control of diffusion networks!

Diffusers now supports:
1. [Instruct Pix2Pix](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/pix2pix)
2. [Pix2Pix 0](pix2pix-zero), more details in [docs](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/pix2pix_zero)
3. [Attend and excite](attend-excite), more details in [docs](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/attend_and_excite)
4. [Semantic guidance](semantic-guidance), more details in [docs](https://huggingface.co/docs/diffusers/main/en/api/pipelines/semantic_stable_diffusion)
5. [Self-attention guidance](self-attention-guidance), more details in [docs](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/self_attention_guidance)
6. [Depth2image](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion_2#depthtoimage)
7. [MultiDiffusion panorama](panorama), more details in [docs](https://huggingface.co/docs/diffusers/main/en/api/pipelines/stable_diffusion/panorama)

See our doc on [controlling image generation](https://huggingface.co/docs/diffusers/main/en/using-diffusers/controlling_generation) and the individual pipeline docs for more details on the individual methods.


:up: Latent Upscaler

Latent Upscaler is a diffusion model that is designed explicitly for Stable Diffusion. You can take the generated latent from Stable Diffusion and pass it into the upscaler before decoding with your standard VAE. Or you can take any image, encode it into the latent space, use the upscaler, and decode it. It is incredibly flexible and can work with any SD checkpoints.
Original output image | 2x upscaled output image
:-------------------------:|:-------------------------:
![](https://pbs.twimg.com/media/Fg8UijAaEAAqfvS?format=png&name=small) | ![](https://pbs.twimg.com/media/Fg8UjCmaMAAAUdS?format=jpg&name=medium)

The model was developed by [Katherine Crowson](https://github.com/crowsonkb/k-diffusion) in collaboration with [Stability AI](https://stability.ai/)
python
from diffusers import StableDiffusionLatentUpscalePipeline, StableDiffusionPipeline
import torch

pipeline = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16)
pipeline.to("cuda")

upscaler = StableDiffusionLatentUpscalePipeline.from_pretrained("stabilityai/sd-x2-latent-upscaler", torch_dtype=torch.float16)
upscaler.to("cuda")

prompt = "a photo of an astronaut high resolution, unreal engine, ultra realistic"
generator = torch.manual_seed(33)

we stay in latent space! Let's make sure that Stable Diffusion returns the image
in latent space
low_res_latents = pipeline(prompt, generator=generator, output_type="latent").images

upscaled_image = upscaler(
prompt=prompt,
image=low_res_latents,
num_inference_steps=20,
guidance_scale=0,
generator=generator,
).images[0]

Let's save the upscaled image under "upscaled_astronaut.png"
upscaled_image.save("astronaut_1024.png")

as a comparison: Let's also save the low-res image
with torch.no_grad():
image = pipeline.decode_latents(low_res_latents)
image = pipeline.numpy_to_pil(image)[0]

image.save("astronaut_512.png")




:zap: Optimization

In addition to new features and an increasing number of pipelines, `diffusers` cares a lot about performance. This release brings a number of optimizations that you can turn on easily.

xFormers

Memory efficient attention, as implemented by [xFormers](), has been available in `diffusers` for some time. The problem was that installing `xFormers` could be complicated because there were no official `pip` wheels (or they were outdated), and you had to resort to installing from source.

From `xFormers 0.0.16`, official pip wheels are now published with every release, so installing and using xFormers is now as simple as these two steps:

1. `pip install xformers` in your terminal.
2. `pipe.enable_xformers_memory_efficient_attention()` in your code to opt-in in your pipelines.

These actions will unlock dramatic memory savings, and usually faster inference too!

See more details in [the documentation](https://huggingface.co/docs/diffusers/v0.13.0/en/optimization/xformers).

0.12.1

Make sure cached models can be loaded in offline mode.

* Don't call the Hub if `local_files_only` is specifiied by patrickvonplaten in 2119

0.12.0

πŸͺ„ Instruct-Pix2Pix
Instruct-Pix2Pix is a Stable Diffusion model fine-tuned for editing images from human instructions. Given an input image and a written instruction that tells the model what to do, the model follows these instructions to edit the image.

![image](https://huggingface.co/datasets/diffusers/diffusers-images-docs/resolve/main/pix2pix.jpeg)

The model was released with the paper [InstructPix2Pix: Learning to Follow Image Editing Instructions](https://arxiv.org/abs/2211.09800). More information about the model can be found in the paper.


pip install diffusers transformers safetensors accelerate


python
import PIL
import requests
import torch
from diffusers import StableDiffusionInstructPix2PixPipeline

model_id = "timbrooks/instruct-pix2pix"
pipe = StableDiffusionInstructPix2PixPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")

url = "https://huggingface.co/datasets/diffusers/diffusers-images-docs/resolve/main/mountain.png"
def download_image(url):
image = PIL.Image.open(requests.get(url, stream=True).raw)
image = PIL.ImageOps.exif_transpose(image)
image = image.convert("RGB")
return image
image = download_image(url)

prompt = "make the mountains snowy"
edit = pipe(prompt, image=image, num_inference_steps=20, image_guidance_scale=1.5, guidance_scale=7).images[0]
images[0].save("snowy_mountains.png")

* Add InstructPix2Pix pipeline by patil-suraj 2040


πŸ€– DiT

Diffusion Transformers (DiTs) is a class conditional latent diffusion model which replaces the commonly used U-Net backbone with a transformer operating on latent patches. The pretrained model is trained on the ImageNet-1K dataset and is able to generate class conditional images of 256x256 or 512x512 pixels.

![dit](https://user-images.githubusercontent.com/8100/214593099-3b478e53-64ca-4265-925c-50eb0ea5da3e.png)

The model was released with the paper [Scalable Diffusion Models with Transformers](https://www.wpeebles.com/DiT).

python
import torch
from diffusers import DiTPipeline

model_id = "facebook/DiT-XL-2-256"
pipe = DiTPipeline.from_pretrained(model_id, torch_dtype=torch.float16).to("cuda")

pick words that exist in ImageNet
words = ["white shark", "umbrella"]
class_ids = pipe.get_label_ids(words)

output = pipe(class_labels=class_ids)
image = output.images[0] label 'white shark'

⚑ LoRA

LoRA is a technique for performing parameter-efficient fine-tuning for large models. LoRA works by adding so-called "update matrices" to specific blocks of a pre-trained model. During fine-tuning, only these update matrices are updated while the pre-trained model parameters are kept frozen. This allows us to achieve greater memory efficiency as well as easier portability during fine-tuning.

LoRA was proposed in [LoRA: Low-Rank Adaptation of Large Language Models](https://arxiv.org/abs/2106.09685). In the original paper, the authors investigated LoRA for fine-tuning large language models like GPT-3. [cloneofsimo](https://github.com/cloneofsimo) was the first to try out LoRA training for Stable Diffusion in the popular [lora](https://github.com/cloneofsimo/lora) GitHub repository.

Diffusers now supports [LoRA](https://arxiv.org/abs/2212.06727)! This means you can now fine-tune a model like Stable Diffusion using consumer GPUs like Tesla T4 or RTX 2080 Ti. LoRA support was added to [`UNet2DConditionModel`](https://huggingface.co/docs/diffusers/main/en/api/models#diffusers.UNet2DConditionModel) and DreamBooth training script by patrickvonplaten in 1884.

By using LoRA, the fine-tuned checkpoints will be **just 3 MBs in size**. After fine-tuning, you can use the LoRA checkpoints like so:

py
from diffusers import StableDiffusionPipeline
import torch

model_path = "sayakpaul/sd-model-finetuned-lora-t4"
pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", torch_dtype=torch.float16)
pipe.unet.load_attn_procs(model_path)
pipe.to("cuda")

prompt = "A pokemon with blue eyes."
image = pipe(prompt, num_inference_steps=30, guidance_scale=7.5).images[0]
image.save("pokemon.png")


![pokemon-image](https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/pokemon-collage.png)

You can follow these resources to know more about how to use LoRA in diffusers:

* [text2image fine-tuning script](https://github.com/huggingface/diffusers/tree/main/examples/text_to_image#training-with-lora) (by sayakpaul in 2031).
* [Official documentation discussing how LoRA is supported](https://huggingface.co/docs/diffusers/main/en/training/lora) (by sayakpaul in #2086).

πŸ“ Customizable Cross Attention

LoRA leverages a new method to customize the cross attention layers deep in the UNet. This can be useful for other creative approaches such as [Prompt-to-Prompt](https://arxiv.org/abs/2208.01626), and it makes it easier to apply optimizers like [xFormers](https://github.com/facebookresearch/xformers). This new "attention processor" abstraction was created by patrickvonplaten in #1639 after discussing the design with the community, and we have used it to rewrite our xFormers and attention slicing implementations!

🌿 Flax => PyTorch

A long requested feature, prolific community member camenduru took up the gauntlet in 1900 and created a way to convert Flax model weights for PyTorch. This means that you can train or fine-tune models super fast using Google TPUs, and then convert the weights to PyTorch for everybody to use. Thanks camenduru!

πŸŒ€ Flax Img2Img

Another community member, dhruvrnaik, ported the image-to-image pipeline to Flax in 1355! Using a TPU v2-8 (available in Colab's free tier), you can generate 8 images at once in a few seconds!

🎲 DEIS Scheduler
DEIS (Diffusion Exponential Integrator Sampler) is a new fast mult step scheduler that can generate high-quality samples in fewer steps.
The scheduler was introduced in the paper [Fast Sampling of Diffusion Models with Exponential Integrator](https://arxiv.org/abs/2204.13902). More information about the scheduler can be found in the paper.

python
from diffusers import StableDiffusionPipeline, DEISMultistepScheduler
import torch

pipe = StableDiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipe.scheduler = DEISMultistepScheduler.from_config(pipe.scheduler.config)
pipe = pipe.to("cuda")

prompt = "a photo of an astronaut riding a horse on mars"
generator = torch.Generator(device="cuda").manual_seed(0)
image = pipe(prompt, generator=generator, num_inference_steps=25).images[0

* feat : add log-rho deis multistep scheduler by qsh-zh 1432

Reproducibility

One can now pass CPU generators to all pipelines even if the pipeline is on GPU. This ensures
much better reproducibility across GPU hardware:

python
import torch
from diffusers import DDIMPipeline
import numpy as np

model_id = "google/ddpm-cifar10-32"

load model and scheduler
ddim = DDIMPipeline.from_pretrained(model_id)
ddim.to("cuda")

create a generator for reproducibility
generator = torch.manual_seed(0)

run pipeline for just two steps and return numpy tensor
image = ddim(num_inference_steps=2, output_type="np", generator=generator).images
print(np.abs(image).sum())


See: 1902 and https://huggingface.co/docs/diffusers/using-diffusers/reproducibility

Important New Guides

- Stable Diffusion 101: https://huggingface.co/docs/diffusers/stable_diffusion
- Reproducibility: https://huggingface.co/docs/diffusers/using-diffusers/reproducibility
- LoRA: https://huggingface.co/docs/diffusers/training/lora

Important Bug Fixes

- Don't download safetensors if library is not installed: 2057
- Make sure that `save_pretrained(...)` doesn't accidentally delete files: 2038
- Fix CPU offload docs for maximum memory gain: 1968
- Fix conversion for exotically sorted weight names: 1959
- Fix intermediate checkpointing for textual inversion, thanks lstein 2072

All commits

* update composable diffusion for an updated diffuser library by nanlliu in 1697
* [Tests] Fix UnCLIP cpu offload tests by anton-l in 1769
* Bump to 0.12.0.dev0 by anton-l in 1771
* [Dreambooth] flax fixes by pcuenca in 1765
* update train_unconditional_ort.py by prathikr in 1775
* Only test for xformers when enabling them 1773 by kig in 1776
* expose polynomial:power and cosine_with_restarts:num_cycles params by zetyquickly in 1737
* [Flax] Stateless schedulers, fixes and refactors by skirsten in 1661
* Correct hf hub download by patrickvonplaten in 1767
* Dreambooth docs: minor fixes by pcuenca in 1758
* Fix num images per prompt unclip by patil-suraj in 1787
* Add Flax stable diffusion img2img pipeline by dhruvrnaik in 1355
* Refactor cross attention and allow mechanism to tweak cross attention function by patrickvonplaten in 1639
* Fix OOM when using PyTorch with JAX installed. by pcuenca in 1795
* reorder model wrap + bug fix by prathikr in 1799
* Remove hardcoded names from PT scripts by patrickvonplaten in 1778
* [textual_inversion] unwrap_model text encoder before accessing weights by patil-suraj in 1816
* fix small mistake in annotation: 32 -> 64 by Line290 in 1780
* Make safety_checker optional in more pipelines by pcuenca in 1796
* Device to use (e.g. cpu, cuda:0, cuda:1, etc.) by camenduru in 1844
* Avoid duplicating PyTorch + safetensors downloads. by pcuenca in 1836
* Width was typod as weight by Helw150 in 1800
* fix: resize transform now preserves aspect ratio by parlance-zz in 1804
* Make xformers optional even if it is available by kn in 1753
* Allow selecting precision to make Dreambooth class images by kabachuha in 1832
* unCLIP image variation by williamberman in 1781
* [Community Pipeline] MagicMix by daspartho in 1839
* [Versatile Diffusion] Fix cross_attention_kwargs by patrickvonplaten in 1849
* [Dtype] Align dtype casting behavior with Transformers and Accelerate by patrickvonplaten in 1725
* [StableDiffusionInpaint] Correct test by patrickvonplaten in 1859
* [textual inversion] add gradient checkpointing and small fixes. by patil-suraj in 1848
* Flax: Fix img2img and align with other pipeline by skirsten in 1824
* Make repo structure consistent by patrickvonplaten in 1862
* [Unclip] Make sure text_embeddings & image_embeddings can directly be passed to enable interpolation tasks. by patrickvonplaten in 1858
* Fix ema decay by pcuenca in 1868
* [Docs] Improve docs by patrickvonplaten in 1870
* [examples] update loss computation by patil-suraj in 1861
* [train_text_to_image] allow using non-ema weights for training by patil-suraj in 1834
* [Attention] Finish refactor attention file by patrickvonplaten in 1879
* Fix typo in train_dreambooth_inpaint by pcuenca in 1885
* Update ONNX Pipelines to use np.float64 instead of np.float by agizmo in 1789
* [examples] misc fixes by patil-suraj in 1886
* Fixes to the help for `report_to` in training scripts by pcuenca in 1888
* updated doc for stable diffusion pipelines by yiyixuxu in 1770
* Add UnCLIPImageVariationPipeline to dummy imports by anton-l in 1897
* Add accelerate and xformers versions to `diffusers-cli env` by anton-l in 1898
* [addresses issue 1642] add add_noise to scheduling-sde-ve by aengusng8 in 1827
* Add condtional generation to AudioDiffusionPipeline by teticio in 1826
* Fixes in comments in SD2 D2I by neverix in 1903
* [Deterministic torch randn] Allow tensors to be generated on CPU by patrickvonplaten in 1902
* [Docs] Remove duplicated API doc string by patrickvonplaten in 1901
* fix: DDPMScheduler.set_timesteps() by Joqsan in 1912
* Fix --resume_from_checkpoint step in train_text_to_image.py by merfnad in 1914
* Support training SD V2 with Flax by yasyf in 1783
* Fix lr-scaling store_true & default=True cli argument for textual_inversion training. by aredden in 1090
* Various Fixes for Flax Dreambooth by yasyf in 1782
* Test ResnetBlock2D by hchings in 1850
* Init for korean docs by seriousran in 1910
* New Pipeline: Tiled-upscaling with depth perception to avoid blurry spots by peterwilli in 1615
* Improve reproduceability 2/3 by patrickvonplaten in 1906
* feat : add log-rho deis multistep scheduler by qsh-zh in 1432
* Feature/colossalai by Fazziekey in 1793
* [Docs] Add TRANSLATING.md file by seriousran in 1920
* [StableDiffusionimg2img] validating input type by Shubhamai in 1913
* [dreambooth] low precision guard by williamberman in 1916
* [Stable Diffusion Guide] 101 Stable Diffusion Guide directly into the docs by patrickvonplaten in 1927
* [Conversion] Make sure ema weights are extracted correctly by patrickvonplaten in 1937
* fix path to logo by vvssttkk in 1939
* Add automatic doc sorting by patrickvonplaten in 1940
* update to latest colossalai by Fazziekey in 1951
* fix typo in imagic_stable_diffusion.py by andreemic in 1956
* [Conversion SD] Make sure weirdly sorted keys work as well by patrickvonplaten in 1959
* allow loading ddpm models into ddim by patrickvonplaten in 1932
* [Community] Correct checkpoint merger by patrickvonplaten in 1965
* Update CLIPGuidedStableDiffusion.feature_extractor.size to fix TypeError by oxidase in 1938
* [CPU offload] correct cpu offload by patrickvonplaten in 1968
* [Docs] Update README.md by haofanwang in 1960
* Research project multi subject dreambooth by klopsahlong in 1948
* Example tests by patrickvonplaten in 1982
* Fix slow tests by patrickvonplaten in 1983
* Fix unused upcast_attn flag in convert_original_stable_diffusion_to_diffusers script by kn in 1942
* Allow converting Flax to PyTorch by adding a "from_flax" keyword by camenduru in 1900
* Update docstring by Warvito in 1971
* [SD Img2Img] resize source images to multiple of 8 instead of 32 by vvsotnikov in 1571
* Update README.md to include our blog post by sayakpaul in 1998
* Fix a couple typos in Dreambooth readme by pcuenca in 2004
* Add tests for 2D UNet blocks by hchings in 1945
* [Conversion] Support convert diffusers to safetensors by hua1995116 in 1996
* [Community] Fix merger by patrickvonplaten in 2006
* [Conversion] Improve safetensors by patrickvonplaten in 1989
* [Black] Update black library by patrickvonplaten in 2007
* Fix typos in ColossalAI example by haofanwang in 2001
* Use pipeline tests mixin for UnCLIP pipeline tests + unCLIP MPS fixes by williamberman in 1908
* Change PNDMPipeline to use PNDMScheduler by willdalh in 2003
* [train_unconditional] fix LR scheduler init by patil-suraj in 2010
* [Docs] No more autocast by patrickvonplaten in 2021
* [Flax] Add Flax inpainting impl by xvjiarui in 1966
* Check k-diffusion version is at least 0.0.12 by pcuenca in 2022
* DiT Pipeline by kashif in 1806
* fix dit doc header by patil-suraj in 2027
* [LoRA] Add LoRA training script by patrickvonplaten in 1884
* [Dit] Fix dit tests by patrickvonplaten in 2034
* Fix typos and minor redundancies by Joqsan in 2029
* [Lora] Model card by patrickvonplaten in 2032
* [Save Pretrained] Remove dead code lines that can accidentally remove pytorch files by patrickvonplaten in 2038
* Fix EMA for multi-gpu training in the unconditional example by anton-l in 1930
* Minor fix in the documentation of LoRA by hysts in 2045
* Add InstructPix2Pix pipeline by patil-suraj in 2040
* Create repo before cloning in examples by Wauplin in 2047
* Remove modelcards dependency by Wauplin in 2050
* Module-ise "original stable diffusion to diffusers" conversion script by damian0815 in 2019
* [StableDiffusionInstructPix2Pix] use cpu generator in slow tests by patil-suraj in 2051
* [From pretrained] Don't download .safetensors files if safetensors is… by patrickvonplaten in 2057
* Correct Pix2Pix example by patrickvonplaten in 2056
* add community pipeline: StableUnCLIPPipeline by budui in 2037
* [LoRA] Adds example on text2image fine-tuning with LoRA by sayakpaul in 2031
* Safetensors loading in "convert_diffusers_to_original_stable_diffusion" by cafeai in 2054
* [examples] add dataloader_num_workers argument by patil-suraj in 2070
* Dreambooth: reduce VRAM usage by gleb-akhmerov in 2039
* [Paint by example] Fix cpu offload for paint by example by patrickvonplaten in 2062
* [textual_inversion] Fix resuming state when using gradient checkpointing by pcuenca in 2072
* [lora] Log images when using tensorboard by pcuenca in 2078
* Fix resume epoch for all training scripts except textual_inversion by pcuenca in 2079
* [dreambooth] fix multi on gpu. by patil-suraj in 2088
* Run inference on a specific condition and fix call of manual_seed() by shirayu in 2074
* [Feat] checkpoint_merger works on local models as well as ones that use safetensors by lstein in 2060
* xFormers attention op arg by takuma104 in 2049
* [docs] [dreambooth] note random crop by williamberman in 2085
* Remove wandb from text_to_image requirements.txt by pcuenca in 2092
* [doc] update example for pix2pix by patil-suraj in 2101
* Add `lora` tag to the model tags by apolinario in 2103
* [docs] Adds a doc on LoRA support for diffusers by sayakpaul in 2086
* Allow directly passing text embeddings to Stable Diffusion Pipeline for prompt weighting by patrickvonplaten in 2071
* Improve transformers versions handling by patrickvonplaten in 2104
* Reproducibility 3/3 by patrickvonplaten in 1924

πŸ™Œ Significant community contributions πŸ™Œ

The following contributors have made significant changes to the library over the last release:

* nanlliu
* update composable diffusion for an updated diffuser library (1697)
* skirsten
* [Flax] Stateless schedulers, fixes and refactors (1661)
* Flax: Fix img2img and align with other pipeline (1824)
* hchings
* Test ResnetBlock2D (1850)
* Add tests for 2D UNet blocks (1945)
* seriousran
* Init for korean docs (1910)
* [Docs] Add TRANSLATING.md file (1920)
* qsh-zh
* feat : add log-rho deis multistep scheduler (1432)
* Fazziekey
* Feature/colossalai (1793)
* update to latest colossalai (1951)
* klopsahlong
* Research project multi subject dreambooth (1948)
* xvjiarui
* [Flax] Add Flax inpainting impl (1966)
* damian0815
* Module-ise "original stable diffusion to diffusers" conversion script (2019)
* camenduru
* Allow converting Flax to PyTorch by adding a "from_flax" keyword (1900)

0.11.1

This patch release fixes a bug with `num_images_per_prompt` in the `UnCLIPPipeline`
* Fix num images per prompt unclip by patil-suraj in 1787

0.11.0

:magic_wand: Karlo UnCLIP by Kakao Brain

Karlo is a text-conditional image generation model based on OpenAI's unCLIP architecture with the improvement over the standard super-resolution model from 64px to 256px, recovering high-frequency details in a small number of denoising steps.

This alpha version of Karlo is trained on 115M image-text pairs, including [COYO](https://github.com/kakaobrain/coyo-dataset)-100M high-quality subset, CC3M, and CC12M.
For more information about the architecture, see the Karlo repository: https://github.com/kakaobrain/karlo
![image](https://user-images.githubusercontent.com/26864830/208464171-a46be794-ca3c-4d39-80ab-cee71402f0f0.png)


pip install diffusers transformers safetensors accelerate


python
import torch
from diffusers import UnCLIPPipeline

pipe = UnCLIPPipeline.from_pretrained("kakaobrain/karlo-v1-alpha", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "a high-resolution photograph of a big red frog on a green leaf."
image = pipe(prompt).images[0]


![img](https://huggingface.co/datasets/patrickvonplaten/images/resolve/main/frog.png)

:octocat: Community pipeline versioning

The community pipelines hosted in `diffusers/examples/community` will now follow the installed version of the library.

E.g. if you have `diffusers==0.9.0` installed, the pipelines from the `v0.9.0` branch will be used: https://github.com/huggingface/diffusers/tree/v0.9.0/examples/community

If you've installed diffusers from source, e.g. with `pip install git+https://github.com/huggingface/diffusers` then the latest versions of the pipelines will be fetched from the `main` branch.

To change the custom pipeline version, set the `custom_revision` variable like so:
python
pipeline = DiffusionPipeline.from_pretrained(
"google/ddpm-cifar10-32", custom_pipeline="one_step_unet", custom_revision="0.10.2"
)


:safety_vest: safetensors

Many of the most important checkpoints now have [https://github.com/huggingface/safetensors](https://github.com/huggingface/safetensors) available. Upon installing `safetensors` with:


pip install safetensors


You will see a nice speed-up when loading your model :rocket:

Some of the most improtant checkpoints have safetensor weights added now:
- https://huggingface.co/stabilityai/stable-diffusion-2
- https://huggingface.co/stabilityai/stable-diffusion-2-1
- https://huggingface.co/stabilityai/stable-diffusion-2-depth
- https://huggingface.co/stabilityai/stable-diffusion-2-inpainting

Batched generation bug fixes :bug:

* Make sure all pipelines can run with batched input by patrickvonplaten in 1669

We fixed a lot of bugs for batched generation. All pipelines should now correctly process batches of prompts and images :hugs:
Also we made it much easier to tweak images with reproducible seeds:
https://huggingface.co/docs/diffusers/using-diffusers/reusing_seeds

:memo: Changelog
* Remove spurious arg in training scripts by pcuenca in 1644
* dreambooth: fix 1566: maintain fp32 wrapper when saving a checkpoint to avoid crash when running fp16 by timh in 1618
* Allow k pipeline to generate > 1 images by pcuenca in 1645
* Remove unnecessary offset in img2img by patrickvonplaten in 1653
* Remove unnecessary kwargs in depth2img by maruel in 1648
* Add text encoder conversion by lawfordp2017 in 1559
* VersatileDiffusion: fix input processing by LukasStruppek in 1568
* tensor format ort bug fix by prathikr in 1557
* Deprecate init image correctly by patrickvonplaten in 1649
* fix bug if we don't do_classifier_free_guidance by MKFMIKU in 1601
* Handle missing global_step key in scripts/convert_original_stable_diffusion_to_diffusers.py by Cyberes in 1612
* [SD] Make sure scheduler is correct when converting by patrickvonplaten in 1667
* [Textual Inversion] Do not update other embeddings by patrickvonplaten in 1665
* Added Community pipeline for comparing Stable Diffusion v1.1-4 checkpoints by suvadityamuk in 1584
* Fix wrong type checking in `convert_diffusers_to_original_stable_diffusion.py` by apolinario in 1681
* [Version] Bump to 0.11.0.dev0 by patrickvonplaten in 1682
* Dreambooth: save / restore training state by pcuenca in 1668
* Disable telemetry when DISABLE_TELEMETRY is set by w4ffl35 in 1686
* Change one-step dummy pipeline for testing by patrickvonplaten in 1690
* [Community pipeline] Add github mechanism by patrickvonplaten in 1680
* Dreambooth: use warnings instead of logger in parse_args() by pcuenca in 1688
* manually update train_unconditional_ort by prathikr in 1694
* Remove all local telemetry by anton-l in 1702
* Update main docs by patrickvonplaten in 1706
* [Readme] Clarify package owners by anton-l in 1707
* Fix the bug that torch version less than 1.12 throws TypeError by chinoll in 1671
* RePaint fast tests and API conforming by anton-l in 1701
* Add state checkpointing to other training scripts by pcuenca in 1687
* Improve pipeline_stable_diffusion_inpaint_legacy.py by cyber-meow in 1585
* apply amp bf16 on textual inversion by jiqing-feng in 1465
* Add examples with Intel optimizations by hshen14 in 1579
* Added a README page for docs and a "schedulers" page by yiyixuxu in 1710
* Accept latents as optional input in Latent Diffusion pipeline by daspartho in 1723
* Fix ONNX img2img preprocessing and add fast tests coverage by anton-l in 1727
* Fix ldm tests on master by not running the CPU tests on GPU by patrickvonplaten in 1729
* Docs: recommend xformers by pcuenca in 1724
* Nightly integration tests by anton-l in 1664
* [Batched Generators] This PR adds generators that are useful to make batched generation fully reproducible by patrickvonplaten in 1718
* Fix ONNX img2img preprocessing by peterto in 1736
* Fix MPS fast test warnings by anton-l in 1744
* Fix/update the LDM pipeline and tests by anton-l in 1743
* kakaobrain unCLIP by williamberman in 1428
* [fix] pipeline_unclip generator by williamberman in 1751
* unCLIP docs by williamberman in 1754
* Correct help text for scheduler_type flag in scripts. by msiedlarek in 1749
* Add resnet_time_scale_shift to VD layers by anton-l in 1757
* Add attention mask to uclip by patrickvonplaten in 1756
* Support attn2==None for xformers by anton-l in 1759
* [UnCLIPPipeline] fix num_images_per_prompt by patil-suraj in 1762
* Add CPU offloading to UnCLIP by anton-l in 1761
* [Versatile] fix attention mask by patrickvonplaten in 1763
* [Revision] Don't recommend using revision by patrickvonplaten in 1764
* [Examples] Update train_unconditional.py to include logging argument for Wandb by ash0ts in 1719
* Transformers version req for UnCLIP by anton-l in 1766

0.10.2

This patch removes the hard requirement for `transformers>=4.25.1` in case external libraries were downgrading the library upon startup in a non-controllable way.

* do not automatically enable xformers by patrickvonplaten in 1640
* Adapt to forced transformers version in some dependent libraries by anton-l in 1638
* Re-add xformers enable to UNet2DCondition by patrickvonplaten in 1627

🚨🚨🚨 **Note that xformers in not automatically enabled anymore** 🚨🚨🚨

The reasons for this are given here: https://github.com/huggingface/diffusers/pull/1640#discussion_r1044651551:

> We should not automatically enable xformers for three reasons:
>
> It's not PyTorch-like API. PyTorch doesn't by default enable all the fastest options available
> We allocate GPU memory before the user even does .to("cuda")
> This behavior is not consistent with cases where xformers is not installed

**=> This means**: If you were used to have xformers automatically enabled, please make sure to add the following now:

python
from diffusers.utils.import_utils import is_xformers_available

unet = ... load unet

if is_xformers_available():
try:
unet.enable_xformers_memory_efficient_attention(True)
except Exception as e:
logger.warning(
"Could not enable memory efficient attention. Make sure xformers is installed"
f" correctly and a GPU is available: {e}"
)


for the UNet (e.g. in dreambooth) or for the pipeline:

py
from diffusers.utils.import_utils import is_xformers_available

pipe = ... load pipeline

if is_xformers_available():
try:
pipe.enable_xformers_memory_efficient_attention(True)
except Exception as e:
logger.warning(
"Could not enable memory efficient attention. Make sure xformers is installed"
f" correctly and a GPU is available: {e}"
)

Page 10 of 14

Β© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.