Diffusers

Latest version: v0.29.0

Safety actively analyzes 638801 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 13 of 14

0.4.2

This patch release allows the img2img pipeline to be run on fp16 and fixes a bug with the "mps" device.

- [schedulers hanlde dtype in add_noise](https://github.com/huggingface/diffusers/pull/767) by patil-suraj
- [img2img, inpainting fix fp16 inference](https://github.com/huggingface/diffusers/pull/769/files) by patil-suraj
- [mps: Alternative implementation for repeat_interleave](https://github.com/huggingface/diffusers/pull/766) by pcuenca

0.4.1

This patch release fixes an bug with incorrect module naming for community pipelines and an incorrect breaking change when moving piplines in fp16 to "cpu" or "mps".

- [Change fp16 error to warning](https://github.com/huggingface/diffusers/pull/764) by apolinario
- [Community Pipeline - Fix module bug & Lower required memory for clip guided](https://github.com/huggingface/diffusers/pull/754) by patrickvonplaten

0.4.0

πŸš— Faster

We have [thoroughly profiled](https://twitter.com/nouamanetazi/status/1576959648912973826?s=21&t=DaKkhPZ5zn1IVJooMJ5J3A) our codebase and applied a number of incremental improvements that, when combined, provide a speed improvement of almost **3x**.

On top of that, we now default to using the `float16` format. It's much faster than `float32` and, according to our tests, produces images with no discernible difference in quality. This beats the use of `autocast`, so the resulting code is cleaner!

πŸ”‘ `use_auth_token` no more

The recently released version of `huggingface-hub` automatically uses your access token if you are logged in, so you don't need to put it everywhere in your code. All you need to do is authenticate once using `huggingface-cli login` in your terminal and you're all set.

diff
- pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
+ pipe = DiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4")


We bumped `huggingface-hub` version to `0.10.0` in our dependencies to achieve this.

🎈More flexible APIs

- Schedulers now use a common, simpler unified API design. This has allowed us to remove many conditionals and special cases in the rest of the code, including the pipelines. This is very important for us and for the users of 🧨 diffusers: we all gain clarity and a solid abstraction for schedulers. See the description in https://github.com/huggingface/diffusers/pull/719 for more details

Please update any custom Stable Diffusion pipelines accordingly:
diff
- if isinstance(self.scheduler, LMSDiscreteScheduler):
- latents = latents * self.scheduler.sigmas[0]
+ latents = latents * self.scheduler.init_noise_sigma

diff
- if isinstance(self.scheduler, LMSDiscreteScheduler):
- sigma = self.scheduler.sigmas[i]
- latent_model_input = latent_model_input / ((sigma**2 + 1) ** 0.5)
+ latent_model_input = self.scheduler.scale_model_input(latent_model_input, t)

diff
- if isinstance(self.scheduler, LMSDiscreteScheduler):
- latents = self.scheduler.step(noise_pred, i, latents, **extra_step_kwargs).prev_sample
- else:
- latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs).prev_sample
+ latents = self.scheduler.step(noise_pred, t, latents, **extra_step_kwargs).prev_sample


- Pipeline callbacks. As a community project (h/t jamestiotio!), `diffusers` pipelines can now invoke a callback function during generation, providing the latents at each step of the process. This makes it easier to perform tasks such as visualization, inspection, explainability and others the community may invent.

πŸ› οΈ More tasks

Building on top of the previous foundations, this release incorporates several new tasks that have been adapted from research papers or community projects. These include:

- **Textual inversion**. Makes it possible to quickly train a new concept or style and incorporate it into the vocabulary of Stable Diffusion. Hundreds of people have already created theirs, and they can be shared and combined together. See the [training Colab](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_textual_inversion_training.ipynb) to get started.
- **Dreambooth**. Similar goal to textual inversion, but instead of creating a new item in the vocabulary it fine-tunes the model to make it learn a new concept. [Training Colab](https://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=&cad=rja&uact=8&ved=2ahUKEwihuLCU-Mv6AhUw5IUKHcffDLAQFnoECAgQAQ&url=https%3A%2F%2Fcolab.research.google.com%2Fgithub%2Fhuggingface%2Fnotebooks%2Fblob%2Fmain%2Fdiffusers%2Fsd_dreambooth_training.ipynb&usg=AOvVaw1fUDNVS7RbV0iMYAMDWf_d).
- **Negative prompts**. Another community effort led by shirayu. The Stable Diffusion pipeline can now receive both a positive prompt (the one you want to create), and a negative prompt (something you want to drive the model away from). This opens up a lot of creative possibilities!

πŸƒβ€β™€οΈ Under the hood changes to support better fine-tuning

Gradient checkpointing and 8-bit optimizers have been successfully applied to achieve Dreambooth fine-tuning in a Colab notebook! These updates will make it easier for `diffusers` to support general-purpose fine-tuning (coming soon!).

⚠️ Experimental: community pipelines

This is big, but it's still an experimental feature that may change in the future.

We are constantly amazed at the amount of imagination and creativity in the `diffusers` community, so we've made it easy to create custom pipelines and share them with others. You can write your own pipeline code, store it in πŸ€— Hub, GitHub or your local filesystem and `StableDiffusionPipeline.from_pretrained` will be able to load and run it. Read more in [the documentation](https://huggingface.co/docs/diffusers/main/en/using-diffusers/custom_pipelines).

We can't wait to see what new tasks the community creates!

πŸ’ͺ Quality of life fixes

Bug fixing, improved documentation, better tests are all important to ensure `diffusers` is a high-quality codebase, and we always spend a lot of effort working on them. Several first-time contributors have helped here, and we are very grateful for their efforts!

πŸ™ Significant community contributions

The following people have made significant contributions to the library over the last release:

* Victarry – Add training example for DreamBooth (554)
* jamestiotio – Add callback parameters for Stable Diffusion pipelines (521)
* jachiam – Allow resolutions that are not multiples of 64 (505)
* johnowhitaker – Adding pred_original_sample to SchedulerOutput for some samplers (614).
* keturn – Interesting discussions and insights on many topics.

✏️ Change list

* [Docs] Correct links by patrickvonplaten in 432
* [Black] Update black by patrickvonplaten in 433
* use torch.matmul instead of einsum in attnetion. by patil-suraj in 445
* Renamed variables from single letter to better naming by daspartho in 449
* Docs: fix installation typo by daspartho in 453
* fix table formatting for stable diffusion pipeline doc (add blank line) by natolambert in 471
* update expected results of slow tests by kashif in 268
* [Flax] Make room for more frameworks by patrickvonplaten in 494
* Fix `disable_attention_slicing` in pipelines by pcuenca in 498
* Rename test_scheduler_outputs_equivalence in model tests. by pcuenca in 451
* Scheduler docs update by natolambert in 464
* Fix scheduler inference steps error with power of 3 by natolambert in 466
* initial flax pndm schedular by kashif in 492
* Fix vae tests for cpu and gpu by kashif in 480
* [Docs] Add subfolder docs by patrickvonplaten in 500
* docs: bocken doc links for relative links by jjmachan in 504
* Removing `.float()` (`autocast` in fp16 will discard this (I think)). by Narsil in 495
* Fix MPS scheduler indexing when using `mps` by pcuenca in 450
* [CrossAttention] add different method for sliced attention by patil-suraj in 446
* Implement `FlaxModelMixin` by mishig25 in 493
* Karras VE, DDIM and DDPM flax schedulers by kashif in 508
* [UNet2DConditionModel, UNet2DModel] pass norm_num_groups to all the blocks by patil-suraj in 442
* Add `init_weights` method to `FlaxMixin` by mishig25 in 513
* UNet Flax with FlaxModelMixin by pcuenca in 502
* Stable diffusion text2img conversion script. by patil-suraj in 154
* [CI] Add stalebot by anton-l in 481
* Fix is_onnx_available by SkyTNT in 440
* [Tests] Test attention.py by sidthekidder in 368
* Finally fix the image-based SD tests by anton-l in 509
* Remove the usage of numpy in up/down sample_2d by ydshieh in 503
* Fix typos and add Typo check GitHub Action by shirayu in 483
* Quick fix for the img2img tests by anton-l in 530
* [Tests] Fix spatial transformer tests on GPU by anton-l in 531
* [StableDiffusionInpaintPipeline] accept tensors for init and mask image by patil-suraj in 439
* adding more typehints to DDIM scheduler by vishnu-anirudh in 456
* Revert "adding more typehints to DDIM scheduler" by patrickvonplaten in 533
* Add LMSDiscreteSchedulerTest by sidthekidder in 467
* [Download] Smart downloading by patrickvonplaten in 512
* [Hub] Update hub version by patrickvonplaten in 538
* Unify offset configuration in DDIM and PNDM schedulers by jonatanklosko in 479
* [Configuration] Better logging by patrickvonplaten in 545
* `make fixup` support by younesbelkada in 546
* FlaxUNet2DConditionOutput flax.struct.dataclass by mishig25 in 550
* [Flax] fix Flax scheduler by kashif in 564
* JAX/Flax safety checker by pcuenca in 558
* Flax: ignore dtype for configuration by pcuenca in 565
* Remove check_tf_utils to avoid an unnecessary TF import for now by anton-l in 566
* Fix `_upsample_2d` by ydshieh in 535
* [Flax] Add Vae for Stable Diffusion by patrickvonplaten in 555
* [Flax] Solve problem with VAE by patrickvonplaten in 574
* [Tests] Upload custom test artifacts by anton-l in 572
* [Tests] Mark the ncsnpp model tests as slow by anton-l in 575
* [examples/community] add CLIPGuidedStableDiffusion by patil-suraj in 561
* Fix `CrossAttention._sliced_attention` by ydshieh in 563
* Fix typos by shirayu in 568
* Add `from_pt` argument in `.from_pretrained` by younesbelkada in 527
* [FlaxAutoencoderKL] rename weights to align with PT by patil-suraj in 584
* Fix BaseOutput initialization from dict by anton-l in 570
* Add the K-LMS scheduler to the inpainting pipeline + tests by anton-l in 587
* [flax safety checker] Use `FlaxPreTrainedModel` for saving/loading by patil-suraj in 591
* FlaxDiffusionPipeline & FlaxStableDiffusionPipeline by mishig25 in 559
* [Flax] Fix unet and ddim scheduler by patrickvonplaten in 594
* Fix params replication when using the dummy checker by pcuenca in 602
* Allow dtype to be specified in Flax pipeline by pcuenca in 600
* Fix flax from_pretrained pytorch weight check by mishig25 in 603
* Mv weights name consts to diffusers.utils by mishig25 in 605
* Replace `dropout_prob` by `dropout` in `vae` by younesbelkada in 595
* Add smoke tests for the training examples by anton-l in 585
* Add torchvision to training deps by anton-l in 607
* Return Flax scheduler state by pcuenca in 601
* [ONNX] Collate the external weights, speed up loading from the hub by anton-l in 610
* docs: fix `Berkeley` ref by ryanrussell in 611
* Handle the PIL.Image.Resampling deprecation by anton-l in 588
* Make flax from_pretrained work with local subfolder by mishig25 in 608
* [flax] 'dtype' should not be part of self._internal_dict by mishig25 in 609
* [UNet2DConditionModel] add gradient checkpointing by patil-suraj in 461
* docs: fix `stochastic_karras_ve` ref by ryanrussell in 618
* Adding pred_original_sample to SchedulerOutput for some samplers by johnowhitaker in 614
* docs: `.md` readability fixups by ryanrussell in 619
* Flax documentation by younesbelkada in 589
* fix docs: change sample to images by AbdullahAlfaraj in 613
* refactor: pipelines readability improvements by ryanrussell in 622
* Allow passing session_options for ORT backend by cloudhan in 620
* Fix breaking error: "ort is not defined" by pcuenca in 626
* docs: `src/diffusers` readability improvements by ryanrussell in 629
* Fix formula for noise levels in Karras scheduler and tests by sgrigory in 627
* [CI] Fix onnxruntime installation order by anton-l in 633
* Warning for too long prompts in DiffusionPipelines (Resolve 447) by shirayu in 472
* Fix docs link to train_unconditional.py by AbdullahAlfaraj in 642
* Remove deprecated `torch_device` kwarg by pcuenca in 623
* refactor: `custom_init_isort` readability fixups by ryanrussell in 631
* Remove inappropriate docstrings in LMS docstrings. by pcuenca in 634
* Flax pipeline pndm by pcuenca in 583
* Fix `SpatialTransformer` by ydshieh in 578
* Add training example for DreamBooth. by Victarry in 554
* [Pytorch] Pytorch only schedulers by kashif in 534
* [examples/dreambooth] don't pass tensor_format to scheduler. by patil-suraj in 649
* [dreambooth] update install section by patil-suraj in 650
* [DDIM, DDPM] fix add_noise by patil-suraj in 648
* [Pytorch] add dep. warning for pytorch schedulers by kashif in 651
* [CLIPGuidedStableDiffusion] remove set_format from pipeline by patil-suraj in 653
* Fix onnx tensor format by anton-l in 654
* Fix `main`: stable diffusion pipelines cannot be loaded by pcuenca in 655
* Fix the LMS pytorch regression by anton-l in 664
* Added script to save during textual inversion training. Issue 524 by isamu-isozaki in 645
* [CLIPGuidedStableDiffusion] take the correct text embeddings by patil-suraj in 667
* Update index.mdx by tmabraham in 670
* [examples] update transfomers version by patil-suraj in 665
* [gradient checkpointing] lower tolerance for test by patil-suraj in 652
* Flax `from_pretrained`: clean up `mismatched_keys`. by pcuenca in 630
* `trained_betas` ignored in some schedulers by vishnu-anirudh in 635
* Renamed x -> hidden_states in resnet.py by daspartho in 676
* Optimize Stable Diffusion by NouamaneTazi in 371
* Allow resolutions that are not multiples of 64 by jachiam in 505
* refactor: update ldm-bert `config.json` url closes 675 by ryanrussell in 680
* [docs] fix table in fp16.mdx by NouamaneTazi in 683
* Fix slow tests by NouamaneTazi in 689
* Fix BibText citation by osanseviero in 693
* Add callback parameters for Stable Diffusion pipelines by jamestiotio in 521
* [dreambooth] fix applying clip_grad_norm_ by patil-suraj in 686
* Flax: add shape argument to `set_timesteps` by pcuenca in 690
* Fix type annotations on StableDiffusionPipeline.__call__ by tasercake in 682
* Fix import with Flax but without PyTorch by pcuenca in 688
* [Support PyTorch 1.8] Remove inference mode by patrickvonplaten in 707
* [CI] Speed up slow tests by anton-l in 708
* [Utils] Add deprecate function and move testing_utils under utils by patrickvonplaten in 659
* Checkpoint conversion script from Diffusers => Stable Diffusion (CompVis) by jachiam in 701
* [Docs] fix docstring for issue 709 by kashif in 710
* Update schedulers README.md by tmabraham in 694
* add accelerate to load models with smaller memory footprint by piEsposito in 361
* Fix typos by shirayu in 718
* Add an argument "negative_prompt" by shirayu in 549
* Fix import if PyTorch is not installed by pcuenca in 715
* Remove comments no longer appropriate by pcuenca in 716
* [train_unconditional] fix applying clip_grad_norm_ by patil-suraj in 721
* renamed x to meaningful variable in resnet.py by i-am-epic in 677
* [Tests] Add accelerate to testing by patrickvonplaten in 729
* [dreambooth] Using already created `Path` in dataset by DrInfiniteExplorer in 681
* Include CLIPTextModel parameters in conversion by kanewallmann in 695
* Avoid negative strides for tensors by shirayu in 717
* [Pytorch] pytorch only timesteps by kashif in 724
* [Scheduler design] The pragmatic approach by anton-l in 719
* Removing `autocast` for `35-25% speedup`. (`autocast` considered harmful). by Narsil in 511
* No more use_auth_token=True by patrickvonplaten in 733
* remove use_auth_token from remaining places by patil-suraj in 737
* Replace messages that have empty backquotes by pcuenca in 738
* [Docs] Advertise fp16 instead of autocast by patrickvonplaten in 740
* remove use_auth_token from for TI test by patil-suraj in 747
* allow multiple generations per prompt by patil-suraj in 741
* Add back-compatibility to LMS timesteps by anton-l in 750
* update the clip guided PR according to the new API by patil-suraj in 751
* Raise an error when moving an fp16 pipeline to CPU by anton-l in 749
* Better steps deprecation for LMS by anton-l in 753

0.3

pipeline.enable_model_cpu_offload()

face_image = load_image("https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/women_input.png")

style_folder = "https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/style_ziggy"
style_images = [load_image(f"{style_folder}/img{i}.png") for i in range(10)]

generator = torch.Generator(device="cpu").manual_seed(0)

image = pipeline(
prompt="wonderwoman",
ip_adapter_image=[style_images, face_image],
negative_prompt="monochrome, lowres, bad anatomy, worst quality, low quality",
num_inference_steps=50
generator=generator,
).images[0]


Reference style images:
<img src=https://github.com/huggingface/diffusers/assets/12631849/84f15215-7ac2-40ef-a552-3bdad3fdfba0 width=700/>
<table>
<tr>
<th><strong>Reference face Image</strong></th>
<th><strong>Output Image</strong></th>
</tr>
<tr>
<td><img src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/women_input.png" width=500></td>
<td><img src="https://huggingface.co/datasets/YiYiXu/testing-images/resolve/main/ip_multi_out.png" width=500></td>
</tr>
</table>

πŸ“œ Check out the docs [here](https://huggingface.co/docs/diffusers/using-diffusers/loading_adapters#ip-adapter).

Single-file checkpoint loading

`from_single_file()` utility has been refactored for better readability and to follow similar semantics as `from_pretrained()` . Support for loading single file checkpoints and configs from URLs has also been added.

DPM scheduler fix

We introduced a [fix](https://github.com/huggingface/diffusers/pull/6477) for DPM schedulers, so now you can use it with SDXL to generate high-quality images in fewer steps than the Euler scheduler.

Apart from these, we have done a myriad of refactoring to improve the library design and will continue to do so in the coming days.

All commits

* [docs] Fix missing API function by stevhliu in 6604
* Fix failing tests due to Posix Path by DN6 in 6627
* Update convert_from_ckpt.py / read checkpoint config yaml contents by spezialspezial in 6633
* [Community] Experimental AnimateDiff Image to Video (open to improvements) by a-r-r-o-w in 6509
* refactor: extract init/forward function in UNet2DConditionModel by ultranity in 6478
* Modularize InstructPix2Pix SDXL inferencing during and after training in examples by sang-k in 6569
* Fixed the bug related to saving DeepSpeed models. by HelloWorldBeginner in 6628
* fix DPM Scheduler with `use_karras_sigmas` option by yiyixuxu in 6477
* fix SDXL-kdiffusion tests by yiyixuxu in 6647
* add padding_mask_crop to all inpaint pipelines by rootonchair in 6360
* add Sa-Solver by lawrence-cj in 5975
* Add tearDown method to LoRA tests. by DN6 in 6660
* [Diffusion DPO] apply fixes from 6547 by sayakpaul in 6668
* Update README by standardAI in 6669
* [Big refactor] move unets to `unets` module πŸ¦‹ by sayakpaul in 6630
* Standardise outputs for video pipelines by DN6 in 6626
* fix dpm related slow test failure by yiyixuxu in 6680
* [Tests] Test for passing local config file to `from_single_file()` by sayakpaul in 6638
* [Refactor] Update from single file by DN6 in 6428
* [WIP][Community Pipeline] InstaFlow! One-Step Stable Diffusion with Rectified Flow by ayushtues in 6057
* Add InstantID Pipeline by haofanwang in 6673
* [Docs] update: tutorials ja | AUTOPIPELINE.md by YasunaCoffee in 6629
* [Fix bugs] pipeline_controlnet_sd_xl.py by haofanwang in 6653
* SD 1.5 Support For Advanced Lora Training (train_dreambooth_lora_sdxl_advanced.py) by brandostrong in 6449
* AnimateDiff Video to Video by a-r-r-o-w in 6328
* [docs] UViT2D by stevhliu in 6643
* Correct sigmas cpu settings by patrickvonplaten in 6708
* [docs] AnimateDiff Video-to-Video by a-r-r-o-w in 6712
* fix community README by a-r-r-o-w in 6645
* fix custom diffusion training with concept list by AIshutin in 6710
* Add IP Adapters to slow tests by DN6 in 6714
* Move tests for SD inference variant pipelines into their own modules by DN6 in 6707
* Add Community Example Consistency Training Script by dg845 in 6717
* Add UFOGenScheduler to Community Examples by dg845 in 6650
* [Hub] feat: explicitly tag to diffusers when using push_to_hub by sayakpaul in 6678
* Correct SNR weighted loss in v-prediction case by only adding 1 to SNR on the denominator by thuliu-yt16 in 6307
* changed to posix unet by gzguevara in 6719
* Change os.path to pathlib Path by Stepheni12 in 6737
* correct hflip arg by sayakpaul in 6743
* Add unload_textual_inversion method by fabiorigano in 6656
* [Core] move transformer scripts to `transformers` modules by sayakpaul in 6747
* Update lora.md with a more accurate description of rank by xhedit in 6724
* Fix mixed precision fine-tuning for text-to-image-lora-sdxl example. by sajadn in 6751
* udpate ip-adapter slow tests by yiyixuxu in 6760
* Update export to video to support new `tensor_to_vid` function in video pipelines by DN6 in 6715
* [DDPMScheduler] Load `alpha_cumprod` to device to avoid redundant data movement. by woshiyyya in 6704
* Fix bug in ResnetBlock2D.forward where LoRA Scale gets Overwritten by dg845 in 6736
* add note about serialization by sayakpaul in 6764
* Update train_diffusion_dpo.py by viettmab in 6754
* Pin torch < 2.2.0 in test runners by DN6 in 6780
* [Kandinsky tests] add `is_flaky` to test_model_cpu_offload_forward_pass by sayakpaul in 6762
* add ipo, hinge and cpo loss to dpo trainer by kashif in 6788
* Fix setting scaling factor in VAE config by DN6 in 6779
* Add PIA Model/Pipeline by DN6 in 6698
* [docs] Add missing parameter by stevhliu in 6775
* [IP-Adapter] Support multiple IP-Adapters by yiyixuxu in 6573
* [sdxl k-diffusion pipeline]move sigma to device by yiyixuxu in 6757
* [Feat] add I2VGenXL for image-to-video generation by sayakpaul in 6665

0.3.0

:books: Shiny new docs!

Thanks to the community efforts for [[Docs]](https://github.com/huggingface/diffusers/issues/293) and [[Type Hints]](https://github.com/huggingface/diffusers/issues/287) we've started populating the [**Diffusers documentation**](https://huggingface.co/docs/diffusers/main/en/index) pages with lots of helpful guides, links and API references.

:memo: New API & breaking changes

New API
Pipeline, Model, and Scheduler outputs can now be both dataclasses, Dicts, and Tuples:

python
image = pipe("The red cat is sitting on a chair")["sample"][0]


is now replaced by:

python
image = pipe("The red cat is sitting on a chair").images[0]
or
image = pipe("The red cat is sitting on a chair")["image"][0]
or
image = pipe("The red cat is sitting on a chair")[0]


Similarly:

python
sample = unet(...).sample

and

python
prev_sample = scheduler(...).prev_sample


is now possible!

🚨🚨🚨 Breaking change 🚨🚨🚨

This PR introduces breaking changes for the following public-facing methods:

- `VQModel.encode` -> we return a dict/dataclass instead of a single tensor. In the future it's very likely required to return more than just one tensor. Please make sure to change `latents = model.encode(...)` to `latents = model.encode(...)[0]` or `latents = model.encode(...).latens`
- `VQModel.decode` -> we return a dict/dataclass instead of a single tensor. In the future it's very likely required to return more than just one tensor. Please make sure to change `sample = model.decode(...)` to `sample = model.decode(...)[0]` or `sample = model.decode(...).sample`
- `VQModel.forward` -> we return a dict/dataclass instead of a single tensor. In the future it's very likely required to return more than just one tensor. Please make sure to change `sample = model(...)` to `sample = model(...)[0]` or `sample = model(...).sample`
- `AutoencoderKL.encode` -> we return a dict/dataclass instead of a single tensor. In the future it's very likely required to return more than just one tensor. Please make sure to change `latent_dist = model.encode(...)` to `latent_dist = model.encode(...)[0]` or `latent_dist = model.encode(...).latent_dist`
- `AutoencoderKL.decode` -> we return a dict/dataclass instead of a single tensor. In the future it's very likely required to return more than just one tensor. Please make sure to change `sample = model.decode(...)` to `sample = model.decode(...)[0]` or `sample = model.decode(...).sample`
- `AutoencoderKL.forward` -> we return a dict/dataclass instead of a single tensor. In the future it's very likely required to return more than just one tensor. Please make sure to change `sample = model(...)` to `sample = model(...)[0]` or `sample = model(...).sample`

:art: New Stable Diffusion pipelines

A couple of new pipelines have been added to Diffusers! We invite you to experiment with them, and to take them as inspiration to create your cool new tasks. These are the new pipelines:

- **Image-to-image generation**. In addition to using a text prompt, this pipeline lets you include an example image to be used as the initial state of the process. [πŸ€— Diffuse the Rest](https://huggingface.co/spaces/huggingface/diffuse-the-rest) is a cool demo about it!
- **Inpainting** (_experimental_). You can provide an image and a mask and ask Stable Diffusion to replace the mask.

For more details about how they work, please visit our new [API documentation](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion).

This is a summary of all the Stable Diffusion tasks that can be easily used with πŸ€— Diffusers:

| Pipeline | Tasks | Colab | Demo
|---|---|:---:|:---:|
| [pipeline_stable_diffusion.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py) | *Text-to-Image Generation* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_diffusion.ipynb) | [πŸ€— Stable Diffusion](https://huggingface.co/spaces/stabilityai/stable-diffusion)
| [pipeline_stable_diffusion_img2img.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py) | *Image-to-Image Text-Guided Generation* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/image_2_image_using_diffusers.ipynb) | [πŸ€— Diffuse the Rest](https://huggingface.co/spaces/huggingface/diffuse-the-rest)
| [pipeline_stable_diffusion_inpaint.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py) | **Experimental** – *Text-Guided Image Inpainting* | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/in_painting_with_stable_diffusion_using_diffusers.ipynb) | Coming soon

:candy: Less memory usage for smaller GPUs

Now the diffusion models can take up significantly less VRAM (3.2 GB for Stable Diffusion) at the expense of 10% of speed thanks to the optimizations discussed in https://github.com/basujindal/stable-diffusion/pull/117.

To make use of the attention optimization, just enable it with `.enable_attention_slicing()` after loading the pipeline:
python
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=True
)
pipe = pipe.to("cuda")
pipe.enable_attention_slicing()


This will allow many more users to play with Stable Diffusion in their own computers! We can't wait to see what new ideas and results will be created by the community!

:black_cat: Textual Inversion

Textual Inversion lets you personalize a [Stable Diffusion](https://huggingface.co/blog/stable_diffusion) model on your own images with just 3-5 samples.

GitHub: https://github.com/huggingface/diffusers/tree/main/examples/textual_inversion
Training: https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/sd_textual_inversion_training.ipynb
Inference: https://colab.research.google.com/github/huggingface/notebooks/blob/main/diffusers/stable_conceptualizer_inference.ipynb

:apple: MPS backend for Apple Silicon

πŸ€— Diffusers is compatible with Apple silicon for Stable Diffusion inference, using the PyTorch `mps` device. You need to install PyTorch Preview (Nightly) on a Mac with M1 or M2 CPU, and then use the pipeline as usual:

python
from diffusers import StableDiffusionPipeline

pipe = StableDiffusionPipeline.from_pretrained("CompVis/stable-diffusion-v1-4", use_auth_token=True)
pipe = pipe.to("mps")

prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]


We are seeing great speedups (31s vs 214s in a M1 Max), but there are still a couple of limitations. We encourage you to read the [documentation](https://huggingface.co/docs/diffusers/optimization/mps) for the details.

:factory: Experimental ONNX exporter and pipeline for Stable Diffusion

We introduce a new (and experimental) Stable Diffusion pipeline compatible with the ONNX Runtime. This allows you to run Stable Diffusion on any hardware that supports ONNX (including a significant speedup on CPUs).

You need to use `StableDiffusionOnnxPipeline` instead of `StableDiffusionPipeline`. You also need to download the weights from the `onnx` branch of the repository, and indicate the runtime provider you want to use (CPU, in the following example):

python
from diffusers import StableDiffusionOnnxPipeline

pipe = StableDiffusionOnnxPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="onnx",
provider="CPUExecutionProvider",
use_auth_token=True,
)

prompt = "a photo of an astronaut riding a horse on mars"
image = pipe(prompt).images[0]


:warning: Warning: the script above takes a long time to download the external ONNX weights, so it will be faster to convert the checkpoint yourself (see below).

To convert your own checkpoint, run the [conversion script](https://github.com/huggingface/diffusers/blob/main/scripts/convert_stable_diffusion_checkpoint_to_onnx.py) locally:
bash
python scripts/convert_stable_diffusion_checkpoint_to_onnx.py --model_path="CompVis/stable-diffusion-v1-4" --output_path="./stable_diffusion_onnx"

After that it can be loaded from the local path:
python
pipe = StableDiffusionOnnxPipeline.from_pretrained("./stable_diffusion_onnx", provider="CPUExecutionProvider")


Improvements and bugfixes

* Mark in painting experimental by patrickvonplaten in 430
* Add config docs by patrickvonplaten in 429
* [Docs] Models by kashif in 416
* [Docs] Using diffusers by patrickvonplaten in 428
* [Outputs] Improve syntax by patrickvonplaten in 423
* Initial ONNX doc (TODO: Installation) by pcuenca in 426
* [Tests] Correct image folder tests by patrickvonplaten in 427
* [MPS] Make sure it doesn't break torch < 1.12 by patrickvonplaten in 425
* [ONNX] Stable Diffusion exporter and pipeline by anton-l in 399
* [Tests] Make image-based SD tests reproducible with fixed datasets by anton-l in 424
* [Docs] Outputs.mdx by patrickvonplaten in 422
* [Docs] Fix scheduler docs by patrickvonplaten in 421
* [Docs] DiffusionPipeline by patrickvonplaten in 418
* Improve unconditional diffusers example by satpalsr in 414
* Improve latent diff example by satpalsr in 413
* Inference support for `mps` device by pcuenca in 355
* [Docs] Minor fixes in optimization section by patrickvonplaten in 420
* [Docs] Pipelines for inference by satpalsr in 417
* [Docs] Training docs by patrickvonplaten in 415
* Docs: fp16 page by pcuenca in 404
* Add typing to scheduling_sde_ve: init, set_timesteps, and set_sigmas function definitions by danielpatrickhug in 412
* Docs fix some typos by natolambert in 408
* [docs sprint] schedulers docs, will update by natolambert in 376
* Docs: fix undefined in toctree by natolambert in 406
* Attention slicing by patrickvonplaten in 407
* Rename variables from single letter to meaningful name fix by rashmimarganiatgithub in 395
* Docs: Stable Diffusion pipeline by pcuenca in 386
* Small changes to Philosophy by pcuenca in 403
* karras-ve docs by kashif in 401
* Score sde ve doc by kashif in 400
* [Docs] Finish Intro Section by patrickvonplaten in 402
* [Docs] Quicktour by patrickvonplaten in 397
* ddim docs by kashif in 396
* Docs: optimization / special hardware by pcuenca in 390
* added pndm docs by kashif in 391
* Update text_inversion.mdx by johnowhitaker in 393
* [Docs] Logging by patrickvonplaten in 394
* [Pipeline Docs] ddpm docs for sprint by kashif in 382
* [Pipeline Docs] Unconditional Latent Diffusion by satpalsr in 388
* Docs: Conceptual section by pcuenca in 392
* [Pipeline Docs] Latent Diffusion by patrickvonplaten in 377
* [textual-inversion] fix saving embeds by patil-suraj in 387
* [Docs] Let's go by patrickvonplaten in 385
* Add colab links to textual inversion by apolinario in 375
* Efficient Attention by patrickvonplaten in 366
* Use `expand` instead of ones to broadcast tensor by pcuenca in 373
* [Tests] Fix SD slow tests by anton-l in 364
* [Type Hint] VAE models by daspartho in 365
* [Type hint] scheduling lms discrete by santiviquez in 360
* [Type hint] scheduling karras ve by santiviquez in 359
* type hints: models/vae.py by shepherd1530 in 346
* [Type Hints] DDIM pipelines by sidthekidder in 345
* [ModelOutputs] Replace dict outputs with Dict/Dataclass and allow to return tuples by patrickvonplaten in 334
* package `version` on main should have `.dev0` suffix by mishig25 in 354
* [textual_inversion] use tokenizer.add_tokens to add placeholder_token by patil-suraj in 357
* [Type hint] scheduling ddim by santiviquez in 343
* [Type Hints] VAE models by daspartho in 344
* [Type Hint] DDPM schedulers by daspartho in 349
* [Type hint] PNDM schedulers by daspartho in 335
* Fix typo in unet_blocks.py by da03 in 353
* [Commands] Add env command by patrickvonplaten in 352
* Add transformers and scipy to dependency table by patrickvonplaten in 348
* [Type Hint] Unet Models by sidthekidder in 330
* [Img2Img2] Re-add K LMS scheduler by patrickvonplaten in 340
* Use ONNX / Core ML compatible method to broadcast by pcuenca in 310
* [Type hint] PNDM pipeline by daspartho in 327
* [Type hint] Latent Diffusion Uncond pipeline by santiviquez in 333
* Add contributions to README and re-order a bit by patrickvonplaten in 316
* [CI] try to fix GPU OOMs between tests and excessive tqdm logging by anton-l in 323
* README: stable diffusion version v1-3 -> v1-4 by pcuenca in 331
* Textual inversion by patil-suraj in 266
* [Type hint] Score SDE VE pipeline by santiviquez in 325
* [CI] Cancel pending jobs for PRs on new commits by anton-l in 324
* [train_unconditional] fix gradient accumulation. by patil-suraj in 308
* Fix nondeterministic tests for GPU runs by anton-l in 314
* Improve README to show how to use SD without an access token by patrickvonplaten in 315
* Fix flake8 F401 imported but unused by anton-l in 317
* Allow downloading of revisions for models. by okalldal in 303
* Fix more links by python273 in 312
* Changed variable name from "h" to "hidden_states" by JC-swEng in 285
* Fix stable-diffusion-seeds.ipynb link by python273 in 309
* [Tests] Add fast pipeline tests by patrickvonplaten in 302
* Improve README by patrickvonplaten in 301
* [Refactor] Remove set_seed by patrickvonplaten in 289
* [Stable Diffusion] Hotfix by patrickvonplaten in 299
* Check dummy file by patrickvonplaten in 297
* Add missing auth tokens for two SD tests by anton-l in 296
* Fix GPU tests (token + single-process) by anton-l in 294
* [PNDM Scheduler] format timesteps attrs to np arrays by NouamaneTazi in 273
* Fix link by python273 in 286
* [Type hint] Karras VE pipeline by patrickvonplaten in 288
* Add datasets + transformers + scipy to test deps by anton-l in 279
* Easily understandable error if inference steps not set before using scheduler by samedii in 263)
* [Docs] Add some guides by patrickvonplaten in 276
* [README] Add readme for SD by patrickvonplaten in 274
* Refactor Pipelines / Community pipelines and add better explanations. by patrickvonplaten in 257
* Refactor progress bar by hysts in 242
* Support K-LMS in img2img by anton-l in 270
* [BugFix]: Fixed add_noise in LMSDiscreteScheduler by nicolas-dufour in 253
* [Tests] Make sure tests are on GPU by patrickvonplaten in 269
* Adds missing torch imports to inpainting and image_to_image example by PulkitMishra in 265
* Fix typo in README.md by webel in 260
* Fix inpainting script by patil-suraj in 258
* Initialize CI for code quality and testing by anton-l in 256
* add inpainting example script by nagolinc in 241
* Update README.md with examples by natolambert in 252
* Reproducible images by supplying latents to pipeline by pcuenca in 247
* Style the `scripts` directory by anton-l in 250
* Pin black==22.3 to keep a stable --preview flag by anton-l in 249
* [Clean up] Clean unused code by patrickvonplaten in 245
* added test workflow and fixed failing test by kashif in 237
* split tests_modeling_utils by kashif in 223
* [example/image2image] raise error if strength is not in desired range by patil-suraj in 238
* Add image2image example script. by patil-suraj in 231
* Remove dead code in `resnet.py` by ydshieh in 218

Significant community contributions

The following contributors have made significant changes to the library over the last release:

* kashif
* [Docs] Models (416)
* karras-ve docs (401)
* Score sde ve doc (400)
* ddim docs (396)
* added pndm docs (391)
* [Pipeline Docs] ddpm docs for sprint (382)
* added test workflow and fixed failing test (237)
* split tests_modeling_utils (223)

0.2.4

This patch release allows the Stable Diffusion pipelines to be loaded with `float16` precision:
python
pipe = StableDiffusionPipeline.from_pretrained(
"CompVis/stable-diffusion-v1-4",
revision="fp16",
torch_dtype=torch.float16,
use_auth_token=True
)
pipe = pipe.to("cuda")


The resulting models take up less than `6900 MiB` of GPU memory.

- [Loading] allow modules to be loaded in fp16 by patrickvonplaten in 230

Page 13 of 14

Β© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.