Riffusion

Latest version: v0.0.5

Safety actively analyzes 641872 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

0.3.0

Riffusion is a library for real-time music and audio generation with stable diffusion.

Read about it at https://www.riffusion.com/about and try it at https://www.riffusion.com/.

๐Ÿ–Š๏ธ Full Rewrite
This release contains a full rewrite of the Riffusion codebase to go from a hack to a quality software project.

* Rename the repository from `riffusion-inference` to `riffusion`.
* `SpectrogramParams` class that contains all conversion parameters, with sane defaults.
* `SpectrogramConverter` class that converts between spectrogram tensors and audio.
* `SpectrogramImageConverter` class that converts between spectrogram images and audio.
* Leverage [pydub](https://github.com/jiaaro/pydub) `AudioSegment` in more places rather than raw numpy arrays.
* Move common code into the `util` package.
* Cache more computation and be careful about error checking.
* Move third party integrations into the `integrations` package. Share most of the code so they greatly simplify.
* `pyproject.toml` for tool configuration
* Overhaul README with more descriptive instructions.

๐Ÿšจ This release is API compatible with the web app, but code that used this repository directly will need to be updated.

๐Ÿ‘ฉโ€๐Ÿ’ป Riffusion CLI

Extensible command line interface for performing common tasks. See the README for details.


$ python -m riffusion.cli -h
usage: cli.py [-h] {audio-to-image,image-to-audio,sample-clips,print-exif} ...

positional arguments:
{audio-to-image,image-to-audio,sample-clips,print-exif}
audio-to-image Compute a spectrogram image from a waveform.
image-to-audio Reconstruct an audio clip from a spectrogram image.
sample-clips Slice an audio file into clips of the given duration.
print-exif Print the params of a spectrogram image as saved in the exif data.

options:
-h, --help show this help message and exit

๐Ÿคพโ€โ™‚๏ธ Riffusion Playground

Extensible [Streamlit](https://streamlit.io/) app for interactive exploration of Riffusion. See the README for details.

![image](https://user-images.githubusercontent.com/1524208/209698164-276a8eba-9017-4465-93e7-08c01136136e.png)

๐Ÿ”ฅ MPS and CPU Backends

Riffusion now can run on MPS and CPU backends in addition to CUDA. See the README for details.

Also adds graceful detection and fallback of devices.

Closes: 15

๐Ÿ‘“ Stereo Spectrograms

Add tools to encode and decode stereo audio as spectrograms, using the G and B channels for left and right.

![clip_2_start_103694_ms_duration_5678_ms_stereo](https://user-images.githubusercontent.com/1524208/209697313-b472a7f1-917d-47a2-9108-d869645eb1e3.png)

๐Ÿ–ผ๏ธ Encode Spectrogram Params in Image EXIF

Add the ability to store spectrogram conversion parameters in EXIF metadata of the images, and the ability to decode back to audio from those params. This allows more flexibility for usage without assuming default parameters.

The `SpectrogramParams` class has methods to convert to and from EXIF.


$ python -m riffusion.cli print-exif --image spectrogram.jpg
NUM_FREQUENCIES = 512
STEP_SIZE_MS = 10
MAX_VALUE = 46801012.0
MIN_FREQUENCY = 0
WINDOW_DURATION_MS = 100
MAX_FREQUENCY = 10000
PADDED_DURATION_MS = 400
SAMPLE_RATE = 44100
STEREO = 1
POWER_FOR_IMAGE = 0.25


๐Ÿ”‰ Post-Processing Filters

Add a capability to apply normalization and compression to audio using pydub.

๐ŸŸข Test Suite

Add a suite of tests in the `test/` package, and check in some test data.

They are automatically run on pull requests, configured from [ci.yml](https://github.com/riffusion/riffusion/blob/main/.github/workflows/ci.yml).

* `audio_to_image_test.py`
* `image_to_audio_test.py`
* `image_util_test.py`
* `linter_test.py`
* `print_exif_test.py`
* `sample_clips_test.py`
* `spectrogram_converter_test.py`
* `spectrogram_image_converter_test.py`

๐Ÿงน Lint Tools
These tools run in CI and must pass cleanly to merge.

* [ruff](https://github.com/charliermarsh/ruff) for linting (`ruff --fix .`)
* [black](https://github.com/psf/black) for formatting (`black .`)
* [mypy](https://github.com/python/mypy) for typing (`mypy .`)

PRs

* Rewrite the codebase to be high quality by hmartiro in https://github.com/riffusion/riffusion/pull/36
* Enable ruff import sorting by hmartiro in https://github.com/riffusion/riffusion/pull/38
* Add CI with github actions by hmartiro in https://github.com/riffusion/riffusion/pull/37
* Streamlit app for interactive use of the model by hmartiro in https://github.com/riffusion/riffusion/pull/40
* Add detail to readme by hmartiro in https://github.com/riffusion/riffusion/pull/46
* Disable compression by default, too slow by hmartiro in https://github.com/riffusion/riffusion/pull/47
* Improve interpolation playground by hmartiro in https://github.com/riffusion/riffusion/pull/45

**Full Changelog**: https://github.com/riffusion/riffusion/compare/v0.2.0...v0.3.0

Links

Releases

Has known vulnerabilities

ยฉ 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.