Mlx-audio

Latest version: v0.0.3

Safety actively analyzes 722581 Python packages for vulnerabilities to keep your Python projects secure.

0.0.3

What's Changed
* Add verbose logging and model selection support by ivanfioravanti in https://github.com/Blaizzy/mlx-audio/pull/22
* Pulsating effect by ivanfioravanti in https://github.com/Blaizzy/mlx-audio/pull/23
* Compile the decoder for Kokoro by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/24
* Play audio segments as they are generated by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/26
* Evaluate the computation graph before returning results by ivanfioravanti in https://github.com/Blaizzy/mlx-audio/pull/35
* Add Mimi neural audio codec by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/34
* Add model for Sesame TTS by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/36
* Sphere speed up during audio generation by ivanfioravanti in https://github.com/Blaizzy/mlx-audio/pull/40
* Added more Voices by andrepadez in https://github.com/Blaizzy/mlx-audio/pull/37
* Feature: External API for Audiobook Generation by sergenes in https://github.com/Blaizzy/mlx-audio/pull/19
* Add EnCodec neural audio codec by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/46
* Add Suno bark by Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/45
* Update README.md to fix lang_code error by zboyles in https://github.com/Blaizzy/mlx-audio/pull/49
* fix model config by Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/50
* Add Vocos neural audio codec by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/48
* Fix Kokoro audio generation by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/52
* Add orpheus by Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/47
* Resample and Transcribe by chigkim in https://github.com/Blaizzy/mlx-audio/pull/51
* Fix vocos config loading by Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/53

New Contributors
* andrepadez made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/37
* sergenes made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/19
* zboyles made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/49
* chigkim made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/51

**Full Changelog**: https://github.com/Blaizzy/mlx-audio/compare/v0.0.2...v0.0.3

0.0.2

What's Changed
* fix workflows and readme by Blaizzy in https://github.com/Blaizzy/mlx-audio/pull/5
* Add soundfile to requirements and Quick Start in README by ivanfioravanti in https://github.com/Blaizzy/mlx-audio/pull/8
* Remove librosa dependency by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/11
* Add support for command-line playback with the --play argument by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/10
* Allow receiving text input from stdin or an entry prompt by lucasnewman in https://github.com/Blaizzy/mlx-audio/pull/12
* Add web server and improve audio player by ivanfioravanti in https://github.com/Blaizzy/mlx-audio/pull/14
* Use phonemizer-fork to avoid espeak errors by rampadc in https://github.com/Blaizzy/mlx-audio/pull/17

New Contributors
* Blaizzy made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/5
* ivanfioravanti made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/8
* lucasnewman made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/11
* rampadc made their first contribution in https://github.com/Blaizzy/mlx-audio/pull/17

**Full Changelog**: https://github.com/Blaizzy/mlx-audio/compare/v0.0.1...v0.0.2

0.0.1

Release Notes - February 28, 2025

Overview
This release introduces a fully functional MLX-Audio package with text-to-speech capabilities, complete with testing infrastructure and CI/CD integration via GitHub Actions.

New Features
- **Text-to-Speech Generation**: Added complete generation pipeline with audio output functionality
- **Audio Joining**: New functionality to join multiple audio segments
- **Model Quantization**: Added support for model quantization to improve performance
- **GitHub Actions**: Implemented CI/CD workflows for automated testing and deployment

Improvements
- **Kokoro MLX porting**: Completed refactoring of the entire model to MLX framework:
- Text encoder with BERT implementation
- Decoder with improved audio quality
- Duration, indices, and alignment target prediction
- Custom Bidirectional LSTM, Weight norm for CNNs, AdaLayerNorm and Generator layers
- **SafeTensors Support**: Added working implementation for SafeTensors format
- **Pipeline Structure**: Restructured the generation pipeline for better maintainability

Bug Fixes
- Fixed model loading mechanism
- Resolved issues with text encoder LayerNorm operation
- Fixed generator functionality
- Addressed issues in LSTM and AdaLayerNorm implementations
- Refactored and fixed ConvWeight component
-
**Full Changelog**: https://github.com/Blaizzy/mlx-audio/commits/v0.0.1

Releases

Has known vulnerabilities