ebook2audiobook Changelog

1.2.1

Fixed custom model loading issue.

What's Changed
* chinese readme by WUYIN66 in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/25
* Installation with pip in edit mode by ROBERT-MCDOWELL in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/26
* Revert "Installation with pip in edit mode" by DrewThomasson in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/27
* Merge new Kaggel additions by DrewThomasson in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/29

New Contributors
* WUYIN66 made their first contribution in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/25
* ROBERT-MCDOWELL made their first contribution in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/26
* DrewThomasson made their first contribution in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/27

**Full Changelog**: https://github.com/DrewThomasson/ebook2audiobookXTTS/compare/1.2...1.2.1

1.2

New and improved App

- Single app that runs in gui or headless mode

- Fixed Sentence splitting for all 16 languages

New and Improved Web GUI
![demo_web_gui](https://github.com/user-attachments/assets/0c0cec8b-f681-4b65-acf9-02ed9ab616bc)

Added these parameters for headless mode:

bash
usage: app.py [-h] [--share SHARE] [--headless HEADLESS] [--ebook EBOOK] [--voice VOICE]
[--language LANGUAGE] [--use_custom_model USE_CUSTOM_MODEL]
[--custom_model CUSTOM_MODEL] [--custom_config CUSTOM_CONFIG]
[--custom_vocab CUSTOM_VOCAB] [--custom_model_url CUSTOM_MODEL_URL]
[--temperature TEMPERATURE] [--length_penalty LENGTH_PENALTY]
[--repetition_penalty REPETITION_PENALTY] [--top_k TOP_K] [--top_p TOP_P]
[--speed SPEED] [--enable_text_splitting ENABLE_TEXT_SPLITTING]

Convert eBooks to Audiobooks using a Text-to-Speech model. You can either launch the
Gradio interface or run the script in headless mode for direct conversion.

options:
-h, --help show this help message and exit
--share SHARE Set to True to enable a public shareable Gradio link. Defaults
to False.
--headless HEADLESS Set to True to run in headless mode without the Gradio
interface. Defaults to False.
--ebook EBOOK Path to the ebook file for conversion. Required in headless
mode.
--voice VOICE Path to the target voice file for TTS. Optional, uses a default
voice if not provided.
--language LANGUAGE Language for the audiobook conversion. Options: en, es, fr, de,
it, pt, pl, tr, ru, nl, cs, ar, zh-cn, ja, hu, ko. Defaults to
English (en).
--use_custom_model USE_CUSTOM_MODEL
Set to True to use a custom TTS model. Defaults to False. Must
be True to use custom models, otherwise you'll get an error.
--custom_model CUSTOM_MODEL
Path to the custom model file (.pth). Required if using a custom
model.
--custom_config CUSTOM_CONFIG
Path to the custom config file (config.json). Required if using
a custom model.
--custom_vocab CUSTOM_VOCAB
Path to the custom vocab file (vocab.json). Required if using a
custom model.
--custom_model_url CUSTOM_MODEL_URL
URL to download the custom model as a zip file. Optional, but
will be used if provided. Examples include David Attenborough's
model: 'https://huggingface.co/drewThomasson/xtts_David_Attenbor
ough_fine_tune/resolve/main/Finished_model_files.zip?download=tr
ue'. More XTTS fine-tunes can be found on my Hugging Face at
'https://huggingface.co/drewThomasson'.
--temperature TEMPERATURE
Temperature for the model. Defaults to 0.65. Higher Tempatures
will lead to more creative outputs IE: more Hallucinations.
Lower Tempatures will be more monotone outputs IE: less
Hallucinations.
--length_penalty LENGTH_PENALTY
A length penalty applied to the autoregressive decoder. Defaults
to 1.0.
--repetition_penalty REPETITION_PENALTY
A penalty that prevents the autoregressive decoder from
repeating itself. Defaults to 2.0.
--top_k TOP_K Top-k sampling. Lower values mean more likely outputs and
increased audio generation speed. Defaults to 50.
--top_p TOP_P Top-p sampling. Lower values mean more likely outputs and
increased audio generation speed. Defaults to 0.8.
--speed SPEED Speed factor for the speech generation. IE: How fast the
Narrerator will speak. Defaults to 1.0.
--enable_text_splitting ENABLE_TEXT_SPLITTING
Enable splitting text into sentences. Defaults to True.

Example: python script.py --headless --ebook path_to_ebook --voice path_to_voice
--language en --use_custom_model True --custom_model model.pth --custom_config
config.json --custom_vocab vocab.json

What's Changed
* 1.wav missing - change to default_voice.wav by matthiss in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/12

New Contributors
* matthiss made their first contribution in https://github.com/DrewThomasson/ebook2audiobookXTTS/pull/12

**Full Changelog**: https://github.com/DrewThomasson/ebook2audiobookXTTS/compare/1.1...1.2

1.1

**Full Changelog**: https://github.com/DrewThomasson/ebook2audiobookXTTS/commits/1.1

Ebook2audiobook

Page 1 of 1

1.2.1

1.2

1.1

Page 1 of 1

Links

Releases