New
- Apple MPS GPU (Experimental, off by default) (261, 472) by rasonyang
- Replacement of rare characters (Chinese characters) (350) by 6drf21e
- `local` loading mode, renamed original `local` to `custom` (361) by fumiama
- Core supports streaming inference (360) by Ox0400
- WebUI supports streaming inference (380) by v3ucn
- User customizable logger (398) by fumiama
- CMD supports batch inference (366) by Ox0400
- Customizable DVAE coef parameter (405) by fumiama
- `download_models` `unload` API (4dd1f88) by fumiama
- Normalizer changed to registration type, users can register interfaces that meet the requirements (420) by fumiama
- Improved type annotations, all dict parameters changed to dataclass for easy auto-completion when calling (422) by fumiama
- Interruptable inference process, which will return the currently inferred part (433) by fumiama
- **Experimental**: NVIDIA TransformerEngine support (496) by fumiama
- Infer parameter `show_tqdm` (3836db8) by fumiama
- **Experimental**: flash_attention_2 support (c109089) by fumiama
Fixed
- Normalizer initialization error (343) by fumiama
- Compile error handling (377, 413) by asamaayako
- Possible addition of `[spk_emb]` when refining text (464) by fumiama
- Inconsistent tone when inferring a list of texts (492) by fumiama
- Possible return of None voice when inferring (511) by fumiama
Optimized
- DVAE tensor operation process (273) by ain-soph
- MPS inference sound quality (373) by LeoN0425
- Added `_` prefix for internal calls (4dd1f88) by fumiama
- Renamed `check_model` to `has_loaded` (4dd1f88) by fumiama
- Renamed `load_model` to `load` (432) by fumiama
- Verify file hash when customizing model loading path to prevent tampering (453) by fumiama
- Default output to mp3 format (449) by fumiama
- Changed spk_emb to str type for easy customization, copying, and sharing of tones (463) by fumiama
- Removed useless tensor dimension swap in DVAE (488) by charSLee013
Dependencies
- Relaxed dependency restrictions for easier installation
---
新增
- Apple MPS GPU (实验性, 默认不开启) (261, 472) by rasonyang
- 替换生僻字(汉字) (350) by 6drf21e
- `local`加载模式,重命名原`local`到`custom` (361) by fumiama
- core 支持流式推理 (360) by Ox0400
- webui 支持流式推理 (380) by v3ucn
- 用户可自定义 logger (398) by fumiama
- cmd 支持批量推理 (366) by Ox0400
- 可自定义 DVAE coef 参数 (405) by fumiama
- `download_models` `unload` API (4dd1f88) by fumiama
- normalizer 改为注册式,用户可以自行注册符合要求的接口 (420) by fumiama
- 完善类型注解,将所有dict传参改为dataclass,方便调用时自动补全 (422) by fumiama
- 打断推理进程,返回当前已推理的部分 (433) by fumiama
- **实验性**:NVIDIA TransformerEngine 支持 (496) by fumiama
- infer 参数 `show_tqdm` (3836db8) by fumiama
- **实验性**:flash_attention_2 支持 (c109089) by fumiama
修复
- Normalizer 初始化错误 (343) by fumiama
- compile 错误处理 (377, 413) by asamaayako
- refine_text() 时可能加入 `[spk_emb]` (464) by fumiama
- infer 传入文本列表时音色不统一 (492) by fumiama
- infer 可能概率返回 None 语音 (511) by fumiama
优化
- DVAE 张量运算流程 (273) by ain-soph
- MPS推理音质 (373) by LeoN0425
- 为内部调用增加`_`前缀 (4dd1f88) by fumiama
- 重命名 `check_model` 为 `has_loaded` (4dd1f88) by fumiama
- 重命名 `load_model` 为 `load` (432) by fumiama
- 自定义加载模型路径时校验文件哈希以免被篡改 (453) by fumiama
- 默认输出 mp3 格式 (449) by fumiama
- spk_emb 改成 str 类型方便自定义、拷贝、分享音色 (463) by fumiama
- 移除 DVAE 中无用的张量维度交换 (488) by charSLee013
依赖
- 放宽依赖限制使安装更容易