Wenet

Latest version: v1.0.6

Safety actively analyzes 722491 Python packages for vulnerabilities to keep your Python projects secure.

2.0.1

This release is for hosting the wenet python binding models.

2.0.0

The following features are stable.
- [x] U2++ framework for better accuracy
- [x] n-gram + WFST language model solution
- [x] Context biasing(hotword) solution
- [x] Very big data training support with UIO
- [x] More dataset support, including WenetSpeech, GigaSpeech, HKUST and so on.

1.0.0

Model
* propose and support U2++, as the following graph shows, which uses both forward and backward information at training and decoding.

![image](https://user-images.githubusercontent.com/6036624/122723076-19401900-d2a5-11eb-8c8a-8c0c0fac3065.png)

* support dynamic left chunk training and decoding, so we can limit history chunk at decoding to save memory and computation.
* support distributed training.

Dataset
Now we support the following five standard speech datasets, and we got SOTA result or close to SOTA result.
| 数据集 | 语言 | 数据量(h) | 测试集 | CER/WER | SOTA |
|-------------|------|-----------|------------|---------|---------------|
| aishell-1 | 中文 | 200 | test | 4.36 | 4.36(WeNet) |
| aishell-2 | 中文 | 1000 | test_ios | 5.39 | 5.39(WeNet) |
| multi-cn | 中文 | 2385 | / | / | / |
| librispeech | 英文 | 1000 | test_clean | 2.66 | 2.10(EspNet) |
| gigaspeech | 英文 | 10000 | test | 11.0 | 10.80(EspNet) |

Productivity

Here are some features related to productivity.

* LM support. Here is the system design or LM supporting. WeNet can work with/without LM according to your applications/scenarios.

![image](https://user-images.githubusercontent.com/6036624/122722822-bfd7ea00-d2a4-11eb-9c5b-9fcfafe207ff.png)

* timestamp support.
* n-best support.
* endpoint support.
* gRPC support
* further refine x86 server and on-device android recipe.

0.1.0

Major Features
* Joint CTC/AED model structure
* U2, dynamic chunk training support
* Torchaudio support
* Runtime x86 and android support

Releases

Has known vulnerabilities