Whisperx

Latest version: v3.3.0

Safety actively analyzes 714973 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 2

3.3.0

What's Changed
* Update faster-whisper to 1.0.2 to enable model distil-large-v3 by moritzbrantner in https://github.com/m-bain/whisperX/pull/814
* latest faster-whisper support added by Hasan-Naseer in https://github.com/m-bain/whisperX/pull/875
* Working version with pyannote:3.3.2 and faster-whisper:1.1.0 by ibombonato in https://github.com/m-bain/whisperX/pull/936
* Add ultization to verbose flag by H4CK3Rabhi in https://github.com/m-bain/whisperX/pull/759
* Added local_files_only option on whisperx.load_model for offline mode by RoqueGio in https://github.com/m-bain/whisperX/pull/867
* adding cache_dir to wav2vec2 by bnitsan in https://github.com/m-bain/whisperX/pull/681
* feat: add basic installation test flow & restrict python versions by Barabazs in https://github.com/m-bain/whisperX/pull/965
* chore: add build and release workflow by Barabazs in https://github.com/m-bain/whisperX/pull/966
* fix: update README image source and enhance setup.py for long description by Barabazs in https://github.com/m-bain/whisperX/pull/968
* docs: update installation instructions in README by Barabazs in https://github.com/m-bain/whisperX/pull/969
* fix: add UTF-8 encoding when reading README.md by xigh in https://github.com/m-bain/whisperX/pull/970
* chore: loosen ctranslate2 version restriction & bump whisperX version by Barabazs in https://github.com/m-bain/whisperX/pull/971

New Contributors
* moritzbrantner made their first contribution in https://github.com/m-bain/whisperX/pull/814
* Hasan-Naseer made their first contribution in https://github.com/m-bain/whisperX/pull/875
* ibombonato made their first contribution in https://github.com/m-bain/whisperX/pull/936
* H4CK3Rabhi made their first contribution in https://github.com/m-bain/whisperX/pull/759
* RoqueGio made their first contribution in https://github.com/m-bain/whisperX/pull/867
* bnitsan made their first contribution in https://github.com/m-bain/whisperX/pull/681
* xigh made their first contribution in https://github.com/m-bain/whisperX/pull/970

**Full Changelog**: https://github.com/m-bain/whisperX/compare/v3.2.0...v3.3.0

3.2.0

Device and Language Support
* added Korean wav2vec2 model by Boulaouaney in https://github.com/m-bain/whisperX/pull/277
* Add Czech alignment model by Thebys in https://github.com/m-bain/whisperX/pull/280
* Adding Norwegian Bokmål and Norwegian Nynorsk by peregilk in https://github.com/m-bain/whisperX/pull/636
* Support language names in `--language` parameter. by jkukul in https://github.com/m-bain/whisperX/pull/517
* Add align model for catalan language. by davidmartinrius in https://github.com/m-bain/whisperX/pull/581
* add missing Cantonese in supported languages by MahmoudAshraf97 in https://github.com/m-bain/whisperX/pull/617
* Add alignment model for Malayalam by kurianbenoy in https://github.com/m-bain/whisperX/pull/585
* Added Romanian phoneme-based ASR model by Majdoddin in https://github.com/m-bain/whisperX/pull/791
* added alignment for sk and sl languages by jan-panoch in https://github.com/m-bain/whisperX/pull/852
* Add war2vec model for Vietnamese in https://github.com/m-bain/whisperX/pull/278
* Add Urdu model support for alignment by abCods in https://github.com/m-bain/whisperX/pull/374
* chore(writer): Join words without spaces for ja, zh by jim60105 in https://github.com/m-bain/whisperX/pull/440

Bug Fixes and Stability Improvements
* fix Unequal Stack Size VAD error by m-bain in https://github.com/m-bain/whisperX/pull/281
* fix: Bug in type hinting by VisionOra in https://github.com/m-bain/whisperX/pull/294
* pin faster whisper by sorgfresser in https://github.com/m-bain/whisperX/pull/474
* Fix repeat transcription on different languages and proper suppress_numerals use by Joemgu7 in https://github.com/m-bain/whisperX/pull/395
* fix writer fail on segments 0 by sorgfresser in https://github.com/m-bain/whisperX/pull/429
* fix missing speaker prefix by invisprints in https://github.com/m-bain/whisperX/pull/438
* fix: correct defaut_asr_options with new options (patch 0.8) by remic33 in https://github.com/m-bain/whisperX/pull/458
* Fixes --model_dir path by canoalberto in https://github.com/m-bain/whisperX/pull/648
* fix: force ctranslate to version 4.4.0 by Barabazs in https://github.com/m-bain/whisperX/pull/946
* fix: update faster-whisper dependencies by cococig in https://github.com/m-bain/whisperX/pull/716
* fix: ZeroDivisionError when --print_progress True by mvoggu in https://github.com/m-bain/whisperX/pull/494
* Minor fixes for word options and subtitles by amolinasalazar in https://github.com/m-bain/whisperX/pull/549
* fix unboundlocalerror by sorgfresser in https://github.com/m-bain/whisperX/pull/554
* Fix: Allow vad options to be configurable by passing to FasterWhisperPipeline and merge_chunks. by abettke in https://github.com/m-bain/whisperX/pull/507
* fix minimum input length for torch wav2vec2 models by MahmoudAshraf97 in https://github.com/m-bain/whisperX/pull/510
* fix(diarize): key error on empty track by characat0 in https://github.com/m-bain/whisperX/pull/518
* pip compliance for git+ installs by spbisc97 in https://github.com/m-bain/whisperX/pull/603

Documentation Updates
* adds link to whisperX medium on replicate.com by CaRniFeXeR in https://github.com/m-bain/whisperX/pull/431
* Document --compute_type command line option by dotgrid in https://github.com/m-bain/whisperX/pull/430
* adding link to Replicate demo by daanelson in https://github.com/m-bain/whisperX/pull/352
* fix: typo in error message by zamoshchin in https://github.com/m-bain/whisperX/pull/493
* Fix link in README.md by jimregan in https://github.com/m-bain/whisperX/pull/668
* Update README.md by valentt in https://github.com/m-bain/whisperX/pull/509
* Add a special note about Speaker-Diarization-3.0 in readme by kaihe-stori in https://github.com/m-bain/whisperX/pull/521
* Update README to correct speaker diarization version link by gillens in https://github.com/m-bain/whisperX/pull/618
* Update README.md by mlopsengr in https://github.com/m-bain/whisperX/pull/630
* fix link by M0HID in https://github.com/m-bain/whisperX/pull/605
* Remove torchvision from README by baer in https://github.com/m-bain/whisperX/pull/378

Miscellaneous Changes
* move model to assets by m-bain in https://github.com/m-bain/whisperX/pull/945
* Update alignment.py by Ayushi-Desynova in https://github.com/m-bain/whisperX/pull/418
* Update alignment.py by awerks in https://github.com/m-bain/whisperX/pull/427
* push contributions from main by m-bain in https://github.com/m-bain/whisperX/pull/290
* make diarization faster by davidas1 in https://github.com/m-bain/whisperX/pull/400
* Add device_index option by sorgfresser in https://github.com/m-bain/whisperX/pull/266
* Add transcribe keywords by sorgfresser in https://github.com/m-bain/whisperX/pull/269
* Added download path parameter. by prameshbajra in https://github.com/m-bain/whisperX/pull/284
* Suppress numerals by m-bain in https://github.com/m-bain/whisperX/pull/303
* Add Audacity export by Ca-ressemble-a-du-fake in https://github.com/m-bain/whisperX/pull/309
* Update transcribe.py -> small change in `batch_size` description by mabergerx in https://github.com/m-bain/whisperX/pull/382
* Suggest using pytorch-cuda 11.8 instead of 11.7 by tijszwinkels in https://github.com/m-bain/whisperX/pull/255
* feat: Add merge chunks chunk_size as arguments. by jim60105 in https://github.com/m-bain/whisperX/pull/445
* A solution to long subtitles and words without timestamps by awerks in https://github.com/m-bain/whisperX/pull/459
* chore(writer): improve text display(ja etc) in json file by darwintree in https://github.com/m-bain/whisperX/pull/472
* add faster whisper threading by sorgfresser in https://github.com/m-bain/whisperX/pull/473
* Pyannote3 by remic33 in https://github.com/m-bain/whisperX/pull/492
* Update alignment.py by piuy11 in https://github.com/m-bain/whisperX/pull/487
* Pass patience and beam_size to faster-whisper. by jkukul in https://github.com/m-bain/whisperX/pull/527
* remove the minimum length for alignment and print the failing segment by MahmoudAshraf97 in https://github.com/m-bain/whisperX/pull/529
* Update setup.py to use pyannote.audio version with working GPU by wuurrd in https://github.com/m-bain/whisperX/pull/531
* Update setup.py to download pyannote depending on platform by justinwlin in https://github.com/m-bain/whisperX/pull/541
* Drop ffmpeg-python dependency and call ffmpeg directly. by hidenori-endo in https://github.com/m-bain/whisperX/pull/570
* no align based on space by sorgfresser in https://github.com/m-bain/whisperX/pull/556
* Update asr.py and make the model parameter be used by kaka1909 in https://github.com/m-bain/whisperX/pull/580
* Move load_model after WhisperModel by DougTrajano in https://github.com/m-bain/whisperX/pull/584
* Update pyannote to 3.1.0 by remic33 in https://github.com/m-bain/whisperX/pull/586
* support for `large-v3` by MahmoudAshraf97 in https://github.com/m-bain/whisperX/pull/599
* Added option to load Custom VAD model to load model method by Swami-Abhinav in https://github.com/m-bain/whisperX/pull/654
* Update pyannote to v3.1.1 to fix a diarization problem (and diarize.py) by santialferez in https://github.com/m-bain/whisperX/pull/646
* Get rid of numeral_symbol_tokens variable in printed message by KossaiSbai in https://github.com/m-bain/whisperX/pull/669
* Add Replicate large-v3 demo by victor-upmeet in https://github.com/m-bain/whisperX/pull/703
* local vad model by m-bain in https://github.com/m-bain/whisperX/pull/944
* Feat: add new align models - SHORT by Equipo45 in https://github.com/m-bain/whisperX/pull/922
* Update alignment.py by peregilk in https://github.com/m-bain/whisperX/pull/687

**Full Changelog**: https://github.com/m-bain/whisperX/compare/v3.1.1...v3.2.0

3.1.1

- `translate` functionality added
- fix word timestamp bug (words no longer have consecutive timestamps)

3.1.0

- 70x real time transcription, <8GB gpu memory requirement ⚡️⚡️
- each transcript segment is a sentence (using `nltk.sent_tokenize`)
- diarization now assigned per sentence (and outputted to srt)
- clean up on alignment logic

3.0.2

torch2.0, python3.10

3.0.1

- fix pickling error (set num_workers=0) to perform dataloading iwth main process
- add basic diarization
- pad language detection if less than 30s

Page 1 of 2

Releases

Has known vulnerabilities

Whisperx

Page 1 of 2

3.3.0

3.2.0

3.1.1

3.1.0

3.0.2

3.0.1

Page 1 of 2

Links

Releases