Ohmeow-blurr

Latest version: v1.0.5

Safety actively analyzes 622435 Python packages for vulnerabilities to keep your Python projects secure.

1.0.0

The official v.1 release of `ohmeow-blurr`

This is a massive refactoring over the previous iterations of blurr, including namespace modifications that will make it easier for us to add in support for vision, audio, etc... transformers in the future. If you've used any of the previous versions of `blurr` or the development build we covered in part 2 of the W&B study group, ***please make sure you read the docs and note the namespace changes***.

To get up to speed with how to use this library, check out the [W&B x fastai x Hugging Face](https://www.youtube.com/playlist?list=PLD80i8An1OEF8UOb9N9uSoidOGIMKW96t) study group. The [docs](https://ohmeow.github.io/blurr/) are your friend and full of examples as well. I'll be working on updating the other examples floating around the internet as I have time.

If you have any questions, please use the `hf-fastai` channel in the fastai discord or github issues. As always, any and all PRs are welcome.

0.0.26

Checkout the readme for more info.

This release fixes a couple of issues and also includes a few breaking changes. Make sure you update your version of fastai to >= 2.3.1 and your huggingface transformers to >= 4.5.x

0.0.22

* Updated the Seq2Seq models to use some of the latest huggingface bits like tokenizer.prepare_seq2seq_batch.
* Separated out the Seq2Seq and Token Classification metrics into metrics-specific callbacks for a better separation of concerns. As a best practice, you should now only use them as fit_one_cycle, etc.. callbacks rather than attach them to your Learner.
* NEW: Translation are now available in blurr, joining causal language modeling and summarization in our core Seq2Seq stack
* NEW: Integration of huggingface's Seq2Seq metrics (rouge, bertscore, meteor, bleu, and sacrebleu). Plenty of info on how to set this up in the docs.
* NEW: Added default_text_gen_kwargs, a method that given a huggingface config, model, and task (optional), will return the default/recommended kwargs for any text generation models.
* A lot of code cleanup (e.g., refactored naming and removal of redundant code into classes/methods)
* More model support and more tests across the board! Check out the docs for more info
* Misc. validation improvements and bug fixes.

See the docs for each task for more info!

0.0.16

Makes blurr PyTorch 1.7 and fast.ai 2.1.x compliant.

Added new examples section

Misc. improvements/fixes.

0.0.14

This release simplifies the API and introduces a new on-the-fly tokenization feature whereby all tokenization happens during mini-batch creation. There are several upsides to this approach. First, it gets you training faster. Second, it reduces RAM utilization during the reading of your raw data (esp. nice with very large datasets that would give folks problems on platforms like colab). And lastly, I believe the approach provides some flexibility to include data augmentation and/or build adverserial models amongst other things.

0.0.12

This pre-release does tokenization/numericalization the tradational way, as a fastai type transform.

Releases

Has known vulnerabilities