Major Features and Improvements
* New FastBertNormalizer that improves speed for BERT normalization and is convertible to TF Lite.
* New FastBertTokenizer that combines FastBertNormalizer and FastWordpieceTokenizer.
* New ngrams kernel for handling STRING_JOIN reductions.
Bug Fixes and Other Changes
* NgramsStringJoin shape inference fixed to handle unranked tensors
* Upgrade pybind11 and reenable tests that were broken.
* Rename a couple files to match the naming of the other tflite kernels. Also adds some deps to tflite_ops that were missing and causing an error when testing `:all`.
* Add to TF Lite documentation that ngrams is a convertible op.
* Fix public access and missing ICU data to build_fast_bert_normalizer_model and enable the disabled tests.
* Update the doc for FastWordpieceTokenizer.
* Refine the doc for FastWordpieceTokenizer.
* Bug fix: make BertTokenizer work for RaggedTensors with row_splits_dtype=int32
* Fix typo error text.WordpieceTokenizer
* Added comma at missing places in emoticons for normalizer
* Refactor build and test scripts to use prepare_tf_dep.sh
* Fixes prepare_tf_dep.sh for OSX.
* Fixed bug in setup.py that was requiring the wrong version.
* Updated package with the correct versions of Python we release on.
* Update documentation on TF Lite convertible ops.
* Transition to use TF's version of bazel.
* Transition to use TF's bazel configuration.
* Add missing symbols for tokenization layers
* Fix typo in text_generation.ipynb
* Fix grammar typo
* Allow fast wordpiece tokenizer to take in external wordpiece model.
* Internal change
* Improvement to guide where mean call is redundant. See https://github.com/tensorflow/text/issues/810 for more info.
* Update broken link and fix typo in BERT-SNGP demo notebook
* Consolidate disparate test-related files into a single testing_infra folder.
* Pin tf-text version to guides & tutorials.
* Fix bug in constrained sequence op. Added a check on an edge case where num_steps = 0 should do nothing and prevent it from SIGSEV crashes.
* Remove outdated Keras tests due to them no longer making the testing utilities available.
* Update bert preprocessing by padding correct tensors
* Update tensorflow-text notebooks from 2.7 to 2.8
* Optimize FastWordPiece to only generate requested outputs.
* Add a note about byte-indexing vs character indexing.
* Add a MAX_TOKENS to the transformer tutorial.
* Only export tensorflow symbols from shared libs.
* (Generated change) Update tf.Text versions and/or docs.
* Do not run the prepare_tf_dep script for Apple M1 macs.
* Update text_classification_rnn.ipynb
* Fix the exported symbols for the linker test. By adding it to the share objects instead of the c++ code, it allows for the code to be compiled together in one large shared lib.
* Implement FastBertNormalizer based on codepoint-wise mappings.
* Add pybind for fast_bert_normalizer_model_builder.
* Remove unused comments related to Python 2 compatibility.
* update transformer.ipynb
* Update toolchain & temporarily disable tf lite tests.
* Define manylinux2014 for the new toolchain target, and have presubmits use it.
* Move tflite build deps to custom target.
* Add FastBertTokenizer.
* Update bazel version to 5.1.0
* Update TF Text to use new Ngrams kernel.
* Don't try to set dimension if shape is unknown for ngrams.
Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
Aflah, Connor Brinton, devnev39, Janak Ramakrishnan, Martin, Nathan Luehr, Pierre Dulac, Rabin Adhikari, gadagashwini, mohantym, rtg0795