Highlights
1. Standardize the ModelOutput API:
- Remove ambiguous flags: `ignore_masking` and `hf_format`: https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/543
- Introduce the `testing` flag to differentiate between evaluation (=True) and inference (=False) modes: https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/543
- All prediction tasks return the same output
https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/546
- During training and evaluation: the output is a dictionary with three elements: {"loss":torch.tensor, "labels": torch.tensor, "predictions": torch.tensor}
- During inference: The output is the tensor of predictions.
2. Extend the `Trainer` class to support all prediction tasks:
https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/564
- The trainer class is now accepting a T4Rec model defined with binary or regression tasks.
- Remove the `HFWrapper` class as the `Trainer` is now supporting the base T4Rec `Model` class.
- Set the default of the trainer's argument `predict_top_k` to `0` instead of `10`.
* Note that getting the top-k predictions is specific to `NextItemPredictionTask` and the user should explicitly set the parameter in the `T4RecTrainingArguments` object. If not specified, the method `Trainer.predict()` returns unsorted predictions for the whole item catalog.
- Support multi-task learning in the `Trainer` class: it accepts any T4Rec model defined with multiple tasks and/or multiple heads.
3. Fix the inference performance of the Transformer-based model trained with masked language modeling (MLM):
https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/551
* At inference, the input sequence is extended by a [MASK] embedding after the last non-padded position to take into account the target position. The hidden representation of the [MASK] position is used to get the next-item prediction scores.
* With this fix, the user doesn't need to add a dummy position to the input test data when calling `Trainer.predict()` or `model(test_batch, training=False, testing=False)`
4. Update Transformers4Rec to use the new merlin-dataloader package: https://github.com/NVIDIA-Merlin/Transformers4Rec/pull/547
* The NVTabularDataLoader is renamed to MerlinDataLoader to use the loader from merlin-dataloader package.
* User can specify the argument `data_loader_engine=βmerlinβ` in the `T4RecTrainingArguments` object to use the merlin dataloader. It supports GPU and CPU environments. The alias `nvtabular` is also kept to ensure backward compatibility.
Whatβs Changed
β Breaking Changes
- Extend trainer class to support all T4Rec prediction tasks sararb (564)
- Standardize prediction tasks' outputs nzarif (546)
- Uses merlin-dataloader package edknv (547)
- Refactoring part1- flags modification nzarif (543)
π Bug Fixes
- Fix error raised by latest Torchmetrics (0.11.0) sararb (576)
- Fix the test data path in Trainer.predict() sararb (571)
- Fix discrepancy between evaluation and inference modes sararb (551)
π Features
- Support to pre-trained embeddings initializer (trainable or not) gabrielspmoreira (572)
- Extend trainer class to support all T4Rec prediction tasks sararb (564)
- Standardize prediction tasks' outputs nzarif (546)
- Add music-streaming synthetic data to test the support of all predictions tasks with the Trainer class sararb (540)
- Refactoring part1- flags modification nzarif (543)
π Documentation
- Address review feedback mikemckiernan (562)
- Serving tfrec with pyt backend example rnyak (552)
- docs: Add basic SEO configuration mikemckiernan (518)
- docs: Add semver to calver banner mikemckiernan (520)
- Minor updates to notebook texts bbozkaya (548)
π§ Maintenance
- Update mypy version from 0.971 to 0.991 oliverholworthy (574)
- Uses merlin-dataloader package edknv (547)
- fix drafter and update cpu ci to run on targeted branch jperez999 (549)
- Add lint workflow to run pre-commit on all files oliverholworthy (545)
- Specify packages to look for in setup.py to avoid publishing tests oliverholworthy (529)
- Cleanup tensorflow dependencies oliverholworthy (530)
- Add docs requirements to extras list in setup.py (533)
- Remove stale documentation reviews (531)
- Update branch name extraction for tag builds (608)
- run github action tests and lint via tox, with upstream deps installed (527)