Unbabel-comet

Latest version: v2.2.5

Safety actively analyzes 723650 Python packages for vulnerabilities to keep your Python projects secure.

Page 2 of 3

2.1.1

Minor bug fix on MBR default model.
Rollback to CometKiwi-22 as default QE model (due to being lightweight)
Minor dependencies update

2.1.0

Released of CometKiwi [-XL](https://huggingface.co/Unbabel/wmt23-cometkiwi-da-xl) and [-XXL](https://huggingface.co/Unbabel/wmt23-cometkiwi-da-xxl)

Bump torchmetrics ("^0.10.2") and Pytorch Lightning ("^2.0.0") (159)

Update on MODELS.md and LICENSE.models.md documentation.

2.0.2

Minor bug fix, update HF hub and released [CometKiwi model](https://huggingface.co/Unbabel/wmt22-cometkiwi-da):
- Bump Hugging Face Hub (^0.16.0)
- Added flexibility to checkpoint download (156 )
- [CometKiwi](https://huggingface.co/Unbabel/wmt22-cometkiwi-da) model is finally released. It's open-source with a non-commercial license.

2.0.1

2.0.0

- **New model architecture (UnifiedMetric)** inspired by [UniTE](https://aclanthology.org/2022.acl-long.558/).
- This model uses cross-encoding (similar to [BLEURT](https://aclanthology.org/2020.acl-main.704/)), works **with and without references** and can be trained in a multitask setting. This model is also implemented in a **very flexible** way where we can decide to train using just source and MT, reference and MT or source, MT and reference.

- New encoder models [RemBERT](https://arxiv.org/abs/2010.12821) and [XLM-RoBERTa-XL](https://arxiv.org/pdf/2105.00572.pdf)

- New training features:
- **System-level accuracy** [(Kocmi et al, 2021)](https://aclanthology.org/2021.wmt-1.57.pdf) reported during validation (only if validation files has a `system` column).
- **Support for multiple training files** (each file will be loaded at the end of the corresponding epoch): This is helpful to **train with large datasets** and to **train following a curriculum**.
- **Support for multiple validation files**: Before we were using 1 single validation file with all language pairs concatenated which has an impact in correlations. With this change we now can have 1 validation file for each language and correlations will be averaged over all validation sets. This also allows for the use of validation files where the ground truth scores are in different scales.
- **Support to HuggingFace Hub**: Models can now be easily added to HuggingFace Hub and used directly using the CLI

- With this release we also add **New models from WMT 22**:
1) We won the WMT 22 QE shared task: Using UnifiedMetric it should be easy to replicate our final system, nonetheless we are planning to release the system that was used: `wmt22-cometkiwi-da` which performs strongly both on data from the QE task (MLQE-PE corpus) and on data from metrics task (MQM annotations).
2) We were 2nd in the Metrics task (1st place was MetricXL a 6B parameter metric trained on top of mT5-XXL). Our new model `wmt22-comet-da` was part of the ensemble used to secure our result.

If you are interested in our work from this year please read the following paper:
- [CometKiwi: IST-Unbabel 2022 Submission for the Quality Estimation Shared Task](https://www.statmt.org/wmt22/pdf/2022.wmt-1.60.pdf)
- [COMET-22: Unbabel-IST 2022 Submission for the Metrics Shared Task](https://www.statmt.org/wmt22/pdf/2022.wmt-1.52.pdf)

And the corresponding findings papers:
- [Findings of the WMT 2022 Shared Task on Quality Estimation](https://www.statmt.org/wmt22/pdf/2022.wmt-1.3.pdf)
- [Results of WMT22 Metrics Shared Task: Stop Using BLEU – Neural Metrics Are Better and More Robust](https://www.statmt.org/wmt22/pdf/2022.wmt-1.2.pdf)

Special thanks to all the involved people: mtreviso nunonmg glushkovato chryssa-zrv jsouza DuarteMRAlves Catarinafarinha cmaroti

1.1.3

Same as v1.1.2 but we bumped some requirements in order to be easier to use COMET on Windows and Apple M1.

Page 2 of 3

Releases

Has known vulnerabilities

Previous Next

Unbabel-comet

Page 2 of 3

2.1.1

2.1.0

2.0.2

2.0.1

2.0.0

1.1.3

Page 2 of 3

Links

Releases