The v2.1 release adds the uncertainty quantification modules, including estimation, calibration, and evaluation (937). For more details on uncertainty quantification in Chemprop, please refer to the [documentation](https://chemprop.readthedocs.io/en/main/tutorial/cli/predict.html#uncertainty-quantification) and the [example notebook](https://chemprop.readthedocs.io/en/main/uncertainty.html). Additionally, we switched the loss functions and metrics to `torchmetrics` (#1022). With this change we also changed the "val_loss" reported to be calculated the same as the training loss to make them comparable (1020). We also changed Chemprop to use replicates instead of cross validation (994) and batch normalization is now disabled by default (1058).
Core code changes
* The `validation_loss_function` is removed in 1023.
* The batch norm is disabled by default in 1058
* An new predictor, `QuantileFFN`, is added in 963
* `BinaryDirichletLoss` and `MulticlassDirichletLoss` are integrated into `DirichletLoss` in 1066
* The split type of `CV` and `CV_NO_VAL` are removed in 994
* A models list of metric is now registered as children modules in 1020
CLI changes
* Disable batch norm by default, and it can be turned on by `--batch-norm` 1058
* Many CLI flags related to uncertainty quantification are added 1010
* Quantile regression is now supported via `-t regression-quantile` 963
* The cross validation (CV) is replaced with replicates. The number of replicates can be specified via `--num-replicates` and the flag `--num-folds` is deprecated 994
* `--tracking-metric` is added which is the metric to track for early stopping and checkpointing 1020
New notebooks
* An notebook showing interoperability of Chemprop featurizer w/ other libraries (DGL and PyG) 1063
* Active learning 910
* Uncertainty quantification 1071
CI/CD
* Ray can be tested on Python 3.12 1064
* `USE_LIBUV: 0` is added into the CI workflow 1065
Backwards Compatibility Note
Models trained with v2.0 will not load properly in v2.1 due to the loss functions file being moved. A conversion script is provided to convert a v2.0 model to one compatible with v2.1. Its usage is `python chemprop/utils/v2_0_to_v2_1.py <v2_0.pt> <v2_1.pt>`
`data.make_split_indices` now always returns a nested list. Previously it would only return a nested list for cross validation. We encourage you to use `data.make_split_indices(num_replicates=X)` where `X` is some number greater than 1, to train on multiple splits of your data to get a better idea of the performance of your architecture. If you do use only one replicate, you will need to unnest the list like so:
train_indices, val_indices, test_indices = data.make_split_indices(mols)
train_data, val_data, test_data = data.split_data_by_indices(
all_data, train_indices, val_indices, test_indices
)
train_data, val_data, test_data = train_data[0], val_data[0], test_data[0]
What's Changed
* change installed torch version on windows actions again by shihchengli in https://github.com/chemprop/chemprop/pull/1062
* .pt instead of .ckpt by twinbrian in https://github.com/chemprop/chemprop/pull/1060
* add ModelCheckpointing to training.ipynb so best model is used automatically by donerancl in https://github.com/chemprop/chemprop/pull/1059
* Add ray to tests on python 3.12 by KnathanM in https://github.com/chemprop/chemprop/pull/1064
* `v2.1` Feature: Replicates Instead of Cross Validation Folds by JacksonBurns in https://github.com/chemprop/chemprop/pull/994
* disable libuv with env var rather than avoiding latest torch by JacksonBurns in https://github.com/chemprop/chemprop/pull/1065
* Add new example notebook for active learning by joelnkn in https://github.com/chemprop/chemprop/pull/910
* Fix: splits column is a string not a list by KnathanM in https://github.com/chemprop/chemprop/pull/1074
* Update chemprop to v2.1 in https://github.com/chemprop/chemprop/pull/1038
* This PR included the following PRs:
* Rerun notebooks for v2.1 by KnathanM in 1067
* Refactor with torchmetrics by KnathanM in 1022
* update train docs for v2.1 by KnathanM in 1069
* Disable batch norm by default by jonwzheng in 1058
* Add notebook showing interoperability of Chemprop featurizer w/other libraries by jonwzheng in 1063
* Add tracking metric options; make metrics ModuleList; other improvements by KnathanM in 1020
* Remove old validate-loss-function function by KnathanM in 1023
* V2: Uncertainty implementation in 1058
* This PR included the following PRs:
* Improve the docstring for uncertainty modules by shihchengli in 986
* Add Platt calibrator by KnathanM in 961
* Add dropout and ensemble predictors by joelnkn in 970
* Add NLL and Spearman Uncertainty Evaluators by am2145 in 984
* Add quantile regression by shihchengli in 963
* Add miscalibration area and ence evaluators by shihchengli in 1012
* Add isotonic calibrators by KnathanM in 1053
* V2 conformal calibrators by shihchengli in 989
* V2 conformal evaluators by shihchengli in 1005
* Uncertainty regression calibrators (non-conformal) by shihchengli in 1055
* Adding Evidential, MVE, and Binary Dirichlet Uncertainty Predictors by akshatzalte in 1061
* Cleanup the uncertainty modules by shihchengli in 1072
* Multiclass dirichlet give uncertainty by KnathanM in 1066
* Rename uncertainty estimator by KnathanM in 1070
* Update uncertainty notebook by shihchengli in 1071
* Add uncertainty quantification to the predict CLI by shihchengli in 1010
**Full Changelog**: https://github.com/chemprop/chemprop/compare/v2.0.5...v2.1.0