Major change - Variant Scoring
- The ability to change the expert variant scoring strategy has been added. There is now a class `VariantScoring` which can be configured with a `scoring_strategy` argument (currently supported: `attribute_value`, `pseudolikelihood_ratio`, and `mutant_marginal` (NEW)). Each expert has an instance of a `VariantScoring` class. It is defined in `evo_prot_grad.common.variant_scoring`.
- The main entry point for instantiating an expert, `get_expert`, now has a `scoring_strategy` argument for configuring the expert.
- The `use_without_wildtype` argument of the Expert class has been removed. Each scoring strategy normalizes the score with respect to the wildtype score, so this was superflous. If you want to instantiate an expert and use it outside of the DirectedEvolution class, you have to explicitly call `expert.init_wildtype(wt_seq)` before calling the expert to cache the wildtype score (see below).
- `Expert` private class method `_model_output_to_scalar_score` has been removed in favor of a public facing method `get_model_output`. This method can be used to directly get expert scores for sequences.
- The `Expert` class no longer has a `wt_score` attribute. The wildtype score is now stored in the `VariantScoring` class (`wt_score_cache`).
Minor changes
- The `Expert` abstract class now publicly exposes the following methods: `init_wildtype`, for storing the wildtype string sequence and caching the WT score, `tokenize` for tokenizing a sequence, `get_model_output` which accepts a list of protein sequence strings and returns the one-hot encoded sequences and the expert model's predictions.
- Renamed `experts.base_experts.HuggingFaceExpert` to `experts.base_experts.ProteinLMExpert`
- Improved error message reporting for `get_expert`
- Upgraded `transformers[torch]` to `4.38.0`