Breaking changes
- `molgraph.layers`
- `molgraph.layers.DotProductIncident` no longer takes `apply_sigmoid` as an argument. Instead it takes `normalize`, which specifies whether the dot product should be normalized, resulting in cosine similarities (values between -1 and 1).
- `molgraph.models`
- `GraphAutoEncoder` (GAE) and `GraphVariationalAutoEncoder` (GVAE) are changed. The default `loss` is `None`, which means that a default loss function is used. This loss function simply tries to maximize the positive edge scores and minimize the negative edge scores. `predict` now returns the (positive) edge scores corresponding to the inputted `GraphTensor` instance. `get_config` now returns a dictionary, as expected. The default decoder is `molgraph.layers.DotProductIncident(normalize=True)`. Note: there is still some more work to be done with GAE/GVAE; e.g. improving the "`NegativeGraphSampler`" and (for VGAE) improving the `beta` schedule.
- `molgraph.tensors`
- `GraphTensor.propagate()` now removes the `edge_weight` data component, as
it has already been used.
Major features and improvements
- `molgraph.models`
- `GraphMasking` (alias: `MaskedGraphModeling`) is now implemented. Like the autoencoders, this model pretrains an encoder; though instead of predicting links between nodes, it predicts randomly masked node and edge features. (Currently only works with tokenized node and edge features (via `chemistry.Tokenizer`).) This pretraining strategy is inspired by BERT for language modeling.
Bug fixes
- `molgraph.layers`
- `from_config` now works as expected for all gnn layers. Consequently, `gnn_model.from_config(gnn_model.get_config())` now works fine.
Minor features and improvements
- `molgraph.layers`
- `_build_from_vocabulary_size()` removed from `EmbeddingLookup`. Instead creates `self.embedding` in `adapt()` or `build()`.