Pydgn

Latest version: v1.6.0

Safety actively analyzes 688027 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 8

1.2.3

At the moment, the entire graph must fit in CPU/GPU memory. `DataProvider` extensions to partition the graph using PyG should not be difficult.

Added

- New splitter, `SingleGraphSplitter`, which randomly splits nodes in a single graph (with optional stratification)
- New provider, `SingleGraphDataProvider`, which adds mask fields to the single DataBatch object (representing the graph)

Changed

- renamed method `get_graph_targets` of `Splitter` to `get_targets`, and modified it to make it more general

1.2.2

Added

- Telegram bot support. Just specify a telegram configuration in a YAML file and let the framework do the rest! Just remember not to push your telegram config file in your repo!

1.2.1

Fixed

- the changed introduced in splitter causes seed to be resetted when splits were loaded during each experiment. Now it has been fixed by setting the seed only when split() is called.
- minor in num_features of OGBGDatasetInterface

1.2.0

Changed

- Simplified metrics to either take the mean score over batches or to compute epoch-wise scores (default behavior).
In the former case, the result may be affected by batch size, especially in cases like micro-AP and similar scores.
Use it only in case it is too expensive (RAM/GPU memory) to compute the scores in a single shot.

Fixed

- Bug in splitter, the seed was not set properly and different executions led to different results. This is not a problem whenever the splits are publicly released after the experiments (which is always the case).
- Minor in data loader workers for iterable datasets

1.1.0

Added

- Temporal learning routines (with documentation), works with single graphs sequences
- Template to show how we can use PyDGN on a cluster (see `cluster_slurm_example.sh`) - launch using `sbatch cluster_slurm_example.sh`. **Disclaimer**: you must have experience with slurm, the script is not working out of the box and settings must be adjusted to your system.

Fixed

- removed method from `OGBGDatasetInterface` that broke the data split generation phase.
- added `**kwargs` to all datasets

Changed

- Extended behavior of ``TrainingEngine`` to allow for null target values and some temporal bookkeeping (allows a lot of code reuse).
- Now ``batch_loss`` and ``batch_score`` in the ``State`` object are initialized to ``None`` before training/evaluation of a new batch starts. This could have been a problem in the temporal setting, where we want to accumulate results for different snapshots.

1.0.9

We provide an implementation of iterable-style datasets, where the dataset usually doesn't fit into main memory and
it is stored into different files on disk. If you don't overwrite the ``__iter__`` function, we assume to perform data splitting at
file level, rather than sample level. Each file can in fact contain a list of ``Data`` objects, which will be streamed
sequentially. Variations are possible, depending on your application, but you can use this new dataset class as a good starting point.
If you do, be careful to test it together with the iterable versions of the data provider, engine, and engine callback.

Added

- Implemented an Iterable Dataset inspired by the [WebDataset](https://github.com/webdataset/webdataset) interface
- Similarly, added ``DataProvider``, ``Engine`` and ``EngineCallback`` classes for the Iterable-style datasets.

Changed

- Now we can pass additional arguments at runtime to the dataset

Page 4 of 8

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.