Pyspi

Latest version: v1.0.3

Safety actively analyzes 623248 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

1.0.3

SPI Reproducibility Fix
_pyspi_ `v1.0.3` is a patch update that addresses inconsistent results observed in several Information Theoretic and Convergent Cross-Mapping (`ccm`) SPIs when running multiple trials on the same benchmarking dataset. As of this update, all 284 SPIs should now produce identical results across multiple runs on the same dataset.

**Key Changes**:
- Upgraded [jidt](https://github.com/jlizier/jidt/) dependency from `v1.5` to `v1.6.1`. `jidt v1.6.1` includes a new `NOISE_SEED` property for all jidt calculators, enabling consistent results across multiple runs. For more information, see [here](https://github.com/jlizier/jidt/issues/99). Since jidt is self-contained within the _pyspi_ package, upgrading the jidt version should not introduce any breaking changes for users who have already installed pyspi.
- Added random seed support within _pyspi_ for jidt-based SPIs. All SPIs that rely on the jidt library now utilise a fixed random seed to ensure reproducibility across runs.
- Introduced random seed for Convergent Cross-Mapping (`ccm`) SPI. The `ccm` SPI now uses a fixed random seed, addressing the previously observed stochasticity in its outputs.

**Important Note to Users:**
The addition of fixed random seeds for the _affected SPIs_ may result in slightly different output values compared to previous versions of _pyspi_. This change is due to the improved consistency and reproducibility of the SPI outputs. Please keep this in mind if making exact numerical comparisons with previous versions of _pyspi_.

**Affected SPIs**:
The following SPIs, which previously produced varying outputs across multiple trials, should now yield consistent results:
- `ccm` (all 9 estimators)
- `cce_kozachenko`
- `ce_kozachenko`
- `di_kozachenko`
- `je_kozachenko`
- `si_kozachenko_k-1`

1.0.2

New SPI - Gromov-Wasserstein Distance (GWτ)
This minor patch update introduces a new distance-based SPI, GWτ (called `gwtau` in _pyspi_). An in-depth tutorial for incorporating new SPIs into the existing _pyspi_ framework, using `gwtau` as a prototypical example, is now available in the [documentation](https://time-series-features.gitbook.io/pyspi/development/contributing-to-pyspi/adding-new-spis-to-pyspi).

What is it?
Based on the algorithm proposed by [Kravtsova et al. (2023)](https://doi.org/10.1007/s11538-023-01175-y), GWτ is a new distance measure for comparing time series data, especially suited for biological applications. It works by representing each time series as a metric space and computing the distances from the start of each time series to every point. These distance distributions are then compared using the [Wasserstein](https://en.wikipedia.org/wiki/Wasserstein_metric) distance, which finds the optimal way to match the distances between two time series, making it robust to shifts and perturbations. The "tau" in GWτ emphasises that this distance measure is based on comparing the distributions of distances from the root (i.e., the starting point) to all other points in each time series, which is analogous to comparing the branch lengths in two tree-like structures. GWτ can be computed efficiently and is scalable.

How can I use it?
Currently, the default (`subset = all`) SPI set and fast (`subset = fast`) subset include `gwtau`. This means you do not
have to do anything, unless you would like to compute `gwtau` in isolation. Simply instantiate the calculator object and compute
SPIs as usual. You can access the matrix of pairwise interactions for `gwtau` using it's identifier in the results table:

calc = Calculator(dataset=...)
calc.compute()
gwtau_results = calc.table['gwtau']


For technical details about the specific implementation of `gwtau`, such as theoretical properties of this distance measure, see the original paper by [Kravtsova et al. (2023)](https://doi.org/10.1007/s11538-023-01175-y). You can also find the original implementation of the algorithm in MATLAB in this [GitHub repository](https://github.com/kravtsova2/GWtau).

1.0.1

Bug Fixes

File location handling improvement for the `filter_spis` function:
- Modified the `filter_spis` function to allow the user to specify the exact location of the source config YAML file.
- Implemented a default file mechanism where, if no file is specified by the user, the function defaults to using the pre-defined `config.yaml` file located in the script's directory as the source file.
- Updated unit tests to reflect the changes.

1.0.0

This major release (1.0.0) brings several updates to _pyspi_ including optional dependency checks and the ability to filter SPIs based on keywords.

Highlights of this release
- **SPI Filtering**: A new `filter_spis` function has been added to the `pyspi.utils` module. This function allows users to create subsets of SPIs based on keywords (e.g., "linear", "non-linear"). It takes three arguments:
- `keywords`: a list of one or more labels to filter the SPIs, e.g., ["linear", "signed"].
- `output_name`: the name of the output YAML file, defaulting to `{random_string}_config.yaml` if no name is provided as an argument.
- `configfile`: the path to the source config file. If no configfile is provided, defaults to using `config.yaml` in the pyspi directory.

**Example usage**:

using the default config.yaml as the source file
filter_spis(keywords=["linear", "signed"], output_name="linear_signed") returns `linear_signed.yaml` in cwd

or using a user-specified configfile as the source file
filter_spis(keywords=["linear", "signed"], output_name="linear_signed", configfile="myconfig.yaml")

A new yaml file is saved in the current working directory with the filtered subset of SPIs. This filtered config file can be loaded into the Calculator object using the `configfile` argument as would be the case for a typical custom YAML file (see the docs for more info):

calc = Calculator(configfile="./linear_signed.yaml")

- **Optional Dependency Checks**: When instantiating a Calculator object, _pyspi_ now automatically performs checks for optional dependencies (Java and Octave). If any dependencies are missing, the user will be notified about which SPIs will be excluded and due to which dependencies. The user can then choose to proceed with a reduced set of SPIs or install the missing dependencies.
- **Restructured SPI Config File**: The SPI configuration YAML file has been restructured to include the following keys for each base SPI:
- `labels`: base SPI specific labels (e.g., linear, non-linear, signed, etc.) that can be used by the filter function to create user-specified subsets of SPIs.
- `dependencies`: external/system dependencies required by the base SPI (e.g., Octave for integrated information SPIs).
- `config`: estimator settings and configurations e.g., EmpiricalCovariance for the Covariance base SPI.

**Example YAML**:
Here is an example of how the `phi_star_t1_norm-0` SPI would be specified
yaml
IntegratedInformation:
labels:
- undirected
- nonlinear
- unsigned
- bivariate
- time-dependent
dependencies:
- octave
configs:
- phitype: "star"


Breaking Changes
This major version release introduces breaking changes for users who rely on custom SPI subsets (i.e., custom YAML files). Users relying on the pyspi default and pre-defined subsets are unaffected by these changes.
- The `octaveless` subset has been removed, as it is no longer necessary due to the automatic dependency checks. Users without Octave installed can now run pyspi without specifying `octaveless` as a subset in the Calculator object.
- Users who want to define a custom subset of SPIs should follow the new guide in the documentation to ensure their custom YAML file conforms to the new structure with labels, dependencies, and configs as keys.

Migration Guide
If you are an existing user of pyspi and have custom SPI subsets (custom YAML files), follow these steps to migrate to the new version:
1. Review the updated structure of the SPI configuration YAML file (see the above example), which now includes labels, dependencies, and configs keys for each base SPI.
2. Update your custom YAML files to match the new structure.
3. If you were previously using the octaveless subset, you no longer need to specify it when instantiating the Calculator object. The dependency checks will automatically exclude Octave-dependent SPIs if Octave is not installed.

For more detailed instructions and examples, refer to the updated [documentation](https://time-series-features.gitbook.io/pyspi/).

Documentation
The pyspi documentation has been updated to reflect the new features and changes introduced in this release. You can find the latest documentation [here](https://time-series-features.gitbook.io/pyspi/).

Testing
- Added unit tests for the new `filter_spis` function.
- Added unit tests for the CalculatorFrame and CorrelationFrame.
- Updated workflow file for Git Actions to use the latest checkout and python setup actions.

0.4.2

Introduction
This patch release brings a few minor updates including a new high contrast logo for dark mode users, improved SPI unit testing (with a new benchmarking dataset) and fixes for potential security vulnerability issues.

Highlights of this release
- New high contrast logo for dark-mode users.
- Improved SPI unit testing with z-scoring approach to flag SPIs with differing outputs.
- New coupled map lattice (CML) benchmarking dataset.
- Fix for potential security vulnerability issues in scikit-learn.

What's Changed
- Replaced the old `standard_normal.npy` benchmarking dataset with a coupled map lattice (`cml7.npy`), along with its associated .pkl file containing the benchmark values (`CML7_benchmark_tables.pkl`) generated in a fresh Ubuntu environment.
- Updated the README to automatically select either the regular or new dark mode logo based on the user's theme.
- Added new `conftest.py` file for pytest to customise the unit testing outputs.
- Added a new `pyproject.toml` file for configuring the package for publishing to PyPI.

New features
- Improved SPI unit testing with a new coupled map lattice benchmarking dataset (`cml7.npy`) consisting of 7 processes and 100 observations per process.
- Z-scoring approach in unit testing pipeline to flag potential changes in SPI outputs as a result of algorithmic changes, etc. SPIs with outputs differing by more than a specified threshold are "flagged" and summarised in a table.
- Added a darkmode _pyspi_ logo to the README which is shown for users with the dark-mode GitHub theme.

Bug Fixes
- Fixed a scikit-learn security vulnerability issue with severity "high" (pertaining to denial of service) by upgrading scikit-learn from version `0.24.1` to version `1.0.1`.
- Fixed Int64 deprecation issue (cannot import name `Int64Index` from `pandas`) by fixing pandas to version `1.5.0`.
- Fixed unknown character issue for Windows users resulting from not specifying an encoding when loading the "README" in `setup.py`. Now fixed to `utf-8` for consistency across platforms.

0.4.1

Introduction
PySPI v0.4.1 introduces several minor changes to the existing README, as well as migrating documentation from "readthedocs" to an all new "GitBook" page. Simple unit testing has also been incorporated for each of the SPIs using a benchmarking dataset to check for the consistency of outputs.

Highlights of this release

What's Changed
- Removal of old /docs directory
- Addition of a /tests directory for unit testing
- Updated README
- Addition of CODE_OF_CONDUCT.md and SECURITY.md

New features
- Basic unit testing incorporated into a GitHub Actions workflow.
- Updated README file with links to the new GitBooks hosted documentation to replace the old "readthedocs" documentation.
- Added a code of conduct markdown
- Added a security policy markdown

Bug Fixes
- Fixed a PyTorch security vulnerability issue with severity "critical" (pertaining to arbitrary code execution) by updating torch from version `1.10.0` to `1.13.1`.

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.