This major release (1.0.0) brings several updates to _pyspi_ including optional dependency checks and the ability to filter SPIs based on keywords.
Highlights of this release
- **SPI Filtering**: A new `filter_spis` function has been added to the `pyspi.utils` module. This function allows users to create subsets of SPIs based on keywords (e.g., "linear", "non-linear"). It takes three arguments:
- `keywords`: a list of one or more labels to filter the SPIs, e.g., ["linear", "signed"].
- `output_name`: the name of the output YAML file, defaulting to `{random_string}_config.yaml` if no name is provided as an argument.
- `configfile`: the path to the source config file. If no configfile is provided, defaults to using `config.yaml` in the pyspi directory.
**Example usage**:
using the default config.yaml as the source file
filter_spis(keywords=["linear", "signed"], output_name="linear_signed") returns `linear_signed.yaml` in cwd
or using a user-specified configfile as the source file
filter_spis(keywords=["linear", "signed"], output_name="linear_signed", configfile="myconfig.yaml")
A new yaml file is saved in the current working directory with the filtered subset of SPIs. This filtered config file can be loaded into the Calculator object using the `configfile` argument as would be the case for a typical custom YAML file (see the docs for more info):
calc = Calculator(configfile="./linear_signed.yaml")
- **Optional Dependency Checks**: When instantiating a Calculator object, _pyspi_ now automatically performs checks for optional dependencies (Java and Octave). If any dependencies are missing, the user will be notified about which SPIs will be excluded and due to which dependencies. The user can then choose to proceed with a reduced set of SPIs or install the missing dependencies.
- **Restructured SPI Config File**: The SPI configuration YAML file has been restructured to include the following keys for each base SPI:
- `labels`: base SPI specific labels (e.g., linear, non-linear, signed, etc.) that can be used by the filter function to create user-specified subsets of SPIs.
- `dependencies`: external/system dependencies required by the base SPI (e.g., Octave for integrated information SPIs).
- `config`: estimator settings and configurations e.g., EmpiricalCovariance for the Covariance base SPI.
**Example YAML**:
Here is an example of how the `phi_star_t1_norm-0` SPI would be specified
yaml
IntegratedInformation:
labels:
- undirected
- nonlinear
- unsigned
- bivariate
- time-dependent
dependencies:
- octave
configs:
- phitype: "star"
Breaking Changes
This major version release introduces breaking changes for users who rely on custom SPI subsets (i.e., custom YAML files). Users relying on the pyspi default and pre-defined subsets are unaffected by these changes.
- The `octaveless` subset has been removed, as it is no longer necessary due to the automatic dependency checks. Users without Octave installed can now run pyspi without specifying `octaveless` as a subset in the Calculator object.
- Users who want to define a custom subset of SPIs should follow the new guide in the documentation to ensure their custom YAML file conforms to the new structure with labels, dependencies, and configs as keys.
Migration Guide
If you are an existing user of pyspi and have custom SPI subsets (custom YAML files), follow these steps to migrate to the new version:
1. Review the updated structure of the SPI configuration YAML file (see the above example), which now includes labels, dependencies, and configs keys for each base SPI.
2. Update your custom YAML files to match the new structure.
3. If you were previously using the octaveless subset, you no longer need to specify it when instantiating the Calculator object. The dependency checks will automatically exclude Octave-dependent SPIs if Octave is not installed.
For more detailed instructions and examples, refer to the updated [documentation](https://time-series-features.gitbook.io/pyspi/).
Documentation
The pyspi documentation has been updated to reflect the new features and changes introduced in this release. You can find the latest documentation [here](https://time-series-features.gitbook.io/pyspi/).
Testing
- Added unit tests for the new `filter_spis` function.
- Added unit tests for the CalculatorFrame and CorrelationFrame.
- Updated workflow file for Git Actions to use the latest checkout and python setup actions.