The Captum v0.6.0 release introduces a new feature `StochasticGates`. This release also enhances Influential Examples and includes a series of other improvements & bug fixes.
Stochastic Gates
Stochastic Gates is a technique to enforce sparsity by approximating L0 regularization. It can be used for network pruning and feature selection. As directly optimizing L0 is a non-differentiable combinatorial problem, Stochastic Gates approximates it by using certain continuous probability distributions (e.g., Concrete, Gaussian) as smoothed Bernoulli distributions. So the optimization can be reparameterized into the distributions parameters. Check the following papers for more details:
- [Learning Sparse Neural Networks through L0 Regularization](https://arxiv.org/abs/1712.01312)
- [Feature Selection using Stochastic Gates](https://arxiv.org/abs/1810.04247)
Captum provides two Stochastic Gates implementations using different distributions as smoothed Bernoulli, `BinaryConcreteStochasticGates` and `GaussianStochasticGates`. They are available under `captum.module`, a new subpackage collecting neural network building blocks that are useful for model understanding. A usage example:
py
from captum.module import GaussianStochasticGates
n_gates = 5 number of gates
stg = GaussianStochasticGates(n_gates, reg_weight=0.01)
inputs = torch.randn(3, n_gates) mock inputs with batch size of 3
gated_inputs, reg = stg(mock_inputs) gate the inputs
loss = model(gated_inputs) use gated inputs in the downstream network
optimize sparsity regularization together with the model loss
loss += reg
...
verify the learned gate values to see how model is using the inputs
print(stg.get_gate_values())
Influential Examples
Influential Examples is a new function pillar enabled in the last version. This new release continues to focus on it and introduces many improvements upon the existing `TracInCP` family. Some of the changes are incompatible with the previous version. Below is the list of details:
- Support loss function with reduction of `mean` in `TracInCPFast` and `TracInCPFastRandProj` (https://github.com/pytorch/captum/pull/913)
- `TracInCP` classes add a new argument `show_progress` to optionally display progress bars for the compuation (https://github.com/pytorch/captum/pull/898, https://github.com/pytorch/captum/pull/1046)
- `TracInCP` provides a new public method `self_influence` which computes the self influence scores among the examples in the given data. `influence` can no longer compute self_influence scores and the argument `inputs` cannot be `None` (https://github.com/pytorch/captum/pull/994, https://github.com/pytorch/captum/pull/1069, https://github.com/pytorch/captum/pull/1087, https://github.com/pytorch/captum/pull/1072)
- Previous constructor argument `influence_src_dataset` in `TracInCP` is renamed to `train_dataset` (https://github.com/pytorch/captum/pull/994)
- Add GPU support to `TracInCPFast` and `TracInCPFastRandProj` (https://github.com/pytorch/captum/pull/969)
- `TracInCP` and `TracInCPFastRandProj` provides a new public method `compute_intermediate_quantities` which computes “embedding” vectors for examples in a the given data (https://github.com/pytorch/captum/pull/1068)
- `TracInCP` classes supports a new optional argument `test_loss_fn` for use cases where different losses are used for training and testing examples (https://github.com/pytorch/captum/pull/1073)
- Revised the interface of the method `influence`. Removed the arguments `unpack_inputs` and `target`. Now, the `inputs` argument must be a `tuple` where the last element is the label (https://github.com/pytorch/captum/pull/1072)
Notable Changes
- LRP now will throw error when it detects the model ruses any modules (https://github.com/pytorch/captum/pull/911)
- Fixed the bug that the concept order changes in `TCAV`’s output (https://github.com/pytorch/captum/pull/915, https://github.com/pytorch/captum/issues/909)
- Fixed the data type issue of using Captum’s built-in SGD linear models in `Lime` (https://github.com/pytorch/captum/pull/938, https://github.com/pytorch/captum/issues/910)
- All submodules are now accessible under the top-level `captum` module, so users can `import captum` and access everything underneath it, e.g., `captum.attr` (https://github.com/pytorch/captum/pull/912, https://github.com/pytorch/captum/pull/992, https://github.com/pytorch/captum/issues/680)
- Added a new attribution visualization utility for time series data (https://github.com/pytorch/captum/pull/980)
- Improved version detection to fix some compatibility issues caused by dependencies’ versions (https://github.com/pytorch/captum/pull/940, https://github.com/pytorch/captum/pull/999, )
- Fixed an index bug in the tutorial Interpret regression models using Boston House Prices Dataset (https://github.com/pytorch/captum/pull/1014, https://github.com/pytorch/captum/issues/1012)
- Refactored `FeatureAblation` and `FeaturePermutation` to verify the output type of `forward_func` and its shape when `perturbation_per_eval > 1` (https://github.com/pytorch/captum/pull/1047, https://github.com/pytorch/captum/pull/1049, https://github.com/pytorch/captum/pull/1091)
- Changed [Housing Regression tutorial](https://captum.ai/tutorials/House_Prices_Regression_Interpret) with California housing dataset (https://github.com/pytorch/captum/pull/1041)
- Improved the error message of invalid input types when the required data type is `tensor` or `tuple[tensor]` (https://github.com/pytorch/captum/pull/1083)
- Switched to tensor `forward_hook` from module `backward_hook` for many attribution algorithms that need tensor gradients, like `DeepLift` and `LayerLRP`. So those modules can now support models with in-place modules (https://github.com/pytorch/captum/pull/979, https://github.com/pytorch/captum/issues/914)
- Added an optional `mask` argument to `FGSM` and `PGD` adversarial attacks under `captum.robust` to specify which elements are perturbed (https://github.com/pytorch/captum/pull/1043)