Slideflow

Latest version: v3.0.2

Safety actively analyzes 682471 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 11

2.1.1

This is a *minor*, bug-fix release. See the [Version 2.1 release notes](https://github.com/jamesdolezal/slideflow/releases/tag/2.1.0) for details about the latest major release.

Changes
- Fix error when providing a feature extractor argument to `sf.mil.predict_slide()` [310] (thanks bplee)
- Fix Neptune logging with latest API [312, 313, 314] (thanks luiscarm9)
- Fix unknown argument `save_checkpoints` when performing SMAC hyperparameter search in PyTorch [309]
- Fix MIL inference support on MacOS (MPS) devices
- Fix attention heatmap generation from MIL model `TransMIL`
- Fix feature generation during MIL inference from finetuned SimCLR model
- Opening a slide in Studio with `WSI.view()` will load the slide at the same tile_px/tile_um
- `Dataset.rebuild_index()` will now remove old index files, fixing various `allow_pickle` errors when using old TFRecords
- Fix minor inconsistency in masking algorithm with Reinhard between OpenCV/Numpy and Tensorflow/PyTorch implementations
- Fix "Random pixel interpolation not implemented" error when using `augment=True`
- Minor documentation typo fixes

2.1.0

Highlights
Slideflow 2.1 includes a number of new features and optimizations, with a focus on improving Multiple-Instance Learning (MIL) model development and deployment. Key improvements include an **MIL / Attention Heatmaps extension** for Slideflow Studio, improvements to both **feature extraction** and **MIL training**, **new QC algorithms**, and dozens of other enhancements and bug fixes.

Table of Contents
1. **Slideflow Studio: MIL & Attention Heatmaps**
2. **MIL Training Enhancements**
a. Rebuilding feature extractors used for MIL
b. Single-slide predictions, without feature bags
c. QOL improvements
3. **Streamlined Feature Extraction**
a. Features from layer activations of an ImageNet-pretrained model
b. Features from a public pretrained network
c. Features from a SimCLR model (self-supervised learning)
d. Using feature extractors
4. **Slideflow Studio: Tile Extraction Preview & More**
5. **Slide Filtering / QC Updates**
a. DeepFocus
b. GaussianV2
6. **Smaller updates**
a. PyTorch Image Preprocessing Improvements
b. Mini-batch sample diversity for PyTorch dataloaders
c. TFRecord optimizations
d. Other new features
e. Other improvements
f. Bug fixes

Slideflow Studio: MIL & Attention Heatmaps
Slideflow Studio now includes an [MIL extension](https://slideflow.dev/studio/#multiple-instance-learning), allowing you to generate MIL predictions for slides and visualize attention as a heatmap.

Start by navigating to the Extensions tab in the bottom-left corner, and enable the "Multiple-instance Learning" extension.

![image](https://github.com/jamesdolezal/slideflow/assets/48372806/c94974ba-7ff6-4f30-85ed-490c1f3bd1ee)

A new icon will appear in the left-hand toolbar. Use this button to open the MIL widget. Models can be loaded by clicking the "Load MIL model" button, with "File -> Load MIL Model...", or by dragging-and-dropping an MIL model folder onto the window.

Information about the feature extractor and MIL model will be shown in the toolbar. MIL model architecture and hyperparameters can be viewed by clicking the "HP" button. Click "Predict Slide" to generate a whole-slide prediction. If applicable, attention will be displayed as a heatmap. The heatmap color and display can be customized in the Heatmap widget.

![image](https://github.com/jamesdolezal/slideflow/assets/48372806/d5a838df-a654-4538-bd94-1aa6a63de32d)

MIL Training Enhancements
Several changes in the MIL training process have been made to improve the user experience and facilitate deployment of trained MIL models on new slides.

Rebuilding feature extractors used for MIL
One of the previous challenges with MIL models was the reliance on generated feature "bags", even for model evaluation. Slideflow now includes tools to generate predictions from MIL models without manually generating feature bags, greatly simplifying evaluation and single-slide testing.

When image tile features are calculated and exported for a dataset (either with `Project.generate_feature_bags()` or `DatasetFeatures.to_torch()`), the feature extractor configuration is now saved as `bags_config.json` in the same directory as the exported feature bags. This configuration file contains all information necessary for rebuilding the feature extractor. An example file is shown below.

json
{
"extractor": {
"class": "slideflow.model.extractors.retccl.RetCCLFeatures",
"kwargs": {
"center_crop": true
}
},
"normalizer": {
"method": "macenko",
"fit": {
"stain_matrix_target": [
[
0.5062568187713623,
0.22186939418315887
],
[
0.7532230615615845,
0.8652154803276062
],
[
0.4069173336029053,
0.42241501808166504
]
],
"target_concentrations": [
1.7656903266906738,
1.2797492742538452
]
}
},
"num_features": 2048,
"tile_px": 299,
"tile_um": 302
}


The feature extractor can then be rebuilt with `sf.model.rebuild_extractor()`:

python
from slideflow.model.extractors import rebuild_extractor

Recreate the feature extractor
and stain normalizer, if applicable
extractor, normalizer = rebuild_extractor("/path/to/bags_config.json")


Single-slide predictions, without feature bags
The new `sf.mil.predict_slide()` function allows you to generate a whole-slide prediction (and attention heatmap) from a saved MIL model, without requiring the user to manually generate feature bags.

This is accomplished by including feature extraction information in the `mil_params.json` file stored in MIL model folders. When performing single-slide inference, Slideflow will automatically rebuild the feature extractor, calculate features for all tiles in the given slide, and pass these features to the loaded MIL model.

You can generate single-slide predictions using a path to a slide:

python
from slideflow.mil import predict_slide

slide = '/path/to/slide.svs'
model = '/path/to/mil_model'

Calculate predictions and attention heatmap
y_pred, y_att = predict_slide(model, slide)


You can also generate single-slide predictions from a loaded `WSI` object, allowing you to customize slide processing or QC before generating predictions:

python
import slideflow as sf
from slideflow.mil import predict_slide
from slideflow.slide import qc

Load slide and apply Otsu thresholding
slide = '/path/to/slide.svs'
wsi = sf.WSI(slide, ...)
wsi.qc(qc.Otsu())

Calculate predictions and attention heatmap
y_pred, y_att = predict_slide('/path/to/mil_model', wsi)


QOL improvements for MIL training
Several smaller quality of life improvements have been made for MIL training. In addition to the feature extraction configuration, the `mil_params.json` file now also includes information about the input and output shapes of the MIL network and outcome labels. An example file is shown below.

json
{
"trainer": "fastai",
"params": {
...
},
"outcomes": "histology",
"outcome_labels": {
"0": "Adenocarcinoma",
"1": "Squamous"
},
"bags": "/mnt/data/projects/example_project/bags/simclr-263510/",
"input_shape": 1024,
"output_shape": 2,
"bags_encoder": {
"extractor": {
"class": "slideflow.model.extractors.simclr.SimCLR_Features",
"kwargs": {
"center_crop": false,
"ckpt": "/mnt/data/projects/example_project/simclr/00001-EXAMPLE/ckpt-263510.ckpt"
}
},
"normalizer": null,
"num_features": 1024,
"tile_px": 299,
"tile_um": 302
}
}


When exporting feature bags for MIL training with `Project.generate_feature_bags()`, memory consumption is reduced by performing the feature bag calculation in smaller batches of slides at a time. [261]

Finally, when validating or evaluation MIL models with a categorical outcome, accuracy within each class is reported separately. [265] (thank you andrewsris)


INFO Validation metrics for outcome histology:
INFO slide-level AUC (cat 0): 0.993 AP: 0.998 (opt. threshold: 0.565)
INFO slide-level AUC (cat 1): 0.993 AP: 0.974 (opt. threshold: 0.439)
INFO Category 0 acc: 97.3% (146/150)
INFO Category 1 acc: 92.3% (36/39)


Streamlined Feature Extraction
Extracting features from image tiles - commonly used for training [Multiple-instance Learning (MIL)](http://slideflow.dev/mil/) models - has been streamlined with `sf.model.build_feature_extractor()`, providing a common API for preparing many types of feature extractors.

Features from layer activations of an ImageNet-pretrained model
Generate features from a neural network pretrained on ImageNet simply by passing the name of the network to `sf.model.build_feature_extractor()`. If a tile size is specified, input tiles will be center cropped before calculating features.

python
from slideflow.model import build_feature_extractor

resnet50_extractor = build_feature_extractor(
'resnet50',
tile_px=299
)


This will calculate features using activations from the post-convolutional layer of the network. You can also concatenate activations from multiple layers and apply pooling for layers with 2D output shapes.

python
extractor = build_feature_extractor(
'resnet50',
layers=['conv1_relu', 'conv3_block1_2_relu'],
pooling='avg',
tile_px=299
)


Features from layer activations of a fine-tuned model
Generate features from a model fine-tuned in Slideflow by calculating activations at any number of arbitrary neural network layers.

python
extractor = build_feature_extractor(
'/path/to/trained_model.zip'
)


Features from a public pretrained network
Generate features from the pre-trained CTransPath or RetCCL networks. Weights for these pretrained networks will be automatically downloaded from [HuggingFace](huggingface.co/jamesdolezal/retccl/).

python
extractor = build_feature_extractor(
'retccl',
tile_px=299
)


Features from a SimCLR model (self-supervised learning)
Generate features from a model trained with [self-supervised learning](https://slideflow.dev/ssl) using SimCLR. Specify a saved model folder or path to a model checkpoint (`*.ckpt`).

python
extractor = build_feature_extractor(
'simclr'
ckpt='/path/to/simclr.ckpt'
)


Using feature extractors

All feature extractors can then be used to calculate features from individual image tiles, [generate feature bags](https://slideflow.dev/mil/#exporting-features) for MIL training, or calculate features for an entire slide using a loaded `WSI` object.

Slideflow Studio: Tile Extraction Preview & More
Studio now facilitates quickly previewing tile extraction. Tile extraction parameters - such as slide-level processing / QC, grayspace/whitespace filtering, and stride - can be customized in the "Slide Processing" section. The "Display" section allows users to preview tile extraction by displaying outlines around tiles. When generating whole-slide predictions from a loaded model, only the shown tiles will be used.

![image](https://github.com/jamesdolezal/slideflow/assets/48372806/a4911b16-9b5a-4289-9d46-41c95f31acda)

Additional updates to Studio include:
- Gracefully handle invalid/incompatible slides with an error message, instead of crashing
- Zoom to a specific MPP in a slide with `View -> Zoom to MPP (Ctrl +/)` [270] (thank you skochanny)
- Remove status bar when capturing main view [270]
- Add MacOS M1 / MPS compatibility when generating StyleGAN images
- Fix ROI annotations on high-DPI devices
- Various stability improvements & bug fixes

Slide Filtering / QC Updates (DeepFocus, GaussianV2)
Slideflow includes two new slide filtering / QC algorithms: `DeepFocus` and `GaussianV2`.

DeepFocus
An official implementation of the DeepFocus QC algorithm is now included in Slideflow, and can be used like any other QC algorithm. By default, DeepFocus is applied to slides at 40X magnification, although this can be customized with the `tile_um` argument.

python
from slideflow.slide import qc

deepfocus = qc.DeepFocus(tile_um='20x')
slide.qc(deepfocus)


You can also retrieve raw predictions from the DeepFocus model by passing the argument `threshold=False`:


preds = deepfocus(slide, threshold=False)


GaussianV2
A new, optimized Gaussian ("blur") filter has been implemented as `sf.slide.qc.GaussianV2`. This method reduces computational time and memory consumption by first splitting the slide into smaller chunks, performing Gaussian filtering on each chunk separately (accelerated with multiprocessing), and then merging the chunks (eliminating areas of overlap to reduce stitching artifacts). `GaussianV2` will be used by default when using the QC methods `'blur'` or `'both'`.

Smaller updates

Slideflow includes a number of other new features and enhancements, as detailed below.

PyTorch Image Preprocessing Improvements
Image preprocessing and augmentations in PyTorch backend have been refactored to use torchvision transformations. This improves computational efficiency and makes custom transformation pipelines easier to work with. This results in a 3-4x speed improvement in PyTorch Gaussian blur augmentation [145], and also improves PyTorch stain normalization speed.

Custom PyTorch transformations or augmentations can be used in any PyTorch dataloader by passing a callable function to `Dataset.torch(augment=...)` or `Dataset.torch(transform=...)`. For example, to apply a resize transformation on images:

python
import slideflow as sf
from torchvision import transforms

Load a project and dataset
P = sf.load_project(...)
dataset = P.dataset(tile_px=299, tile_um=302)

Establish a resize transformation
resize = transforms.resize(512)

Create a PyTorch dataloader with this
transformation applied to images
dl = dataset.torch(transform=resize)


Custom transformations can also be used in any Tensorflow dataset using the same API. Pass a callable function to the `transform` argument of `Dataset.tensorflow()`:

python
import slideflow as sf
import tensorflow as tf

tf.function
def custom_resize(image):
return tf.image.resize(image, (512, 512))

Load a project and dataset
P = sf.load_project(...)
dataset = P.dataset(tile_px=299, tile_um=302)

Create a Tensorflow dataset with this
resize transformation applied to images
dl = dataset.tensorflow(transform=custom_resize)


Mini-batch sample diversity for PyTorch dataloaders
This update addresses a long-standing issue where mini-batches assembled with PyTorch tended to contain tiles from repeat slides. PyTorch dataloaders now enforce greater sample diversity, reducing the chance that multiple tiles from the same slide will be present in a single batch (unless the number of slides is less than the batch size). Performance auditing has revealed that this change may improve model generalizability.

TFRecord optimizations
TFrecord index files now store tile location information, greatly improving efficiency of reading TFRecords by tile location (which is performed for various internal functions, such as calculating dataset features). Existing TFRecord indices will be automatically updated with location information when used, but this process can be manually triggered with `Dataset.rebuild_index()`. Tile locations can be read from a TFRecord's index file with `sf.io.get_locations_from_tfrecord()`.

Other new features
- Add support for slide images that do not contain 'levels', such as multi-page TIFFs and Versa-scanned SVS files. (Thank you emmachancellor and skochanny)
- `Dataset.verify_slide_names()`: verify that TFRecord filenames match the slide names inside
- `sf.WSI.area()`: Calculate the area of a slide that has passed QC using
- `sf.slide.backends.vips.vips_padded_crop()`: enable extracting tiles outside the bounds of a slide, padding out-of-bounds area with white or black background.
- New `use_edge_tiles` option for `sf.WSI`. If True, will allow extracting edge tiles from the slide. Empty areas are rendered as white, in both cuCIM and VIPS backends.
- Add optional `loc`, `ncol`, and `legend_kwargs` arguments (passed to `ax.legend()`) to `Slidemap.plot()`, for customizing the UMAP plot axes. [275] (Thank you emmachancellor)
- Add support for training SimCLR with stain augmentation

Other improvements
- Improve clarity of slide backend error messages [266] (thank you cswpy)
- Include Libvips version info in `sf.about()`
- Improve PyTorch training speed by using channels-last memory format.
- Improve handling of `linalg` errors during Macenko normalization. If an error is encountered with Macenko normalization, the original image is returned instead of raising the error. This behavior can be disabled by passing `StainNormalizer.transform(allow_errors=False)`.
- Improve quality of slide thumbnail in PDF extraction report. Also adds ability to provide thumbnail keywords arguments when extracting tiles via `thumb_kwargs` (thank you skochanny)
- Improved CPU core detection in Linux. All functions which detect the number of CPU cores now use `sf.util.num_cpu()` instead of `os.cpu_count()`. This will first check available cores with `os.sched_getaffinity(0)`, which reflects available CPU cores with OS-level scheduling. If this fails (e.g. on Window and macOS systems), it will default to `os.cpu_count()`.
- SimCLR default arguments have been updated to reflect the default parameters of the original paper:
- `learning_rate`: 0.3 -> 0.075
- `learning_rate_scaling`: 'linear' -> 'sqrt'
- `weight_decay`: 1e-6 -> 1e-4
- Fix issue where Otsu's thresholding on MRXS files would occasionally fail to identify any foreground tissue. This was due to very small images in the MRXS pyramid. (thank you siddhir)
- Fix issue where MRXS slides could not be extracted when using a buffer, due to the presence of an associated folder with the MRXS file format. [300]
- Close file handles when deleting PyTorch dataloader
- Improve accuracy of mosaic map grid
- Deprecate `Project.generate_features_for_clam()`, replacing it with `Project.generate_feature_bags()`

Bug fixes
- Fix reported concordance index for survival models, which was previously being incorrectly reported as `1 - c_index`
- Fix 'input Tensor too large' error with PyTorch GPU normalizers. Fix is applied by capping the batch size for normalization at 32.
- Fix `sf.DatasetFeatures.to_csv()` [260]
- Fix mixed precision training in PyTorch
- Improve protobuf dependency versioning. Slideflow requires protobuf version <=3.20.\*. Previously, setup.py listed protobuf requirements as <=3.20.2; this has been updated to <3.21 to include any additional 3.20.\* patch releases. This also specifies tensorflow_datasets<4.9.0 to prevent protobuf version >= 4. [289] (thank you sebp)
- Pin required version of cellpose to `<2.2`
- Pin required version of pandas to `<2`
- Pin required version of timm to `<0.9` (thank you quark412)

2.0.5

This is a *minor*, bug-fix release. See the [Version 2.0 release notes](https://github.com/jamesdolezal/slideflow/releases/tag/2.0.0) for details about the latest major release.

Changes
- Fix `ConcatOp` error when training multi-modal models with continuous variables [282]
- Minor docstring updates and typo fixes [281]

2.0.4

This is a *minor*, bug-fix and optimization release. See the [Version 2.0 release notes](https://github.com/jamesdolezal/slideflow/releases/tag/2.0.0) for details about the latest major release.

Bug fixes
- **Fix bug which caused the `predictions.parquet` file for MIL models to have incorrect ground-truth labels.** This also impacted accuracy of results calculated with `Project.evaluate_mil()`.
- Fix bug with `Dataset.kfold_split()`
- Fix bug with `DatasetFeatures.to_csv()` [260]
- Fix bug with loading JPG/PNG files in Slideflow Studio
- Fix `DatasetFeatures.remove_slide()` when activations are not generated

Other changes
- Allow using relative paths with `sf.create_project()` [272]
- Update `Project.extract_tiles()` docstring [273]
- Add libvips version info in `sf.about()`, for easier troubleshooting
- Improve clarity of slide backend error messages [266]
- Remove status bar when capturing main view in Slideflow Studio [270]
- Improve accuracy of mosaic map grid
- Return original image when linalg errors would be raised during stain Macenko normalization in the Tensorflow backend, instead of raising an error
- Improve documentation clarity regarding CPH backend support [276]

2.0.3post1

This is a *minor*, bug-fix and optimization release. See the [Version 2.0 release notes](https://github.com/jamesdolezal/slideflow/releases/tag/2.0.0) for details about the latest major release.

Bug fixes
- **Fix bug with attention heatmaps**: Fix bug where attention heatmaps were not appropriately displayed.
- Fix 'input tensor too large' error when using PyTorch GPU accelerated normalizers. Fix is applied by capping the batch size for
normalization at 32.

Optimizations
- Improve PyTorch training speed by using channels-last memory format
- Improve quality of slide thumbnails in tile extraction PDF reports by decreasing the rectangle line width.
- Optimizations to PyTorch real-time stain normalization speed, by increasing the number of default threads.
- Minor optimizations for PyTorch dataloaders: file handlers are closed when the dataloader is destroyed.
- Improve `Dataset.build_index()` speed if index files are not missing.

Other changes
- Set required cellpose version to <2.2
- Set pandas version to <2.0 to avoid breaking issues with Seaborn
- Handle linalg errors during PyTorch Macenko normalization, returning the original image instead of raising an error.
- Improve error message for `MIL_fc_mc model` if outcomes are not multi-categorical.
- Update QuPath script to improve compatibility with MRXS files
- Stability improvements in Slideflow Studio when using Libvips backend
- Add additional debug logs during SimCLR training

2.0.2.post1

This is a *minor*, bug-fix release. See the [Version 2.0 release notes](https://github.com/jamesdolezal/slideflow/releases/tag/2.0.0) for details about the latest major release.

Changelog
- Improve Databricks support by saving attention vectors as *.npy instead of *.npz if the environmental variable `SF_ALLOW_ZIP=0` [250]
- Add Python 3.7 and Tensorflow 2.11+ support to SimCLR functions [251]
- Update type hints to support Python <3.9 [252]
- Tile iteration with `WSI.build_generator()` now defaults to using multiprocessing when `SF_BACKEND=libvips`, and threadpools when `SF_BACKEND=cucim`, due to poor performance with libvips and threadpools. This does not affect default behavior of `Dataset.extract_tiles()`, but does improve default speed of `WSI.preview()`.

Page 3 of 11

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.