We are happy to announce the release of AICSImageIO 4.0.0!
AICSImageIO is a library for image reading, metadata conversion, and image writing for microscopy formats in pure Python. It aims to be able to read microscopy images into a single unified API regardless of size, format, or location, while additionally writing images and converting metadata to a standard common format.
A lot has changed and this post will only have the highlights. If you are new to the library, please see our full [documentation](https://allencellmodeling.github.io/aicsimageio/) for always up-to-date usage and a quickstart README.
Mosaic Tile Stitching
For certain imaging formats we will stitch together the mosaic tiles stored in the image prior to returning the data. Currently the two formats supported by this functionality are `LIF` and `CZI`.
from aicsimageio import AICSImage
stitched_image = AICSImage("tiled.lif")
stitched_image.dims very large Y and X
_If you don't want the tiles to be stitched back together you can turn this functionality off with `reconstruct_mosaic=False` in the `AICSImage` object init._
The entire library now rests on [xarray](http://xarray.pydata.org/en/stable/index.html). We store all metadata, coordinate planes, and imaging data all in a single `xarray.DataArray` object. This allows for not only selecting by indices with the already existing `aicsimageio` methods: `get_image_data` and `get_image_dask_data`, but additionally to select by coordinate planes.
Mosaic Images and Xarray Together
A prime example of where xarray can be incredibly powerful for microscopy images is in large mosaic images. For example, instead of selecting by index, with `xarray` you can select by spatial coordinates:
from aicsimageio import AICSImage
Read and stitch together a tiled image
large_stitched_image = AICSImage("tiled.lif")
Only load in-memory the first 300x300 micrometers (or whatever unit) in Y and X dimensions
larged_stitched_image.xarray_dask_data.loc[:, :, :, :300, :300].data.compute()
While we are actively working on metadata translation functions, in the meantime we are working on setting ourselves up for the easiest method for format converstion. For now, we have added a simple `save` function to the `AICSImage` object that will convert the pixel data and key pieces of metadata to OME-TIFF for all of (or a select set of) scenes in the file.
from aicsimageio import AICSImage
For users that want greater flexibility and specificity in image writing, we have entirely reworked our writers module and this release contains three writers: `TwoDWriter`, `TimeseriesWriter`, and `OmeTiffWriter`. The objective of this rework was to make n-dimensional image writing much easier, and most importantly, to make metadata attachment as simple as possible. And, we now support multi-scene / multi-image OME-TIFF writing when providing a `List[ArrayLike]`. See full writer documentation [here](https://allencellmodeling.github.io/aicsimageio/aicsimageio.writers.html).
OME Metadata Validation
Our `OmeTiffReader` and `OmeTiffWriter` now validate the read or produced OME metadata. Thanks to the [ome-types](https://github.com/tlambert03/ome-types) library for the heavy lifting, we can now ensure that all metadata produced by this library is valid to the OME specification.
Better Scene Management
Over time while working with 3.x, we found that many formats contain a `Scene` or `Image` dimension that can be entirely different from the other instances of that dimension in pixel type, channel naming, image shape, etc. To solve this we have changed the `AICSImage` and `Reader` objects to statefully manage `Scene` while all other dimensions are still available.
In practice, this means on the `AICSImage` and `Reader` objects the user no longer receives the `Scene` dimension back in the `data` or `dimensions` properties (or any other related function or property).
To change scene while operating on a file you can call `AICSImage.set_scene(scene_id)` while retrieving which scenes are valid by using `AICSImage.scenes`.
from aicsimageio import AICSImage
many_scene_img = AICSImage("my_file.ome.tiff")
many_scene_img.current_scene the current operating scene
many_scene_img.scenes returns tuple of available scenes
many_scene_img.set_scene("Image:2") sets the current operating scene to "Image:2"
RGB / BGR Support
Due to the scene management changes, we no longer use the `"S"` dimension to represent "Scene". We use it to represent the "Samples" dimension (RGB / BGR) which means we now have an isolated dimension for color data. This is great because it allows us to directly support multi-channel RGB data, where previously we would expand RGB data into channels, even when the file had a channel dimension.
So if you encounter a file with `"S"` in the dimensions, you can know that you are working with an RGB file.
FSSpec Adoption
Across the board we have adopted [fsspec](https://github.com/intake/filesystem_spec) for file handling. With 4.x you can now provide any URI supported by the `fsspec` library or any implementations of `fsspec` ([s3fs (AWS S3)](https://github.com/dask/s3fs), [gcsfs (Google Cloud Storage)](https://github.com/dask/gcsfs), [adlfs (Azure Data Lake)](https://github.com/dask/adlfs), etc.).
In many cases, this means we now support direct reading from local or remote data as a base part of our API. _As well as preparing us for supporting OME-Zarr!_
from aicsimageio import AICSImage
wb_img = AICSImage("https://www.your-site.com/your-file.ome.tiff")
s3_img = AICSImage("s3://your-bucket/your-dir/your-file.ome.tiff")
gs_img = AICSImage("gs://your-bucket/your-dir/your-file.ome.tiff")
To read from remote storage, you must install the related `fsspec` implementation library. For `s3` for example, you must install `s3fs`.
Splitting up Dependencies
To reduce the size of AICSImageIO on fresh installs and to make it easier to manage environments, we have split up dependencies into specific format installations.
By default, `aicsimageio` supports `TIFF` and `OME-TIFF` reading and writing. If you would like to install support for reading CZI files, you would simply append `[czi]` to the `aicsimageio` pip install: `pip install aicsimageio[czi]`. For multiple formats, you would add them as a comma separated string: `pip install aicsimageio[czi,lif]`. And to simply install support for reading all format implentations: `pip install aicsimageio[all]`.
A full list of supported formats can be found [here](https://github.com/AllenCellModeling/aicsimageio/blob/main/setup.py#L11).
Roadmap and Motivation
After many discussions with the community we have finally written a roadmap and accompanying documentation for the library. If you are interested, please feel free to read them. [Roadmap and Accompanying Documentation](https://allencellmodeling.github.io/aicsimageio/ROADMAP.html)
Full List of Changes
* Added support for reading all image data (including metadata and coordinate information) into `xarray.DataArray` objects. This can be accessed with `AICSImage.xarray_data`, `AICSImage.xarray_dask_data`, or `Reader` equivalents. Where possible, we create and attach coordinate planes to the `xarray.DataArray` objects to support more options for indexed data selection such as timepoint selection by unit of time, or pixel selection by micrometers.
* Added support for reading multi-channel RGB imaging data utilizing a new `Samples` (`S`) dimension. This dimension will only appear in the `AICSImage`, `Reader`, and the produced `np.ndarray`, `da.Array`, `xr.DataArray` if present in the file and will be the last dimension, i.e. if provided an RGB image, `AICSImage` and related object dimensions will be `"...YXS"`, if provided a single sample / greyscale image, `AICSImage` and related object dimensions will be `"...YX"`. (This change also applies to `DefaultReader` and PNG, JPG, and similar formats)
* `OmeTiffReader` now validates the found XML metadata against the referenced specification. If your file has invalid OME XML, this reader will fail and roll back to the base `TiffReader`. (In the process of updating this we found many bugs in our 3.x series `OmeTiffWriter`, our 4.x `OmeTiffReader` fixes these bugs at read time but that doesn't mean the file contains _valid_ OME XML. It is recommended to upgrade and start using the new and improved, _and validated_, `OmeTiffWriter`.)
* `DefaultReader` now fully supports reading "many-image" formats such as GIF, MP4, etc.
* `OmeTiffReader.metadata` is now returned as the `OME` object from [ome-types](https://github.com/tlambert03/ome-types). This change additionally removes the `vendor.OMEXML` object.
* Dimensions received an overhaul -- when you use `AICSImage.dims` or `Reader.dims` you will be returned a `Dimensions` object. Using this object you can get the native order of the dimensions and each dimensions size through attributes, i.e. `AICSImage.dims.X` returns the size of the `X` dimension, `AICSImage.dims.order` returns the string native order of the dimensions such as `"TCZYX"`. Due to these changes we have removed all `size` functions and `size_{dim}` properties from various objects.
* Replaced function `get_physical_pixel_size` with attribute `physical_pixel_sizes` that returns a new `PhysicalPixelSizes` `NamedTuple` object. This now allows attribute axis for each physical dimension, i.e. `PhysicalPixelSizes.X` returns the size of each `X` dimension pixel. This object can be cast to a base `tuple` but be aware that we have reversed the order from `XYZ` to `ZYX` to match the rest of the library's standard dimension order. Additionally, `PhysicalPixelSizes` now defaults to `(None, None, None)` (was `(1.0, 1.0, 1.0)`)as pixel physical size is optional in the OME metadata schema.
* Replaced parameters named `known_dims` with `dim_order`.
* Replaced function `get_channel_names` with attribute `channel_names`.
* Replaced function `dtype` with attribute `dtype`.
* Renamed the `chunk_by_dims` parameter on all readers to just `chunk_dims`.
* Removed all `distributed` cluster and client spawning.
* Removed context manager usage from all objects.
Contributors and Reviewers this Release (alphabetical)
Jackson Maxfield Brown (JacksonMaxfield)
Ramón Casero (rcasero)
Julie Cass (jcass11)
Jianxu Chen (jxchen01)
Bryant Chhun (bryantChhun)
Rory Donovan-Maiye (donovanr)
Christoph Gohlke (cgohlke)
Josh Moore (joshmoore)
Sebastian Rhode (sebi06)
Jamie Sherman (heeler)
Madison Swain-Bowden (AetherUnbound)
Dan Toloudis (toloudis)
Matheus Viana (vianamp)