This is the first release in AgML v0.3.x cycle, introducing the `agml.models` module.
API Changes
`agml.models`
- Introduction of the `agml.models` module. To load the module, include an `import agml.models` statement after the standard `import agml`.
- Construct inference models for the following tasks:
- image classification: `agml.models.ClassificationModel` (using an *EfficientNetB4* model)
- semantic segmentation: `agml.models.SegmentationModel` (using a *DeepLabV3* model)
- object detection: `agml.models.DetectionModel` (using an *EfficientDetD4* model)
- Use `model.predict` on an input data sample to get the prediction results, or use `model.show_prediction` to visualize the prediction for semantic segmentation and object detection datasets.
- Use `model.preprocess_input` to format inputs in the required format for each model.
- Load a pretrained benchmark for an AgML public dataset using `model.load_benchmark`.
- Currently available for semantic segmentation and object detection.
- Print information about the benchmark metric and training parameter using the `model.benchmark` property.
- Evaluate a model using standard metrics using `model.evaluate()` with an `AgMLDataLoader`
- Standard metric containers using the `agml.models.metrics` module: currently with `Accuracy` for image classification and `MeanAveragePrecision` for object detection.
- Preprocessing pipelines using `agml.models.preprocessing`: currently with `EfficientDetPreprocessor` for object detection models.
- Standard loss functions with `agml.models.losses`: currently with `DiceLoss` for semantic segmentation.
`agml.data`
- Load multiple datasets into a single `AgMLDataLoader` using the new Sequence API: `agml.data.AgMLDataLoader([<d1>, <d2>])`.
- Load custom datasets into an `AgMLDataLoader` using `agml.data.AgMLDataLoader.custom(<d>)`.
- Merge two loaders into a single `AgMLDataLoader` using `agml.data.AgMLDataLoader.merge(<l1>, <l2>)`.
- Copy the state of another `AgMLDataLoader` using `agml.data.AgMLDataLoader.copy_state(<d>)`. This copies the states of the internal managers, so the original loader will get the transforms, resizing, and training mode of the copied loader.
- **New Dataset Reduction Methods**:
- Use `loader.take_random` to select a random number of data samples from the loader and return a new loader with this reduced number of samples.
- Use `loader.take_class` on a multi-dataset object detection `AgMLDataLoader` to return a reduced loader containing all samples belonging to the specific class (or multiple classes).
- Use `loader.take_dataset` on a multi-dataset `AgMLDataLoader` to return one of the individual loaders in the collection.
- *Preprocessing Changes*:
- Select a custom method for image resizing using the `method` keyword argument in `AgMLDataLoader.resize_images()`.
- You can update whether the loader should be auto-shuffled internally through the `shuffle_data` property (set to True or False).
- Semantic segmentation loaders now have an option `loader.mask_to_channel_basis` (the parallel of `loader.labels_to_one_hot` for image classification), which converts one-channel masks to multi-channel masks with each channel representing a binary mask for a class.
- For multi-dataset object detection loaders, the `loader.generalize_class_detections` method generalizes all classes, thus resulting in the loader having only one-class (useful for localization-intensive tasks).
- The `agml.data.convert_bbox_format` method now allows you to pass string formats, e.g., `pascal-voc` or `efficientdet`: these get auto-expanded to the full bounding box format in the method itself.
`agml.backend`
- Similar to the method for datasets, models can now be saved to a specific path using `agml.backend.set_model_save_path()` and this path can be retrieved using `agml.backend.model_save_path()`.
- A new `agml.backend.experimental` module has been added for experimental features.
- Currently, the first experimental feature is splitting already-split loaders, with `agml.backend.experimental.allow_nested_data_splitting()`.
`agml.viz`
- New methods have been introduced which allow visualization of real and predicted samples:
- Visualize real and predicted bounding boxes with `agml.viz.visualize_real_and_predicted_boxes`.
- Visualize an image with multiple sets of bounding boxes using `agml.viz.visualize_image_and_many_boxes`.
- Visualize a real and predicted semantic segmentation mask with `agml.viz.visualize_image_mask_and_predicted`.
Behavior Changes
- The `AgMLDataLoader` now has a set of new properties (`num_images`, `num_classes`, etc). These are similar to the properties `loader.info.num_images`, `loader.images.num_classes`, etc., however, the properties directly on the loader reflect any changes applied to reduction methods.
- For instance, if you call `loader.split()`, then the reduced `loader.train_data` loader will have a reduced number of `loader.num_images`, while `loader.info.num_images` will continue to reflect the total number of images for the original dataset.
- Dataset transforms are now tracked by their time of insertion; if you call `loader.transform()` twice, the transforms which are passed in the first call will be applied before those in the second call.
- Any methods which use random operations (such as `loader.take_random`) now have an optional `random_state` keyword argument, which can be passed to control the random seed used for that method only.
- All transformation methods now have an optional argument `add` which allows them to be disabled.
- When converting an `AgMLDataLoader` into training mode, it now disables the `auto` mode so that no inadvertent resizing is done unless the user explicitly requests it.
- In all `agml.viz` methods, image arrays are auto-converted to 8-bit unsigned integer format before display. Additionally, the `format_image` method now takes an additional argument `mask` for preprocessing masks.
- Array check methods now check for NumPy arrays and lists, before PyTorch tensors, before finally TensorFlow tensors: this prevents having to load in the full API for PyTorch/TensorFlow modules when unnecessary.
Other Bugfixes
- Bounding boxes for object datasets are now clipped at [0, `image_size`) on loading.
- Exporting COCO datasets from a split `AgMLDataLoader` now returns the correct, reduced COCO JSON dictionary, rather than the entire dataset's COCO JSON.
- Copying a loader or splitting it will now update all internal states; previously there was a chance the `TrainingManager` would have different resize/transform managers than the upper-level `DataManager`.
- You can now download a list of datasets using `agml.data.download_public_dataset`.
- When visualizing semantic segmentation masks with no class labels (e.g., a prediction with only zeros), now only the original image is displayed, rather than throwing an error.
- An error with namedtuples in local scopes which prevented `AgMLMetadata` from being serialized has been resolved.
[*Read the Full Changelog Here.*](https://github.com/Project-AgML/AgML/compare/v0.2.9...v0.3.0)