Malpolon

Latest version: v2.1.1

Safety actively analyzes 681812 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 2

2.1.1

What's changed

Main changes

- [x] Updated link to glc24_pre_extracted model weights, addressing `pos_weight` loading issue following previous Malpolon updates.

2.1.0

What's changed

Main changes

- [x] Added possibility for users to choose their optimizer and scheduler via their config file:
- `malpolon.models.utils`: Changed behavior of `check_optimizer()` and added `check_scheduler()` to allow users to input one or several optimizers (and optionally 1 scheduler per optimizer, possibly with a lr_scheduler_config descriptor) via their config files.
- `malpolon.models.standard_prediction_systems`: changed instantiation of optimizer(s) and scheduler(s) in class `GenericPredictionSystem`. The class attributes are now lists of instantiated optimizers (respectively, of `lr_scheduler_config dictionaries`). Updated behavior of method `configure_optimizers()` to return a dictionary containing all the optimizers and schedulers (cf. https://lightning.ai/docs/pytorch/stable/api/lightning.pytorch.core.LightningModule.html#lightning.pytorch.core.LightningModule.configure_optimizers).
- Updated all examples and added all corresponding unit tests, testing both valid scenarios and edge cases of incorrect user inputs in the config file.

Others

- [x] In glc24_pre_extracted example: added habitat version dataset which consists of symbolic links to the species version. Running the habitat's main script will trigger the download of the data predictors (rasters, satellite, time-series).
- [x] Updated split_obs_per_species_frequency() to include more input arguments

2.0.0

What's Changed
Main changes
- [x] Added GLC24 pre_extracted habitat dataset and example (see PR 58 in the Links section)
- [x] Changed the way checkpoints are loaded from loading the `state_dict` of the model object to loading the `state_dict` of the LightningModule. This is a breaking change as examples needed to be updated by **removing** the replacement of "model." string in the loaded state_dict.
- [x] Added possibility to download model weights for any Malpolon model given a URL and a few file paths

- [x] Updated the way checkpoint_path is passed on to models. Added an attribute checkpoint_path for all Malpolon models
- Updated every examples consequently

- [x] Added Malpolon as (local) model provider.
- Created new module `malpolon.models.custom_models` which will host custom models proposed by Malpolon
- Split classes from `geolifeclef2024_multimodal_ensemble.py` to glc2024_multimodal_ensemble_model.py and glc2024_pre_extracted_prediction_system.py in custom_models to prevent circular import from malpolon.models.model_builder after adding Malpolon as (local) provider

Others
- [x] Updated `malpolon.data.data_module.export_predict_csv` to enable more flexibility when outputting the prediction CSV for a single data point.

- [x] Added GLC24 pre-extracted examples (habitat and species) using the MultiModalEnsemble (MME) model
- Automatic download of the dataset from Kaggle (depending on the value of boolean config parameter `data.download_data`)
- Automatic download of the model weights from Seafile if not already on disk, via a new `model.model_kwargs.pretrained` key in the config file. The weights enable users to directly run our MME model on our GLC24_pre_extracted Test set and reach ~30% micro F1-score with ~26% micro precision and ~36% micro Recall, as well as ~96% micro AuC.

- [x] Added and updated unit tests for GLC24 pre-extracted examples (habitat and species)

- [x] Added new content in online documentation and tutorial files

**Full Changelog**: https://github.com/plantnet/malpolon/compare/v1.3.0...v2.0.0

1.3.0

What's Changed
Main changes
- Created new module `malpolon.models.custom_models` which will host custom models proposed by Malpolon
- Split datamodule and model from `geolifeclef2024_multimodal_ensemble.py` to `glc2024_multimodal_ensemble_model.py` and `glc2024_pre_extracted_prediction_system.py` in `custom_models`.

- Added `malpolon` as model provider. Currently we only provide MultiModalEnsemble (MME) model which can be called for in config files "model_name" key as: **`glc24_multimodal_ensemble`** (see repository `examples/benchmark/geolifeclef/geolifeclef2024_pre_extracted/config/glc24_cnn_multimodal_ensemble.yaml`)

- Added possibility to download model weights for any Malpolon model given a URL and a few file paths via `malpolon.standard_prediction_system.download_weights`
- Added model weight download info for the MME model. The example experiment file of MME now automatically downloads the weights from Seafile if not already on disk, via `model.model_kwargs.pretrained` key in the config file

- Updated the way `checkpoint_path` is passed on to models. Added an attribute `checkpoint_path` for all Malpolon models
- Updated every examples consequently


Others
- MME: changed the way loss parameter `loss.pos_weight` is used in the model's `_step()` method so that its _state\_dict_ object stays the same before and after running the model in train mode.

- GLC22 examples in `benchmark` and `custom_train` have been updated to include an inference run option. This led to changing the return values of the class getter for the `test` dataset. The class now always return a `{data, label}` pair, with `label` of value `-1` for `test` dataset (inference run)
- Updated `malpolon/tests/test_examples.py` accordingly

1.2.1

Changes
- Fixed models import from `malpolon.models`

Other
- Purged poisoned PyPi package from unwanted `dev` files

1.2.0

New features
- **Datasets**
- Added a new dataset `geolifeclef2024_pre_extracted` following 2024 edition of Kaggle challenge [GeoLifeCLEF](https://www.kaggle.com/competitions/geolifeclef-2024/data/)
- Computed rolling `mean` and rolling `std` values of GeoLifeCLEF2024 dataset for each modality. These values are stored in this dataset's transform functions

- **Models**
- Added a new model "MultimodalEnsemble" in `geolifeclef2024_multimodal_ensemble` based on picekl work on [GeoLifeCLEF2024](https://www.kaggle.com/code/picekl/sentinel-landsat-bioclim-baseline-0-31626)

- **Scripts**
- Added new scripts `split_obs_spatially.py`, `sort_files_glc_fashion.sh`
- `split_obs_spatially.py`: splits a CSV observation dataset into a _training_ and a _val_ subset where _val_ observation plots are spatially separated from _training_ ones. This scripts uses new **`verde`** package.
- `sort_files_glc_fashion.sh`:
> This script re-organizes files in one folder into folders and sub-folders in the same way as for the GeoLifeCLEF challenge.
> That is to say in the following manner.
>
> Each file is re-arranged in folders and sub-folders in the following way:
> A file named 'ABCDWXYZ.pt' located at 'root_path/' will be moved to
> 'root_path/YZ/WX/ABCDWXYZ.pt'.
>
> Each file name must be at least 3 characters long. For instance:
> A file named 'XYZ.pt' located at 'root_path/' will be moved to
> 'root_path/YZ/X/XYZ.pt'.
- `split_obs_per_species_frequency`: splits a CSV observation dataset into a _training_ and a _val_ subset based on species frequency
- Added `split_obs_spatially.py` and `split_obs_per_species_frequency.py` scripts to Malpolon as modules in `malpolon.data.utils`

Changes
- Renamed `scripts` folder to `toolbox`
- Renamed scenarios from {"Ecologists", "Inference", "Kaggle"} to {"Custom_train", "Inference", "Benchmarks"} and re-organized experiments
- Fixed examples-related bugs, file links, duplicate files and cleaned config files
- Updated code documentation, repository READMEs and examples tutorial files

Page 1 of 2

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.