We are excited to announce the new release of Geodata-Harvester v1.0.0! This release brings several new features, performance improvements, and bug fixes that enhance the overall experience of using the Geodata-Harvester.
🌟 New Features
- **Time-Series Extraction**: The long-awaited extraction of time-series data is now available and integrated in the auto harvest.run function! This enhancement provides users with the ability to process image collections for multiple time intervals and to automatically extract temporal aggregated data, including for climate data (SILO), Digital Earth Australia (DEA) post-processed satellite data, and Google Earth Engine data sources.
- **Multi-band raster data queries**: With this feature, multi-band data can now be extract from raster images into data tables, including automatic generation and labeling of data columns for each band or channel per image.
- **FAQ Chatbot**: The new FAQ chatbot on the [Github page](https://sydney-informatics-hub.github.io/geodata-harvester/#what-is-it) provides users with a quick and easy way to access the Geodata-Harvester documentation. The FAQbot leverages OpenAI's GPT and uses vector embeddings of the Geodata-Harvester reference material and code documentation.
🚀 Performance Improvements
- **Raster query optimisation**: Performance gains were achieved by leveraging (rio)xarray for data extraction from raster images, resulting in faster execution times for all users.
- **Temporal processing optimisation**: The temporal aggregation process for stats extraction has been optimized to reduce the time required to extract temporal aggregated data from large image collections.
🐛 Main Bug Fixes
- **Fix missing data values for aggregation**: Resolved an issue where temporal processing aggregates over missing data values, which needed to be identified from image header via ase-insensitive search for nodata value names in header and replaced with nan values. This fix ensures that missing data values in images are not corrupting aggregated stats.
- **Fix name objects**: Fixed naming conventions in image labeling and data table generation to ensure that all objects are named correctly and consistently.
- **Fix data table generation**: Fixed an issue where data tables were not generated correctly for multi-band raster data queries.
- **Fix datetime labeling**: Fixed an issue where extracted image dates are not added to metadata in images and labels.
- **Fix potential issue in geopackage writing**: Fixed the potential issue of duplicate column names in geopackage writing, in case of identical named pre-existing data in result folder.
- **Cleanup results csv file**: Reordered columns in CSV and removed geometry column from csv since Lat, Lng columns already exist.
- **Fix xarray2tif function due to rioxarray upgrade**: Fixed an issue in xarray2tif where the rioxarray upgrade (from version '0.13.1' to '0.13.3') was not working anymore for writing multi-channel xarray data as geotiff.
📚 Documentation and Notebooks
- Added new notebooks to demonstrate the new temporal processing features
- Updated documentation to reflect the new features
- Added [Geodata-Harvester summary paper](https://github.com/Sydney-Informatics-Hub/geodata-harvester/blob/main/paper/paper.md)
- Added contribution guidelines
📦 Download & Installation
You can find the source code and installation instructions for Geodata-Harvester v1.0.0 on our [GitHub repository](https://github.com/Sydney-Informatics-Hub/geodata-harvester).
For any questions or issues, please refer to our [issue tracker](https://github.com/Sydney-Informatics-Hub/geodata-harvester/issues).
Happy coding! 🎉