What's Changed
Another round of absolutely massive performance improvements for `method="cohorts"`. This should bigly improve many Xarray groupby workloads (which use cohorts by default). Resampling in particular should be much better
Benchmarks improvements undersell the changes, since the core loop is approximately quadratic. Graph construction time for the example in https://xarray.dev/blog/flox with "cohorts" specified now drops from 30s to 3s 😱
| Before [15324a7a] <v0.8.3> | After [666d45e9] <main> | Ratio | Benchmark (Parameter) |
|------------------------------|---------------------------|---------|---------------------------------------------------------|
Larger cohorts, lower tasks
| 5180 | 3600 | 0.69 | cohorts.NWMMidwest.track_num_tasks |
| 4891 | 3385 | 0.69 | cohorts.NWMMidwest.track_num_tasks_optimized |
| 505 | 345 | 0.68 | cohorts.NWMMidwest.track_num_layers |
Much faster algorithm for detecting cohorts, that should scale better.
| 3.19±0.07ms | 2.42±0.05ms | 0.76 | cohorts.ERA5Google.time_find_group_cohorts |
| 1.04±0.01ms | 782±70μs | 0.75 | cohorts.PerfectMonthly.time_find_group_cohorts |
| 1.06±0.01ms | 781±70μs | 0.73 | cohorts.PerfectMonthlyRechunked.time_find_group_cohorts |
| 29.7±2ms | 12.6±0.9ms | 0.43 | cohorts.ERA5DayOfYear.time_find_group_cohorts |
| 7.76±1ms | 2.90±0.2ms | 0.37 | cohorts.ERA5MonthHour.time_find_group_cohorts |
| 8.17±0.8ms | 2.75±0.2ms | 0.34 | cohorts.ERA5MonthHourRechunked.time_find_group_cohorts |
| 242±5ms | 47.3±2ms | 0.2 | cohorts.NWMMidwest.time_find_group_cohorts |
| 28.8±3ms | 4.11±0.3ms | 0.14 | cohorts.ERA5DayOfYearRechunked.time_find_group_cohorts |
Total time is not too different, we have some overhead in constructing the graphs
| 162±5ms | 144±9ms | 0.89 | cohorts.ERA5DayOfYearRechunked.time_graph_construct |
| 20.7±0.2ms | 18.3±0.4ms | 0.89 | cohorts.ERA5Google.time_graph_construct |
| 3.21±0.2ms | 2.40±0.04ms | 0.75 | cohorts.PerfectMonthly.time_graph_construct |
| 181±10ms | 129±10ms | 0.71 | cohorts.NWMMidwest.time_graph_construct |
Changes
* More cohorts speedups by dcherian in https://github.com/xarray-contrib/flox/pull/290
* typing fixes. by dcherian in https://github.com/xarray-contrib/flox/pull/292
* Use set containment instead of perfect subsets by dcherian in https://github.com/xarray-contrib/flox/pull/291
**Full Changelog**: https://github.com/xarray-contrib/flox/compare/v0.8.3...v0.8.4