After a lengthy hiatus of development on numbagg, we're back with a big release:
- Lots of new grouping functions, in an attempt to be an engine for [flox](https://github.com/xarray-contrib/flox), a library from dcherian & others. The functions include:
- `group_nancount`
- `group_nanargmax`, `group_nanargmin`
- `group_nanfirst`, `group_nanlast`
- `group_nansum_of_squares`
- `group_nanprod`
- `group_nanall`, `group_nanany`
- `group_nanvar`
- `group_nanstd`
- `group_nanmax`, `group_nanmin`
- Lots of performance improvements to existing grouping functions
- Initial benchmarking shows 2-5x the performance over pandas' equivalent functions (though mostly towards the lower end, and the benchmarks are not as robust as I'd like; feedback and verifications welcome).
- Large test coverage expansion of grouping functions
- Improvements to the exponentially weighted moving functions:
- A new `move_exp_nanvar` function
- Code simplification and modest performance improvements to existing functions
- Benchmarks show 1-5x the performance of pandas' equivalent functions.
- A modest performance gain to existing moving functions.
- Internally, we've removed some of the original hacks that were initially required. Thanks to `numbagg` for supporting many of these natively!
The documentation needs a pass — the Readme could be reorganized, and the benchmarks could be more systematically measured and reported. It's possible that these large changes have introduced small bugs — particularly around edge cases, such as unfamiliar dtypes. That said, the main use cases are quite well-tested, and we have pandas & numpy to thank for excellent comparisons to test against.)
Please report any issues or questions. I (max-sixty) am excited `numbagg` is back, and will gauge how much to add on the extent to which folks find it useful. And ofc thanks to shoyer for writing the original library!