What's Changed
Features
Checkpointing
`Checkpointing` is an overhaul of the previous commit structure. It is meant to better synchronize processing progress (i.e. committing topic offsets) and state updates to ensure consistency of the state.
It should also increase processing speed anywhere from 1.3x-2.5x due to its new batched commit approach.
To adjust this new commit frequency, users can set a (new) `commit_interval` (Default: 5 seconds):
python
app = Application(commit_interval=5)
For more details, [see the `Checkpoint` docs](https://quix.io/docs/quix-streams/advanced/checkpointing.html).
GroupBy
`GroupBy` enables users to "group" or "re-key" their messages based on the message value, typically to perform (stateful) aggregations on them (much like SQL).
With the new `StreamingDataFrame.group_by()`, you can do this while including other `StreamingDataFrame` operations before or after (so only one `Application` is needed):
python
data: {"user_id": "abc", "int_field": 5}
app = Application()
sdf = app.dataframe()
sdf["new_col"] = sdf["int_field"] + 1
sdf = sdf.group_by("user_id")
sdf = sdf.apply(lambda r: r["new_col"])
sdf = sdf.tumbling_window(duration_ms=3600).sum().final()
...etc...
Users can group by a column name, or provide a custom grouping function.
For more details, [see the `GroupBy` docs](https://quix.io/docs/quix-streams/groupby.html).
Enhancements
* Docs updates by stereosky in https://github.com/quixio/quix-streams/pull/344, https://github.com/quixio/quix-streams/pull/352
* add default error cb to Admin by tim-quix in https://github.com/quixio/quix-streams/pull/343
**Full Changelog**: https://github.com/quixio/quix-streams/compare/v2.4.2...v2.5.0