- Multiple operators have been reworked to avoid taking and releasing
Python's global interpreter lock while iterating over multiple items.
Windowing operators, stateful operators and operators like `branch`
will see significant performance improvements.
Thanks to damiondoesthings for helping us track this down!
- *Breaking change* `FixedPartitionedSource.build_part`,
`DynamicSource.build`, `FixedPartitionedSink.build_part` and `DynamicSink.build`
now take an additional `step_id` argument. This argument can be used when
labeling custom Python metrics.
- Custom Python metrics can now be collected using the `prometheus-client`
library.
- *Breaking change* The schema registry interface has been removed.
You can still use schema registries, but you need to instantiate
the (de)serializers on your own. This allows for more flexibility.
See the `confluent_serde` and `redpanda_serde` examples for how
to use the new interface.
- Fixes bug where items would be incorrectly marked as late in sliding
and tumbling windows in cases where the timestamps are very far from
the `align_to` parameter of the windower.
- Adds `stateful_flat_map` operator.
- *Breaking change* Removes `builder` argument from `stateful_map`.
Instead, the initial state value is always `None` and you can call
your previous builder by hand in the `mapper`.
- *Breaking change* Improves performance by removing the `now:
datetime` argument from `FixedPartitionedSource.build_part`,
`DynamicSource.build`, and `UnaryLogic.on_item`. If you need the
current time, use:
python
from datetime import datetime, timezone
now = datetime.now(timezone.utc)
- *Breaking change* Improves performance by removing the `sched:
datetime` argument from `StatefulSourcePartition.next_batch`,
`StatelessSourcePartition.next_batch`, `UnaryLogic.on_notify`. You
should already have the scheduled next awake time in whatever
instance variable you returned in
`{Stateful,Stateless}SourcePartition.next_awake` or
`UnaryLogic.notify_at`.