Index structures
- New: `OnDiskIndex` is based on HDF5 and can be accessed on-demand from disk
- Indexes can now grow dynamically in size
- Data is now represented using pandas data frames internally
- Many operations have been vectorized to improve performance
- Early stopping now works in batches rather than per query
- New: `Indexer` class for indexing corpora
- New: PyTerrier transformers are provided for scoring and interpolation using Fast-Forward indexes
API changes
Many parts of the API have changed. Some of the most important breaking changes:
- Scores are now computed using `Index.__call__`
- Queries are not explicitly provided anymore but attached to the ranking
- `InMemoryIndex` objects cannot be saved to or loaded from disk anymore