To quantify the changes, the code base increased by 50% with 36 resolved issues, by far the biggest release of txtai. These changes were designed to be fully backward compatible but keep in mind it is a new major release.
[What's new in txtai 4.0](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/24_Whats_new_in_txtai_4_0.ipynb) covers all the changes with detailed examples. The [documentation site](https://neuml.github.io/txtai) has also been refreshed.
New Features
--------------------------
- Store text content (168)
- Add option to index dictionaries of content (169)
- Add SQL support for generating combined embeddings + database queries (170)
- Add reindex method to embeddings (171)
- Add index archive support (172)
- Add close method to embeddings (173)
- Update API to work with embeddings + database search (176)
- Add content option to tabular pipeline (177)
- Update workflow example to support embeddings content (179)
- Add index metadata to embeddings config (180)
- Add object storage (183)
- Aggregate partial query results when clustering (184)
- Add function parameter to embeddings reindex (185)
- Add support for user defined column aliases (186)
- Use SQL bracket notation to support multi word and more complex JSON path expressions (187)
- Support SQLite 3.22+ (190)
- Add pre-computed vector support (192)
- Change document/object inserts to only keep latest record (193)
- Update documentation with 4.0 changes (196)
Improvements
--------------------------
- Modify workflow to select batches with slices (158)
- Add tensor support to workflows (159)
- Read YAML config if provided as a file path (162)
- Make adding pipelines to API easier (163)
- Process task actions concurrently (164)
- Add tensor workflow notebook (167)
- Update default ANN parameters (174)
- Require Python 3.7+ (175)
- Consistently name embeddings id fields (178)
- Add txtai __version__ attribute (181)
- Refresh notebooks for 4.0 (188)
- Modify embeddings to only iterate over input documents once (189)
- Improve efficiency of vector transformations (191)
Bug Fixes
--------------------------
- Add thread lock around API write calls (160)
- Expose caption and objects pipeline via API (161)
- Change pickle calls to use protocol supporting lowest Python version (182)
- HFOnnx expects ORT provider bug (195)