This is the first beta of 2.4. While we believe it is feature complete, there is still some wider testing that needs to happen. The goal of this release is to support the full re-analysis of the CMS Run 1 Higgs.
New Features:
* You can specify a single `http://` or `root://` file as input for a single file dataset.
* You can specify a list of `http://` and/or `root://` files. They will be processed by ServiceX as long as it has permission to access the data.
* A title can be given to each transform
* Add the ability to query a dataset for what will be the data types back. This enables automatic data type discovery (required to keep the interface sensible in `coffea` and other upstream libraries).
* Python 3.9 now supported
* Add support for the cms run 1 aod backend `type`.
* Caching
* Analysis Cache - one can create/check in a `json` file that will map queries to backend `request-id`'s. This means that others can re-run and just download the data, rather than having to re-transform the data for the same queries.
* A user can delete a data file from the local cache and it will automatically be re-downloaded
* If a query status cache file is removed, it will be automatically re-fetched
* Configuration:
* End points now can have names rather than just types, supporting more than one backend of a single type (e.g. two `uproot` backends)
Bug Fixes:
* If the backend has _lost_ the data, automatically resubmit the query. This was broken when streaming URL's or files.
* Transforms that are marked `Fatal` are now correctly cleared from the local cache, so they can be re-run
* When a transform with lots of files fails, the error report will be truncated to the result from 20 different files, rather than... all 3000.
* When a notebook is run under visual studio code, the progress bars are correctly shown (for processing and download).
* `StreamInfoUrl` is now exported
* Protect against filenames that are so long that the OS can't handle them. In particular, fix the current implementation so it has a more robust hashing mechanism for the modified filename.
In Progress:
* Added logging information to support debugging the local machine downloading. We aren't saturating good connections and it isn't clear why that is happening yet.