Release Notes
Most of these changes were made for [v23.0.0](https://github.com/ME-ICA/tedana/releases/tag/23.0.0), but the package did not build for pip so the descriptive release notes are stored with this version.
This release changes many internal aspects of the code, will make future improvements easier, and will hopefully make it easier for more people to understand their results and contribute. The denoising results should be identical. Right before releasing this new version, we released version 0.0.13, which is the last version of the older code.
User-facing changes
* **Breaking change**: `tedana` can no longer be used to manually change component classifications. A separate program, `ica_reclassify`, can be used for this. This makes it easier for programs like [Rica](https://github.com/ME-ICA/rica) to output a list of component numbers to change and to then change them with `ica_reclassify`. Internally a massive portion of the `tedana` workflow code was a mess of conditional statements that were designed just so that this functionality could be retained within tedana. By separating out `ica_reclassify` the `tedana` code is more comprehensible and adaptable.
* **Breaking change**: No components are classified as `ignored`. `Ignored` has long confused users. It was intended to identify components with such low variation that it was not worth deciding whether to lose a statistical degree of freedom by rejecting them. They were treated identically to `accepted` components. Now they are classified as `accepted` and tagged as `Low variance` or `Borderline Accept`. This `classification_tag` now appears on the html report of the results and the component table file.
* **Breaking change**: In the component table file `classification_tag` has replaced `rationale`. Since the tags use words and one can assign more than one tag to each component, these are both more informative and more flexible than the older `rationale` numerical codes.
* It is now possible to select different decision trees for component selection using the `--tree` option. The default tree is `kundu` and that should replicate the current outputs. We also include `minimal` which is a simpler tree that is intended to provide more consistent results across a study, but needs more testing and validation and may still change. [Flow charts for these two options are here.](https://tedana.readthedocs.io/en/stable/included_decision_trees.html)
* Anyone can create their own decision tree. If one is using metrics that are already calculated, like `kappa` and `rho`, and doing greater/less than comparisons, one can make a decision tree with a user-provided json file and the `--tree` option. More complex calculations might require editing the tedana python code. This change also means any metric that has one value per component can be used in a selection process. This makes it possible to combine the multi-echo metrics used in tedana with other selection metrics, such as correlations to head motion. The documentation includes [instructions on building and understanding this component selection process](https://tedana.readthedocs.io/en/stable/building_decision_trees.html).
* Additional files are saved which store key internal calculations and what steps changed the accept vs reject classifications for each component. The documentation includes [descriptions of the newly outputted files and file contents](https://tedana.readthedocs.io/en/stable/outputs.html#classification-output-descriptions). These includes:
* A registry of all files outputted by tedana. This allows for multiple file naming methods and means internal and external programs that want to interact with the tedana outputs just need to load this file.
* A file of all the metrics calculated across components, such as the `kappa` and `rho` elbow thresholds
* A decision tree file which records the exact decision tree that was run on the data and includes metrics calculated and component classifications changed in each step of the process
* A component status table that is summarizes each components classification at each step of the decision tree
Under-the-hood changes
* The component classification process that designates components as āacceptedā or ārejectedā was completely rewritten so that every step in the process is modular and the inputs and outputs of every step are logged.
* Moved towards using the terminology of āComponent Selectionā rather than āDecision Treeā to refer to the code thatās part of the selection process. āDecision Treeā is used to more specifically to refer to the steps to classify components.
* `ComponentSelector` object created to include common elements from the selection process including the component_table and information about what happens along every step of the decision tree. Additional information that will be stored in `ComponentSelector` and saved in files (as described above) includes `component_table`, `cross_component_metrics`, `component_status_table`, and `tree`
* The new class is defined in `./selection/component_selector.py`, the functions that define each node of a decision tree are in `./section/selection_nodes.py` and some key common functions used by selection_nodes are in `./selection/selection_utils.py`
* By convention, functions in selection_nodes.py that can change component classifications, begin with `dec_` for decision and functions that calculate cross_component_metrics begin with `calc_`
* A key function in selection_nodes.py is `dec_left_op_right` which can be used to change classifications based on the intersection of 1-3 boolean statements. This means most of the decision tree is modular functions that calculate cross_component_metrics and then tests of boolean conditional statements.
* When defining a decision tree a list of `necessary_metrics` are required and, when a tree is executed, the `used_metrics` are saved. This information is both a good internal check and can potentially be used to calculate metrics as defined in a `tree` rather than separately specifying the metrics to calculate and the tree to use.
* `io.py` is now used to output a registry (default is `desc-tedana_registry.json`) and can be used by other programs to read in files generated by `tedana` (i.e. Load the optimcally combined time series and ICA mixing matrix from the output of tedana rather than needing to input the names of each file separately)
* Some terminology changes, such as using `component_table` instead of `comptable` in code
* integration tests now store testing data in `.testing_data_cache` and only download data if the data on OSF was updated more recently than the local data.
* Nearly 100% of the new code and 98% of all tedana code is covered by integration testing.
* Tedana python package management now uses pyproject.toml
* **Possible breaking change** Minimum python version is now 3.8 and minimum pandas version is now 2.0 (might cause problems if the same python environment is used for packages that require older versions of pandas)
Changes
* [REF] Decision Tree Modularization jbteves handwerkerd n-reddy marco7877 tsalo in 756
* Update python-publish.yml by tsalo in 945
**Full Changelog**: <https://github.com/ME-ICA/tedana/compare/0.0.13...23.0.1>