Implemented in [PR29](https://github.com/biocore/DEICODE/pull/29).
Features
* The `--min-feature-count` and `--iterations` options are now usable when
running DEICODE outside of QIIME 2.
* When running DEICODE outside of QIIME 2, it will no longer just raise an error if
the specified output directory doesn't exist (it'll automatically try to
create the output directory, even creating multiple levels of directories if
needed).
* Options' default values (if applicable) are now shown when running
`deicode --help`.
* The `--in-biom` and `--output-dir` options are now marked as required when
running DEICODE outside of QIIME 2 (so just running, e.g., `deicode` without
any options will give you a clearer error message than before).
* The RPCA functionality is now exposed via the `deicode.rpca` module,
which contains an `rpca()` function.
* Duplicate indices and columns will cause a ValueError. Previously
the script `deicode.scripts.rpca.py` would just drop any duplicates.
Backward-incompatible changes [stable]
* The following option names have changed when running DEICODE outside of QIIME
2:
| Original Name | New Name |
| -------------------- | -------------------- |
| `--in_biom` | `--in-biom` |
| `--output_dir` | `--output-dir` |
| `--min_sample_depth` | `--min-sample-count` |
* `deicode.scripts._rpca` has been replaced by `deicode.scripts._standalone_rpca`.
* Similarly, the `rpca()` function within `deicode.scripts._rpca` has been
replaced by the `standalone_rpca()` function in
`deicode.scripts._standalone_rpca`.
* `deicode.preprocessing.inverse_rclr` was removed along with its
tests. This code was redundant with `skbio.stats.composition.clr_inv`,
which can be performed on clr-transformed data. Furthermore,
this inverse is a holdover from old versions of DEICODE
where we directly interpreted the imputation and is no
longer useful for the output.
* `deicode.ratios.py` was removed. This (untested) code was an unfinished
feature that will be replaced by rankratioviz. This code was only used
in the visualizations used in the manuscript and will be stored in the
DEICODE-benchmarking repository where it is actually used.
Backward-incompatible changes [experimental]
Performance enhancements
* Removed scikit-learn dependency.
Bug fixes
* Some of the QIIME 2 RPCA behavior was not mirrored perfectly in the non-QIIME
2 RPCA code. Here is a list of the "new" things done by the non-Q2 RPCA code
that it didn't do before:
* Uses `--min-feature-count` with a default value of `10`. Previously, the
non-Q2 RPCA code didn't do this filtering step at all.
* Adds a PC3 containing zeros if the `rank` is set to `2` (to support
visualizing these biplots in Emperor).
* A minimum value of `2` is now enforced for the `--rank` option.
* A minimum value of `1` is now enforced for the `--iterations` option.
* Fixed the test in `deicode/scripts/tests/` to check the correct output
files produced by DEICODE (previously, this test was looking at the incorrect
files).
* Fixed a test in `deicode/q2/tests/` to correctly check for NaNs in the
ordination produced by DEICODE (previously, this test was using python's
built-in `any()` function instead of pandas' `.any()` function, which
resulted in the test being incorrect).
* Iteration in `deicode/_optspace.py` indexing was off see fedarko's
comment in PR 29. This causes the iteration to be one less than the
input, this should not have had an impact any results.
Deprecated functionality [stable]
Deprecated functionality [experimental]
Miscellaneous
* Since `deicode.rpca` is now used by both the QIIME 2 and non-QIIME 2 code,
the amount of redundant code has decreased. This should simplify DEICODE's
codebase.
* Shared RPCA parameter settings between the QIIME 2 and non-QIIME 2 code
(descriptions and default values) are now stored in `deicode._rpca_defaults`.
This further cuts down on redundancy: developers now only have to update the
description or default value of an option in this one place.
* The files produced by the non-QIIME 2 code have been renamed as follows to be
consistent with the data inside the artifacts produced by the QIIME 2 code:
| Original Name | New Name |
| --------------------- | --------------------- |
| `RPCA_Ordination.txt` | `ordination.txt` |
| `RPCA_distance.txt` | `distance-matrix.tsv` |
* The `skbio.OrdinationResults` short and long method names for ordinations
produced by the non-QIIME 2 code have changed as follows, in order to be
consistent with the ordinations produced by the QIIME 2 code:
| Original Name | New Name |
| ------------------------------- | -------------------------------- |
| `PCoA` | `rpca_biplot` |
| `Principal Coordinate Analysis` | `(Robust Aitchison) RPCA Biplot` |
* These changes shouldn't actually impact anything within the
ordination files, since as of writing the [scikit-bio ordination format](http://scikit-bio.org/docs/latest/generated/skbio.io.format.ordination.html) doesn't include either of the method names. Just listing this here to be safe.
* Various typo fixes
* Various RPCA test code enhancements
* A citation of DEICODE's
[published paper](https://msystems.asm.org/content/4/1/e00016-19) is now
included in the citations of QIIME 2 artifacts generated with DEICODE.