Major changes:
* GTDB-TK now uses a **divide-and-conquer** approach where the bacterial reference tree is split into multiple order-level subtrees. This reduces the memory requirements of GTDB-Tk from **320 GB** of RAM when using the full GTDB R07-RS207 reference tree to approximately **35 GB**. A manuscript describing this approach is in preparation. If you wish to continue using the full GTDB reference tree use the `--full-tree` flag.
* Archaeal classification now uses a refined set of 53 archaeal-specific marker genes based on the recent publication by [Dombrowski et al., 2020](https://www.nature.com/articles/s41467-020-17408-w). This set of archaeal marker genes is now used by GTDB for curating the archaeal taxonomy.
* By default, all directories containing intermediate results are **now removed** by default at the end of the `classify_wf` and `de_novo_wf` pipelines. If you wish to retain these intermediates files use the `--keep-intermediates` flag.
* All MSA files produced by the `align` step are now compressed with gzip.
* The classification summary and failed genomes files are now the only files linked in the root directory of `classify_wf`.
Features:
* `convert_to_itol` to convert trees into iTOL format (373)
* Output FASTA files are compressed by default (369)
* Intermediate files will be removed by default when using classify/de-novo workflows unless specified by `--keep_intermediates` (369)
* Add --genes flag for Error (362)
* A warning will be displayed if pplacer fails to place a genome (360 / 356)
**Important**
* This version is not backwards compatible with GTDB release 202.
* This version requires a [new reference package](https://data.gtdb.ecogenomic.org/releases/release207/207.0/auxillary_files/gtdbtk_r207_data.tar.gz)