The custom_taxonomy_databases script contains a parser and a database handler to allow customization of databases
from NCBI, QIIME or CanSNPer sources and supports export functions into NCBI formatted names and nodes.dmp files
as well as a slimmed tab separated file. The database allows databases to be merged at selected
nodes(taxonomy IDs) as well as adding resolution to certain subgroups (ie using a tab separated file).
The script was initially written to allow the use of GTDB with some custom modifications to allow separation of
subgroups. GTDB was created by an Australian group aimed to restructure the taxonomy relation from the NCBI
taxonomy tree to strictly follow a phylogenetic structure (http://gtdb.ecogenomic.org/) this script can use the
taxonomy.tsv files from the GTDB downloads page as input (with the --taxonomy_type selected as QIIME). By default
the script will read a Tab separated file containing parent and child (defined by column headers).
All data is kept in a sqlite3 database (.ctdb by default) and can be dumped at will to NCBI formatted names
and nodes.dmp files. Supported export formats in version 0.2b is NCBI and TSV). The TSV dump format is similar to
the NCBI dump except that it contains a header (parent/child), has parent on the left and only uses tab to separate
each column (not <tab>|<tab>).