-----------------
* Implemented allele ratio filtering for J gene discovery
* J genes are discovered as part of the pipeline (previously, one needed
to run the ``discoverj`` script manually)
* In each iteration, dendrograms are now created not only for V genes, but
also for D and J genes. The file names are ``dendrogram_D.pdf``,
``dendrogram_J.pdf``
* The V dendrograms are now in ``dendrogram_V.pdf`` (no longer
``V_dendrogram.pdf``). This puts all the dendrograms together when looking
at the files in the iteration directory.
* The ``V_usage.tab`` and ``V_usage.pdf`` files are no longer created.
Instead, ``expressed_V.tab`` and ``expressed_V.pdf`` are created. These
contain similar information, but an allele-ratio filter is used to
filter out artifacts.
* Similarly, ``expressed_D.tab`` and ``expressed_J.tab`` and their
``.pdf`` counterparts are created in each iteration.
* Removed ``parse`` subcommand (functionality is in the ``igblast`` subcommand)
* New CDR3 detection method (only heavy chain sequences): CDR3 start/end coordinates
are pre-computed using the database V and J sequences. Increases detection rate
to 99% (previously less than 90%).
* Remove the ability to check discovered genes for required motifs. This has never
worked well.
* Add a column ``clonotypes`` to the ``candidates.tab`` that tries to count how many
clonotypes are associated with a single candidate (using only exact occurrences).
This is intended to replace the ``CDR3s_exact`` column.
* Add an ``exact_ratio`` to the germline filtering options. This checks the ratio
between the exact V occurrence counts (``exact`` column) between alleles.
* Germline filtering option ``allele_ratio`` was renamed to ``clonotypes_ratio``
* Implement a cache for IgBLAST results. When the same dataset is re-analyzed,
possibly with different parameters, the cached results are used instead of
re-running IgBLAST, which saves a lot of time. If the V/D/J database or the
IgBLAST version has changed, results are not re-used.