A long overdue release covering some minor functionality updates and bugfixes:
Additional functionality:
- Write out reads failing regex matching with `extract`/`whitelist` (see options `--filtered-out`, `--filtered-out2`). See 328 for motivation
- Ignore template length with paired-end `dedup`/`group` (see option `--ignore-tlen`). See 357 for motivation. Thanks skitcattCRUKMI
- Ignore read pair suffixes with `extract`/`whitelist` e.g `/1` or `/2`. (see option `--ignore-read-pair-suffixes`). See 325, 391, 418, PierreBSC/Viral-Track9 for motivation
Performance
- Sped up error correction mapping for cell barcodes in `whitelist` by using BKTree. Thanks redst4r. Note that this adds a new python dependency (`pybktree`) which is available via `pip` and `conda-forge`.
- Very slight reduction in memory usage for `dedup`/`group` via bugfix to reduce the amount of reads being retained in the buffer. Thanks to mitrinh1 for spotting this (428). The bug was equivalent to hardcoding the option `-buffer-whole-contig` on, which ensures all reads with the same start position are grouped together for deduplication, but at the cost of not yielding reads until the end of each contig, thus increasing memory usage. As such, the bug was not detrimental to results output.
Bugfixes:
- Unmapped mates were not properly discarded with `dedup` and `group`. Thanks Daniel-Liu-c0deb0t for rectifying this.