- CUDA Mapper
- Created a new interface to work with Indices. Allow grouping Indices together through public API.
- Added support for Index from descriptor objects and use that throughout.
- Added API support for outputting sequences/alignments in SAM/BAM format
- Determining positions of reads' minimizers on device instead of on host
- Correctly skipping reads that are too short to fit in at least one window
- https://github.com/clara-parabricks/GenomeWorks/pull/557 and https://github.com/clara-parabricks/GenomeWorks/pull/562 correct errors in the evaluate_paf script that could result in incorrect counts of matched starts / ends of records or precision and recall values that were greater than 1.
- CUDA Aligner
- Fixed a bug in banded Myers which could lead to an out-of-bounds access for non-optimal alignments. If the backtrace in the Needleman-Wunsch matrix touched the border of the band at a specific point it may lead to an out-of-bounds access.
- Added support for the extended CIGAR format, which distinguishes between matches `=` and mismatches `X`.
- Improved performance for batches with very varying alignment lengths
- Added a FixedBandAligner base class (as specialization of Aligner) for aligners that operate on a diagonal band. These aligners provide a aligner->reset_bandwidth(new_bandwidth) function now.
- Fixed the memory requirements of Hirschberg-Myers aligner. It can now process significantly larger batches at once.
- The default aligners returned by `create_aligner()` are the banded Myers aligner (a FixedBandAligner) for the `create_aligner()` function that does specify a bandwidth and Hirschberg-Myers aligner (a Aligner) for the `create_aligner()` call that does not specify a bandwidth. The API for the latter case is deprecated in will be replaced by a different `create_aligner()` function.
- CUDA Partial Order Aligner (CUDA POA)
- https://github.com/clara-parabricks/GenomeWorks/pull/551 Adds GFA output of the alignment graph generated by cudaPOA.
- Applied various changes to optimize performance of kernels for banded alignments.
- Added option `-s` for CUDA POA API to allow managing allocated memory for adaptive score matrix
- Introduced `static_band_traceback` as a new alignment mode in CUDA POA. This mode can potentially improve performance for processing long-read batches.
- Introduced `adaptive_band_traceback` for long-read batches. Different banded versions of Needleman-Wunsch kernels were unified.
- Added new CI tests to validate results of static/adaptive-band and static/adaptive-band with traceback against full-band Needleman-Wunsch kernel.
- Added description and hints to CUDA POA error codes.
- Added support for caching device allocations to reduce time spent allocating device memory.
- CUDA Extender
- ***New*** Added new C++ module for CUDA-accelerated ungapped seed-extension algorithm that uses seed positions in encoded input strands to extend and compute the alignment between the strands, adapted from SegAlign's Ungapped Extender - S. Goenka, Y. Turakhia, B. Paten and M. Horowitz, "SegAlign: A Scalable GPU-Based Whole Genome Aligner," in 2020 SC20: International Conference for High Performance Computing, Networking, Storage and Analysis (SC), Atlanta, GA, US, 2020 pp. 540-552.
- Pygenomeworks
- CIGAR strings for pairwise alignments can now be visualized.
- Other
- System requirements are updated.
- GenomeWorks semantic versioning is changed to calendar versioning.
- `DevicePreallocatedAllocator` allocates exactly the amount of memory requested
- Fixed silent execution errors, which could occur at GPU kernel launch under certain conditions.