- Update overall project metrics summary to move to a flexible YAML format that
handles multiple analysis types. Re-include target, duplication and variant
metrics.
- Support disambiguation of mixed samples for RNA-seq pipelines. Handles alignment
to two genomes, running disambiguation and continuation of disambiguated samples
through the pipeline. Contributed by Miika Ahdesmaki and AstraZenenca.
- Handle specification of sex in metadata and correctly call X,Y and
mitochondrial chromosomes.
- Fix issues with open file handles for large population runs. Ensure ZeroMQ contexts
are closed and enable extension of ulimit soft file and user process limits within
user available hard limits.
- Avoid calling in regions with excessively deep coverage. Reduces variant calling
bottlenecks in repetitive regions with 25,000 or more reads.
- Improve `bcbio_nextgen.py upgrade` function to be more consistent on handling of
code, tools and data. Now each require an implicit specification, while other
options are remembered. Thanks to Jakub Nowacki.
- Generalize retrieval of RNA-seq resources (GTF files, transcriptome indexes) to use
genome-resources.yaml. Updates all genome resources files. Contributed by James Porter.
- Use sambamba for indexing, which allows multicore indexing to speed up index
creation on large BAM processing. Falls back to samtools index if not available.
- Remove custom Picard metrics runs and pdf generation. Eliminates dependencies on
pdflatex and R for QC metrics.
- Improve memory handling by providing fallbacks during common memory intensive steps.
Better handle memory on SLURM by explicitly allowing system memory in addition
to that required for processing.
- Update fastqc runs to use a BAM files downsampled to 10 million reads to avoid
excessive run times. Part of general speed up of QC step.
- Add Qualimap to generate plots and metrics for BAM alignments. Off by default
due to speed issues.
- Improve handling of GATK version detection, including support for Appistry versions.
- Allow interruption of read_through trimming with Ctrl-C.
- Improve test suite: use system configuration instead of requiring test specific setup.
Install and use a local version of nose using the installer provided Python.
- Fix for crash with single-end reads in read_through trimming.
- Added a library complexity calculation for RNA-seq libraries as a QC metric
- Added sorting via sambamba. Internally bcbio-nextgen now inspects the headers
of SAM/BAM files to find their sorting status, so make sure tools set it correctly.