This long-awaited release includes major plotting enhancements in the `heatmap`, `scatter`, and `diagram` commands, as well as a new `export gistic` command, thanks to joint work by tetedange13 and tskir (see below).
There are also significant infrastructure improvements including bug fixes, modernized packaging, and build/test automation.
New features
------------
`diagram`:
- New options `--no-gene-labels` to not display gene labels on the plot, and `-c` / `--chromosome` to plot a single chromosome (628, 629, 634; thanks tetedange13)
`heatmap`:
New CLI options (35, 625, 632, 652; thanks tetedange13 and tskir):
- `--vertical`: Transpose the plot, displaying the genome axis vertically instead of horizontally
- `--delimit-samples`: Add an delimitation line between each sample row (or column, with `--vertical`)
- `--title`: Set the plot title
`scatter`:
- New option `--fig-size`: Set the output image dimensions (600, 641; thanks tetedange13 and tskir)
- Show triangles at the bottom of the plot to indicate where segments are hidden below the plotted region by automatic pruning at 'ymin=-5'. Also log a warning when this happens. (385, 643, 645; thanks tetedange13, tskir, and micknudsen)
`export gistic`:
- New export command to generate an unsegmented "markers" file for use with GISTIC. GISTIC also takes a second input file with corresponding segments in SEG format, which CNVkit can generate with `export seg`. (622, 623, 776; thanks tetedange13, tskir, BioComSoftware)
API and CLI changes
-------------------
- Running `cnvkit.py` without any arguments will now display the full help text instead of an error message.
- Supporting scripts (aside from `cnvkit.py`) are no longer installed automatically. They are still available in the source tree.
Documentation
-------------
- Clarified `bintest` usage, provided an example, and explained outputs. (646; thanks tetedange13 and tskir)
Bugfixes
--------
- Fixed several errors and warnings due to outdated usage of dependencies, e.g. pandas, pysam.
- Fixed the Dockerfile and Docker image to install R packages properly for CNVkit to use internally. (765; thanks 28rietd)
- Made the Makefile example/test workflow more portable across environments. (661, 666, 695, 699; thanks tetedange13)
- `batch`: Apply --drop-low-coverage option in the segmetrics step. (694)
- `bintest`: Include 'probes' column in .cns output so that it is valid .cns (closes 693)
- `fix`: Condense the error message when coordinate set contains duplicate values. (637, 638; thanks tskir)
- `fix`: Choose a smoothing window fraction based on the data size to help correct biases better at the extremes of the GC range, where previously some residual GC bias could still be present after correction. (379)
- BED inputs: Handle UCSC BED 'browser' header line, as used in Agilent BED files with a 2-line header. (closes 696, 618)
Internal
--------
- Modernized the packaging configuration with pyproject.toml, leaving a stub setup.py for legacy setuptools compatibility. (790)
- Set up automated testing through GitHub Actions (GHA) to verify Python versions 3.7 through 3.10 using pytest and tox. The latter make local testing with multiple Python versions more reliable, too. (792, 793, 794)
- Updated minimum dependency versions to roughly match Ubuntu 22.04 LTS packages; these are used in CI, too.
- Applied black and pylint to reformat the codebase consistently and replace deprecated calls to libraries. (795)
- Remove joblib pinning (589, 770; thanks DavidCain and risicle)
- Remove networkx pinning (606, 771; thanks DavidCain)
- Make the extreme-GC filters more easily configurable via `params.py` (738, 752, 753, 764; thanks tetedange13 and tsivaarumugam)