Cnvkit

Latest version: v0.9.11

Safety actively analyzes 681866 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 5 of 7

0.7.5

Global speedups, friendlier error handling and miscellaneous bug fixes.
Documentation updates (thanks kyleabeauchamp; 67).
Expanded unit tests & restored continuous integration (TravisCI).
Raised the minimum pandas version to 0.17.1, the latest.

`rescale` (new command; 64):
- Adjust .cnr or .cns files for normal contamination or subclone fraction.
- Re-center log2 values by median (the usual), mode, mean, or biweight location.

`segment`:
- Detect outlier bins and ignore them during segmentation using a method similar to BIC-seq. Command line option: `--drop-outliers`; any outlier bins found will be logged.

`coverage`:
- If the given target BED files is missing the 4th column (gene names), fill in the dummy name "-" instead of crashing.

`segmetrics`:
- Expose alpha and number of bootstraps as command-line options `-a`/`--alpha` and `-b`/`--bootstrap` for calculating confidence intervals.

`antitarget`:
- Reduce default bin size from 150kb to 100kb.

`fix`:
- Speed improvements: now about 20 times faster on exomes.

API changes:
- Gene names to treat as meaningless and to ignore in reporting (by default "-", ".", "CGH") can be globally configured in `cnvlib/params.py` (params.IGNORE_GENE_NAMES).
- vary.VariantArray (used in `scatter`) can now parse VCF files with no samples (genotypes) as a table of plain loci.

0.7.4

This is primarily a bugfix release.

`export`:
- `bed --show variant` now filters CNAs on sex chromosomes correctly, taking reference and sample genders into account.
- `nexus-ogt` format now emits BAFs more similar to the original VCF allele frequencies. Previously, if multiple SNVs fell into a single CNVkit genomic bin, the allele frequencies of those SNVs would all be "mirrored" above 0.5 before taking the median. Now the SNVs are mirrored in the direction of the majority of the SNVs in the bin, whether above or below 0.5, so that the output looks more balanced and low-frequency SNVs are more apparent.

`heatmap`:
- Sub-chromosomal regions can now be selected for display with the `-c` option, e.g. `-c chr7:125000000-145000000`, just like the same option in `scatter`.

`segment`:
- Fix the listing of gene names in each segment in the output .cns file. Previously, briefly, each gene's name was truncated to 1 character.

0.7.3

access`:
- New command equivalent to the now-deprecated `genome2access.py` script.

`target`, `antitarget`:
- Always write output files in 4-column BED format.

`scatter`:
- Copy ratios (.cnr) are no longer required. Without this input file, behavior is similar to the now-deprecated `loh` command, but still more flexible.
- VCF input file can include multiple tumor samples and PEDIGREE tags; if a tumor sample ID is specified, all PEDIGREE tags will be checked to find the matching normal sample.
- VCFs processed by CLC Genomics Server are now parsed correctly.

`loh`:
- Deprecated. Use `scatter` with `-v` and no .cnr file instead.

`segment`:
- Preliminary support for segmenting SNP allele frequencies from a VCF in addition to total copy number (`-v` option). Details are likely to change in a later release. (34)
- In the `weight` column of the output file, values are now the sum, not the mean, of the weights of the probes covered by that segment.
- The `haar` segmentation method is improved to avoid duplicate breakpoints and run much faster.

`export bed`:
- Deprecate `--show-all` in favor of `--show` with possible arguments `all` (like --show-all), `ploidy` (default behavior), or `variant` (show the same regions as export vcf).

`export vcf`:
- Fix a typo in the SVLEN tag definition in the VCF header -- Number should be 1, not -1 which caused GATK parsing to fail. (57; thanks chapmanb)

Python library `cnvlib`:
- Logging is now done with the Python standard library's `logging` module, making it easier to silence or redirect status messages. In particular, unit tests run more quietly. (52)
- Internal refactoring (including new features in GenomicArray, RegionArray, VariantArray) resulting in changes to the `cnvlib` API , as well as some performance improvements.

0.7.2

A variety of mostly minor improvements and bug fixes over v0.7.1.

`segment`, `gainloss`, `segmetrics`:
- Don't exclude very-low-coverage bins from calculations by default; instead,
expose this option as `--drop-low-coverage`. (This option usually helps on
tumor samples with some normal contamination, but leads to problems on
germline samples with homozygous deletions.)

`segment`:
- Output .cns files now have a "weight" column which is the mean of the weights
of the bins it covers.
- Output of the 'haar' segmentation method now has each segment's gene names
listed, as with the other methods.
- Fixed a bug where every segment's probe count (the "probes" column) could be
overwritten with the `_` character. (53; thanks chapmanb)

`segmetrics`:
- Each statistic is now printed in its own column, instead of squeezing all
stats into the "gene" column. The confidence/prediction interval stats get
two columns, `_lo` and `_hi` (lower and upper bound).

`loh`, `scatter`:
- Given a VCF called on a tumor-normal pair, use the paired normal to select
appropriate germline SNPs for plotting.

`export`:
- New format "nexus-ogt" combines bin-level copy number ratios with b-allele
frequencies given a VCF and a .cnr file. This replaces "nexus-basic" with the
`-v` option that was introduced in v0.7.1; "nexus-ogt" stores the same info
but can be viewed in BioDiscovery Nexus Copy Number without any special
configuration (load it as the "Custom-OGT" data format).
- Renamed `bed` option `--show-neutral` to `--show-all`.
- `vcf` option `-g`/`--gender` now works properly for identifying CNVs on sex
chromosomes.

`call`:
- Fixed the `threshold` method to calculate absolute copy number on sex
chromosomes correctly. (49; thanks tskir)

0.7.1

This is primarily a bugfix release. Many more unit test cases were added to the automated test suite. Code coverage is now monitored at [Codecov](https://codecov.io/github/etal/cnvkit/commits) (thanks stevepeak).

`export nexus-basic`:
- New optional argument `-v`/`--vcf` extracts SNV b-allele frequencies from the given VCF file, matches them to the bins in the .cnr file, and prints an additional "baf" column in the output table. These allele frequencies can then be viewed in Nexus Copy Number, similar to a SNP array.

`call`:
- Fixed a bug in the `threshold` method where the copy number of haploid chromosomes was twice what it should be. The `clonal` method already handled these chromosomes properly. (49)

`reference`:
- Handle blank/empty antitarget BED and coverage (.cnn) files. This was a regression from earlier releases in v0.7.0. (51)
- When calculating GC and RepeatMasker values, catch invalid BED ranges that extend beyond the length of the chromosome and raise an informative error. This would error before, too (in ngfrills.faidx), but the message would be baffling.

`fix`:
- Catch duplicated target ranges, e.g. the exact same bait labeled with two different gene names, and report those ranges in the error message. The `target` command's `--split` option should usually fix these, but sometimes it's not used.

0.7.0

CNVkit now depends on [pandas](http://pandas.pydata.org/), [SciPy](http://scipy.org/), and [PyVCF](https://github.com/jamescasbon/PyVCF). The internals were largely rewritten, so please report any bugs or other regressions you find.

[Documentation](https://cnvkit.readthedocs.org/) is much improved.

export:
- VCF format is supported (5, 41). The generated VCFs are compatible with many third-party tools, including development versions of [MetaSV](https://github.com/bioinform/metasv). (Thanks chapmanb)
- Removed the "freebayes" sub-command; use "export bed" instead.

segment:
- The names of genes (or other targeted loci) covered by each segment are now included in the output .cns file.
- The p-value or q-value threshold (depending on the method) can now be specified with `-t`/`--threshold`.
- The "haar" method works properly now (6). This segmentation algorithm is implemented in Python and does not require R to run. It is a bit faster than CBS, but not as accurate.

loh:
- Plot variant allele frequencies (VAFs) as their actual values, 0 to 1, instead of the mirrored b-allele frequency (0.5 to 1). Draw segment mean allele frequencies separately above and below 0.5. This matches how the equivalent SNP array data are typically viewed.

antitarget:
- Generate off-target bins for all chromosomes present in the "access" BED file, not just those where targeted regions occur. (37)

coverage:
- A minimum read mapping quality (MAPQ) value can now be specified with `-q`/`--min-mapq`. The default value is 0, i.e. reads are no longer excluded for low MAPQ or ambiguous mapping location. This should generally improve calling accuracy and avoid some spurious deletion calls.

Page 5 of 7

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.