Somaticseq

Latest version: v3.7.4

Safety actively analyzes 626763 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 9

3.7.4

ntree_limit is replaced with iteration_range in xgboost.predict in xgboost >=1.4. This release uses `iteration_range=(0, iterations)` instead of `ntree_limit=iterations`.

3.7.3

Allow xgboost hyperparameters be passed into `somaticseq_parallel.py`, e.g., `somaticseq_parallel.py --somaticseq-train --extra-hyperparameters scale_pos_weight:0.1 seed:100`. Previously, they could only be passed into `somatic_xgboost.py`. Beware, however, multi-argument options like `--extra-hyperparameters` and `--features-excluded` cannot be placed immediately before `paired` or `single`, because otherwise it'll try to include `paired` or `single` as an argument instead of invoking `paired` or `single` mode.

3.7.2

- More robustly check sorting order when VCF files are being read. Raise Exception when they are not sorted according to the reference file.
- Change `-u $UID` to `-u $(id -u):$(id -g)` when invoking docker command in `somaticseq.utilities.dockered_pipelines.container_option`.

3.7.1

- Fixed three bugs where dbsnp and cosmic VCF and exclusion-region BED files did not pass properly in `makeSomaticScripts.py`.
- No change in SomaticSeq code otherwise.

3.7.0

Major feature upgrade: SomaticSeq now supports the input of any arbitrary VCF files in addition to the callers we have explicitly incorporated, e.g., via `--arbitrary-snvs callerX_snv.vcf callerY_snv.vcf` and `--arbitrary-indels callerA_indel.vcf callerB_indel.vcf` options for the `somaticseq_parallel.py` command.

Must separate the SNVs and indels into separate VCF files before using them as input to SomaticSeq. If you have a VCF file that has combined SNV and indels, you may use this script included in our repo, i.e., `splitVcf.py -infile combined_variants.vcf -snv snvs.vcf -indel indels.vcf`. Input can be both `.vcf` or `.vcf.gz`. Output will be `.vcf`.

For the "arbitrary input VCF files," calls labeled as REJECT in the FILTER field will not be counted and will be assigned a value of _0_ in the `if_Caller_X` fields. Calls labeled as LowQual will be assigned a value of _0.5_. Calls without any filter label will be counted as a bona fide call for that particular VCF file and assigned a value of _1_, i.e., as though it is a PASS call. So modify your VCF files accordingly if needed.


seqc2_v1.2
* This is a special release for the Somatic Mutation Working Group of the SEQC2 Consortium to establish v1.2 of the somatic reference call set, i.e., [Fang, L.T., Zhu, B., Zhao, Y. _et al_. Establishing community reference samples, data and call sets for benchmarking cancer mutation detection using whole-genome sequencing. _Nat Biotechnol_ **39**, 1151-1160 (2021)](https://doi.org/10.1038/s41587-021-00993-6 "Fang LT, et al. Nat Biotechnol (2021)") / [PMID:34504347](http://identifiers.org/pubmed:34504347) / [SharedIt Link](https://rdcu.be/cxs3D) / [Youtube presentation](https://youtu.be/nn0BOAONRe8).
* This release is based on the older SomaticSeq v2.8.1. It contains many custom scripts specifically designed to complete SEQC2's somatic reference samples project.
* This release is **not** intended for general use.
* No code change, but updated NCBI's new FTP address in the README over the original [commit](https://github.com/bioinform/somaticseq/tree/2c642fc2849d7a9360bd4b1d166b90b3e63ec429). The FTP address for the SEQC2 Somatic Mutation Working Group can be found here [here](https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/seqc/Somatic_Mutation_WG), so navigate there if a file changed its original location.

3.6.3

- Fixed the `--somaticseq-algorithm` option in `makeSomaticScripts.py`. The default is `xgboost` but you can use `ada` (requires R).
- Used `shell=True` option in many `subprocess.call` instances, to allow more complicated arguments to be passed into the command line e.g., `--action 'qsub -l walltime=100:00:00'`.
- Moved `utilities`, `genomicFileHandler`, and `vcfModifier` into `somaticseq` to prevent potential package conflicts.
- Modified `somaticseq/annotate_caller.py` to handle cases where Strelka's VCF file does not have the SomaticEVS field.

Page 1 of 9

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.