Dajin2

Latest version: v0.5.5.1

Safety actively analyzes 681775 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 1 of 4

0.5.5.1

This is a patch for version v0.5.5.

An unfinished inversion detection program had mistakenly been included in the production code.

Since the inversion detection program is scheduled for implementation in version v0.5.6 or later, the code in question has been removed.

0.5.5

πŸ“ Documentation

+ Add `FAQ.md` and `FAQ_JP.md` to address the question: "Why is the read count of the Control sample lower in the output BAM file?". [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/b238d21fbb7cd3330a147bdde65b726278447649)]

πŸ”§ Maintenance

+ Integrating insertion and inversion detection: Issue 31
+ Add sv_handler [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/d994d845b0b8ed0fa8affed7992f1d95bf163073)]

+ Modify arguments of `is_insertion` to `is_sv` [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/f2d3dc4ca2dff60fc869fb1f5b6b08f54490b564)]

+ Remame `insertions_to_fasta.generate_insertions_fasta` to `insertion_detector.detect_insertions` because the function is not only for generating fasta files but also for generating csv tag. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/63c9d63bad627f529f272ea90c035e236f9dd1fb)]

+ Remove unused dependencies
+ `networkx`: Issue 49 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/524186bdce9e28d6357378d0baeb45670d2e22ed)]

0.5.4

πŸ’₯ Breaking

+ Use simulated annealing to optimize cluster assignments in `clustering.constrained_kmenas` [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/b07b626c1def93022e79840e1e6e393fa400cefb)]
+ Since `ortools` is not installable on osx-arm64 in Bioconda, I implemented an alternative method, simulated annealing, to solve min_cost_flow.

+ Change the criteria for terminating clustering. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/db6ec7245d0d1a7ff7204574cffdfd945ee5e854)]
+ The following termination criteria have been added:
- Minimum cluster size is less than or equal to 0.5% of the sample's read number.
- Decrease in the proportion of samples with a silhouette score of 0.25 or higher.
+ The following termination criterion has been removed:
- Adjusted Rand Index >= 0.95, as it led to early termination when minor clusters were generated.

+ The threshold for `clustering.strand bias` determination has been loosened. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/5bbaa7d363bce03d6fbd4ba7fdf1c00e938d9809)]
+ This adjustment addresses cases like `+:13, -:2` (0.87) observed in `example_flox/flox-1nt-deletion`.
+ Since the minor allele is particularly susceptible, further adjustments may be necessary in the future.

🌟 New Features

+ Support for Apple Silicon (osx-arm64) in Bioconda🍎 Issue: 46

0.5.3

πŸ’₯ Breaking

- Update `clustering.clustering`: Use Constrained Kmeans clustering to address the issue of cluster imbalance where extremely minor clusters were preferentially separated. Set `min_cluster_size` to 0.5% of the sample read count. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/c1b14e73d8a95fdb39e510a7a90e501d596b7f3a)]
- As a result, `clustering.label_merger.py` is no longer needed and has been removed.

- Update `consensus.call_consensus`: For mutations determined to be sequence errors, we previously replaced them with unknown (`N`), but this `N` had low interpretability. Therefore, mutations that DAJIN2 determines to be sequence errors will now be assigned the same base as the reference genome. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/1f46215ae7054c4da088c638ad82e41dd0dc7227)]

πŸ› Bug Fixes

- Due to a bias in `classifiler.calc_match` where alleles with shorter sequences were prioritized, the operation of dividing by sequence length has been removed. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/fa6fbd5a7f9693df3b067a3041df42198a0d65b7)]

- Fix `preporcess.mapping.generate_sam` to perform alignments with `map-ont` and `splice` in addition to `sr` for sequence lengths of 500 bp or less, and select the optimal prefix from these alignments. Issue: 45 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/9e7fb93f3c7b74095d2afd08bf3fa0bc00e6f367)]

0.5.2

πŸ“ Documentation

+ Add `FAQ.md` and `FAQ_JP.md` to provide answers to questions. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/c2217b006494ae73fda422a17edaf39fb97e8898)]

🌟 New Features

- Update `mutation_extractor` [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/9444ee701ee52adeb6271552eff70667fb49b854)]
- Simplified the logic of the `is_dissimilar_loci` if statement. Additionally, changed the threshold for determining a mutation in Consensus from 75% to 50% (to accommodate the insertion allele in Cas3 Tyr Barcode10).
- Updated `detect_anomalies` to use MLPClassifier to detect mutations more flexibly and accurately compared to the previous threshold setting with MiniBatchKMeans.

πŸ”§ Maintenance

+ Make DAJIN2 compatible with Python 3.11 and 3.12. Issue: 43 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/8da9118f5c0f584ed1ab12541d5e410d1b9f0da8)]
+ pysam and mappy builds with Python 3.11 and 3.12 are now available on Bioconda.

+ Update GitHub Actions to test with Python 3.11 and 3.12. Issue: 43 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/54df79e60b484da429c1cbf6f12b0c19196452cc)]

+ Resolve the B023 Function definition does not bind loop variable `alignment_lengths` issue. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/9c85d2f0410494a9b71d9905fad2f9e4efe30ed7)]

+ Add `question.yml` in GitHub Issue template. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/1172fddd34c382f92b6778d6f30fd733b458cc04)]


πŸ› Bug Fixes

+ Update `cssplits_handler._get_index_of_large_deletions`: Modified to split large deletions when a match of 10 or more bases is found within the identified large deletion. Issue: 42 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/0c97a9b5fb8cad2ebdaf91b796eed3ce80f5eeee)]

0.5.1

πŸš€ New Features

+ Enable to accept additional file formats as an input. Issue: 37
+ FASTA [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/ee6d392cd51649c928bd604acafbab4b9d28feb1)]
+ BAM [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/1f3a9812756f0a2607ece3551740e4c67955324c)]

πŸ“ Documentation

+ Add a description of the procedure for accepting files generated by Dorado basecaller as input. Issue: 37 [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/c9ebc020fa60980ba7aaaf9295975775ec07da6d)]


πŸ”§ Maintenance

+ Specify the Python version to be between 3.8 and 3.10. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/5fae947eff7da0f7e1ed5e4ff3f95c911fd9f646)]

+ Change `mutation_exporter.report_mutations` to return list[list[str]]. Update the tests accordingly. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/7153cb143d621e136ca94bfe6b391f1d7b61d438)]

+ Apply formatting with Ruff [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/aec9b697863ef06b4e86e248bebde6616f4eb54e)]

πŸ› Bug Fixes

+ Add `reallocate_insertion_within_deletion` into `report.mutation_exporter` and reflected it in the mutation info. [[Commit Detail](https://github.com/akikuno/DAJIN2/commit/ed6a96e01bb40c77df9cd3a17a4c29524684b6f1)]

Page 1 of 4

Β© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.