Hybracter

Latest version: v0.11.2

Safety actively analyzes 724206 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 4

0.9.0

**`--auto` for automatic estimation of chromosome size**

* Thanks to an [issue](https://github.com/gbouras13/hybracter/issues/90) and code from [richardstoeckl](https://github.com/richardstoeckl), Hybracter can now estimate the estimated chromosome size for each sample by passing `--auto`.
* The implementation uses [kmc](https://github.com/refresh-bio/KMC). Specifically, Hybracter uses kmc to count the number of unique 21mers that appear at least 10 times in your long-read FASTQ file. This is because, for a given assembly of length L, and a k-mer size of k, the total number of unique possible k-mers will be given by ( L – k ) + 1, and if L >> k, then it suffices as an estimate of total assembly size
* The estimated chromosome size used by Hybracter will actually be 80% of the number of 21-mers found at least 10 times, as it needs to account for plasmids
* If you aren't sure whether you have enough data for assembly (i.e. coverage lower than 20x), be careful using `--auto`, because the actual assembly size will tend to be larger than the number of unique 21mers found at least 10 times. Therefore, the estimated chromosome size will almost certainly be an underestimate and may lead to Hybracter considering your assembly "complete" when in fact it isn't.

* If you use `--auto`, you do not need to specify the chromosome length in the input. This means you don't need to `-c` with `long-single` or `hybrid-single` and in the input csv sample sheet, you do not need a column with chromosome length.

e.g. for `hybracter long` you only need 2 columns with sample name and long-read FASTQ file path:

bash
s_aureus_sample1,sample1_long_read.fastq.gz
p_aeruginosa_sample2,sample2_long_read.fastq.gz


and for `hybracter hybrid` you only need 4 columns with sample name, long-read FASTQ, and R1 and R2 short-read FASTQ file paths:

bash
s_aureus_sample1,sample1_long_read.fastq.gz,sample1_SR_R1.fastq.gz,sample1_SR_R2.fastq.gz
p_aeruginosa_sample2,sample2_long_read.fastq.gz,sample2_SR_R1.fastq.gz,sample2_SR_R2.fastq.gz


**Other changes**

* Hybracter v0.9.0 will automatically support the reorientation of archaeal chromosomes (thanks [richardstoeckl](https://github.com/richardstoeckl)) to begin with the cog1474 Orc1/cdc6 gene.
* `--datadir` can now also accept 2 paths separated by a comma, if you have long reads and short reads in separate directories e.g. `--datadir "long_read_dir,short_read_dir"` (https://github.com/gbouras13/hybracter/issues/76).
* `--min_depth` parameter added. Hybracter will error out if your QC'd long reads have a coverage lower than `min_depth` for a sample (https://github.com/gbouras13/hybracter/issues/89).

0.8.0

* Add `--datadir` that removes the need to add full paths in sample sheet (thanks oschwengers)
* Update medaka to v1.12.1 to support the newest models (84)
* New default medaka model is `r1041_e82_400bps_sup_v5.0.0`
* Adds `--mac` flag if you are running Hybracter on MacOS - it is now recommended from to run Hybracter on Linux if you want the latest Medaka models.
* This is because ONT do not support bioconda install anymore and the latest version (v1.12.1) from pip doesn't work on Mac
* `--mac` will install and run Medaka v1.8.0 as in previous versions and use `r1041_e82_400bps_sup_v4.2.0` as default

0.7.3

* Enforce spades>=v3.15.2 in the `plassembler.yaml` environment
* For some reason, the environment on Linux environments was being solved for v3.14.1, which was causing an error with Unicycler within Plassembler for some samples described (https://github.com/rrwick/Unicycler/issues/318)

0.7.2

* Adds 'circualr=True' to chromosome contig headers where Flye has marked these as such. This bug was introduced in v0.7.0.
* Thanks Nicole Lerminiaux for spotting this

0.7.1

* Fixes bug where `hybracter install -d db_dir` would not work as the `-f` parameter was not being passed to Plassembler. Thanks npbhavya

0.7.0

**Bug fixes**

* Fixes bug where `--configfile` wasn't being passed to Hybracter.
* Fixes bug where `hybracter` would crash if the input long reads were not gzipped 51 thanks wanyuac.

**Changes to short read polishing.**

* Logic added to run `polypolish` v0.6.0 with `--careful` and skip pypolca if the SR coverage estimate is below 5x (note: FASTA files for pypolca will be generated in the processing directory to play nice with Snakemake, but these will be identical to the polypolish output).
* For 5-25x coverage, `polypolish --careful` and `pypolca` with `--careful` will be run.
* For >25x coverage, `polypolish` default and `pypolca` with `--careful` will be run.
* A preprint justifying these changes will be available soon.

**`--logic` changes**
* By default, `--logic` defaults to `last` for `hybracter hybrid`, as there we have found that the polishing strategy implemented above never makes the assembly worse. We suggest never using `--logic best` with `hybracter hybrid`.

**Changes for chromosome contigs and circularity.**
* If hybracter assembles a contig that is greater than the minimum chromosome length but not marked as circular by Flye, this will now be denoted as a chromosome, but not circular. The genome will be marked as complete also.
* These will usually be assemblies with some issue (e.g. prophages, circularisation issues, heterogeneity) and probably require some more attention.
* For example, with the _Vibrio cholerae_ larger chromosome described [here](https://rrwick.github.io/2024/02/15/misassemblies.html), the genome will be marked as 'complete' but the contig will not be marked as 'circular' in the `hybracter` output.
* Such contigs will be polished and be in the final `_chromosome.fasta` output, but they will not be rotated by `dnaapler`.
* These were previously being excluded, which was missing assemblies with structural heterogeneity (causing the chromosome not to completely circularise) or even bacteria with linear chromosomes like [_Borrelia_](https://www.nature.com/articles/37551).

**Adds `--depth_filter`**
* This is passed to [Plassembler](https://github.com/gbouras13/plassembler) and will filter out all putative plasmid contigs that are lower than this depth fraction compared to the chromosome.
* Defaults to 0.25 like Unicycler's implementation.

Page 2 of 4

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.