Seismic-rna

Latest version: v0.24.2

Safety actively analyzes 723717 Python packages for vulnerabilities to keep your Python projects secure.

Page 1 of 13

0.24.2

Bug Fixes

- Fixed serious bug where reads with a zero-length segment would erroneously have their 5' ends set to 1. This affected reads after the `mask` step in which one mate overlapped the selected region and the other mate did not overlap the region, as well as unspliced reads within a batch containing at least one spliced read. This bug was introduced in v0.24.0 when the way that segments are masked out was changed to be compatible with NumPy v2.1, and was not caught by the unit tests. The unit tests have since been updated to catch this bug.
- Fixed bug where `draw` would include the `fold` branch when searching for table files.

Performance

- Calculating tables now iterates over only positions that have ≥1 read, which speeds up this step for long, sparsely covered reference sequences.

**Full Changelog**: https://github.com/rouskinlab/seismic-rna/compare/v0.24.1...v0.24.2

0.24.1

New Features

- New command `seismic sim abstract` calculates the parameters for simulation from table and/or SEISMICgraph files.
- In `dispatch`, `args` can now be any iterable (rather than only a `list`), the function can now optionally return an iterator rather than always returning a `list`, and the elements can optionally be returned as soon as they are available rather than all at once.
- To improve efficiency, the clustering algorithm now runs EM fewer times initially (`--min-em-iter`), then checks if the runs are sufficiently similar, and only if they are not will it run additional trials of EM, up to `--max-em-iter`.

Updated parameters

- `--min-mut-gap` now defaults to 4 (rather than 3) because new evidence suggests that the observer bias extends 1 nt further than we thought.
- C99 is now the default standard in `meson.build` to improve portability.
- The parameter `max_procs` has been renamed to `num_cpus` and merged with `n_procs`.

Bug fixes

- Fix bug causing `seismic ensembles` to crash due to missing argument `count_mut` in `ensembles.py`.
- Fix bug causing `seismic graph` to crash if using the option `--struct-file`.
- Catch `ZeroDivisionError` and return NaN in `unbias.py`.
- Fix bug where `seismic fold` would still delete temporary files with the option `--keep-tmp`.
- Fix bug causing `seismic graph mutdist` to crash if there were no reads of non-zero length.
- Fix bug where `seismic splitbam` required input BAM files to be in a directory called `align`.

**Full Changelog**: https://github.com/rouskinlab/seismic-rna/compare/v0.24.0...v0.24.1

0.24.0

New Features

- Branches: `align`, `splitbam`, `relate`, `list`, `mask`, `cluster`, `fold`, and `sim relate` now accept the option `--branch/-b`, which creates a new branch of the workflow. This enables you to run the same step multiple times on the same inputs with different parameters, and keep and compare the results of different parameters. The branch name is appended to the step; e.g. running `cluster -b k3` will write its outputs into a directory called `cluster_k3` rather than `cluster`. Branches are cumulative so that you can track them through the whole workflow; e.g. if you run `align -b mapq10` (which writes to `align_mapq10`) and then input the results of align into `relate -b noambindel`, the results of relate will be in `relate_mapq10_noambindel`.
- Introns: `relate` can now handle reads with introns (`N` in the SAM/BAM file CIGAR string). This is useful for when you generate SAM/BAM files using a splice-aware aligner, rather than the default aligner (Bowtie 2) that SEISMIC-RNA uses (which does not detect introns). Handling introns (if any exist) in `relate` is always active; it is not enabled/disabled with a flag.
- `align` and `splitbam` now write the SHA-512 checksum of their FASTA and FASTQ files into the report to make it easier to keep track of which input files were used.
- `splitbam` now accepts directories in addition to SAM/BAM files, ensures no files have the same sample name, and writes reports.
- `draw` now supports SVG and PNG formats.
- `draw` can now install RNArtistCore using `jgo`.
- `mask` now accepts `--count-mut` option to make it easier to count a single type of mutation.
- `mask` now accepts multiple files (and directories) for `--mask-pos-file` and `--mask-read-file`.
- `migrate` now enables converting from v0.23 to v0.24 format, and it copies the files before reformatting them to prevent data loss if any errors occur.

Bug Fixes

- Fixed bug causing `align` to crash if there are 0 reads.
- Fixed bug where `cached_property` objects (e.g. `RelateBatch.read_nums`) would be stored inside batch files, increasing their size.

Format Changes

- Each brickle file now stores a `dict` of instance attributes rather than an instance itself, so that class names and module structures can be modified between versions without causing backwards incompatibilities.
- File checksums now use SHA-512 instead of MD5 for better security.

API Changes

- New `interface.py` module makes it easier to load datasets and tables.
- Run functions and path module functions now accept both `str` and `Path` instances.
- Path building/parsing functions now accept `tuple`/`list` and `dict` arguments instead of `*args` and `**kwargs` arguments, respectively.
- Classes where each instance has a file path and attributes that correspond to fields in the path now all inherit from `path.HasFilePath`; namely `Report`, `BrickleIO`, `List`, and `Table`. This refactoring consolidates several path handling methods that had been implemented multiple times for the different classes.
- Reads are now permitted to have 0 segments.
- NumPy MaskedArray are no longer used due to bugs when arrays are unsigned and fill values are signed integers.
- NumPy 2.0 and 2.1 are now supported.
- Python 3.10 is no longer supported (only 3.11 and 3.12).

What's Changed
* Relate introns by justinaruda in https://github.com/rouskinlab/seismic-rna/pull/21
* Merge 0.24.0 by matthewfallan in https://github.com/rouskinlab/seismic-rna/pull/22

**Full Changelog**: https://github.com/rouskinlab/seismic-rna/compare/v0.23.1...v0.24.0

0.23.1

Not secure

New Features

- New global command line option `--exit-on-error` makes SEISMIC-RNA exit immediately with code 1 if any uncaught exceptions occur rather than logging a message and exiting with code 0. It is most useful for automating the test suite, i.e. `seismic --exit-on-error test`, which will cause the test suite to exit with code 1 if any test fails (`seismic test` will still exit with code 0 even if tests fail, making it harder to detect failure automatically).

Bug Fixes

- Fixed the wrong default value of `--max-marcd-join`; it was 1.5 and is now 0.0175. A new unit test makes sure it has the same default value as `--min-marcd-run`.
- The global command line option `--profile` has been removed, since it had been nonfunctional for a long time.

**Full Changelog**: https://github.com/rouskinlab/seismic-rna/compare/v0.23.0...v0.23.1

0.23.0

Not secure

New Features

- `seismic ensembles` is a new command that performs scanning clustering, similar to DRACO (https://doi.org/10.1038/s41592-021-01075-w).
- A region of choice (default: full) is divided into overlapping subregions of identical lengths. By default, the length is chosen so that the average read has 2 mutations (the minimum for clustering).
- Each subregion is clustered to determine how many structures it forms.
- Consecutive subregions that form the same number of clusters, with similar mutation rates, are joined automatically.
- Each joined region suggests the existence of an RNA module that folds into one or more structures independently of the surrounding sequences.
- `seismic join` can now automatically determine the best way to match clusters from multiple regions, without needing to provide a `--join-clusts` file. The regions must still have the same numbers or clusters (or none, if coming from the Mask step).
- Three new types of graph have been added (and are also now available through `seismic wf`):
- `seismic graph abundance`: Abundance of each cluster, either as a fraction of the ensemble or as a number of reads in the cluster.
- `seismic graph poscorr`: Phi correlation of the mutations between every pair of positions.
- `seismic graph mutdist`: Histogram of the distance between the two closest mutations in each read (or 0 for each read with fewer than two mutations).
- Both `pool` and `join` now tabulate the pooled/joined datasets automatically (with the option to turn this off).
- Five new commands provide easy access to online resources:
- `seismic biorxiv`: the preprint on bioRxiv
- `seismic docs`: the documentation on GitHub pages
- `seismic github`: this GitHub repository
- `seismic pypi`: the Python Package Index page for SEISMIC-RNA
- `seismic conda`: the Conda page for SEISMIC-RNA

Bug Fixes

- In `seismic relate`, the algorithm that finds ambiguous indels has been redesigned to make it non-recursive, so that it will no longer take extreme amounts of time to process reads with indels in long stretches of low-quality base calls.
- Fixed a bug in `seismic align` and `seismic relate` where processing multiple FASTQ/BAM files with the same sample and reference names could cause crashes or files to be overwritten. Now, if this situation is detected at the beginning, an error is raised to protect the data.
- Fixed a bug in `seismic mask` and `seismic cluster` where processing datasets in multiple output directories with the same sample, reference, and region names could cause crashes or files to be overwritten.
- Fixed a bug where running `seismic mask` with multiple regions of the same Relate dataset simultaneously would cause all but one of those regions to crash.
- Fixed a bug in `seismic mask` where if 0 positions remained at the end of one iteration, it would fail to mask out the remaining reads.
- In `seismic mask`, `-s` has been renamed to `-i`.
- In `seismic join` for clustered datasets, if the joined mask report already exists, then it now checks that the joined regions in the mask report match the regions that will be joined in the cluster report (and raises an error if they do not match).
- When correcting observer bias, the algorithm now issues a warning for `ValueError: Jacobian inversion yielded zero vector` instead of crashing.
- In `seismic cluster`, error that happen when calculating the jackpot quotient and creating the graphs now also trigger warning messages rather than crashes.
- Graphs that take two tables (`corroll`, `delprof`, and `scatter`) now always sort the names of their two samples alphabetically, so that they don't generate multiple sample directories for the same graphs depending on the order of their arguments.
- Replaced `static const char` variables with macros for backwards compatibility with older versions of C.

Logging

- There are now eight levels, including a new level ACTION (for writes to the filesystem and shell commands).
- The verbosity arguments now range from `-vvvv` (log everything to console) to `-qqqq` (log nothing to console).
- Many of the logging messages have been made more concise for easier comprehension.
- SEVERE has been renamed to FATAL.
- COMMAND has been renamed to STATUS.

Other Changes

- To speed up `seismic relate`, batches of read names and the table of relationships per read are no longer written by default, since they are rarely needed but writing them takes a relatively long time.
- The `meson.build` file now includes compiler flags `-O2` and `-DNDEBUG` to make the machine code for `seismic relate` more efficient.
- The `pool` and `join` commands now require specifying the name of the pooled sample and joined region, respectively, to avoid confusion over what the new sample/region is named.
- When calculating the BIC, the threshold for the reads to parameters ratio has been relaxed to cause fewer warning messages.
- In the API, run functions now accept generic `Iterable[str | Path]` arguments where they previously expected only `tuple[str, ...]` arguments.
- Unit tests now test all types of graphs, and run in double verbose mode on the command line.

**Full Changelog**: https://github.com/rouskinlab/seismic-rna/compare/v0.22.3...v0.23.0

0.22.3

Not secure

PyPI

- Wheels for macOS
- Source distribution

**Full Changelog**: https://github.com/rouskinlab/seismic-rna/compare/v0.22.2...v0.22.3

Page 1 of 13

Releases

Has known vulnerabilities

Seismic-rna

Page 1 of 13

0.24.2

0.24.1

0.24.0

0.23.1

0.23.0

0.22.3

Page 1 of 13

Links

Releases