Ms2pip

Latest version: v4.0.0

Safety actively analyzes 682404 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 4 of 6

3.11.0

Added
- `fasta2speclib`: Improved workflow for generating spectral libraries starting from a FASTA file, with new configuration options. ⚠️ These changes break compatibility with the previous configuration files. ⚠️ (PR 193, fixes 188)
- Support for C-terminal modifications
- Differentiate between peptide and protein termini for variable modifications
- Allow filtering of peptides based on precursor m/z
- Allow semi-specific cleavage
- Allow non-specific cleavage
- Allow setting of a maximum of variable modifications per peptide
- Add tests for modification assignment
- Add figures for 2023 manuscript (PR 194)

Changed
- Change logging of model configuration to debug level (PR 193)

Removed
- `fasta2speclib`: Removed support for Elude-based RT predictions, RT predictions file, PEPREC filter, saving temporary PEPREC files (PR 193)

Fixed
- Remove unsupported argument for `mzml.read` (PR 193)
- `spectrum_output`: Fix CSV output to always use `\n` line terminators (PR 193)
- `spectrum_output`: Use semi-colon for spectronaut CSV output (PR 193)
- DeepLC integration: Disable PyGAM for default calibration on iRT peptides (led to poor calibration) (PR 193)

3.10.1

Added
- If the precursor charge is not found in the MGF file, the charge from the PeptideRecord file is used instead. (189)
- Added tests for fasta2speclib modification generation. (190)

Fixed
- Fixed issue in fasta2speclib where fixed modifications were added one residue to the left of the actual site. This bug was introduced in commit https://github.com/compomics/ms2pip_c/commit/6f41d4404765ea0f7a0622571c97dd22be6e5b62 and released in v3.10.0. (#190)

3.10.0

Added
- Added support for mzML spectrum files (both for evaluating models and for extracting feature vectors).
- New argument `spectrum_id_pattern`: Regular expression pattern to apply to spectrum titles before matching to peptide file entries.
- When using MS²PIP as class instance, the resulting `pred_and_emp` dataframe can also be returned (instead of writing to a file) when setting `return_results` to `True`.
- If requested, retention time prediction with DeepLC is now also enabled if spectrum file is given. This feature was previously only enabled if only a peptide file was given.

Changed
- Improved logging: Use Rich library for logging, show time stamps and message log levels.
- MS²PIP now shows a progress bar instead of a wall of text to display prediction progress.
- `fasta2speclib`: Improved algorithm for variable modification assignment. Combinatorial explosion from variable modifications is now reduced by setting a maximum of modified residues per peptide, instead of arbitrarily selecting a maximum of potentially modified sites per peptide.
- Update README.md (Switch from BadGen to Shields.io).
- Switch to Pyteomics MGF reader.
- Avoid SciPy dependency.
- More optimal use of Numpy in `calc_correlations`.
- Remove `poetry.lock` (not used, avoid unneeded Dependabot PRs).

Fixed
- Vastly improved computational speed and reduced memory usage when using XGBoost model files for prediction in combination with providing a spectrum file (XGB prediction step is now moved out of multiprocessing).
- For optimal performance, feature vectors for predictions from XGBoost model files now also uses the traditional `ms2pipC.py` multiprocessing system.
- `fasta2speclib`: Fixed issue where modified versions of peptide were duplicated.
- `spectrum_output`: Various fixes in MSP spectral library file writing for DIA-NN compatibility: Write m/z error of 0.0 for each predicted peak in peak annotation string, ensure modifications in MSP `Mods` field are sorted by position, use `RetentionTime` instead of `RTINSECONDS` in comments field.
- Fixed double spectrum_utils entry in requirements.
- Updated `python_requires` to minimal 3.7, following previously updated test grid.
- Fix spectrum_utils modification off-by-one bug (had no consequences except for plot annotations).
- Fixes 170
- Fix typo in `write_amino_acid_masses` function name.
- Fix missing comma in the setup.py.

Removed
- Removed unsupported Tableau output file option

3.9.0

New and improved 🚀
- New prediction model for CID-TMT: TMT-labelled peptide spectra acquired on ion trap (trap-type CID), often used for "MultiNotch MS3" (https://dx.doi.org/10.1021/ac502040v) (PR #157)
- Support for Python 3.9 and 3.10; dropped support for end-of-life Python 3.6 (PR 156, fixes 126)
- Support for alternative cleavage rules (digestion enzymes) in `fasta2speclib` (PR 166, fixes 96)


Bugfixes 🐛
- Fixed missing support for XGBoost models in single-prediction mode (PR 157, fixes 155)
- Use oldest-supported-numpy for build in CI testing (PR 157)


Refactoring and minor changes 🔧
- Replaced C models files with their XGBoost counterpart (except for HCD2019 and TMT): Faster compilation, smaller Python package (PR 157)
- Add `model_dir` option to set custom directory for model downloads (CLI, single-prediction CLI, Python API) (PR 169, fixes 165)
- Add docstring to `MS2PIP` class and add example to `README.md` (PR 167, fixes 131)
- Relaxed click version requirements (PR 157, fixes 158)
- Removed XGBoost warnings from the CLI output (PR 157)
- Various fasta2speclib improvements (PR 166)
- Add deeplc option to default config
- Suppress tensorflow warnings
- Replace deprecated pandas append with concat
- Add missing `sptm` and `gptm` to example config.toml (167)


New prediction models

| Model | Current version | Train-test dataset (unique peptides) | Evaluation dataset (unique peptides) | Median Pearson correlation on evaluation dataset |
| - | - | - | - | - |
| CID-TMT | v20220104 | [in-house dataset] (72 138) | [PXD005890](10.1021/acs.jproteome.7b00091) (69 768) | 0.851085

3.8.0

New and improved 🚀
- New models for non-tryptic peptides and immunopeptides! (PR 137)
Checkout our preprint for more info: https://doi.org/10.1101/2021.11.02.466886
- Support for Windows! Just run `pip install ms2pip` in your Windows terminal, and start predicting. (PR 151)

Bugfixes 🐛
- In DLIB output, a value is now written to the `isDecoy` column. Fixes downstream readout of protein information. (140, PR 152)

Refactoring and minor changes 🔧
- Implementation of `.xgboost` model files directly is now supported, no dump to C and compilation required. (PR 137)

New prediction models

| Model | Current version | Train-test dataset (unique peptides) | Evaluation dataset (unique peptides) | Median Pearson correlation on evaluation dataset |
| - | - | - | - | - |
| HCD2021 | v20210416 | [Combined dataset] (520 579) | [PXD008034](https://doi.org/10.1016/j.jprot.2017.12.006) (35 269) | 0.932361
| Immuno-HCD | v20210316 | [Combined dataset] (460191) | [PXD005231 (HLA-I)](https://doi.org/10.1101/098780) (46 753) <br>[PXD020011 (HLA-II)](https://doi.org/10.3389/fimmu.2020.01981 ) (23 941) | 0.963736<br>0.942383

3.7.1

Fixed:
- Pin NumPy version used during build to fix compatibility with older NumPy versions (PR 148)

Page 4 of 6

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.