Switch to symusic
This major version marks the switch from the [miditoolkit](https://github.com/YatingMusic/miditoolkit) MIDI reading/writing library to [**symusic**](https://github.com/Yikai-Liao/symusic), and a large optimisation of the MIDI preprocessing steps.
Symusic is a MIDI reading / writing library written in C++ with Python binding, offering unmatched speeds, [**up to 500 times faster than native Python libraries**](https://github.com/Natooz/MidiTok/issues/112#issuecomment-1895948962). It is based on [minimidi](https://github.com/lzqlzzq/minimidi). The two libraries are created and maintained by Yikai-Liao and lzqlzzq, who did an amazing work, which is still ongoing as many useful features are on the roadmap! 🫶
**Tokenizers from previous versions are compatible with this new version, but their might be some time variations if you compare how MIDIs are tokenized and tokens decoded.**
Performance boost
These changes result in a way faster MIDI loading/writing and tokenization times! **The overall tokenization (loading MIDI and tokenizing it) is** [**between 5 to 12 times faster**](https://github.com/Natooz/MidiTok/issues/112#issuecomment-1896286910) depending the tokenizer and data. You can find other benchmarks [here](https://github.com/Natooz/MidiTok/issues/112#issuecomment-1895948962).
This huge speed gain allows to discard the previously recommended step of pre-tokenizing MIDI files as json tokens, and **directly tokenize the MIDIs on the fly while training/using a model**! We updated the [usage examples of the docs](https://miditok.readthedocs.io/en/latest/examples.html) accordingly, the code is now simplified.
Other major changes
* When using time signatures, time tokens are now computed in ticks per beat, as opposed to ticks per quarter note as done previously. This change is in line with the definition of time and duration tokens, which was not handled following the MIDI norm for note values other than the quarter note until now (https://github.com/Natooz/MidiTok/pull/124);
* Adding new ruff rules and their fixes to comply, increasing the code quality in https://github.com/Natooz/MidiTok/pull/115;
* MidiTok still supports `miditoolkit.MidiFile` objects, but those will be converted on the fly to a `symusic.Score` object and a depreciation warning will be thrown;
* The data augmentation methods on the token level has been removed, in favour of better data augmentation operating directly on MIDIs, now much faster, simplifying processes and now handling durations;
* The docs are fixed;
* The tokenization tests workflows has been unified and considerably simplified, leading to more robust test assertions. We also increased the number of test cases and configurations, while decreasing the test time.
Other minor changes
* Setting special tokens values in TokenizerConf in https://github.com/Natooz/MidiTok/pull/114
* Update README.md by kalyani2003 in https://github.com/Natooz/MidiTok/pull/120
* Readthedocs preview action for PRs in https://github.com/Natooz/MidiTok/pull/125
New Contributors
* kalyani2003 made their first contribution in https://github.com/Natooz/MidiTok/pull/120
**Full Changelog**: https://github.com/Natooz/MidiTok/compare/v2.1.8...v3.0.0