Improves the audio transcription and sentiment extraction workflows. Refactors the voice feature extraction workflow and adds several new voice features.
Added
- Docker containers are now versioned via tags and the container components automatically fetch the container matching the installed version of mexca; the container with the `:latest` tag can be fetched with the argument `get_latest_tag=True` (65)
- Classes for extracting voice features (66):
- `AudioSignal`, `BaseSignal` for loading and storing signals in the `mexca.audio.features` module
- `BaseFrames`, `FormantFrames`, `FormantAmplitudeFrames`, `HnrFrames`, `JitterFrames`, `PitchFrames`, `PitchHarmonicsFrames`, `PitchPeriodFrames`, `PitchPulseFrames`, `ShimmerFrames`, `SpecFrames` for computing and storing formant features, glottal pulse features, and pitch features in the `mexca.audio.features` module
- `BaseFeature`, `FeaturePitchF0`, `FeatureJitter`, `FeatureShimmer`, `FeatureHnr`, `FeatureFormantFreq`, `FeatureFormantBandwidth`, `FeatureFormantAmplitude` for extracting and interpolating voice features in the `mexca.audio.extraction` module
- An `all` extra requirements group which installs the requirements for all of mexca's components (i.e., `pip install mexca[all]`, 64)
Changed
- The `SentimentData` class now has a `text` instead of an `index` attribute, which is used for matching sentiment to transcriptions (63)
- The sentence sentiment is merged separately from the transcription in `Multimodal._merge_audio_text_features()` (63)
- librosa (version 0.9) is added as a requirement for the VoiceExtractor component instead of parselmouth; the voice feature extraction now relies on librosa instead of praat (66)
- stable-ts is required to be version 1.1.5 for compatibility with Python 3.7; in a future version, we might remove stable-ts as a dependency (67)
- transformers is added as a requirement for the AudioTranscriber component (67)
- scipy is moved to the general requirements for all components (66)
- The `VoiceExtractor` class and component is refactored with new default features (66)
- Tests make better use of fixtures for cleaner and more reusable code (63)
Fixed
- An error in the audio transcription that occurred for extremely short speech segments below the precision of whisper and stable-ts (63)
Removed
- The `toml` extra requirement for the coverage requirement in the `dev` group (67)