Added
- Add unit tests for `csv.has_unlabled`
[541](https://github.com/vocalpy/vak/pull/541).
Fixes [102](https://github.com/vocalpy/vak/issues/102).
- Add unit tests for `__main__`
[542](https://github.com/vocalpy/vak/pull/542).
Fixes [337](https://github.com/vocalpy/vak/issues/337).
- Add validation of `labels` argument to `vak.split.algorithms.brute_force`,
to prevent conditions where algorithm can fail to converge
because of bad input
[562](https://github.com/vocalpy/vak/pull/562).
Fixes [288](https://github.com/vocalpy/vak/issues/288).
- Add a "Frequently Asked Questions" page to the documentation,
and a page to the "Reference" section on file naming conventions
[564](https://github.com/vocalpy/vak/pull/564).
Fixes [524](https://github.com/vocalpy/vak/issues/524)
and [424](https://github.com/vocalpy/vak/issues/424).
- Add a new way for vak to map annotation files to annotated files
when preparing datasets, e.g. for training models.
For annotation formats that have one annotation file per
annotated file, vak can now recognize when
the annotation files are named by removing the
annotated file extension (e.g., .wav or .npz)
and replacing it with the annotation format extension,
e.g. .txt or .csv. (Other ways of relating annotations
and annotated files are still valid, e.g. by including
the original source audio file in both filenames.)
[572](https://github.com/vocalpy/vak/pull/572).
Fixes [563](https://github.com/vocalpy/vak/issues/563).
- Have runs from command-line interface log version to logfile
[587](https://github.com/vocalpy/vak/pull/587).
Fixes [216](https://github.com/vocalpy/vak/issues/216).
Changed
- Rewrite unit tests in `tests/test_cli/` to use mocks for `vak.core` functions
[544](https://github.com/vocalpy/vak/pull/544).
Fixes [543](https://github.com/vocalpy/vak/issues/543).
- It is now possible to load configuration files
and work with them programmatically even if the paths
they point to do not exist.
The `core` functions handle validation instead.
E.g., the `PrepConfig` class does not check whether
`output_dir` exist is a directory, but `vak.core.prep` does.
[550](https://github.com/vocalpy/vak/pull/550).
Fixes [459](https://github.com/vocalpy/vak/issues/459).
- Refactor and speed up logic for determining whether a
dataset with sequence annotations has unlabeled segments
that should be assigned a "background" label
[559](https://github.com/vocalpy/vak/pull/559).
Fixes [243](https://github.com/vocalpy/vak/issues/243).
- Adds a new sub-sub-package, `datasets.seq`
with a `validators` module, which is where the
re-written `has_unlabeled` function now lives.
Replaces the `vak.csv` module which was not well named.
- Also adds a `has_unlabeled` function to `vak.annotation`
that is used by `vak.datasets.seq.validators.has_unlabeled`;
this function handles edge cases outlined in
[243](https://github.com/vocalpy/vak/issues/243).
- Rename and refactor functions in `vak.annotation`
that map annotations to the files that they annotate,
so that the purpose of the functions is clearer,
and add clearer error messages with links to documentation
about file naming conventions
[566](https://github.com/vocalpy/vak/pull/566).
Fixes [525](https://github.com/vocalpy/vak/issues/525).
- Revise "autoannotate" tutorial to use .wav audio and .csv
annotation files from new release of Bengalese Finch Song
Repository, and to suggest that Windows users unpack
archives with tar, not other programs such as WinZip
[578](https://github.com/vocalpy/vak/pull/578).
Fixes [560](https://github.com/vocalpy/vak/issues/560)
and [576](https://github.com/vocalpy/vak/issues/576).
- Change `vak.files.find_fname` and `vak.files.spect.find_audio_fname`
so they work when spaces are in filename and/or path
[594](https://github.com/vocalpy/vak/pull/594).
Fixes [589](https://github.com/vocalpy/vak/issues/589).
Fixed
- Fix how `vak.core.prep` handles `labelset` parameter.
Add pre-condition that raises a ValueError
when `labelset` is `None` but the .toml config is one of
{'train', 'learncurve', 'eval'}
[545](https://github.com/vocalpy/vak/pull/545).
Avoids running computationally expensive step of generating
and validating spectrograms *before* crashing when trying to
split the dataset using `labelset`. Also avoids silent
failures for datasets that do not require splitting,
e.g., an 'eval' set that could contain labels not in the
training set.
Fixes [468](https://github.com/vocalpy/vak/issues/468).
- Fix how `cli` and `core` functions that have the `csv_path` parameter
handles it. The parameter points to a dataset .csv generated by `vak prep`
that other `core`/`cli` function use: `train`, `learncurve`, `eval`, `predict`.
They now validate that it exists, and if it doesn't, the `cli` functions
politely suggest running `vak prep` first; the `core` functions
raise a FileNotFoundError.
[546](https://github.com/vocalpy/vak/pull/546).
Fixes [469](https://github.com/vocalpy/vak/issues/469).
- Fix bug where `labelmap_path` parameter was ignored by `core.train`.
Change function so that either `labelmap_path` or `labelset` must
be passed in, both passing in both will raise an error.
Also change `cli.train` to only pass in one of those and set the other
to `None`.
[552](https://github.com/vocalpy/vak/pull/552).
Fixes [547](https://github.com/vocalpy/vak/issues/547).
- Fix `vak.annotation.has_unlabeled` to handle the edge case where an
annotation file has no annotated segments
[583](https://github.com/vocalpy/vak/pull/583).
Fixes [378](https://github.com/vocalpy/vak/issues/378).
- Fix `StandardizeSpect` method `fit_df` so that it computes
parameters for standardization from a specific
split of the dataset--the training split, by default--instead
of using the entire dataset, which could technically give rise
to data leakage
[584](https://github.com/vocalpy/vak/pull/583).
Fixes [575](https://github.com/vocalpy/vak/issues/575).
- Fix error message in `vak.core.eval`
[589](https://github.com/vocalpy/vak/pull/589).
Fixes [588](https://github.com/vocalpy/vak/issues/588).