Vak

Latest version: v1.0.3

Safety actively analyzes 693883 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 6

0.4.0dev1

Note this version was "yanked" from PyPI because of issues with how dependencies were specified
Added
- automate generation of test data.
[274](https://github.com/vocalpy/vak/pull/274)
This pull request also adds concept of 'source' and 'generated' test data,
and decouples them from the source code in other ways, e.g. adding
a Makefile command that downloads them as .tar.gz files from an
Open Science Framework project.
See details in comment on pull request:
https://github.com/vocalpy/vak/pull/274#issue-538992350
- make it possible to specify `spect_output_dir` when `prep`ing datasets,
the directory where array files containing spectrograms are saved
[290](https://github.com/vocalpy/vak/pull/290).
Addresses issue [289](https://github.com/vocalpy/vak/issues/289).
- add ability to specify `previous_run_path` when running `learncurve`,
so that training data subsets generated by a previous run are used
instead of generating new subsets. Controls for any effect of
changing training data across experiments, and makes things faster
[291](https://github.com/vocalpy/vak/pull/291)

Changed
- make it possible for labels in `labelset` to be multiple characters
[278](https://github.com/vocalpy/vak/pull/278)
- switch to `crowsetta` version 3.0.0, making it possible to specify
`csv` as an annotation format
[279](https://github.com/vocalpy/vak/pull/279)
- switch to using `soundfile` to load audio files
[281](https://github.com/vocalpy/vak/pull/281)
- switch to using `poetry` for development
[283](https://github.com/vocalpy/vak/pull/283)
- move `converters` module out of `config` sub-package up to top level
[4ad9b93](https://github.com/vocalpy/vak/commit/4ad9b9390be6ac97b3dbe2b459e94d12d35ff051)
- rename `converters.labelset_from_toml_value` to `labelset_to_set`
since it will be used throughout package (not just with .toml config files)
[4ad9b93](https://github.com/vocalpy/vak/commit/4ad9b9390be6ac97b3dbe2b459e94d12d35ff051)
- make other functions use `converter.labelset_to_set` for `labelset` argument
[35a67d8](https://github.com/vocalpy/vak/commit/35a67d87aabe82b8485162573777d06ff5571409)
[902d840](https://github.com/vocalpy/vak/commit/902d8405610e54da4645732353118439e2349946)
[062902e](https://github.com/vocalpy/vak/commit/062902ed101c8bf5ed6552c2c055a0c15d019396)
[d4e673c](https://github.com/vocalpy/vak/commit/d4e673c792532e311dfb44118e513a615377b2fb)
- rename `vak/validation.py` -> `vak/validators.py`
[9df32e2](https://github.com/vocalpy/vak/commit/9df32e24c650057fc34dd7e53c159bae24192f25)
- raise minimum versions for `crowsetta`, at least 3.1.0, and `tweetynet`, at least 0.5.0
[e1a6fbb](https://github.com/vocalpy/vak/commit/e1a6fbb9d3ccdb63167446684a8aecb3e667fd8a)
- make `vak.io.audio.to_spect` use `vak.logging.log_or_print` function
so that logger messages actually appear in terminal and in log files
[af719b3](https://github.com/vocalpy/vak/commit/af719b30faa4484f2f27a0e0a236310576e8ecb0)

Fixed
- add missing import of `eval` module to `vak.cli.__init__` and organize import statements
[6341c8d](https://github.com/vocalpy/vak/commit/6341c8d4991a4e51565953f8e15d40f13419e6d5)
- fix `vak.files.from_dir` function, that returns list of all files
from a directory with specified extension, so that it is case-insensitive
[276](https://github.com/vocalpy/vak/pull/276)
- fix `vak.annotation.recursive_stem` function so it is case-insensitive
[c02bd8a](https://github.com/vocalpy/vak/commit/c02bd8a8d33eadeb5ce04725d63f1d2e520de737)
- fix `vak.io.audio.to_spect` so validation of `audio_files` is case-insensitive
[cbd08f6](https://github.com/vocalpy/vak/commit/cbd08f6deab7a26fbbb1814fbe6349c578dae20f)
- fix `find_audio_fname` to work with str and Path
[1480b01](https://github.com/vocalpy/vak/commit/1480b01ebc623a64a5c077c26ffdcaa242f29f3e)
- fix how `labelset_to_set` handles set, and add type-checking as pre-condition,
sp that the function doesn't just return `None`
[6c454cd](https://github.com/vocalpy/vak/commit/6c454cda3aded7c0cf7ac19a6eef6f6831220033)
- use `poetry` in Makefile to run scripts that generate test data,
so that development version of `vak` is used,
not some other version that might be installed into an environment
(e.g. a `conda` environment the developer had activated)
[090c205](https://github.com/vocalpy/vak/commit/090c205e227824eda7c1b156f5320129a4809b6b)
- make `source_annot_map` have no side effects, fixes [287](https://github.com/vocalpy/vak/issues/287)
[d1cbe82](https://github.com/vocalpy/vak/commit/d1cbe82132f46f5cc400524dfefdc94de55c430b)

Removed
- remove `tweetynet` as a core dependency, since this creates a
circular dependency (`tweetynet` definitely depends on `vak`)
that prevents using `conda-forge`. Instead declare `tweetynet` as
a test dependency.
[74350a7](https://github.com/vocalpy/vak/commit/c26ad08bfd4057324ba55a1902f7dc2845bc6e40)
- remove `output_dir` parameter from `dataframe.from_files` -- not used
[286](https://github.com/vocalpy/vak/pull/286)
- remove filtering by `labelset` in `dataframe.from_files`
[7dbdc23](https://github.com/vocalpy/vak/commit/7dbdc233a0776e6c205a65ee062f2dce9d479af8)

0.3.3

Fixed
- remove out-of-date install instructions that were confusing people
[268](https://github.com/vocalpy/vak/pull/268)

0.3.2

Fixed
- fix wrong argument value in call to imshow in `plot.spect_annot` function
[648b675](https://github.com/vocalpy/vak/commit/648b675221472f6bcd2750262c57dd8a761099e0)
- fix bug that caused `vak.config.parse` to silently fail when parsing the
`[SPECT_PARAMS]` section of config.toml files
[266](https://github.com/vocalpy/vak/pull/266)

0.3.1

Fixed
- fix `RuntimeError` under torch 1.6 caused by
dividing a tensor by an integer in `Model._eval()` method
[250](https://github.com/vocalpy/vak/pull/250).
Fixes [249](https://github.com/vocalpy/vak/issues/249).

0.3.0

Added
- add functionality to `WindowDataset` that enables training with datasets
of specified durations [188](https://github.com/vocalpy/vak/pull/186)
- add transforms for post-hoc clean up of predicted labels for time bins,
that are applied before converting into segments with labels, onsets, and offsets
+ `majority_vote_transform` that find the most frequently occurring label in a segment
and assigns it to the entire segment [227](https://github.com/vocalpy/vak/pull/227)
+ `remove_short_segments` that removes any segments shorter than a specified duration
[229](https://github.com/vocalpy/vak/pull/229)
- add logic to `WindowDataset.crop_spect_vectors_keep_classes` method so that it tries
to crop a third way, by removing unlabeled segments within vocalizations, if cropping
the specified duration from the end or beginning fails
[224](https://github.com/vocalpy/vak/pull/224)
- add ability to specify name of .csv file containing annotations produced by
`vak.core.predict` [232](https://github.com/vocalpy/vak/pull/232)
- make it so that ItemTransforms (optionally) return path to array files
containing spectrograms, so user can easily link train/test/predict data
returned by `DataLoader` to the source file
[236](https://github.com/vocalpy/vak/pull/236)
- add functions for plotting spectrograms and annotation to `plot` sub-package
[245](https://github.com/vocalpy/vak/pull/245)

Changed
- refactor to remove `util`s modules [196](https://github.com/vocalpy/vak/pull/196)
- add `core.predict` module and rewrite `cli.predict` to use it
[210](https://github.com/vocalpy/vak/pull/210)
- modify `vak.split.algorithms.brute_force` so that it
starts by seeding each split with one instance of each
label in the label set. Quick tests found that this
improves success rate of splits on one dataset
with many (30) classes.
[218](https://github.com/vocalpy/vak/pull/218)
- change `core.predict` so that it always saves
predicted annotations as a .csv file
[222](https://github.com/vocalpy/vak/pull/222).
Removed functionality for converting to other formats.
See discussion in [212](https://github.com/vocalpy/vak/issues/211)
- change warning issued by `split.train_test_dur_split_inds` to a log
statement [231](https://github.com/vocalpy/vak/pull/231)
- use `VocalDataset` in `core.predict`,
see discussion in issue [206](https://github.com/vocalpy/vak/issues/206)
[242](https://github.com/vocalpy/vak/pull/242)
- revise README [248](https://github.com/vocalpy/vak/pull/248)

Fixed
- changes references to `config.ini` in docstrings to `config.toml`
[190](https://github.com/vocalpy/vak/pull/190)
- fix error type in 'config.predict' [197](https://github.com/vocalpy/vak/pull/197)
- add missing `to_format_kwargs` attribute to `PredictConfig` docstring
[210](https://github.com/vocalpy/vak/pull/210)
- add missing parameter in `transforms.default.get_defaults`
[210](https://github.com/vocalpy/vak/pull/210)
- add missing import in `cli.predict`
[210](https://github.com/vocalpy/vak/pull/210)
- revise `autoannotate` tutorial to include missing steps in `predict`
[210](https://github.com/vocalpy/vak/pull/210)
- fix up `config.toml` files that are used with `autoannotate` tutorial
[210](https://github.com/vocalpy/vak/pull/210)
- fix variable name error in `WindowDataset.crop_spect_vectors_keep_classes` method
[215](https://github.com/vocalpy/vak/pull/215)
- fix bug in `WindowDataset.crop_spect_vectors_keep_classes`
[217](https://github.com/vocalpy/vak/issues/217)
that caused `x_inds` to have invalid values when the
`WindowDataset.crop_spect_vectors_keep_classes` function
cropped the vectors to a specified duration "from the front"
[219](https://github.com/vocalpy/vak/pull/219)
- remove line that caused `vak predict` to crash
[211](https://github.com/vocalpy/vak/issues/211)
when model was trained without a `SpectStandardizer` transform
[221](https://github.com/vocalpy/vak/pull/221)
- fix bugs that prevented `vak eval` cli command from working
[238](https://github.com/vocalpy/vak/pull/238)
- fix bug in `labels.lbl_tb2labels` (https://github.com/vocalpy/vak/issues/239)
that resulted from lack of input validation and an indentation error
[240](https://github.com/vocalpy/vak/pull/240)
- fix how segment onsets and offsets are converted from time bin "units"
back to seconds [246](https://github.com/vocalpy/vak/pull/246).
Fixes [237](https://github.com/vocalpy/vak/issues/237).
- fix .toml config file used with "autoannotate" tutorial,
and revise related section of tutorial on prediction
[247](https://github.com/vocalpy/vak/pull/247).
Fixes [223](https://github.com/vocalpy/vak/issues/223).

Removed
- remove `bin/` that contained scripts used with previous version of `vak`
[226](https://github.com/vocalpy/vak/pull/226)
- remove mentions of `.ini` config files from documentation
[248](https://github.com/vocalpy/vak/pull/248)

0.3.0a5

Added
- add functions `format_from_df` and `from_df` to `vak.util.annotation`
[107](https://github.com/vocalpy/vak/pull/107)
+ `vak.util.annotation.from_from_df` returns annotation format associated with a
dataset. Raises an error if more than one annotation format or if format is none.
+ `vak.util.annotation.from_df` function returns list of annotations
(i.e. `crowsetta.Annotation` instances), one corresponding to each row in the dataframe `df`.
- encapsulates control flow logic for getting all labels from a dataset of
annotated vocalizations represented as a Pandas DataFrame
+ handles case where each vocalization has a separate annotation file
+ and the case where all vocalizations have annotations in a single file
- `vak.util.labels.from_df` function [103](https://github.com/vocalpy/vak/pull/103)
+ checks for single annotation type, load all annotations, and then get just labels from those
+ modified to use `util.annotation.from_df` and `vak.util.annotation.format_from_df`
in [107](https://github.com/vocalpy/vak/pull/107)
- logic in `vak.cli.prep` that raises an informative error message when config.toml file specifies
a duration for training set, but durations for validation and test sets are zero or None
[108](https://github.com/vocalpy/vak/pull/108)
+ since there's no functionality for making only one dataset of a specified dataset
- 3 transform classes, and `vak.transforms.util` module [112](https://github.com/vocalpy/vak/pull/112)
+ with `get_defaults` function
- encapsulates logic for building transforms, to make `train`, `predict` etc. less verbose
+ obeys DRY, avoid declaring the same utility transforms like to_floattensor and add_channel in
multiple functions
- add `labelset_from_toml_value` to converters [115](https://github.com/vocalpy/vak/pull/115)
+ casts any value for the `labelset` option in a .toml config file to a set of characters
[127](https://github.com/vocalpy/vak/pull/127)
+ uses `vak.util.general.range_str` so that user can specify
set of labels with a "range string", e.g. `range: 1-27, 29` [115](https://github.com/vocalpy/vak/pull/115)
- add logging module in `vak.util` [132](https://github.com/vocalpy/vak/pull/132)
- add converters and validators for dataset split durations [143](https://github.com/vocalpy/vak/pull/143)
- add `logger` parameters to `io` sub-package functions, so they can use logger created by `cli` functions
[145](https://github.com/vocalpy/vak/pull/145)
- add `log_or_print` function to `util.logging` that either writes message to logger,
or simply prints the message if there is no logger [147](https://github.com/vocalpy/vak/pull/147)
- add `logger` attribute to `vak.Model` class, used to log if not None
[148](https://github.com/vocalpy/vak/pull/148)
- add Tensorboard `SummaryWriter` to `vak.Model` class so there is an `events` file recording each
model's training history [149](https://github.com/vocalpy/vak/pull/149)
+ and add Tensorboard as a dependency in [162](https://github.com/vocalpy/vak/pull/162)
- add additional logging to `Model` class [153](https://github.com/vocalpy/vak/pull/153)
- add initial tutorial on using `vak` for automated annotation of vocalizations
[156](https://github.com/vocalpy/vak/pull/156)
- add `VocalDataset`, more generalized form of a dataset where the input to a network is contained in a source
file, e.g. a .npz array file with a spectrogram, and the optional target is the annotation
[165](https://github.com/vocalpy/vak/pull/165)
- add `transforms.defaults` with `ItemTransforms` that return dictionaries. Decouples logic for
what will be in returned "items" from the different dataset classes [165](https://github.com/vocalpy/vak/pull/165)
- add `eval` command to command-line interface [179](https://github.com/vocalpy/vak/pull/179)
- add `vak.core` sub-package with "core" functions that are called by corresponding functions in
`vak.cli`, e.g. `vak.cli.train` calls `vak.core.train`; de-couples high-level functionality from
command-line interface, and makes it possible for one high-level function to call another, i.e.,
`vak.core.learncurve` calls `vak.core.train` and `vak.core.eval`
[183](https://github.com/vocalpy/vak/pull/183)
- add computation of distance metrics to `Model._eval` method
[185](https://github.com/vocalpy/vak/pull/185)

Changed
- rewrite `vak.util.dataset.has_unlabeled` to use `annotation.from_df`
[107](https://github.com/vocalpy/vak/pull/107)
- bump minimum version of `TweetyNet` to 0.3.1 in [120](https://github.com/vocalpy/vak/pull/120)
+ so that `yarden2annot` function from `TweetyNet` will return annotation labels as string, not int
- rewrite `vak.util.annotation.source_annot_map` so that it maps annotations *to* source files, not
vice versa [130](https://github.com/vocalpy/vak/pull/130)
+ more specifically, it no longer crashes if it can't map every annotation to a source file
+ instead it crashes if it can't map every source file to an annotation
- change `vak.annotation.from_df` to better handle single annotation files
[131](https://github.com/vocalpy/vak/pull/131)
+ no longer crashes if the number of annotations from the file does not exactly match the number of source files
+ instead only requires there at least as many annotations as there are source files
- rewrite `vak.util.labels.from_df` to use `vak.util.annotation.from_df`
[131](https://github.com/vocalpy/vak/pull/131)
- rewrite `WindowDataset` to use `annotation.from_df` function [113](https://github.com/vocalpy/vak/pull/113)
- change default value for util.general.timebin_dur_from_vec parameter n_decimals_trunc from 3 to 5
[136](https://github.com/vocalpy/vak/pull/136)
- rewrite + rename `splitalgos.validate.durs` [143](https://github.com/vocalpy/vak/pull/143)
- parallelize validation of spectrogram files, so it's faster on large datasets
[144](https://github.com/vocalpy/vak/pull/144)
- bump minimum version of `TweetyNet` to 0.4.0 in [155](https://github.com/vocalpy/vak/pull/155)
+ so `TweetyNetModel.from_class` method accepts `logger` argument
- change checkpointing and validation so that they occur on specific steps, not epochs.
[161](https://github.com/vocalpy/vak/pull/161)
This way models with very large training sets that may run for only 1-2 epochs still intermittently save
checkpoints as backups and measure performance on the validation set.
- change names of `TrainConfig` attributes `val_error_step` and `checkpoint_step` to `val_step` and `ckpt_step`
for brevity + clarity. [161](https://github.com/vocalpy/vak/pull/161) Also changed the names of the
corresponding `vak.Model.fit` method parameters to match.
- change `vak.Model._eval` method to work like `vak.cli.predict` does, feeding models non-overlapping
windows from spectrograms [165](https://github.com/vocalpy/vak/pull/165)
- change `reshape_to_window` transform to `view_as_window_batch` because it was not working as intended
[165](https://github.com/vocalpy/vak/pull/165)
- bump minimum version of `TweetyNet` to 0.4.1 in [172](https://github.com/vocalpy/vak/pull/172)
+ version that changes optimizer back to `Adam`
- raise lower bound on `crowsetta` version to 2.2.0, to get fixes for `koumura2annot`
and avoid errors when `annot_file` is provided as a `pathlib.Path` instead of a `str`
[175](https://github.com/vocalpy/vak/pull/175)
- change `Model._eval` method so it returns metrics average across batches, in addition to
the value for each batch
[185](https://github.com/vocalpy/vak/pull/185)
- raise minimum version of `TweetyNet` to 0.4.2, adds distance metrics to `TweetyNetModel`
[9626385](https://github.com/vocalpy/vak/commit/96263858efe880f94dc782cd8a66ec1c051f2ea1)

Fixed
- add missing `shuffle` option to [TRAIN] and [LEARNCURVE] sections in `valid.toml`
[109](https://github.com/vocalpy/vak/pull/109)
- bug that prevented filtering out vocalizations from a dataset when labels are present
in that vocalization that are not in the specified labelset [118](https://github.com/vocalpy/vak/pull/118)
- fix logging for `vak.prep` command [132](https://github.com/vocalpy/vak/pull/132)
- fix how dataset duration splits are validated [143](https://github.com/vocalpy/vak/pull/143),
see issue [140](https://github.com/vocalpy/vak/issues/140) for details.
- fix error due to calling a Path attribute on a string [144](https://github.com/vocalpy/vak/pull/144)
as identified in issue [123](https://github.com/vocalpy/vak/issues/123)
- fix indent error in `Model.fit` method (see issue [151](https://github.com/vocalpy/vak/issues/151))
that stopped training early [153](https://github.com/vocalpy/vak/pull/153)
- fix bug [166](https://github.com/vocalpy/vak/issues/166)
that let training continue even after `patience` number of validation steps had elapsed
without an increase in accuracy [168](https://github.com/vocalpy/vak/pull/168)
- fix `learncurve` functionality so it will work in version `0.3.0`
[183](https://github.com/vocalpy/vak/pull/183)

Removed
- remove `vak.util.general.safe_truncate` function, no longer used
[137](https://github.com/vocalpy/vak/issues/137)
- remove redundant validation of split durations in `util.split`
[143](https://github.com/vocalpy/vak/pull/143)
- removed `save_only_single_checkpoint_file` option and functionality
[161](https://github.com/vocalpy/vak/pull/161).
Now save only one checkpoint as backup, and another for best performance on validation set if provided.
See discussion in pull request and the issues it fixes for more detail.

Page 3 of 6

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.