Fixes
- MPI3D was not correctly loaded, first few factors were misaligned
+ Recomputed statistics for new datasets and updated configs
Additions
- Added `disent.data.data.DataFileSprites`, a **custom** version of [sprites](https://paperswithcode.com/dataset/sprites)
+ Added experiment configs and computed dataset statistics
- Multiple version of `disent.dataset.data.Mpi3dData` now exist, for different use cases because the dataset is so large
+ added `Mpi3dHdf5Data` -- converts the files to hdf5 to stream from disk, but very slow to load into memory directly
+ added `Mpi3dNumpyData` -- loads the files directly into memory (quick), cannot read from disk
+ changed: `Mpi3dData` is now a wrapper around both of the above, and the mode can be specified with `in_memory`
- `disent.dataset.util.state_space.StateSpace`
+ Added init checks
+ Added helper method `invert_factor_idxs` that returns the unspecified factor indices, or the inverse set.
+ Added helper method `sample_indices` that samples valid indices in the range of the dataset.
+ Improved sampling and other methods that take in factors to first call `normalise_factor_idxs` so that we can use factor names in these functions instead.
+ Added helper method `sample_random_factor_traversal_grid` that samples a grid of traversals, one for each ground-truth factor.
- Added `disent.util.inout.paths.modify_ext(...)` that modifies the extension of a path
Breaking changes
- move `disent.dataset.util.npz` to `disent.dataset.util.formats.npz`
- move `disent.dataset.util.hdf5` to `disent.dataset.util.formats.hdf5`
- `disent.util.inout.hashing.hash_file` now has `missing_ok=False` by default
Minor Fixes
- Fix `stalefile` now correctly handles missing files
- Various plotting fixes, now functions support RGBa images not just grey or RGB images.
New Tests
- Added some new tests for both dataset formats and state spaces
TODO:
- Added `Teapots3dData` but it is not complete, needs to be converted to a "random" dataset, as this dataset does not actually have valid ground truth factors in the form of a state space, rather they are randomly sampled.