Feat
- pass all tests with weighted activity selection implementation.
- lazily open Fasta file handle, keeping it open after first read.
- lazily open Fasta file handle, keeping it open after first read.
- generic type annotation for random_chain.
- automatically infer torch DDP usage.
- add transform to map-style dataset. fix: remove region from GVL.sizes, should only have batch_dims excluding region.
- better protocol typing for Reader. feat: random_chain utility function to facilitate randomly chaining GVL loaders.
- experimental map-style torch dataset.
- support torch DDP by specifying distributed framework to GVL. fix: work-in-progress on proper max_end calculation.
- minimum batch dim sizes when shuffle=True (i.e. for training). feat: parallel processing of query regions in pgen and fastavariants. fix: compute max deletion lengths with weighted activity selection, remark on intractable aspects of problem and when heuristic fails. Handle failure in construct haplotypes function. feat[wip]: optionally converting PGEN genotypes to an N5 store, currently segfaults for unknown reasons. Gets further with longer sleep cycles. feat: add chunked attribute to readers so that GVL can attempt to respect chunked layouts. fix: negative indices when slicing VLenAlleles. feat: concat VLenAlleles.
Fix
- reset partition counters on iteration start.
- randomly sample keys in random_chain.