The bugfix in https://github.com/kundajelab/tfmodisco/pull/47 broke backward compatibility with some earlier versions of numpy. This tagged release incorporates a fix to restore backward compatibility (commit https://github.com/kundajelab/tfmodisco/commit/6be7ea5084589eec15b79c805317b65bff5573d9) and also makes a minor adjustment to the gapped kmer embedding calculation such that forward and reverse-complement versions of a seqlet now give exactly symmetrical embeddings within numerical precision (commit https://github.com/kundajelab/tfmodisco/commit/19461fab8047617604d97551c3e26d49a49d68cd).
To elaborate on the reason the forward and reverse versions of a seqlet would not give perfectly symmetrical embeddings prior to this fix: consider the case of gapped kmers with a word length of 3 and one gap. Previously, I was treating \*NN and NN\* (e.g. \*AA and AA\*) as though they were redundant with each other, so I only used one of them when computing the embedding. However, \*AA vs. AA\* can produce different results due to the difference in padding; concretely, a seqlet that had a sequence AAGGG contains the AA\* gapped kmer but does NOT contain the \*AA gapped kmer. Thus, when I was only including the AA\* and TT\* gapped kmers in my embedding and was NOT including the \*AA and \*TT gapped kmers, then a seqlet that had the sequence AAGGG would be recorded as containing the AA* gapped kmer but its reverse complement CCCTT would NOT be recorded as having any TT-containing gapped kmer; thus, symmetry was broken. With this fix, I now include BOTH AA\* and \*AA as well as BOTH TT\* and \*TT as features in the gapped kmer embedding; thus, a AAGGG seqlet is recorded as having a match to AA\* while the reverse complement CCCTT is recorded as having a match to \*TT, and symmetry is preserved.