A few tweaks to accompany a pip release.
1. Changed the cache to use an LRU cache, which should speed things up a bit and make smaller cache sizes practical.
2. Change the generation to use a lookup table for the random projection rather than actually recasting the binary bits each time. This should be faster, although I admit I haven't actually benchmarked it. The idea is that for each integer between 0 and 2**16 -1, we can store the unique permutation of SRP rotations (e.g., `[1, -1, 1, 1, ...]`) associated with it.
Then rather than futzing around with the hexdigest, the binary sha1 hash can be directly cast into a series of uint16 integers; and then these can directly index the lookup table. Each of these are 16 floats long; flattening them gives a straightforward array.