New features:
- Add WebAssembly version (for single queries) 40.
- Allow random on/off in jaccard dist bindings 41.
Bug fixes:
- Fixed a stall with GPU distances sometimes triggered due to a misplaced `__syncwarp()` call.
- Check for sketches which exceed GPU shared memory size, and use global memory rather than failing if it won't fit 43.
- Check all CUDA API calls, including kernel launches.
- Remove `-march=native` from the CMakeLists.txt, which may have been causing Illegal Instruction errors