Feature
* feat: pass atoms dataframe to C++ via numpy
The previous commit introduced code to pass atoms to C++ via Arrow
tables. This would&39;ve been a better solution, because it would never
require copying data. Unfortunately, it requires linking to the arrow
and pyarrow libraries, and this turned out to be prohibitively
difficult. I did get it to work on my laptop, but (i) it didn&39;t work on
the cluster due to CXX11 ABI mismatches and (ii) it made it impossible
to distribute binary wheels.
Passing atom data with numpy arrays does often require copying. The
reason is that the initial filtering step often causes the atoms
dataframe to be split into multiple chunks. I assume this is a
side-effect of some sort of parallel processing. Even so, the numpy
approach should still be better than having the whole loop in python,
which is what I had before. For one thing, this should cause many fewer
heap allocations. ([`6ac2f3e`](https://github.com/kalekundert/macromol_voxelize/commit/6ac2f3e32179148cca80a2f723a03138f000b8c8))
* feat: pass atoms dataframe to C++ via arrow
- This commit also replaces the regular-expression based channel
assignment algorithm with a much more efficient one. ([`3642a2a`](https://github.com/kalekundert/macromol_voxelize/commit/3642a2a2cee2e09b92032ca49b2319a42476014e))