---------------------
Added
^^^^^
- Support for orthorhombic and triclinic periodic boxes.
Changed
^^^^^^^
- Greatly improved the performance of the CPU kernel (pydh) by
introducing a cache blocking scheme adapted to the L2 cache
for problems larger than 100k^2.
- Rewrote the interfaces of the CPU (pydh) and the GPU (cudh)
kernels using Cython to enable Python 3 compatibility.