Interpolation Improvements
implementing the bisection algorithm without the GIL. Gets rid of a lot of python overhead for speed-ups. If using Cython, might as well use it to get down to C-like code.
All of the `for` loops are executed without any calls to the python interpreter, and the bisection algorithm is equivalent to the np.searchsorted without the python overhead. Also, since we know how big the `inds` and `vals` array will be, might as well pre-allocate them.
At least 2x-3x speed on decent sized arrays.
Should address 23
Ints
- bug fix to ensure we are using ints as indices and to create numpy arrays
Utils
- Zero and Identity have a transpose