This version adds:
- many improvements to `__getitem__`, including more fancy indexing cases
- partial support for `__setitem__`, including integer and slice indexing
- a cython-optimized way to perform slice-based indexing
- offset arguments for `diagonal()` and `setdiag()`
- miscellaneous bug fixes and test coverage