Unifying Python/C++/CUDA memory: Python buffered array <-> C++11 ``std::vector`` <-> CUDA managed memory.
- Provides creation functions
+ Python object
* `zeros`
* `from_numpy`
+ C++ vector
* `CuVec<T>` with identical interface as `std::vector<T>`
+ CPython API (castable to `PyObject *`)
* `PyCuVec<T> *PyCuVec_zeros(std::vector<Py_ssize_t> shape);`
* `PyCuVec<T> *PyCuVec_zeros_like(PyCuVec<T> *other);`
* `PyCuVec<T> *PyCuVec_deepcopy(PyCuVec<T> *other);`
- Quick start documentation
- 100% test coverage
- Allows CUDA-free installation for `sdist`
Requires a C++11 compiler, CUDA compiler, and an NVIDIA GPU with compute capability 3.5 or greater.
Python functionality requires Python 3.6 or greater.