Cupy

Latest version: v13.4.0

Safety actively analyzes 710445 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 13 of 26

8.0.0b5

Not secure
This is the release note of v8.0.0b5. See [here](https://github.com/cupy/cupy/milestone/76?closed=1) for the complete list of solved issues and merged PRs.

Highlights

CUB is now bundled with CuPy so that everyone can use it out-of-the-box (thanks leofang!). This release also introduces a mechanism to enable acceleration using different libraries, [`CUPY_ACCELERATORS` environment variable](https://docs.cupy.dev/en/latest/reference/environment.html). You can enable CUB and cuTENSOR by setting `export CUPY_ACCELERATORS=cub,cutensor`.

The new features include an implementation of the SciPy ndimage filters contributed by coderforlife and the introduction of the `cupy_backends` library, used to decouple the CUDA ecosystem APIs from CuPy itself.
Currently, `cupy_backends` is considered an undocumented API and it is subject to further refactoring. In the meantime, you can still continue to use [`cupy.cuda.*` APIs](https://docs.cupy.dev/en/latest/reference/cuda.html).

Changes without compatibility

Supported Platform (3670)

As announced previously, we dropped support for CUDA 8.0 and 9.1. We are also going to drop support for [NumPy 1.15](https://github.com/cupy/cupy/issues/3643) and [SciPy 1.2 or earlier](https://github.com/cupy/cupy/issues/3631) in the upcoming release.

CUB (2584, 3461, 3562)

CUB is now bundled in the source tree. As a consequence, gcc-6 or later is required for the CuPy v8 build. If you are building CuPy from source on systems with legacy gcc, follow the instructions below. *These steps are not necessary for general users using wheel packages.*


Ubuntu 16
$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test
$ sudo apt-get update
$ sudo apt-get install g++-6
$ export NVCC="nvcc --compiler-bindir gcc-6"

CentOS 6 and 7:
$ sudo yum install centos-release-scl
$ sudo yum install devtoolset-7-gcc-c++
$ source /opt/rh/devtoolset-7/enable


CUB-related environment variables (`CUB_PATH`, `CUB_DISABLED`) are no longer effective. You need to enable CUB by setting `CUPY_ACCELERATORS=cub` environment variable to boost reduction kernels and several functions such as `min`, `max`, `sum`, and `scan`.

cuTENSOR (3592)

In response to the introduction of `CUPY_ACCELERATORS`, you need to explicitly specify the option `CUPY_ACCELERATORS=cutensor` to enable cuTENSOR.

Others

- Avoid early compilation when initializing a `RawModule` instance (3534)
- Remove `CHAINER_SEED` (3674)
- Remove `sum_duplicate` parameter in sparse `min`/`max`/`argmin`/`argmax` (3676)

New Features

- Support multistage reduction and indexing in `cupy.fuse` (2734, thanks xuzijian629!)
- Implementation of ndimage filters (3184, thanks coderforlife!)
- Add `cupy.convolve` (3371, thanks Dahlia-Chehata!)
- Move CUDA low-level API to `cupy_backends` namespace (3386)
- Add `choose_conv_method` (3464, thanks Dahlia-Chehata!)
- Add `cupy.poly1d` (3466, thanks Dahlia-Chehata!)
- Sparse mean (3487, thanks cjnolet!)
- Add support for `cusolverDn<t>syevj` and `cusolverDn<t>syevjBatched` (3488, thanks dmargala!)
- `ndimage` rank-based filters (3500, thanks coderforlife!)
- `ndimage` common linear filters (3505, thanks coderforlife!)
- Implement `flatiter.__iter__()` (3508)
- Implement `has_sorted_indices`, `has_canonical_format`, `sort(ed)_indices()` for sparse matrices (3509)
- Add multi-gpu support to time (3519)
- Add `cupy.correlate` (3525, thanks Dahlia-Chehata!)
- Add `cupyx.scipy.sparse.kron()` (3528)
- Support `ncclSend` / `ncclRecv` from NCCL 2.7 (3567)
- Add `cupyx.scipy.fft.next_fast_len` (3571)
- `ndimage` generic filters (3614, thanks coderforlife!)
- Support CSR matrix multiply (3647)
- Support CSR matrix division (3680)

Enhancements

- Build the `cupy.cuda.cub` module by default (2584)
- Expose cuda IPC runtime calls (3290)
- Merge `CUPY_CUB_BLOCK_REDUCTION_DISABLED` and `CUB_DISABLED` (3461)
- Support CUB histogram (3473)
- Support cuTENSOR 1.1 (3477)
- Added functionality to print nvcc and nvrtc output (3485, thanks mnicely!)
- Support `axis=None` in sparse `min`/`max` (3515)
- Small fixes for CUB block reduction kernels (3520)
- Avoid early compilation when initializing a RawModule instance (3534)
- Improve `_prepare_mask_indexing_single` (3539)
- Support batched slogdet with complex numbers (3551, thanks yoshipon!)
- Fix hip header files (3566)
- Remove `compute_30` when CUDA 11 (3578)
- Change `einsum` not to use cuTENSOR when accelerator is not set (3592)
- Update CUDA 11.0 FP16 header to production release version (11.0.2) (3668)
- Drop support for CUDA 8.0 and 9.1 (3670)
- Remove `CHAINER_SEED` (3674)

Performance Improvements

- Use cuTENSOR in `cupy.sum` (2939)
- Reduce `numpy.ndarray` creation in cuTENSOR operation preparation (3393)
- Improve scan operation (3540)
- Improve `_ArgInfo` init (3549)
- Fix small performance issue (3550)
- Improve `_fft_convolve` (3560)
- Reduce device synchronization in `poly1d` instantiation (3563, thanks Dahlia-Chehata!)
- Reuse FFT plan for `convolve`/`correlate` (3587)
- Improve efficiency of `cupy.fft.fftfreq` and `cupy.fft.rfftfreq` (3653, thanks grlee77!)

Bug Fixes

- Fix `cupyx.scipy.ndimage.sum` taking zero-dimensional input (3425)
- Use `CUSPARSE_VERSION` instead of `CUDA_VERSION` (3491)
- Fix sparse `min`/`max` to return sparse matrix (3536)
- Fix boolean indexing (3538)
- Support 0-size `ndarray` and fix possible error in `__del__` at `fft` (3543)
- Fix `cupy.percentile` type assignment in `asarray` (3570)
- Fix array creation for ndarray list of arrays of different dtypes (3605)
- Change sorting order of COO sparse matrix for cuSPARSE (3620)
- Add `__name__` to custom kernels (3626)
- Fix sparse `argmin`/`argmax` return shape (3639)
- Fix missing imports and `cupy.show_config` (3642)
- Fix sparse matrix related test failures on CUDA 11 (3649)
- Fix error message broken (3669)
- Remove `sum_duplicate` parameter in sparse `min`/`max`/`argmin`/`argmax` (3676)
- Fix broken imports for `cupy.cuda.*` (3685)
- Fix Windows build failure of cuSparse generic API (3690)
- Fix compile option on HIP environment (3604)

Code Fixes

- Use `.data()` for `std::vector` (3022)
- Add short comments for the internals (3475)
- Use absolute import (3496)
- Make type dispatcher from `cupy.cuda.cub` reusable (3546)
- Clean up CUB-related stuff (3562)
- Suppress compile warnings (3573)
- Remove unused descriptor definition (3594)

Documentation

- Add sample code for image resizing (3559, thanks pmixer!)
- Update documentation of `CUPY_ACCELERATORS` (3596)
- Update url and email (3608)
- Add a warning for `sum_duplicates` (3624)
- Remove Chainer related docs (3673)

Installation

- Add missing `cupy_cub.cu` in package data (3572)
- Fix rpath for wheel build (3689)

Tests

- Test against `scipy.fft` when available (3032)
- Add tests for `_cub_reduction` (3462)
- Add mock tests to ensure `cupy.cuda.cub` is used (3467)
- Fix to set `testing.slow` correctly (3501)
- Check NumPy compatibility in `flatiter` tests (3514)
- Fix `slogdet` tests to check dtypes of return values (3577)
- Fix negative value test in `test_helper` (3579)
- Deprecate `numpy_cupy_array_list_equal` (3582)
- Use `numpy_cupy_array_equal` instead of `numpy_cupy_array_list_equal` (3599)
- Checks return types in `testing.numpy_cupy_*` (3621)
- Add tests for sparse max with `axis=None` (3638)
- Parameterize sparse `min`/`max`/`argmin`/`argmax` tests (3656)
- Expose accelerator internal API to one level up (3664)

Others

- Fix to raise `ValueError` for invalid `order` (3498)
- Fix to raise `ValueError` for invalid clipmode (3499)
- Fix to raise `TypeError` for invalid subscripts in `einsum` (3502)
- Use builtins directly (3651, thanks larsoner!)
- Add link to Twitter account (3529)
- Update style checker version for Python 3.7 (3585)
- Bump version to v8.0.0b5 (3687)

8.0.0b4

Not secure
This is the release note of v8.0.0b4. See [here](https://github.com/cupy/cupy/milestone/74?closed=1) for the complete list of solved issues and merged PRs.

Highlights

CuPy v8.0.0b4 focuses on performance improvements by adding a general CUB based reduction kernel contributed by leofang (3244). We also introduce support for the upcoming CUDA 11 (3405) although we don’t provide wheels for it yet. Last but not least, several new routines are added to improve the NumPy and SciPy functions coverage.

Changes without compatibility

Change the behavior of `dia_matrix.diagonal` to follow SciPy 1.5.0 specification. It does not raise `ValueError` for invalid values anymore. Now an empty array is returned instead. (3469)



New Features

- Add `cupy.shape` (3229)
- CUB-backed `_SimpleReductionKernel` (3244, thanks leofang!)
- Add `cupyx.scipy.ndimage` sum, mean, standard deviation and variance (3259, thanks niteya-shah!)
- Support C++ template code in `cupy.RawModule` (3319, thanks leofang!)
- Add `cupy.piecewise` (3329, thanks Dahlia-Chehata!)
- Add `cupy.trim_zeros` (3340, thanks Dahlia-Chehata!)
- Add `cupy.sort_complex` (3348, thanks Dahlia-Chehata!)
- Add `cupy.who` (3361)
- Support `cudaDeviceGetLimit` / `cudaDeviceSetLimit` (3387, thanks leofang!)
- Add `polycompanion` (3398, thanks Dahlia-Chehata!)
- Add wrappers for `cusolverDn<t>potrfBatched` and `cusolverDn<t>potrsBatched` (3399, thanks IvanYashchuk!)
- Add `polyvander` (3404, thanks Dahlia-Chehata!)
- Support CUDA 11.0 (3405)
- Add `cupy.shares_memory` (3432)
- Detect and show Thrust build version (3444, thanks leofang!)

Enhancements

- Refactor cuTENSOR handle initialization (2772)
- Deprecate `testing.numpy_cupy_raises` (3098)
- Align vector access with 3020 3022 (3228)
- Get arch per device and support CUDA 9.2+ (3366, thanks leofang!)
- Fix cuTENSOR routines to raise `ValueError` for invalid arguments (3374)
- Support `ignore_error` in kernel optimization (3410)
- Support boolean in `cupyx.scipy.ndimage` stats functions (3419)
- Raise `TypeError` in `cupy.ndarray.__array__` (3421)
- Make Optuna optional to allow import (3427)
- Implement `flatiter.copy()` (3442)

Performance Improvements

- Speed up CSR SpMV by orders of magnitude (3430, thanks leofang!)
- Index `CArray` using 32-bit indexes (3448)

Bug Fixes

- Assert that all the pointers are in the same device in `concatenate` (3285)
- Fix `_count_non_nan` datatype for windows (3350)
- Fix `cupyx.time.repeat` to accumulate duration after GPU synchronization (3375)
- Fix `PerfCaseResult` changing `_ts` (3400)
- Fix intermediate dtypes for float16 inputs in `cupyx.scipy.ndimage` stats functions (3402)
- Properly reset current stream in case null stream is destroyed (3423)
- Fix `cupy.power(0j, 0j)` (3449)
- Fix `TypeError` in parameterize test catching `CUDADriverError` (3451)
- Fix `scipy.dia_matrix.diagonal` for `scipy==1.5.0` (3469)

Code Fixes

- Fix array() for readability (2935)
- Remove unnecessary comparison in `cupy.linalg.svd` (3373)
- Fix initial values in `cupy._environment` (3413, thanks leofang!)
- Use `find_packages` in `setup.py` (3424)
- Refactor CUB-backed `_SimpleReductionKernel` (3443)

Documentation

- Add documentation for `cupyx.optimizing.optimize` (3397)
- Fix sphinx version for travis (3416)
- Document `cupy.fromfile` (3439, thanks jakirkham!)
- Fix typos in `cupy.linalg.det docstring` (3456, thanks grlee77!)
- Fix docstring of `tofile()` (3460, thanks leofang!)

Installation

- Add optuna and remove theano for doctest requirement (3446)

Tests

- Add tests for `cupy.cuda.cub` (2598, thanks leofang!)
- Remove chainercv CI configs (3055)
- Add a test to cover accepting large-size arrays via `__cuda_array_interface__` (3297, thanks leofang!)
- Add `__init__.py` to allow importing test packages (3395)
- Fix ChainerCV tests failing in master branch (3411)
- Test CUB SpMV (3428, thanks leofang!)
- Deprecate `testing.empty` (3438)
- Skip some `RawModule` tests for wrong condition (3453)
- Use `unittest.mock` (3468)

Others

- Bump version to v8.0.0b4 (3481)

8.0.0b3

Not secure
This is the release note of v8.0.0b3. See [here](https://github.com/cupy/cupy/milestone/72?closed=1) for the complete list of solved issues and merged PRs.

As announced in [the previous release](https://github.com/cupy/cupy/releases/tag/v8.0.0b2), we are dropping support for CUDA 8.0 / 9.1 in v8 releases (#3301). Based on the feedback from users, we will continue to provide cuDNN support (3303).

Highlights

CuPy v8.0.0b3 introduces a mechanism for optimizing internal parameters when launching reduction kernels using [Optuna](https://optuna.org/). Depending on your GPU and the kernels you execute, you can take advantage of this feature and improve the performance of your codes by letting Optuna to automatically find the best parameters for your GPU.
To take advantage of this, call functions that perform reductions with the following:

python
with cupyx.optimizing.optimize(key=None):
cupy reduction function
y = cupy.sum(x)


CuPy is also taking part in GSoC 2020 and we keep adding new functions to improve our compatibility with NumPy.

New Features

- Optimize kernel launch parameters using Optuna (2731)
- Support cuSPARSE generic API (3242)
- Implement `flatiter.base` property (3250)
- Implement `flatiter.__len__()` special method (3251)
- Implement `flatiter.__next__()` special method (3252)
- Implement `putmask` function (3261, thanks rushabh-v!)
- Show versions of CUB and cuTENSOR on `cupy.show_config` (3271)
- Enable getting R2C/C2R FFT plans from `get_fft_plan()` (3293, thanks leofang!)
- Support surface memory in `RawKernel` (3294, thanks leofang!)
- Add `cupy.bartlett` (3307, thanks niteya-shah!)
- Add `mean` for sparse matrices (3333)
- Support `max_duration` argument in `cupyx.time.repeat` (3357)
- Support `OptimizeContext` serialization (3367)

Enhancements

- Support primitive complex scalar in `RawKernel` (2606)
- Fix the internal streams in multi-GPU Plan1d (3260, thanks leofang!)
- Support additional dtypes and axis sequences in cupy.median (3280, thanks grlee77!)
- Support multiple architectures in `CUPY_NVCC_GENERATE_CODE` (3330, thanks leofang!)
- Fix too small `max_total_time_per_trial` (3365)

Performance Improvements

- Rewrite `cupyx.scipy.ndimage.interpolation` using `ElementwiseKernel` (3166, thanks grlee77!)
- Improve `ElementwiseKernel` cpu time (3298)
- Performance improvements to `blackman`, `hanning` and `hamming` methods (3312, thanks niteya-shah!)
- Use local cache in `cupy.RawKernel` (3341, thanks leofang!)
- Reduce memory usage of `cupy.linalg.svd` (3347)

Bug Fixes

- Fix SciPy version check in `cupyx.scipy.fft` (3311, thanks grlee77!)
- Ensure runtime context on a per-device basis (3321, thanks leofang!)
- Fix `put` when using scalars (3328)
- Assign a work space to `ormqr` functions in `_solve` (3331)
- Fix `linalg.svd` for 0-sized matrices (3354)
- Fix wrong parameter names in kernel launch optimizers (3364)
- `cupy.around` behaves differently from NumPy for EVEN_NUMBER+0.5 (3335)

Code Fixes

- Add alias of shape type (3310)
- Use `shape_t` instead of `tuple` (3315)

Documentation

- Add PFN to the README (3276)
- Remove upper restrictions for numpy and scipy in doc build (3337)

Tests

- Add tests for optimizer for kernel launch parameters (3363)

Others

- Bump version to v8.0.0b3 (3376)

8.0.0b2

Not secure
This is the release note of v8.0.0b2. See [here](https://github.com/cupy/cupy/milestone/70?closed=1) for the complete list of solved issues and merged PRs.

We are planning to drop support for CUDA 8.0 / 9.1 (3301) and cuDNN (3303) in future v8 releases. If you have any concerns, please feel free to leave a comment in these issues.


New Features

- Add notification support for `fallback_mode` (2279, thanks Piyush-555!)
- Support multi-GPU `cupy.cuda.cufft.Plan1d` (2644, thanks leofang!)
- Add `cupy.median` (3134, thanks Harshan01!)
- Add `cupy.flatiter` (3165)
- Add `cupy.gcd` and `cupy.lcm` (3190, thanks niteya-shah!)
- Support `cusolverDn<t>gesvdj` and `cusolverDn<t>gesvdaStridedBatched` (3192)
- Add `cupyx.scipy.ndimage.label` (3210)
- Add `cupyx.scipy.ndimage.grey_erosion` and `cupyx.scipy.ndimage.grey_dilation` (3216)
- Add `cupy.diag_indices` and `cupy.diag_indices_from` (3217, thanks rushabh-v!)
- Support `cusparse<t>csrgeam2` and `cusparse<t>csrgemm2` (3220)
- Add `minimum_filter`, `maximum_filter`, `grey_closing`, `grey_opening` to `scipy.ndimage` (3239)
- Support `cusolverDn<t>gesvdjBatched` (3247)
- Add `cupy.kaiser` (3268, thanks niteya-shah!)
- Support all dtypes in every sorting function in `cupy.cuda.thrust` (3286, thanks leofang!)

Enhancements

- Add R2C/C2R support to `cupy.cuda.cufft.PlanNd` (3102, thanks leofang!)
- Make `RawKernel` and `RawModule` aware of CUDA context (alt) (3201, thanks leofang!)
- Make `diff` return `AxisError` for an invalid axis (3231, thanks grlee77!)
- Improve the efficiency of `cupy.pad` for some simple cases (3281, thanks grlee77!)

- HIP
- Support `einsum` with complex in HIP (3203)
- Add complex support to HIP Blas (3206)

Performance Improvements

- Reduce list and tuple creation in `_kernel` and `reduction` (2702)
- Remove unnecessary `Arg` instantiation in `cuda/function.pyx` (3253)
- Improve `norm` (3278)

Bug Fixes

- Fix: n-dimensional FFTs must preserve array contiguity when copying a view (3034, thanks grlee77!)
- Use larger type to represent index range in `cupy.take` (3118)
- Fix byte buffer handling to support PyPy (3225)
- Fix `_reduce_dims` call in reduction (3262)
- Raise `IndexError` for R2C/C2R FFT with `axes=()` (3264, thanks leofang!)
- Code fix + bug fix for `cupy.cuda.thrust` (3291, thanks leofang!)

Code Fixes

- Remove `cupy/cuda/_environment.py` (3145, thanks leofang!)
- Fix `cupy.fill_diagonal` to implement with `cupy.flatiter` (3207)
- Remove unreachable code (3235)
- Refactor `__array_function__` (3236)
- Simplify `TestEigenvalue` (3288)

Documentation

- Small typo/spelling fixes (3243, thanks svlandeg!)
- Use Sphinx 2.x on Read the Docs (3272)

Tests

- Fix overfow in `matmul` test (2403)
- Add cuTENSOR test (3037)
- Rewrite some tests not use `numpy_cupy_raises` (3155)
- Rewrite tests not use `numpy_cupy_raises` (3256)

8.0.0b1

Not secure
This is the release note of v8.0.0b1. See [here](https://github.com/cupy/cupy/milestone/68?closed=1) for the complete list of solved issues and merged PRs.

Known packaging issues:
* CuPy build fails when using CUDA 8.0 on Windows (3076). Due to this issue, `cupy-cuda80` wheel packages for Windows are unavailable for this version. Linux or CUDA 9.0+ users are unaffected.

Highlights

CuPy gets faster and more stable towards its v8.0.0 release. This version adds a handful of new routines, adds library wide performance improvements and corrects several bugs.

Changes without compatibility

- Removed `cupy.scatter_add`, which had been deprecated since CuPy v4. Use `cupyx.scatter_add` instead.


New Features

- Add `get_global()` to `cupy.RawModule` (2510, thanks leofang!)
- Support multi-GPU in `cupy.cuda.cufft.Plan1d` (2644, thanks leofang!)
- Add `hstack`, `vstack`, and `bmat` to `cupyx.scipy.sparse` (2665, thanks cjnolet!)
- Add `cupy.require` (3083, thanks niteya-shah!)
- Add `cupy.compress` (3103, thanks Harshan01!)
- Add `cupy.ravel_multi_index` (3104, thanks grlee77!)
- Add `cupy.extract` (3109, thanks Harshan01!)
- Add `cupy.bitwise_not` as alias to `invert` (3120, thanks Harshan01!)
- Add `cupy.argwhere` (3135, thanks rushabh-v!)
- Add `cupy.select` (3138, thanks niteya-shah!)
- Add `cupy.cuda.ExternalStream` (3141)
- Add `cupy.array_equal` (3189, thanks rushabh-v!)

Enhancements

- Add `ndarray` variants AND inplace support in `fallback_mode` (2391, thanks Piyush-555!)
- Support array-like start/stop and add `axis` argument to `linspace` (2461, thanks grlee77!)
- Add fp16 support of CUB (2600, thanks y1r!)
- Raise errors instead of assertion on array type checks (2795)
- Drop support for NumPy 1.15 or earlier (2938)
- Import `using_allocator` in `cupy.cuda` (2951, thanks jakirkham!)
- Remove `__future__` imports (2995)
- Support CUB `prod` (3067, thanks leofang!)
- Remove deprecated `cupy.scatter_add` (3074)
- Update `cupy.pad` to use `cupy.linspace` instead of `numpy.linspace` internally (3101, thanks grlee77!)
- Histogram update: support `range`, `weights` and `density` (3124, thanks grlee77!)
- Add support for `ord` = 2, -2, and 'nuc' in `cupy.linalg.norm` (3130, thanks rushabh-v!)
- Use `ElementwiseKernel` in `cupy.fill_diagonal` (3139)
- Allow `dia_matrix` creation from SciPy equivalent (3160, thanks jakirkham!)
- Add `labels` in the benchmark and add `kwargs` to `repeat` (3172, thanks rushabh-v!)
- Add `out` parameter to `cupy.concatenate` and `cupy.stack` (2983)
- Fix `reshape` to raise `ValueError` for order 'K' (3123)

Performance Improvements

- Improve cuDNN performance when using deterministic mode (1380)
- Improve performance of `cumsum` and `cumprod` (2907)
- Improve `ndimage` `convolve` and `correlate` (3179)
- Add check of `c_contiguous` when indexing `CArray` (3191)

Bug Fixes

- Fix an issue using non-existing attribute in cub.pyx (2985)
- Use `size_t nbytes` in `__cuda_array_interface__` (3009, thanks jakirkham!)
- Fix `fill_diagonal` (3011)
- Fix `cupy.random.multivariate_normal` (3018, thanks espg!)
- Use Python scalar as random seed (3054)
- Properly decrement total bytes in memory pool (3068)
- Fix condition to use slice copy in `ndarray.__setitem__` (3088)
- Fix compiler bug when building `cupy.cuda.cub` with CUDA < 9.2 (3089, thanks leofang!)
- Fix `cub_reduction` for `CUPY_CUB_MIN` and float16 arrays (3100)
- Use `time.process_time` instead of `time.clock` (3128, thanks rushabh-v!)
- Add support for 0 sized matrices in `svd` (3140, thanks rushabh-v!)
- Fix CUB-based `cupy.prod` for half precision (3148, thanks leofang!)
- Fix error type and message in `coo_matrix` (3150)
- Allow `MatDescriptor` to be pickle-able (3157, thanks jakirkham!)
- Fix `erfinv` & `erfcinv` in `cupyx.scipy.special` (3159, thanks leofang!)
- Remove some xfails in sorting tests (3167)
- Fix `Event.__del__` behavior on shutdown` (3176)
- Add the missing initialization value in the reduction test (3194, thanks leofang!)

Code Fixes

- Clean up `internal.pyx` (`get_contiguous_strides`) (1950)
- Remove custom `tempdir` context manager (3003)
- Use `intptr_t` instead of `size_t` for cuSPARSE and cuBLAS handles (3081, thanks Harshan01!)
- Use `intptr_t` for cuDNN handles (3082, thanks Harshan01!)
- Minor fix to `using_allocator` (3094)
- Remove `IndexOrValueError` (3096)
- Remove unused argument in `fill_diagonal` (3171)
- Silence sign comparison warning (cont'd) (3181, thanks leofang!)
- Avoid enum comparison (-Wenum-compare) (3182, thanks leofang!)
- Remove unused, deprecated fields from `cudaPointerAttributes` (3183, thanks leofang!)

Documentation

- Update installation guide for conda-forge (3052, thanks leofang!)
- Include `UnownedMemory` in the API docs (3086, thanks jakirkham!)
- Fix gencode example in the doc (3147, thanks leofang!)
- Document `convolve` and `correlate` (3161, thanks jakirkham!)

Examples

- Add mpi4py examples (3049, thanks leofang!)

Tests

- Added a compute capability check for testing grid sync (3051)
- Fix tolerance of fft tests (3056)
- Skip `irfft` tests for compute capability != 7 (3084)
- Rewrite tests not to use `numpy_cupy_raises` cupyx.* tests (3099)
- Rewrite manipulation tests not to use `numpy_cupy_raises` (3122)
- Remove python 2.7 builds (3162)

Others

- Add copyright notice for Random Kit (3107)
- Bump version to v8.0.0b1 (3204)

8.0.0a1

Not secure
This is the release note of v8.0.0a1. See [here](https://github.com/cupy/cupy/milestone/63?closed=1) for the complete list of solved issues and merged PRs.

Known packaging issues:
* CuPy build fails when using CUDA 8.0 on Windows (3076). Due to this issue, `cupy-cuda80` wheel packages for Windows are unavailable for this version. Linux or CUDA 9.0+ users are unaffected.
* ~Wheel packages for CUDA 10.2 (`cupy-cuda102`) are currently unavailable on PyPI. Packages will be published after getting [approval of the file size limit increase](https://github.com/pypa/pypi-support/issues/191).~ (resolved on 2020-02-21)

Highlights

This release adds support for CUDA 10.2 and NumPy 1.18.
CuPy 8.0.0a1 comes with several exciting new features such as better sparse matrix support, and for users who like to write their own CUDA kernels, there is the possibility of using grid synchronization in `RawKernel` and `RawModule` and allow to tune the block size for `ElementwiseKernels`. There are some noticeable performance improvements as well thanks to the extended support of CUB in several CuPy functions.


Changes without compatibility
- update slicing of CSR and CSC matrices for compatibility with SciPy 1.4.0 (2776)
- Fixed to follow Scipy returns empty slices are returned for such cases.
- Separate code and path arguments in `RawModule` (2784)
- Avoid device synchronization in `cupy.allclose` (2799)
- Changed `cupy.isclose` to return a 0-dim `cupy.ndarray` instead of a float value to avoid device synchronization.
- Remove `dtype` argument from `min`/`max` (2875)
- Rename arg of `isscalar` (2974)
- Renamed the argument of `cupy.isscalar` to `element`, previously named as `num`.

New Features

- Added min, max, argmin, argmax to sparse csr and csc matrices (2711, thanks dloney!)
- Add helpers to measure execution times (2740)
- Add `digitize` (2758)
- Support loading PTX in `cupy.RawModule` (2782, thanks leofang!)
- Fix `cupyx.scipy.ndimage.map_coordinates` for cases with coords > 2d (2813, thanks grlee77!)
- Detect synchronization (2819)
- Add `ptp` ndarray method and function (2859, thanks grlee77!)
- Add convex analysis ufuncs to `cupyx.scipy.special` (2861, thanks grlee77!)
- Allow `ElementwiseKernel` to set the block_size (2914)
- Support grid synchronization in `RawKernel` and `RawModule` (2925)
- Add `cupy.conjugate` and make `cupy.conj` its alias (2982)
- Add a keyword-only `plan` argument to `cupyx.scipy.fft.*` (2998, thanks leofang!)

Enhancements

- Support sorting complex arrays (2745, thanks leofang!)
- Fix slow import of cupy (2759, thanks cgohlke!)
- update slicing of CSR and CSC matrices for compatibility with SciPy 1.4.0 (2776, thanks grlee77!)
- Add `nogil` to CUB (2787, thanks y1r!)
- Avoid device synchronization in `cupy.allclose` (2799)
- Skip zero valued coefficients in cupyx.scipy.ndimage.convolve (2846, thanks grlee77!)
- Add CUB reduction support to `mean` (2860, thanks grlee77!)
- Sort type map in `_kernel.pyx` (2881)
- Make test helper decorators pdb-friendly (2888)
- Declare device synchronization at `runtime.free()` (2898)
- Ignore error when peer access is already enabled (2901, thanks leofang!)
- Add CUDA 10.2 support (2910, thanks ksangeek!)
- Show warning for cuFFT bug in `irfftn` (2922)
- Use cuTensor for `einsum` (2928)
- Improve error message for wrong number of arguments in elementwise kernels (2932)
- Use asynchronous copy in `cupy.copyto` (2942)
- `MemoryPointer.__repr__` (2981)
- Allow multiple axes in `expand_dims` (2992)
- Check size before accesing empty vectors data ptr (3025)
- Improve compatibility of `random.randint` (2828)
- Support 64 bit extent `randint` (2829)
- Disallow boolean subtraction (2874)
- Remove `dtype` argument from `min`/`max` (2875)
- Fix handling of dtypes in `cupy.mean` (2903, thanks grlee77!)
- Disallow boolean `negative` (2973)
- Rename arg of `isscalar` (2974)
- Fix `linspace(..., num=1, endpoint=False, retstep=True)` (2975)

Performance Improvements

- Avoid `numpy.can_cast` call to improve guess routine (2673)
- Improve caching in `ElementwiseKernel` (2688)
- Remove memory copy to improve memory range checking (2699)
- Avoid `can_cast` calling to reduce overhead (2704)
- Use `getrfBatched` in `linalg.slogdet` (2735)
- reduce overhead in calls to multi-dimensional FFTs. (2746, thanks grlee77!)
- Allow squashing f-contiguous axes for faster reduction (2822)
- Support CUB prefix sum & product (2919, thanks leofang!)
- Improve performance of element-wise `einsum` where no contraction is necessary (2960)

Bug Fixes

- Fix `true_divide` with dtype argument (2076)
- `keepdims` should always preserve all dimensions in CUB-based reductions (2725, thanks grlee77!)
- Update thrust::complex headers with a bug fix (2741, thanks leofang!)
- Separate code and path arguments in `RawModule` (2784)
- Avoid looking up null pointers' attributes (2802, thanks leofang!)
- Fix range used in `cupyx.scipy.ndimage` filter origin check (2805, thanks grlee77!)
- Detect interpreter shutdown for proper `__del__` behavior (2809)
- Fix `split` and `array_split` with indices overrun (2814)
- Fix `split` and `array_split` with unordered indices supplied (2815)
- Fix compilation error causes when thrust is enabled (2838)
- Fix `testing.shaped_random` for shape `()` (2870)
- Fix `argmin`/`argmax` `dtype` argument (2872)
- Fix `imag` for 0-size array (2886)
- Fix logic to check explicit `size` argument in `ElementwiseKernel` (2909)
- Sets the default value for `thread_local.linalg` if not defined (2915)
- Fix `cupy.cuda.cub.device_segmented_reduce()` not being used (2921, thanks leofang!)
- Fix complex type checks in `_correlate_or_convolve` (2923)
- Fix `ParameterInfo` as a cache key (2941)
- Avoid invalid in-place division in CUB-based mean (2943, thanks grlee77!)
- Fix empty vector access (3020)
- Fix `nvcc` command lookup (3028)

Code Fixes
- Use `intptr_t` for cuSOLVER handles (2718)
- Merge reduction implementations (2732)
- Rename and reorder private functions in `reduction.pxi` (2767)
- Avoid using PyThread API (2769)
- Remove unused `cuParamSetTexRef()` (2770, thanks leofang!)
- Separate reduction code from `_kernel.pyx` (2785)
- Refactor reduction code (2801)
- Refactor ops (2817)
- Separate `CArray` and family from `core.pyx` (2831)
- Add missing blank lines (2887)
- Readability fix in `memory.pyx` (2899)
- Clean up `_scalar.pyx` (2917)
- Enhance type and argument manipulation in elementwise and reduction kernels (2940)
- Remove intermediate aliases of `cupy.sort` (2944, thanks rushabh-v!)
- Silence sign comparison warnings (2949, thanks leofang!)
- Fix typos in comments (2978)
- Remove dependency to six (2980)
- A nit-picking code fix (2988)
- Rename `_op` variable in cub.pyx (3002)
- Remove code paths for unsupported Python versions (3004)

Documentation
- Fix docs of options argument in `RawKernel` and `RawModule` (2643)
- Document device synchronization (2798)
- Fix typo in scipy.fft docs (2804, thanks grlee77!)
- Fix the docstring format of `cupy.asarray` (2821, thanks leofang!)
- Update cuTENSOR version in docs (2948)
- Document `get_allocator` function (2953, thanks jakirkham!)
- Add NumPy 1.18 to installation guide (3005)
- Fix typo in note (3012, thanks Schoyen!)
- Add `cupy-cuda102` (3057)

Installation

- Do not let Python 2 users build CuPy v7+ (2766, thanks leofang!)
- Fix an issue that `cuComplex_bridge.h` is not installed (2984)
- Fix ROCm build errors (3071)

Examples
- Fix GMM example for matplotlib 3 (2996)
- Use `cupy.random` in kmeans example (3026)

Tests
- Test cuTENSOR v1.0.0 (2727)
- Use more stable input to test `linalg.matrix_power` (2788)
- Remove Python 3.4 matrix from Travis CI (2794)
- Drop ChainerCV's test in master branch. (2803)
- Refactor array testing decorators (2818)
- Fix decorator usage in tests (2820)
- Add f-contiguous reduction tests (2830)
- Test `ifloordiv` with numpy 1.18 (2852)
- Fix `test_helper.py` for NumPy 1.18 (2883)
- Avoid 0s in the diagonal of `TestSolveTriangular` inputs (2927)
- Add tests for size argument with no input (2931)
- Print installed packages in pytest (2979)
- Make `testing.parameterize` pdb-friendly (3024)
- Require `scipy` in `test_gmm` (3048)

Others

- Allow install without thrust (2730)
- Add Mergify configuration file (2894)
- Make `cupyx.time.repeat` experimental (2897)
- Make `cupyx.allow_synchronize` experimental (2947)
- Some fixes to `.pfnci/script.sh` (3041)
- Set `CUPY_CI` environment variable in Travis CI and AppVeyor (3058)
- Bump version to v8.0.0a1 (3069)

Page 13 of 26

© 2025 Safety CLI Cybersecurity Inc. All Rights Reserved.