Cupy

Latest version: v13.3.0

Safety actively analyzes 687918 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 26

11.0.0rc1

Not secure
This is the release note of v11.0.0rc1. See [here](https://github.com/cupy/cupy/pulls?q=is%3Apr+is%3Aclosed+milestone%3Av11.0.0rc1) for the complete list of solved issues and merged PRs.

**We are going to release v11.0.0 on July 28th. Please start testing your workload with this release candidate (`pip install --pre cupy-cuda11x -f https://pip.cupy.dev/pre`). See the [Upgrade Guide](https://docs.cupy.dev/en/latest/upgrade.html#cupy-v11) for the list of possible breaking changes.**

We are running a [Gitter chat](https://gitter.im/cupy/community) for general discussions and quick questions. Feel free to join the channel to talk with developers and users!

Highlights

Support CUDA 11.7 (6767)

Full support for CUDA 11.7 has been added as of this release. Binary packages can be installed with the following command: `pip install --pre cupy-cuda11x -f https://pip.cupy.dev/pre`

Unified Binary Package for CUDA 11.2 or later (6730)

CuPy v11 provides a unified binary package named `cupy-cuda11x` that supports all CUDA 11.2+ releases. This replaces per-CUDA version binary packages (`cupy-cuda112`, `cupy-cuda113`, …, `cupy-cuda117`) provided in CuPy v10 or earlier.

Note that CUDA 11.1 or earlier still requires per-CUDA version binary packages. `cupy-cuda102`, `cupy-cuda110`, and `cupy-cuda111` will be provided for CUDA 10.2, 11.0, and 11.1, respectively.

Binary Package for Arm Platform (6705)

CuPy v11 provides `cupy-cuda11x` binary package built for aarch64, which supports CUDA 11.2+ Arm SBSA and JetPack 5.
These wheels are available through our Pip index: `pip install --pre cupy-cuda11x -f https://pip.cupy.dev/aarch64`

Support for `ndarray` subclassing (6720, 6755)

This release allows users to subclass `cupy.ndarray`, using the same protocol as NumPy:

python
class C(cupy.ndarray):

def __new__(cls, *args, info=None, **kwargs):
obj = super().__new__(cls, *args, **kwargs)
obj.info = info
return obj

def __array_finalize__(self, obj):
if obj is None:
return
self.info = getattr(obj, 'info', None)

a = C([0, 1, 2, 3], info='information')
assert type(a) is C
assert issubclass(type(a), cupy.ndarray)
assert a.info == 'information'


Note that view casting and new from template mechanisms are also supported as described by the NumPy [documentation](https://numpy.org/doc/stable/user/basics.subclassing.html).

Add Collective Communication APIs in `cupyx.distributed` for Sparse Matrices

All the collective calls implemented for dense matrices now support sparse matrices. Users interested in this feature should install `mpi4py` in order to perform an efficient metadata exchange.

Google Summer of Code 2022

We would like to give a warm welcome to khushi-411 who will be working in adding support for the `cupyx.scipy.interpolate` APIs as part of her GSoC internship!

Changes without compatibility

Bump base Docker image to the latest supported one (6802)

CuPy official Docker images have been upgraded. Users relying on these images may suffer from compatibility issues with preinstalled tools or libraries.

Changes

New Features

- Add `cupy.setxor1d` (6582)
- Add initial `cupyx.spatial.distance` support from pylibraft (6690)
- Support `cupy.ndarray` subclassing - Part 2 - View casting (6720)
- Add sparse `broadcast` (6758)
- Add sparse `reduce` (6761)
- Add sparse `all_reduce` and minor fixes (6762)
- Add sparse `all_to_all`, `reduce_scatter`, `send_recv` (6765)
- Subclass `cupy.ndarray` subclassing - Part 3 - New from template (ufunc) (6775)
- Add `cupyx.scipy.special.log_ndtr` (6776)
- Add `cupyx.scipy.special.expn` (6790)

Enhancements

- Utilize CUDA Enhanced Compatibility (6730)
- Fix to return correct CUDA version when in CUDA Python mode (6736)
- Support CUDA 11.7 (6767)
- Make the warning for cupy.array_api say "cupy" instead of "numpy" (6791)
- Utilize CUDA Enhanced Compatibility in all wrappers (6799)
- Add support for `cupy-cuda11x` wheel (6800)
- Bump base Docker image to the latest supported one (6802)
- Remove `CUPY_CUDA_VERSION` as much as possible (6810)
- Raise UserWarning in `cupy.cuda.compile_with_cache` (6818)
- cupy-wheel: Use NVRTC to infer the toolkit version (6819)
- Support NumPy 1.23 (6820)
- Fix for NumPy 1.23 (6807)

Performance Improvements

- Improved integer matrix multiplication performance by modifying tuning parameters (6703)
- Use fast convolution algorithm in `cupy.poly1d.__pow__` (6770)

Bug Fixes

- Fix polynomial tests (6721)
- Fix batched matmul for integral numbers (6725)
- Fix `cupy.median` for NaN inputs (6759)
- Fix required cusparse symbol not loaded in CUDA 11.1.1 (6806)

Code Fixes

- Add type annotation in `_cuda_types.py` (6726)
- Subclass rename (6746)
- Add type annotation to JIT internal types (6778)

Documentation

- Add CUDA 11.7 on documents (6768)
- Improved NVTX documentation (6774)
- Fix docs to hide `ndarray_base` (6782)
- Update docs for `cupy-cuda11x` wheel (6803)
- Bump NumPy version used in docs (6824)
- Add upgrade guide for CuPy v11 (6826)

Tests

- Fix mempool tests (6591)
- CI: Fix prep script to show build failure details (6781)
- Fix a potential variable misuse bug (6786)
- Fix CI Docker image build failing in head test (6804)
- Tiny clean up in CI script (6809)

Others

- Fix docker workflow to push to latest image (6832)

Contributors

The CuPy Team would like to thank all those who contributed to this release!

andoorve asi1024 asmeurer cjnolet emcastillo khushi-411 kmaehashi leofang LostBenjamin pri1311 rietmann-nv takagi

11.0.0b3

Not secure
This is the release note of v11.0.0b3. See [here](https://github.com/cupy/cupy/pulls?q=is%3Apr+is%3Aclosed+milestone%3Av11.0.0b3) for the complete list of solved issues and merged PRs.

We are running a [Gitter chat](https://gitter.im/cupy/community) for general discussions and quick questions. Feel free to join the channel to talk with developers and users!

Highlights


Support cuTensorNet as an `einsum` backend (6677) (thanks leofang!)

A new accelerator for CuPy has been added (`CUPY_ACCELERATORS=cutensornet`).
This feature requires `cuquantum-python >= 22.03` and `cuTENSOR >= 1.5.0`. And is used to accelerate and support large array sizes in the `cupy.linalg.einsum` API.

Changes without compatibility

Drop Support for ROCm 4.2 (6734)

CuPy v11 will drop support for ROCm 4.2. We recommend users to use ROCm 4.3 or 5.0 instead.

Drop Support for NumPy 1.18/1.19 and SciPy 1.4/1.5 (6735)

As per [NEP29](https://numpy.org/neps/nep-0029-deprecation_policy.html#support-table), NumPy 1.18/1.9 support has been dropped on 2021. SciPy supported versions are the one released close to NumPy supported ones.

Changes

New Features
- Support cuTensorNet (from cuQuantum) as an `einsum` backend (6677)
- Add `cupy.poly` (6697)
- Support cupy.ndarray subclassing - Part 1 - Direct constructor call (6716)

Enhancements
- Support cuDNN 8.4 (6641)
- Support cuTENSOR 1.5.0 (6665)
- JIT: Use C++14 (6670)
- Support cuTENSOR 1.5.0 (6722)
- Drop support for ROCm 4.2 in CuPy v11 (6734)
- Drop support for NumPy 1.18/1.19 and SciPy 1.4/1.5 in CuPy v11 (6735)
- Fix compilation warning caused by `ifdef` (6739)

Performance Improvements
- Accelerate `bincount`, `histogram2d`, `histogramdd` with CUB (6701)

Bug Fixes
- Fix memory leak in the FFT plan cache during multi-threading (6704)
- Fix `ifdef` for ROCm >= 4.2 (6750)

Code Fixes
- JIT: Cosmetic change of `Dim3` class (6644)

Documentation
- Fix imports of `scatter_add` example (6696)
- Minor improvement on the array API docs (6706)
- Document the returned benchmark object (6712)
- Use exposed name in user guide (6718)

Tests
- Xfail a test of `LOBPCG` on ROCm 5.0+ (6603)
- CI: Update repo for libcudnn7 in cuda10.2 (6708)
- Bump pinned mypy version (6710)
- Follow `scipy==1.8.1` sparse dot bugfix (6727)
- Support testing CUDA 11.6+ in FlexCI (6731)
- Fix GPG key issue in FlexCI base image (6738)

Contributors

The CuPy Team would like to thank all those who contributed to this release!

asi1024 Dahlia-Chehata emcastillo kmaehashi leofang takagi

11.0.0b2

Not secure
This is the release note of v11.0.0b2. See [here](https://github.com/cupy/cupy/pulls?q=is%3Apr+is%3Aclosed+milestone%3Av11.0.0b2) for the complete list of solved issues and merged PRs.

We are running a [Gitter chat](https://gitter.im/cupy/community) for general discussions and quick questions. Feel free to join the channel to talk with developers and users!

Highlights

JIT Improvements (6620, 6640, 6649, 6668)

CuPy JIT has been further enhanced thanks to leofang and eternalphane!
It is now possible to use [CUDA cooperative groups](https://developer.nvidia.com/blog/cooperative-groups/) and access `.shape` and `.strides` attributes of ndarrays.

py
import cupy
from cupyx import jit

jit.rawkernel()
def kernel(x, y):
size = x.shape[0]
ntid = jit.gridDim.x * jit.blockDim.x
tid = jit.blockIdx.x * jit.blockDim.x + jit.threadIdx.x
for i in range(tid, size, ntid):
y[i] = x[i]
g = jit.cg.this_thread_block()
g.sync()

x = cupy.arange(200, dtype=cupy.int64)
y = cupy.zeros((200,), dtype=cupy.int64)
kernel[2, 32](x, y)

print(kernel.cached_code)


The above program emits the CUDA code as follows:

cpp
include <cooperative_groups.h>
namespace cg = cooperative_groups;

extern "C" __global__ void kernel(CArray<long long, 1, true, true> x, CArray<long long, 1, true, true> y) {
ptrdiff_t i;
ptrdiff_t size = thrust::get<0>(x.get_shape());
unsigned int ntid = (gridDim.x * blockDim.x);
unsigned int tid = ((blockIdx.x * blockDim.x) + threadIdx.x);
for (ptrdiff_t __it = tid, __stop = size, __step = ntid; __it < __stop; __it += __step) {
i = __it;
y[i] = x[i];
}
cg::thread_block g = cg::this_thread_block();
g.sync();
}


Initial MPI and sparse matrix support in `cupyx.distributed` (6628, 6658)

CuPy v10 added the `cupyx.distributed` API to perform interprocess communication using NCCL in a way similar to MPI. In CuPy v11 we are extending this API to support sparse matrices as defined in `cupyx.scipy.sparse`. Currently only `send`/`recv` primitives are supported but we will be adding support for collective calls in the following releases.

Additionally, now it is possible to use MPI (through the `mpi4py` python package) to initialize the NCCL communicator. This prevents from launching the TCP server used for communication exchange of CPU values. Moreover, we recommend to enable MPI for sparse matrices communication as this requires to exchange metadata per each communication call that lead to device synchronization if MPI is not enabled.

python
run with mpiexec -n N python …

import mpi4py
comm = mpi4py.MPI.COMM_WORLD
workers = comm.Get_size()
rank = comm.Get_rank()

comm = cupyx.distributed.init_process_group(workers, rank, use_mpi=True)


Announcements

Introduction of generic `cupy-wheel` (EXPERIMENTAL) (6012)

We have added a new package in the PyPI called `cupy-wheel`. This meta package allows other libraries to add a dependency to CuPy with the ability to transparently install the exact CuPy binary wheel matching the user environment. Users can also install CuPy using this package instead of manually specifying a CUDA/ROCm version.


pip install cupy-wheel


This package is only available for the stable release as the current pre-release wheels are not hosted in PyPI.

This feature is currently experimental and subject to change so we recommend users not to distribute packages relying on it for now. Your suggestions or comments are highly welcomed (please visit 6688.)

Changes

New Features
- Support cooperative group in JIT compiler (6620)
- Add support for sparse matrices in `cupyx.distributed` (6628)
- JIT: Support compile-time for-loop unrolling (6649)
- JIT: Support `.shape` and `.strides` (6668)

Enhancements
- Add a few driver/runtime/nvrtc API wrappers (6604)
- Implement `flatten(order)` (6613)
- Implemented a `__repr__` for `cupyx.profiler._time._PerfCaseResult` (6617)
- JIT: Avoid calling default constructor if possible (6619)
- Add missing `cudaDevAttrMemoryPoolsSupported` to hip (6621)
- Add CC 3.2 to Tegra arch list (6631)
- JIT: Add more cooperative group APIs (6640)
- JIT: Add `kernel.cached_code` test (6643)
- Use MPI for management in `cupyx.distributed` (6658)
- Improve warning message in sparse (6669)

Performance Improvements
- Improve copy and assign operation (6181)
- Performance improvement of `cupy.intersect1d` (6586)

Bug Fixes
- Define `float16::operator-()` only for ROCm 5.0+ (6624)
- JIT: fix access to cached codes (6639)
- Fix cuda python CI (6652)
- Fix int64 overflow in `cupy.polyval` (6664)
- JIT: Disable `memcpy_async` on CUDA 11.0 (6671)

Documentation
- Add `--pre` option to instructions installing pre-releases (6612)
- JIT: fix function signatures in the docs (6648)
- Fix typo in performance guide (6657)

Installation
- Add universal CuPy package (6012)

Tests
- Run daily benchmark with head branch against latest release (6598)
- CI: Trigger FlexCI for hotfix branches (6625)
- Remove `jenkins` requirements (6632)
- Fix `TestIncludesCompileCUDA` for HEAD tests (6646)
- Trigger CUDA Python tests with `/test mini` (6653)
- Fix missing f prefix on f-strings fix (6674)

Contributors
The CuPy Team would like to thank all those who contributed to this release!

asi1024 code-review-doctor danielg1111 davidegavio emcastillo eternalphane kmaehashi leofang okuta takagi toslunar

11.0.0b1

Not secure
This is the release note of v11.0.0b1. See [here](https://github.com/cupy/cupy/pulls?q=is%3Apr+is%3Aclosed+milestone%3Av11.0.0b1) for the complete list of solved issues and merged PRs.

We are running a [Gitter chat](https://gitter.im/cupy/community) for general discussions and quick questions. Feel free to join the channel to talk with developers and users!

Notice (2022-04-05)

**We have identified that this release contains a regression that prevents CuPy from working in older CUDA GPUs (Maxwell or earlier). We are planning to fix this issue in the next pre-release. See 6615 for the details.**

Highlights

Increase coverage of `cupyx.scipy.special` APIs (6461, 6582, 6571)

A series of `scipy.special` routines have been added to `cupyx` with optimized CUDA raw kernel implementations. `loggamma`, `multigammaln`, fast Hankel transformations and several other utility special functions are added in these series of PRs by grlee77 and khushi-411.

11.0.0a2

Not secure
This is the release note of v11.0.0a2 See [here](https://github.com/cupy/cupy/pulls?q=is%3Apr+is%3Aclosed+milestone%3Av11.0.0a2) for the complete list of solved issues and merged PRs.

We are running a [Gitter chat](https://gitter.im/cupy/community) for general discussions and quick questions. Feel free to join the channel to talk with developers and users!

Highlights

Improved NumPy functions coverage (6078)

As series of NumPy routines have been proposed as a good-first-issue and as a result, an increasing number of contributors have sent pull requests to help increase the number of available APIs. An issue tracker with the currently implemented issues is available at 6078.

Initial support for `cupy.typing` (6251)

An API equivalent to [`numpy.typing`](https://numpy.org/devdocs/reference/typing.html) to allow the introduction of data types in CuPy and user codes has been added.

Support for CUDA 11.6 (6349)

Initial support for CUDA 11.6 has been added as of this release. However, binary wheels are not yet distributed and users are expected to build CuPy from source meanwhile.

Support for ROCm 5.0 (6466)

Initial support for ROCm 5.0 has been added as of this release. However, binary wheels are not yet distributed and users are expected to build CuPy from source meanwhile.

Changes without compatibility

Drop support for ROCm 4.0 (6420)

CuPy v11 will drop support for ROCm 4.0. We recommend users to use ROCm 4.2/4.3 instead.

Changes

New Features

- Add `cupy.isneginf` and `cupy.isposinf` (6089)
- Add `cupy.typing` (6251)
- Add `asarray_chkfinite` API. (6275)
- Add Box-Cox transformations to `cupyx.scipy.special` (6302)
- Use CUDA's `log1p` for `cupyx.scipy.special.log1p` (6315)
- Add special functions from the CUDA Math API (6317)
- Add `beta` functions to `cupyx.scipy.special` (6318)
- Add `cupy.union1d` API. (6357)
- Add `cupy.float_power` (6371)
- Add `cupy.intersect1d` API. (6402)
- Add `cupy.setdiff1d` api. (6433)
- Add `cupy.format_float_scientific` API (6474)

Enhancements

- First step of `mypy` introduction (4955)
- Fix CI failure to support SciPy 1.8.0 (6249)
- implement overwrite_input in cupy.{percentile,quantile} (6298)
- avoid DeprecationWarning from SciPy 1.8 (`cupyx.scipy.sparse`) (6321)
- Support NumPy 1.22 (6323)
- Remove batched QR solver's experimental mark (6327)
- Make scipy.special ufuncs work with CuPy inputs (6341)
- Fix thrust related build issue with CUDA 11.6 (6346)
- Support CUDA 11.6 (6349)
- Fix CI failure to support SciPy 1.8.0 (6362)
- Fix type annotations in installer (6382)
- Add `__cupy_get_ndarray__` dunder method to transform objects to arrays' (6414)
- Bump Jitify version to fix memory leak (6430)
- Support cuSPARSELt 0.2.0 (repost) (6436)
- Support ROCm 5.0 (6466)
- Warn if unexpectedlly failed to detect device count in `cupy.show_config()` (6472)
- Fix verbose LOBPCG for SciPy 1.8 (6388)

Performance Improvements

- Reduce memory usage in `cupy.sort` (6392)

Bug Fixes

- Fix JIT to support notebook environment (6329)
- Fix `cupyx.ndimage.spline_filter1d` for HIP (6406)
- Fix `cupy.nan_to_num` (6408)
- Fix `cupyx.special.gammainc`, `lpmv` and `sph_harm` for hip (6409)
- Fix boolean views for HIP (6412)
- Fix reduction contiguous size calculation (6457)

Code Fixes

- Remove global `use_hip` flag in setup (6391)
- Hide private names in `cupyx.scipy.linalg` (6449)
- Hide private names in `cupyx.scipy.ndimage` (6450)
- Hide private names in `cupyx.scipy.signal` (6451)
- Hide private names in `cupyx.scipy.sparse` (6454)
- Hide private names in `cupyx.scipy.stats` (6456)

Documentation

- Use `cupy.__version__` instead of `pkg_resources` (6332)
- Tentatively pin intersphinx to SciPy 1.7.1 docs (6440)
- Revert "Tentatively pin intersphinx to SciPy 1.7.1 docs" (6479)

Installation

- Avoid monkeypatching distutils (6273)
- Eliminate unnecessary configuration pass in setup (6389)
- Remove `CUPY_SETUP_ENABLE_THRUST=0` environment variable (6390)
- Drop support for ROCm 4.0 (6420)
- Bump version to v11.0.0a2 (6501)

Tests

- CI: allow discarding docker image cache manually (6269)
- Add slow tests for stable branch (6340)
- Parameterize library installer tests (6343)
- Fix tests for `eigh()` for CUDA 11.6 (6347)
- Avoid empty notification message for scheduled tests (6363)
- Support SciPy 1.8 (6365)
- Add `cupy.testing.installed` (6381)
- Mark XFAIL for SciPy 1.8 release candidate (6385)
- CI: Bump ROCm version from 4.3 to 4.3.1 (6415)
- CI: build docs in parallel (6416)
- CI: Add HEAD tests for stable branch (6423)
- CI: Use default schema/matrix path in `generate.py` (6424)
- Skip hfft related tests in HIP (6427)
- CI: Manage test tags in yaml (6429)
- CI: coverage in reST (6445)
- CI: fix NCCL 2.10 unit test not covered (6448)
- CI: Fix CUDA 11.6 driver update steps (6467)
- Ignore warnings from Optuna 3.0 pre-releases (6470)
- Fix failing tests in ROCm (6482)

Others

- CI: allow specifying special `skip` tag (6468)


Contributors

The CuPy Team would like to thank all those who contributed to this release!

amanchhaparia anaruse asi1024 emcastillo grlee77 IvanYashchuk khushi-411 kmaehashi pri1311 saswatpp takagi

11.0.0a1

Not secure
This is the release note of v11.0.0a1. See [here](https://github.com/cupy/cupy/pulls?q=is%3Apr+is%3Aclosed+milestone%3Av11.0.0a1) for the complete list of solved issues and merged PRs.

We are running a [Gitter chat](https://gitter.im/cupy/community) for general discussions and quick questions. Feel free to join the channel to talk with developers and users!

Highlights

Improved NumPy functions coverage (6078)

As series of NumPy routines have been proposed as a good-first-issue and as a result, an increasing number of contributors have sent pull requests to help increase the number of available APIs. An issue tracker with the currently implemented issues is available at 6078.

Add `cupyx.scipy.special` functions (5687)

Spherical harmonics, Legendre and Gamma functions are implemented using highly performant specific CUDA kernels. Thanks to grlee77!

Initial support for CUDA Graph API by means of stream capture API (4567)


This PR adds the ability of using the CUDA Graph API to greatly reduce the overhead of kernel launching. This is done by using the stream capture API, and example follows.
Thanks to leofang!

py
import cupy as cp

a = cp.random.randint(0, 10, 100, dtype=np.int32)
s = cp.cuda.Stream(non_blocking=True)

with s:
s.begin_capture()
a += 3
a = cp.abs(a)
g = s.end_capture() work is queued, but not yet launched
g.launch()
s.synchronize()


Support `__device__` function in CuPy JIT (6265)

The new interface `cupyx.jit.rawkernel(device=True)` is supported to define a CUDA device function.

py
from cupyx import jit

jit.rawkernel(device=True)
def getitem(x, tid):
return x[tid]

jit.rawkernel()
def elementwise_copy(x, y):
tid = jit.threadIdx.x + jit.blockDim.x * jit.blockIdx.x
y[tid] = getitem(x, tid)


The following CUDA code is generated from the above python code.

cpp
__device__ int getitem_1(CArray<int, 1, true, true> x, unsigned int tid) {
return x[tid];
}
extern "C" __global__ void elementwise_copy(CArray<int, 1, true, true> x, CArray<int, 1, true, true> y) {
unsigned int tid;
tid = (threadIdx.x + (blockDim.x * blockIdx.x));
y[tid] = getitem_1(x, tid);
}


Changes

New Features
- Support stream capture (4567)
- Add additional special functions (spherical harmonics, Legendre, Gamma functions) (5687)
- Add `cupy.asfarray` (6085)
- Add `cupy.trapz` (6107)
- Add `cupy.array_api.linalg` (6131)
- Add `cupy.mask_indices` (6156)
- Add `cupy.array_equiv` API. (6254)
- Add `cupy.cublas.syrk` and `cupy.cublas.sbmv` (6278)
- Add `cupy.vander` API. (6279)
- Add `cupy.ediff1d` API. (6280)
- Add `cupy.fabs` API. (6282)
- Add discrete cosine and sine transforms to `cupyx.scipy.fft` (6288)
- Add `logit`, `expit` and `log_expit` to `cupyx.scipy.special` (6300)
- Add `xlogy` and `xlog1py` to `cupyx.scipy.special`(6301)
- Add `tril_indices` and `tril_indices_from` API. (6305)
- Add `cupy.format_float_positional` (6308)
- Add `cupy.row_stack` API. (6312)
- Add `triu_indices` and `triu_indices_from` API. (6316)

Enhancements
- Raise better message when importing CPU array via DLPack (6051)
- Borrow more non-GPU APIs from NumPy (6074)
- Add more aliases for compatibility with NumPy (6075)
- Import more dtype aliases from NumPy (6076)
- Borrow indexing APIs from NumPy (6077)
- Apply upstream patch to `cupy.array_api` (6086)
- Compile cub/thrust with no unique symbol (6106)
- Support cuDNN 8.3.0 (6108)
- Support all advanced indexing (6127)
- Support CUDA 11.5.1 (6166)
- Support lambda function in `cupy.vectorize` (6170)
- Support eigenvalue solver 64bit API (6178)
- Support cuTENSOR 1.4.0 (6187)
- Make `matmul` support ufunc kwargs (6195)
- Alias NumPy error classes (6212)
- Support comparison to `None` and `Ellipsis` (6222)
- JIT: Fix if expr typing rule (6234)
- Support comparison with more objects (6250)
- JIT: Support `__device__` function (6265)
- More clear warning message (6283)
- Make streams hashable (6285)
- Check isinstance before comparison in `__eq__` (6287)
- Support cuDNN 8.3.2 (6314)
- Deprecate MachAr (support NumPy 1.22) (6188)
- Fix `cupy.linalg.qr` to align with NumPy 1.22 (6225)
- Change a parameter name in `percentile` and `quantile` to support NumPy 1.22 (6228)

Performance Improvements
- Avoid 64bit division for reduce register consumption (6019)
- Remove memory copy in matmul (6179)

Bug Fixes
- Detect repeated axis in reduction (5964)
- Fix `__all__` in `cupyx.scipy.fft` (6071)
- Fix `__getitem__` on Ellipsis and advanced indexing dimension (6081)
- Allow leading unit dimensions in copy source (6118)
- Always test broadcast in `copyto` (6121)
- Fix overloading ambiguity in ndimage filters (6162)
- Fix empty Cholesky (6164)
- Fix empty `solve` (6167)
- Allow `flip` ()-shaped array (6169)
- Handles infinities of the same sign in `logaddexp` and `logaddexp2` (6172)
- Fix 4675 on resolving TODO in 4198 (6197)
- Eigenvalue solver 64bit API on CUDA 11.1 (6201)
- Fix edge case compatibility in `cupy.eye()` (6208)
- Fix `linalg.eigh` and `linalg.eigvalsh` on empty inputs (6210)
- Fix overlapping `out` in `matmul` and `(tensor)dot` (6216)
- Fix `compile_with_cache` returning None (6232)
- Fixing index calculation for random constructor (6257)
- BUG: Fix the .T attribute in the `array_api` namespace (6289)
- Fix stream capture in ROCm (6296)
- Fix cuDNN installer not working (6337)

Code Fixes
- Remove `__all__` from `cupyx/scipy/*` (6149)
- Delete `from os import path` (6152)
- Remove legacy `cp.linalg.solve()` implementation (6161)

Documentation
- Add link to compatibility matrix (6055)
- Update upgrade guide (6058)
- Add v11 to compatibility matrix (6067)
- Exclude `kernel_version` from comparison table (6072)
- Doc: Add more footnotes to comparison table (6073)
- Add polynomial modules to comparison table (6082)
- Add CITATION.bib and update README (6091)
- Remove LLVM_PATH note on document (6093)
- Docs: Update linkcode implementation (6126)
- Update footnotes in comparison table (6142)
- Update conda-forge installation guide (6186)
- Revise Overview for CuPy v10 (6209)
- Docs: CentOS installation from source (6218)
- Fix `cupy.trapz` docstring (6239)
- Fix `eigsh` doc (6266)
- Add `cupy.positive` in API Reference (6274)

Installation
- Replace `distutils` with `setuptools` in Windows `cl.exe` detection (6025)
- Fix for cuDNN directory structure in Windows (6342)

Tests
- Fix `testing.multi_gpu` to add pytest marker (6015)
- CI: add link to ROCm projects in CI coverage matrix (6037)
- CI: use separate project for multi-GPU tests (6050)
- Fix CI result notification message format (6066)
- Fix CI cannot override cuSPARSELt/cuTENSOR version preinstalled (6084)
- Workaround DeprecationWarning raised from pkg_resources (6094)
- Fix missing `multi_gpu` annotation in tests (6098)
- Fix exception handling in cupyx.distributed (6114)
- Improve FlexCI test scripts (6117)
- CI: Add timeout to show_config (6120)
- Trigger FlexCI from GitHub Actions (6130)
- CI: Fix package override sometimes fails in CentOS (6141)
- CI: Need to update CUDA driver in cuda115.multi (6144)
- Add tests for `convolve2d` (6171)
- CI: Update limits to reduce cache size (6174)
- CI: Fix unquoted specifiers (6175)
- Support pre-release NumPy version in tests (6190)
- Remove XFAIL for XPASS tests on ROCm (6259)
- Tentatively pin to `setuptools<60` in Windows CI (6260)
- Fix cache key for github actions (6281)
- Use NVIDIA docker images for CUDA 11.5 (6303)
- Tentatively pin to CUDA Driver 495 (6310)
- Remove unused dtype parameterizing in `tril_indices` test (6322)
- Use `get_include` instead of `array_equiv` for fallback test (6333)
- CI: Add `cuda-slow` test in FlexCI (6335)
- CI: use CUDA docker images for CUDA Python CI (6336)

Others
- Add doc issue template (6294)
- Bump version to v11.0.0a1 (6344)

Contributors

The CuPy Team would like to thank all those who contributed to this release!

akochepasov amanchhaparia asi1024 ColmTalbot emcastillo eternalphane grlee77 haesleinhuepf khushi-411 kmaehashi leofang okuta ptim0626 SauravMaheshkar shwina takagi thomasjpfan tom24d toslunar twmht WiseroOrb Yutaro-Sanada

Page 6 of 26

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.