CUDA Quantum is now available [on PyPI](https://pypi.org/project/cuda-quantum/)!
For the initial PyPI release, the NVIDIA multi-gpu and tensornet backends are not yet included. Check out our Docker images [on NGC](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/cuda-quantum) to obtain the fully featured version, or build it from source using the release assets.
With 0.4.0 we have added support for quantum kernel execution on Quantinuum and IonQ backends. For more information, see our [docs](https://nvidia.github.io/cuda-quantum/latest/using/hardware.html).
As always, we welcome questions and feedback in the form of [issues](https://github.com/NVIDIA/cuda-quantum/issues/new/choose) and [discussions](https://github.com/NVIDIA/cuda-quantum/discussions) on this repository.
<!-- Release notes generated using configuration in .github/release.yml at b2abbaa6b021ffa5c9619dcf0530c1284b9c2208 -->
What's Changed
Features and Enhancements 🎉
* Implement cudaq::control() taking a free function as argument by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/35
* Add reset to kernel_builder in C++ and python. by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/18
* Add for_loop to cudaq::kernel_builder and cudaq.Kernel by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/19
* Optimization: do not add control qubits to compute/uncompute steps of compute_action idiom. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/63
* Expose for_each_term and for_each_pauli to python by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/56
* Implement spin_op::to_matrix() by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/31
* Add support for negate operator (operator!) to cudaq::control. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/81
* Improve the ExecutionManager Extension Point by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/33
* Performance enhancements: observe_n and sample_n broadcast functions by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/74
* spin_op performance enhancement by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/115
* Implement chemistry domain sub-package. by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/112
* [optimizer] Decomposition pass by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/143
* Increase performance of quantum allocation and deallocation in simulation by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/167
* Basis translation pass by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/144
* Implement runtime quantum operation tracing by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/92
* [optimizer] Multicontrol decomposition by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/194
* Add Server Helper for Quantinuum backends by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/176
* Expose SWAP gate to c++ and python builder by anthony-santana in https://github.com/NVIDIA/cuda-quantum/pull/200
* [opt] Add memtoreg and regtomem passes. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/233
* Added forward difference gradient evaluation. by poojarao8 in https://github.com/NVIDIA/cuda-quantum/pull/107
* Local emulation of remote qpu execution by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/262
* Implement MPI support in CUDA Quantum by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/237
* [pass] Add a loop normalization pass. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/313
* Linux support for pip installation by anthony-santana in https://github.com/NVIDIA/cuda-quantum/pull/304
Bug Fixes 🐛
* Add overload to handle the case when the user writes: by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/32
* Do not allow operator! on target qubits. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/6
* Temporary fix for AST visitor reentrancy bugs. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/22
* Fix bug 69, no expval attached to `observe_result` when shots aren't provided by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/70
* Fix `QuakeValue` Lifetime Bug and `kernel_builder::to_quake` Handling by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/67
* Adding r-val overload (QuakeValue &&) for kernel_builder adjoint modifier by 1tnguyen in https://github.com/NVIDIA/cuda-quantum/pull/99
* [CircuitCheck] Handle qvec as controls by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/128
* [common-ops] Fixes some matrices (row-major vs col-major) issues by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/133
* [quake] Fixes some matrices (row-major vs col-major) issues by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/140
* Fix 129, Make kernel_builder::qalloc(1) explicitly return a qvec. by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/136
* `QuakeBridgeVisitor` to handle `cphase` by 1tnguyen in https://github.com/NVIDIA/cuda-quantum/pull/132
* cudaq-ensmallen not export BLAS symbols by 1tnguyen in https://github.com/NVIDIA/cuda-quantum/pull/185
* Fixed a subtle bug in `QppCircuitSimulator::observe` by 1tnguyen in https://github.com/NVIDIA/cuda-quantum/pull/189
* Bind r1 gate to python by anthony-santana in https://github.com/NVIDIA/cuda-quantum/pull/198
* Fixed `QppCircuitSimulator` shots `ExecutionResult.expectationValue` by 1tnguyen in https://github.com/NVIDIA/cuda-quantum/pull/208
* Fixing segfault crashes when using measure/reset ops. by 1tnguyen in https://github.com/NVIDIA/cuda-quantum/pull/217
* Fix 250: linkage of top-level (global) function. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/267
* Reset and parametric gates in `kernel_builder` by 1tnguyen in https://github.com/NVIDIA/cuda-quantum/pull/269
* Fix 251: Base profile should handle single qubit allocations. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/273
* Fix 281: let canonicalization pattern work with IndexType. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/282
* Fix 286: Add canonicalization to hoist invariants cc.loop arguments. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/289
* Fix 296: issue processing if-statements in JIT compilation by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/298
* [pass] Fix 291: don't erase non-controlled ops by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/292
* Fix 325: Bridge had some bugs with callable instances. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/326
* Fix issue with qreg of dynamic size and disappearing instructions by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/358
* Fix 344 - add support for std::uint8_t kernel argument by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/356
* Fix kernel_builder nested function call bug, 332 by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/334
* Fix 338: Work on implementation of cudaq::adjoint. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/374
Breaking Changes 🛠
* Remove qpud by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/91
* Delete qtx-translate and references to same. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/102
* Remove outdated functions from cudaq.py by anthony-santana in https://github.com/NVIDIA/cuda-quantum/pull/105
* [CircuitCheck] Remove QTX support by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/118
* [quake] Universally replace the QRef type name with Ref. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/160
* [quake] Universal conversion of QVec to Veq. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/163
* Removes the QTX dialect by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/157
* NVQ++ Targets by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/147
* Update python command line flags by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/187
* [nfc] Remove the raise to affine (stub) pass. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/235
Documentation Updates ✏️
* Refer to GitHub for building from source instructions, by bettinaheim in https://github.com/NVIDIA/cuda-quantum/pull/40
* Documentation update for control qubit negation by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/86
* Fixes to common operations definitions by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/120
* Update circuit simulator documentation to reflect the latest refactoring by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/90
* Update the documentation to reflect the new unified dialect by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/153
* Make the CC ops documentation more uniform. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/150
* [docs] Small fixes to quake dialect example by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/165
Other Changes
* Revert change to disable jump branching. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/5
* Replace the old deallocation pass. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/30
* Make use of the various math dialect power operations. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/71
* [nvq++] Add RPATH flags only to the final binary by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/52
* CircuitSimulator Refactor by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/12
* Add a wire type to the quake dialect. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/101
* Add new ops to quake. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/109
* Link to correct version of custatvec libs by hamidelmaazouz in https://github.com/NVIDIA/cuda-quantum/pull/88
* [CircuitCheck] Run canonicalizer before checking by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/122
* Convert the quake dialect ops to support both memory and register forms. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/124
* Change qextract to extract_ref everywhere. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/138
* Add types for structs and arrays to CC by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/146
* [quake] OperatorInterface method to get negated controls by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/154
* [CircuitCheck] Handle negated controls by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/159
* Fold constant into extract_ref op by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/164
* [CircuitCheck] Handle raw index in ExtractRefOp by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/168
* [cc] Start adding some LLVM-like operations. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/192
* [CircuitCheck] Handle local qubits by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/190
* [quake, cc] Improved alloca ops by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/193
* Fixes for macOS by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/213
* [bridge, runtime] Replace use of LLVM-IR dialect in the bridge and python interface with CC dialect operations. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/201
* [cc] Add a folder to compute_ptr op. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/220
* [cg] Move GenKernelExecution to CC dialect. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/225
* Remove the use of the MLIR Memref dialect. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/226
* Load available simulators and platforms lazily in Python by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/239
* [opt] Expand loop unrolling to autodetect counted loops. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/265
* [pass] Decompose aggregate quantum allocations into multiple individual qubit allocations by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/280
* [opt] Flag to raise an error if cannot unroll a loop by boschmitt in https://github.com/NVIDIA/cuda-quantum/pull/295
* Preserve the line information in the source code for JIT compilation. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/299
* Fix 324: Update the old aggressive early inlining pass. by schweitzpgi in https://github.com/NVIDIA/cuda-quantum/pull/331
* Callable Kernel Arguments Passed QuakeSynthesizer. by amccaskey in https://github.com/NVIDIA/cuda-quantum/pull/355
New Contributors
* 1tnguyen made their first contribution in https://github.com/NVIDIA/cuda-quantum/pull/73
* hamidelmaazouz made their first contribution in https://github.com/NVIDIA/cuda-quantum/pull/88
* poojarao8 made their first contribution in https://github.com/NVIDIA/cuda-quantum/pull/107
* tlubowe made their first contribution in https://github.com/NVIDIA/cuda-quantum/pull/59
* Gistbatch made their first contribution in https://github.com/NVIDIA/cuda-quantum/pull/110
* splch made their first contribution in https://github.com/NVIDIA/cuda-quantum/pull/134
**Full Changelog**: https://github.com/NVIDIA/cuda-quantum/compare/0.3.0...841d7332db45d716aa18d5d33eaeb116b748068efb506bdfa0e7ed018bc9b553