Pennylane

Latest version: v0.39.0

Safety actively analyzes 683530 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 9 of 11

0.8.1

<h3>New features</h3>

* The `catalyst.mitigate_with_zne` error mitigation compilation pass now supports the option to fold gates locally as well as the existing method of globally. [(1006)](https://github.com/PennyLaneAI/catalyst/pull/1006) [(#1129)](https://github.com/PennyLaneAI/catalyst/pull/1129)

While global folding applies the scale factor by forming the inverse of the entire quantum circuit (without measurements) and repeating the circuit with its inverse, local folding instead inserts per-gate folding sequences directly in place of each gate in the original circuit.

For example,

python
import jax
import pennylane as qml
from catalyst import qjit, mitigate_with_zne
from pennylane.transforms import exponential_extrapolate

dev = qml.device("lightning.qubit", wires=4, shots=5)

qml.qnode(dev)
def circuit():
qml.Hadamard(wires=0)
qml.CNOT(wires=[0, 1])
return qml.expval(qml.PauliY(wires=0))

qjit(keep_intermediate=True)
def mitigated_circuit():
s = jax.numpy.array([1, 2, 3])
return mitigate_with_zne(
circuit,
scale_factors=s,
extrapolate=exponential_extrapolate,
folding="local-all" "local-all" for local on all gates or "global" for the original method (default being "global")
)()


pycon
>>> circuit()
>>> mitigated_circuit()


<h3>Improvements</h3>

* Fixes an issue where certain JAX linear algebra functions from `jax.scipy.linalg` gave incorrect results when invoked from within a qjit block, and adds full support for other `jax.scipy.linalg` functions. [(1097)](https://github.com/PennyLaneAI/catalyst/pull/1097)

The supported linear algebra functions include, but are not limited to:

- [`jax.scipy.linalg.cholesky`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.cholesky.html)
- [`jax.scipy.linalg.expm`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.expm.html)
- [`jax.scipy.linalg.funm`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.funm.html)
- [`jax.scipy.linalg.hessenberg`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.hessenberg.html)
- [`jax.scipy.linalg.lu`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.lu.html)
- [`jax.scipy.linalg.lu_solve`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.lu_solve.html)
- [`jax.scipy.linalg.polar`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.polar.html)
- [`jax.scipy.linalg.qr`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.qr.html)
- [`jax.scipy.linalg.schur`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.schur.html)
- [`jax.scipy.linalg.solve`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.solve.html)
- [`jax.scipy.linalg.sqrtm`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.sqrtm.html)
- [`jax.scipy.linalg.svd`](https://jax.readthedocs.io/en/latest/_autosummary/jax.scipy.linalg.svd.html)

<h3>Breaking changes</h3>

* The argument `scale_factors` of `mitigate_with_zne` function now follows the proper literature definition. It now needs to be a list of positive odd integers, as we don't support the fractional part. [(1120)](https://github.com/PennyLaneAI/catalyst/pull/1120)

<h3>Bug fixes</h3>

* Those functions calling the `gather_p` primitive (like `jax.scipy.linalg.expm`) can now be used in multiple qjits in a single program. [(1096)](https://github.com/PennyLaneAI/catalyst/pull/1096)

<h3>Contributors</h3>

This release contains contributions from (in alphabetical order):

Joey Carter,
Alessandro Cosentino,
Paul Haochen Wang,
David Ittah,
Romain Moyard,
Daniel Strano,
Raul Torres.

0.8.0

<h3>New features</h3>

* JAX-compatible functions that run on classical accelerators, such as GPUs, via `catalyst.accelerate` now support autodifferentiation. [(920)](https://github.com/PennyLaneAI/catalyst/pull/920)

For example,

python
from catalyst import qjit, grad

qjit
grad
def f(x):
expm = catalyst.accelerate(jax.scipy.linalg.expm)
return jnp.sum(expm(jnp.sin(x)) ** 2)


pycon
>>> x = jnp.array([[0.1, 0.2], [0.3, 0.4]])
>>> f(x)
Array([[2.80120452, 1.67518663],
[1.61605839, 4.42856163]], dtype=float64)


* Assertions can now be raised at runtime via the `catalyst.debug_assert` function. [(925)](https://github.com/PennyLaneAI/catalyst/pull/925)

Python-based exceptions (via `raise`) and assertions (via `assert`) will always be evaluated at program capture time, before certain runtime information may be available.

Use `debug_assert` to instead raise assertions at runtime, including assertions that depend on values of dynamic variables.

For example,

python
from catalyst import debug_assert

qjit
def f(x):
debug_assert(x < 5, "x was greater than 5")
return x * 8


pycon
>>> f(4)
Array(32, dtype=int64)
>>> f(6)
RuntimeError: x was greater than 5


Assertions can be disabled globally for a qjit-compiled function via the ``disable_assertions`` keyword argument:

python
qjit(disable_assertions=True)
def g(x):
debug_assert(x < 5, "x was greater than 5")
return x * 8


pycon
>>> g(6)
Array(48, dtype=int64)


* Mid-circuit measurement results when using `lightning.qubit` and `lightning.kokkos` can now be seeded via the new `seed` argument of the `qjit` decorator. [(936)](https://github.com/PennyLaneAI/catalyst/pull/936)

The seed argument accepts an unsigned 32-bit integer, which is used to initialize the pseudo-random state at the beginning of each execution of the compiled function. Therefor, different `qjit` objects with the same seed (including repeated calls to the same `qjit`) will always return the same sequence of mid-circuit measurement results.

python
dev = qml.device("lightning.qubit", wires=1)

qml.qnode(dev)
def circuit(x):
qml.RX(x, wires=0)
m = measure(0)

if m:
qml.Hadamard(0)

return qml.probs()

qjit(seed=37, autograph=True)
def workflow(x):
return jnp.stack([circuit(x) for i in range(4)])


Repeatedly calling the `workflow` function above will always result in the same values:

pycon
>>> workflow(1.8)
Array([[1. , 0. ],
[1. , 0. ],
[1. , 0. ],
[0.5, 0.5]], dtype=float64)
>>> workflow(1.8)
Array([[1. , 0. ],
[1. , 0. ],
[1. , 0. ],
[0.5, 0.5]], dtype=float64)


Note that setting the seed will *not* avoid shot-noise stochasticity in terminal measurement statistics such as `sample` or `expval`:

python
dev = qml.device("lightning.qubit", wires=1, shots=10)

qml.qnode(dev)
def circuit(x):
qml.RX(x, wires=0)
m = measure(0)

if m:
qml.Hadamard(0)

return qml.expval(qml.PauliZ(0))

qjit(seed=37, autograph=True)
def workflow(x):
return jnp.stack([circuit(x) for i in range(4)])

pycon
>>> workflow(1.8)
Array([1. , 1. , 1. , 0.4], dtype=float64)
>>> workflow(1.8)
Array([ 1. , 1. , 1. , -0.2], dtype=float64)


* Exponential fitting is now a supported method of zero-noise extrapolation when performing error mitigation in Catalyst using `mitigate_with_zne`. [(953)](https://github.com/PennyLaneAI/catalyst/pull/953)

This new functionality fits the data from noise-scaled circuits with an exponential function, and returns the zero-noise value:

py
from pennylane.transforms import exponential_extrapolate
from catalyst import mitigate_with_zne

dev = qml.device("lightning.qubit", wires=2, shots=100000)

qml.qnode(dev)
def circuit(weights):
qml.StronglyEntanglingLayers(weights, wires=[0, 1])
return qml.expval(qml.PauliZ(0) qml.PauliZ(1))

qjit
def workflow(weights, s):
zne_circuit = mitigate_with_zne(circuit, scale_factors=s, extrapolate=exponential_extrapolate)
return zne_circuit(weights)


pycon
>>> weights = jnp.ones([3, 2, 3])
>>> scale_factors = jnp.array([1, 2, 3])
>>> workflow(weights, scale_factors)
Array(-0.19946598, dtype=float64)


* A new module is available, `catalyst.passes`, which provides Python decorators for enabling and configuring Catalyst MLIR compiler passes. [(911)](https://github.com/PennyLaneAI/catalyst/pull/911) [(#1037)](https://github.com/PennyLaneAI/catalyst/pull/1037)

The first pass available is `catalyst.passes.cancel_inverses`, which enables the `-removed-chained-self-inverse` MLIR pass that cancels two neighbouring Hadamard gates.

python
from catalyst.debug import get_compilation_stage
from catalyst.passes import cancel_inverses

dev = qml.device("lightning.qubit", wires=1)

qml.qnode(dev)
def circuit(x: float):
qml.RX(x, wires=0)
qml.Hadamard(wires=0)
qml.Hadamard(wires=0)
return qml.expval(qml.PauliZ(0))

qjit(keep_intermediate=True)
def workflow(x):
optimized_circuit = cancel_inverses(circuit)
return circuit(x), optimized_circuit(x)


* Catalyst now has debug functions `get_compilation_stage` and `replace_ir` to acquire and recompile the IR from a given pipeline pass for functions compiled with `keep_intermediate=True`. [(981)](https://github.com/PennyLaneAI/catalyst/pull/981)

For example, consider the following function:

python
qjit(keep_intermediate=True)
def f(x):
return x**2


pycon
>>> f(2.0)
4.0


Here we use `get_compilation_stage` to acquire the IR, and then modify `%2 = arith.mulf %in, %in_0 : f64` to turn the square function into a cubic one via `replace_ir`:

python
from catalyst.debug import get_compilation_stage, replace_ir

old_ir = get_compilation_stage(f, "HLOLoweringPass")
new_ir = old_ir.replace(
"%2 = arith.mulf %in, %in_0 : f64\n",
"%t = arith.mulf %in, %in_0 : f64\n %2 = arith.mulf %t, %in_0 : f64\n"
)
replace_ir(f, "HLOLoweringPass", new_ir)


The recompilation starts after the given checkpoint stage:

pycon
>>> f(2.0)
8.0


Either function can also be used independently of each other. Note that `get_compilation_stage` replaces the `print_compilation_stage` function; please see the Breaking Changes section for more details.

* Catalyst now supports generating executables from compiled functions for the native host architecture using `catalyst.debug.compile_executable`. [(1003)](https://github.com/PennyLaneAI/catalyst/pull/1003)

pycon
>>> qjit
... def f(x):
... y = x * x
... catalyst.debug.print_memref(y)
... return y
>>> f(5)
MemRef: base = 0x31ac22580 rank = 0 offset = 0 sizes = [] strides = [] data =
25
Array(25, dtype=int64)


We can use ``compile_executable`` to compile this function to a binary:

pycon
>>> from catalyst.debug import compile_executable
>>> binary = compile_executable(f, 5)
>>> print(binary)
/path/to/executable


Executing this function from a shell environment:

console
$ /path/to/executable
MemRef: base = 0x64fc9dd5ffc0 rank = 0 offset = 0 sizes = [] strides = [] data =
25


<h3>Improvements</h3>

* Catalyst has been updated to work with JAX v0.4.28 (exact version match required). [(931)](https://github.com/PennyLaneAI/catalyst/pull/931) [(#995)](https://github.com/PennyLaneAI/catalyst/pull/995)

* Catalyst now supports keyword arguments for qjit-compiled functions. [(1004)](https://github.com/PennyLaneAI/catalyst/pull/1004)

pycon
>>> qjit
... grad
... def f(x, y):
... return x * y
>>> f(3., y=2.)
Array(2., dtype=float64)


Note that the `static_argnums` argument to the `qjit` decorator is not supported when passing argument values as keyword arguments.

* Support has been added for the `jax.numpy.argsort` function within qjit-compiled functions. [(901)](https://github.com/PennyLaneAI/catalyst/pull/901)

* Autograph now supports in-place array assignments with static slices. [(843)](https://github.com/PennyLaneAI/catalyst/pull/843)

For example,

python
qjit(autograph=True)
def f(x, y):
y[1:10:2] = x
return y


pycon
>>> f(jnp.ones(5), jnp.zeros(10))
Array([0., 1., 0., 1., 0., 1., 0., 1., 0., 1.], dtype=float64)


* Autograph now works when `qjit` is applied to a function decorated with `vmap`, `cond`, `for_loop` or `while_loop`. Previously, stacking the autograph-enabled qjit decorator directly on top of other Catalyst decorators would lead to errors. [(835)](https://github.com/PennyLaneAI/catalyst/pull/835) [(#938)](https://github.com/PennyLaneAI/catalyst/pull/938) [(#942)](https://github.com/PennyLaneAI/catalyst/pull/942)

python
from catalyst import vmap, qjit

dev = qml.device("lightning.qubit", wires=2)

qml.qnode(dev)
def circuit(x):
qml.RX(x, wires=0)
return qml.expval(qml.PauliZ(0))


pycon
>>> x = jnp.array([0.1, 0.2, 0.3])
>>> qjit(vmap(circuit), autograph=True)(x)
Array([0.99500417, 0.98006658, 0.95533649], dtype=float64)


* Runtime memory usage, and compilation complexity, has been reduced by eliminating some scalar tensors from the IR. This has been done by adding a `linalg-detensorize` pass at the end of the HLO lowering pipeline. [(1010)](https://github.com/PennyLaneAI/catalyst/pull/1010)

* Program verification is extended to confirm that the measurements included in QNodes are compatible with the specified device and settings. [(945)](https://github.com/PennyLaneAI/catalyst/pull/945) [(#962)](https://github.com/PennyLaneAI/catalyst/pull/962)

pycon
>>> dev = qml.device("lightning.qubit", wires=2, shots=None)
>>> qjit
... qml.qnode(dev)
... def circuit(params):
... qml.RX(params[0], wires=0)
... qml.RX(params[1], wires=1)
... return {
... "sample": qml.sample(wires=[0, 1]),
... "expval": qml.expval(qml.PauliZ(0))
... }
>>> circuit([0.1, 0.2])
CompileError: Sample-based measurements like sample(wires=[0, 1])
cannot work with shots=None. Please specify a finite number of shots.


* On devices that support it, initial state preparation routines `qml.StatePrep` and `qml.BasisState` are no longer decomposed when using Catalyst, improving compilation and runtime performance. [(955)](https://github.com/PennyLaneAI/catalyst/pull/955) [(#1047)](https://github.com/PennyLaneAI/catalyst/pull/1047) [(#1062)](https://github.com/PennyLaneAI/catalyst/pull/1062) [(#1073)](https://github.com/PennyLaneAI/catalyst/pull/1073)

* Improved type validation and error messaging has been added to both the `catalyst.jvp` and `catalyst.vjp` functions to ensure that the (co)tangent and parameter types are compatible. [(1020)](https://github.com/PennyLaneAI/catalyst/pull/1020) [(#1030)](https://github.com/PennyLaneAI/catalyst/pull/1030) [(#1031)](https://github.com/PennyLaneAI/catalyst/pull/1031)

For example, providing an integer tangent for a function with float64 parameters will result in an error:

pycon
>>> f = lambda x: (2 * x, x * x)
>>> f_jvp = lambda x: catalyst.jvp(f, params=(x,), tangents=(1,))
>>> qjit(f_jvp)(0.5)
TypeError: function params and tangents arguments to catalyst.jvp do not match;
dtypes must be equal. Got function params dtype float64 and so expected tangent
dtype float64, but got tangent dtype int64 instead.


Ensuring that the types match will resolve the error:

pycon
>>> f_jvp = lambda x: catalyst.jvp(f, params=(x,), tangents=(1.0,))
>>> qjit(f_jvp)(0.5)
((Array(1., dtype=float64), Array(0.25, dtype=float64)),
(Array(2., dtype=float64), Array(1., dtype=float64)))


* Add a script for setting up a Frontend-Only Development Environment that does not require compilation, as it uses the TestPyPI wheel shared libraries. [(1022)](https://github.com/PennyLaneAI/catalyst/pull/1022)

<h3>Breaking changes</h3>

* The `argnum` keyword argument in the `grad`, `jacobian`, `value_and_grad`, `vjp`, and `jvp` functions has been renamed to `argnums` to better match JAX. [(1036)](https://github.com/PennyLaneAI/catalyst/pull/1036)

* Return values of qjit-compiled functions that were previously `numpy.ndarray` are now of type `jax.Array` instead. This should have minimal impact, but code that depends on the output of qjit-compiled function being NumPy arrays will need to be updated. [(895)](https://github.com/PennyLaneAI/catalyst/pull/895)

* The `print_compilation_stage` function has been renamed `get_compilation_stage`. It no longer prints the IR to the standard output, instead it simply returns the IR as a string. [(981)](https://github.com/PennyLaneAI/catalyst/pull/981)

pycon
>>> qjit(keep_intermediate=True)
... def func(x: float):
... return x
>>> print(get_compilation_stage(func, "HLOLoweringPass"))
module func {
func.func public jit_func(%arg0: tensor<f64>)
-> tensor<f64> attributes {llvm.emit_c_interface} {
return %arg0 : tensor<f64>
}
func.func setup() {
quantum.init
return
}
func.func teardown() {
quantum.finalize
return
}
}


* Support for TOML files in Schema 1 has been disabled. [(960)](https://github.com/PennyLaneAI/catalyst/pull/960)

* The `mitigate_with_zne` function no longer accepts a `degree` parameter for polynomial fitting and instead accepts a callable to perform extrapolation. Any qjit-compatible extrapolation function is valid. Keyword arguments can be passed to this function using the `extrapolate_kwargs` keyword argument in `mitigate_with_zne`. [(806)](https://github.com/PennyLaneAI/catalyst/pull/806)

* The QuantumDevice API has now added the functions `SetState` and `SetBasisState` for simulators that may benefit from instructions that directly set the state. Implementing these methods is optional, and device support can be indicated via the `initial_state_prep` flag in the TOML configuration file. [(955)](https://github.com/PennyLaneAI/catalyst/pull/955)

<h3>Bug fixes</h3>

* Catalyst no longer silently converts complex parameters to floats where floats are expected, instead an error is raised. [(1008)](https://github.com/PennyLaneAI/catalyst/pull/1008)

* Fixes a bug where dynamic one-shot did not work when no mid-circuit measurements are present and when the return type is an iterable. [(1060)](https://github.com/PennyLaneAI/catalyst/pull/1060)

* Fixes a bug finding the quantum function jaxpr when using quantum primitives with dynamic one-shot [(1041)](https://github.com/PennyLaneAI/catalyst/pull/1041)

* Fix a bug where LegacyDevice number of shots is not correctly extracted when using the legacyDeviceFacade. [(1035)](https://github.com/PennyLaneAI/catalyst/pull/1035)

* Catalyst no longer generates a `QubitUnitary` operation during decomposition if a device doesn't support it. Instead, the operation that would lead to a `QubitUnitary` is either decomposed or raises an error. [(1002)](https://github.com/PennyLaneAI/catalyst/pull/1002)

* Catalyst now preserves output PyTrees in QNodes executed with `mcm_method="one-shot"`. [(957)](https://github.com/PennyLaneAI/catalyst/pull/957)

For example:

python
dev = qml.device("lightning.qubit", wires=1, shots=20)
qml.qjit
qml.qnode(dev, mcm_method="one-shot")
def func(x):
qml.RX(x, wires=0)
m_0 = catalyst.measure(0, postselect=1)
return {"hi": qml.expval(qml.Z(0))}


pycon
>>> func(0.9)
{'hi': Array(-1., dtype=float64)}


* Fixes a bug where scatter did not work correctly with list indices. [(982)](https://github.com/PennyLaneAI/catalyst/pull/982)

python
A = jnp.ones([3, 3]) * 2

def update(A):
A = A.at[[0, 1], :].set(jnp.ones([2, 3]), indices_are_sorted=True, unique_indices=True)
return A


pycon
>>> update
[[1. 1. 1.]
[1. 1. 1.]
[2. 2. 2.]]


* Static arguments can now be passed through a QNode when specified with the `static_argnums` keyword argument. [(932)](https://github.com/PennyLaneAI/catalyst/pull/932)

python
dev = qml.device("lightning.qubit", wires=1)

qjit(static_argnums=(1,))
qml.qnode(dev)
def circuit(x, c):
print("Inside QNode:", c)
qml.RY(c, 0)
qml.RX(x, 0)
return qml.expval(qml.PauliZ(0))


When executing the qjit-compiled function above, `c` will be a static variable with value known at compile time:

pycon
>>> circuit(0.5, 0.5)
"Inside QNode: 0.5"
Array(0.77015115, dtype=float64)


Changing the value of `c` will result in re-compilation:

pycon
>>> circuit(0.5, 0.8)
"Inside QNode: 0.8"
Array(0.61141766, dtype=float64)


* Fixes a bug where Catalyst would fail to apply quantum transforms and preserve QNode configuration settings when Autograph was enabled. [(900)](https://github.com/PennyLaneAI/catalyst/pull/900)

* `pure_callback` will no longer cause a crash in the compiler if the return type signature is declared incorrectly and the callback function is differentiated. [(916)](https://github.com/PennyLaneAI/catalyst/pull/916)

Instead, this is caught early and a useful error message returned:

python
catalyst.pure_callback
def callback_fn(x) -> jax.ShapeDtypeStruct((2,), jnp.float32):
return np.array([np.sin(x), np.cos(x)])

callback_fn.fwd(lambda x: (callback_fn(x), x))
callback_fn.bwd(lambda x, dy: (jnp.array([jnp.cos(x), -jnp.sin(x)]) dy,))

qjit
catalyst.grad
def f(x):
return jnp.sum(callback_fn(jnp.sin(x)))


pycon
>>> f(0.54)
TypeError: Callback callback_fn expected type ShapedArray(float32[2]) but observed ShapedArray(float64[2]) in its return value


* AutoGraph will now correctly convert conditional statements where the condition is a non-boolean static value. [(944)](https://github.com/PennyLaneAI/catalyst/pull/944)

Internally, statically known non-boolean predicates (such as `1`) will be converted to `bool`:

python
qml.qjit(autograph=True)
def workflow(x):
n = 1

if n:
y = x ** 2
else:
y = x

return y


* `value_and_grad` will now correctly differentiate functions with multiple arguments. Previously, attempting to differentiate functions with multiple arguments, or pass the ``argnums`` argument, would result in an error. [(1034)](https://github.com/PennyLaneAI/catalyst/pull/1034)

python
qjit
def g(x, y, z):
def f(x, y, z):
return x * y ** 2 * jnp.sin(z)
return catalyst.value_and_grad(f, argnums=[1, 2])(x, y, z)


pycon
>>> g(0.4, 0.2, 0.6)
(Array(0.00903428, dtype=float64),
(Array(0.0903428, dtype=float64), Array(0.01320537, dtype=float64)))


* A bug is fixed in `catalyst.debug.get_cmain` to support multi-dimensional arrays as function inputs. [(1003)](https://github.com/PennyLaneAI/catalyst/pull/1003)

* Bug fixed when parameter annotations return strings. [(1078)](https://github.com/PennyLaneAI/catalyst/pull/1078)

* In certain cases, `jax.scipy.linalg.expm` [may return incorrect numerical results] (https://github.com/PennyLaneAI/catalyst/issues/1071)when used within a qjit-compiled function. A warning will now be raised when `jax.scipy.linalg.expm` is used to inform of this issue.

In the meantime, we strongly recommend the [catalyst.accelerate](https://docs.pennylane.ai/projects/catalyst/en/latest/code/api/catalyst.accelerate.html) function within qjit-compiled function to call `jax.scipy.linalg.expm` directly.

python
qjit
def f(A):
B = catalyst.accelerate(jax.scipy.linalg.expm)(A)
return B


Note that this PR doesn't actually fix the aforementioned numerical errors, and just raises a warning. [(1082)](https://github.com/PennyLaneAI/catalyst/pull/1082)

<h3>Documentation</h3>

* A page has been added to the documentation, listing devices that are Catalyst compatible. [(966)](https://github.com/PennyLaneAI/catalyst/pull/966)

<h3>Internal changes</h3>

* Adds `catalyst.from_plxpr.from_plxpr` for converting a PennyLane variant jaxpr into a Catalyst variant jaxpr. [(837)](https://github.com/PennyLaneAI/catalyst/pull/837)

* Catalyst now uses Enzyme `v0.0.130`. [(898)](https://github.com/PennyLaneAI/catalyst/pull/898)

* When memrefs have no identity layout, memrefs copy operations are replaced by the linalg copy operation. It does not use a runtime function but instead lowers to scf and standard dialects. It also ensures a better compatibility with Enzyme. [(917)](https://github.com/PennyLaneAI/catalyst/pull/917)

* LLVM's O2 optimization pipeline and Enzyme's AD transformations are now only run in the presence of gradients, significantly improving compilation times for programs without derivatives. Similarly, LLVM's coroutine lowering passes only run when `async_qnodes` is enabled in the QJIT decorator. [(968)](https://github.com/PennyLaneAI/catalyst/pull/968)

* The function `inactive_callback` was renamed `__catalyst_inactive_callback`. [(899)](https://github.com/PennyLaneAI/catalyst/pull/899)

* The function `__catalyst_inactive_callback` has the nofree attribute. [(898)](https://github.com/PennyLaneAI/catalyst/pull/898)

* `catalyst.dynamic_one_shot` uses `postselect_mode="pad-invalid-samples"` in favour of `interface="jax"` when processing results. [(956)](https://github.com/PennyLaneAI/catalyst/pull/956)

* Callbacks now have nicer identifiers in their MLIR representation. The identifiers include the name of the Python function being called back into. [(919)](https://github.com/PennyLaneAI/catalyst/pull/919)

* Fix tracing of `SProd` operations to bring Catalyst in line with PennyLane v0.38. [(935)](https://github.com/PennyLaneAI/catalyst/pull/935)

After some changes in PennyLane, `Sprod.terms()` returns the terms as leaves instead of a tree. This means that we need to manually trace each term and finally multiply it with the coefficients to create a Hamiltonian.

* The function `mitigate_with_zne` accomodates a `folding` input argument for specifying the type of circuit folding technique to be used by the error-mitigation routine (only `global` value is supported to date.) [(946)](https://github.com/PennyLaneAI/catalyst/pull/946)

* Catalyst's implementation of Lightning Kokkos plugin has been removed in favor of Lightning's one. [(974)](https://github.com/PennyLaneAI/catalyst/pull/974)

* The `validate_device_capabilities` function is considered obsolete. Hence, it has been removed. [(1045)](https://github.com/PennyLaneAI/catalyst/pull/1045)

<h3>Contributors</h3>

This release contains contributions from (in alphabetical order):

Joey Carter,
Alessandro Cosentino,
Lillian M. A. Frederiksen,
David Ittah,
Josh Izaac,
Christina Lee,
Kunwar Maheep Singh,
Mehrdad Malekmohammadi,
Romain Moyard,
Erick Ochoa Lopez,
Mudit Pandey,
Nate Stemen,
Raul Torres,
Tzung-Han Juang,
Paul Haochen Wang.

0.7.0

<h3>New features</h3>

* Add support for accelerating classical processing via JAX with `catalyst.accelerate`. [(805)](https://github.com/PennyLaneAI/catalyst/pull/805)

Classical code that can be just-in-time compiled with JAX can now be seamlessly executed on GPUs or other accelerators with `catalyst.accelerate`, right inside of QJIT-compiled functions.

python
accelerate(dev=jax.devices("gpu")[0])
def classical_fn(x):
return jnp.sin(x) ** 2

qjit
def hybrid_fn(x):
y = classical_fn(jnp.sqrt(x)) will be executed on a GPU
return jnp.cos(y)


Available devices can be retrieved via `jax.devices()`. If not provided, the default value of `jax.devices()[0]` as determined by JAX will be used.

* Catalyst callback functions, such as `pure_callback`, `debug.callback`, and `debug.print`, now all support auto-differentiation. [(706)](https://github.com/PennyLaneAI/catalyst/pull/706) [(#782)](https://github.com/PennyLaneAI/catalyst/pull/782) [(#822)](https://github.com/PennyLaneAI/catalyst/pull/822) [(#834)](https://github.com/PennyLaneAI/catalyst/pull/834) [(#882)](https://github.com/PennyLaneAI/catalyst/pull/882) [(#907)](https://github.com/PennyLaneAI/catalyst/pull/907)

- When using callbacks that do not return any values, such as `catalyst.debug.callback` and `catalyst.debug.print`, these functions are marked as 'inactive' and do not contribute to or affect the derivative of the function:

python
import logging

log = logging.getLogger(__name__)
log.setLevel(logging.INFO)

qml.qjit
catalyst.grad
def f(x):
y = jnp.cos(x)
catalyst.debug.print("Debug print: y = {0:.4f}", y)
catalyst.debug.callback(lambda _: log.info("Value of y = %s", _))(y)
return y ** 2


pycon
>>> f(0.54)
INFO:__main__:Value of y = 0.8577086813638242
Debug print: y = 0.8577
array(-0.88195781)


- Callbacks that *do* return values and may affect the qjit-compiled functions computation, such as `pure_callback`, may have custom derivatives manually registered with the Catalyst compiler in order to support differentiation.

This can be done via the `pure_callback.fwd` and `pure_callback.bwd` methods, to specify how the forwards and backwards pass (the vector-Jacobian product) of the callback should be computed:

python
catalyst.pure_callback
def callback_fn(x) -> float:
return np.sin(x[0]) * x[1]

callback_fn.fwd
def callback_fn_fwd(x):
returns the evaluated function as well as residual
values that may be useful for the backwards pass
return callback_fn(x), x

callback_fn.bwd
def callback_fn_vjp(res, dy):
Accepts residuals from the forward pass, as well
as (one or more) cotangent vectors dy, and returns
a tuple of VJPs corresponding to each input parameter.

def vjp(x, dy) -> (jax.ShapeDtypeStruct((2,), jnp.float64),):
return (np.array([np.cos(x[0]) * dy * x[1], np.sin(x[0]) * dy]),)

The VJP function can also be a pure callback
return catalyst.pure_callback(vjp)(res, dy)

qml.qjit
catalyst.grad
def f(x):
y = jnp.array([jnp.cos(x[0]), x[1]])
return jnp.sin(callback_fn(y))


pycon
>>> x = jnp.array([0.1, 0.2])
>>> f(x)
array([-0.01071923, 0.82698717])


* Catalyst now supports the 'dynamic one shot' method for simulating circuits with mid-circuit measurements, which compared to other methods, may be advantageous for circuits with many mid-circuit measurements executed for few shots. [(5617)](https://github.com/PennyLaneAI/pennylane/pull/5617) [(#798)](https://github.com/PennyLaneAI/catalyst/pull/798)

The dynamic one shot method evaluates dynamic circuits by executing them one shot at a time via `catalyst.vmap`, sampling a dynamic execution path for each shot. This method only works for a QNode executing with finite shots, and it requires the device to support mid-circuit measurements natively.

This new mode can be specified by using the `mcm_method` argument of the QNode:

python
dev = qml.device("lightning.qubit", wires=5, shots=20)

qml.qjit(autograph=True)
qml.qnode(dev, mcm_method="one-shot")
def circuit(x):

for i in range(10):
qml.RX(x, 0)
m = catalyst.measure(0)

if m:
qml.RY(x ** 2, 1)

x = jnp.sin(x)

return qml.expval(qml.Z(1))


Catalyst's existing method for simulating mid-circuit measurements remains available via `mcm_method="single-branch-statistics"`.

When using `mcm_method="one-shot"`, the `postselect_mode` keyword argument can also be used to specify whether the returned result should include `shots`-number of postselected measurements (`"fill-shots"`), or whether results should include all results, including invalid postselections (`"hw_like"`):

python
qml.qjit
qml.qnode(dev, mcm_method="one-shot", postselect_mode="hw-like")
def func(x):
qml.RX(x, wires=0)
m_0 = catalyst.measure(0, postselect=1)
return qml.sample(wires=0)


pycon
>>> res = func(0.9)
>>> res
array([-2147483648, -2147483648, 1, -2147483648, -2147483648,
-2147483648, -2147483648, 1, -2147483648, -2147483648,
-2147483648, -2147483648, 1, -2147483648, -2147483648,
-2147483648, -2147483648, -2147483648, -2147483648, -2147483648])
>>> jnp.delete(res, jnp.where(res == np.iinfo(np.int32).min)[0])
Array([1, 1, 1], dtype=int64)


Note that invalid shots will not be discarded, but will be replaced by `np.iinfo(np.int32).min` They will not be used for processing final results (like expectation values), but they will appear in the output of QNodes that return samples directly.

For more details, see the [dynamic quantum circuit documentation](https://docs.pennylane.ai/en/latest/introduction/dynamic_quantum_circuits.html).

* Catalyst now has support for returning `qml.sample(m)` where `m` is the result of a mid-circuit measurement. [(731)](https://github.com/PennyLaneAI/catalyst/pull/731)

When used with `mcm_method="one-shot"`, this will return an array with one measurement result for each shot:

python
dev = qml.device("lightning.qubit", wires=2, shots=10)

qml.qjit
qml.qnode(dev, mcm_method="one-shot")
def func(x):
qml.RX(x, wires=0)
m = catalyst.measure(0)
qml.RX(x ** 2, wires=0)
return qml.sample(m), qml.expval(qml.PauliZ(0))


pycon
>>> func(0.9)
(array([0, 1, 0, 0, 0, 0, 1, 0, 0, 0]), array(0.4))


In `mcm_method="single-branch-statistics"` mode, it will be equivalent to returning `m` directly from the quantum function --- that is, it will return a single boolean corresponding to the measurement in the branch selected:

python
qml.qjit
qml.qnode(dev, mcm_method="single-branch-statistics")
def func(x):
qml.RX(x, wires=0)
m = catalyst.measure(0)
qml.RX(x ** 2, wires=0)
return qml.sample(m), qml.expval(qml.PauliZ(0))


pycon
>>> func(0.9)
(array(False), array(0.8))


* A new function, `catalyst.value_and_grad`, returns both the result of a function and its gradient with a single forward and backwards pass. [(804)](https://github.com/PennyLaneAI/catalyst/pull/804) [(#859)](https://github.com/PennyLaneAI/catalyst/pull/859)

This can be more efficient, and reduce overall quantum executions, compared to separately executing the function and then computing its gradient.

For example:

py
dev = qml.device("lightning.qubit", wires=3)

qml.qnode(dev)
def circuit(x):
qml.RX(x, wires=0)
qml.CNOT(wires=[0, 1])
qml.RX(x, wires=2)
return qml.probs()

qml.qjit
catalyst.value_and_grad
def cost(x):
return jnp.sum(jnp.cos(circuit(x)))


pycon
>>> cost(0.543)
(array(7.64695856), array(0.33413963))


* Autograph now supports single index JAX array assignment [(717)](https://github.com/PennyLaneAI/catalyst/pull/717)

When using Autograph, syntax of the form `x[i] = y` where `i` is a single integer will now be automatically converted to the JAX equivalent of `x = x.at(i).set(y)`:

python
qml.qjit(autograph=True)
def f(array):
result = jnp.ones(array.shape, dtype=array.dtype)

for i, x in enumerate(array):
result[i] = result[i] + x * 3

return result


pycon
>>> f(jnp.array([-0.1, 0.12, 0.43, 0.54]))
array([0.7 , 1.36, 2.29, 2.62])


* Catalyst now supports dynamically-shaped arrays in control-flow primitives. Arrays with dynamic shapes can now be used with `for_loop`, `while_loop`, and `cond` primitives. [(775)](https://github.com/PennyLaneAI/catalyst/pull/775) [(#777)](https://github.com/PennyLaneAI/catalyst/pull/777) [(#830)](https://github.com/PennyLaneAI/catalyst/pull/830)

python
qjit
def f(shape):
a = jnp.ones([shape], dtype=float)

for_loop(0, 10, 2)
def loop(i, a):
return a + i

return loop(a)

pycon
>>> f(3)
array([21., 21., 21.])


* Support has been added for disabling Autograph for specific functions. [(705)](https://github.com/PennyLaneAI/catalyst/pull/705) [(#710)](https://github.com/PennyLaneAI/catalyst/pull/710)

The decorator `catalyst.disable_autograph` allows one to disable Autograph from auto-converting specific external functions when called within a qjit-compiled function with `autograph=True`:

python
def approximate_e(n):
num = 1.
fac = 1.
for i in range(1, n + 1):
fac *= i
num += 1. / fac
return num

qml.qjit(autograph=True)
def g(x: float, N: int):

for i in range(N):
x = x + catalyst.disable_autograph(approximate_e)(10) / x ** i

return x


pycon
>>> g(0.1, 10)
array(4.02997319)


Note that for Autograph to be disabled, the decorated function must be defined **outside** the qjit-compiled function. If it is defined within the qjit-compiled function, it will continue to be converted with Autograph.

In addition, Autograph can also be disabled for all externally defined functions within a qjit-compiled function via the context manager syntax:

python
qml.qjit(autograph=True)
def g(x: float, N: int):

for i in range(N):
with catalyst.disable_autograph:
x = x + approximate_e(10) / x ** i

return x


* Support for including a list of (sub)modules to be allowlisted for autograph conversion. [(725)](https://github.com/PennyLaneAI/catalyst/pull/725)

Although library code is not meant to be targeted by Autograph conversion, it sometimes make sense to enable it for specific submodules that might benefit from such conversion:

py
qjit(autograph=True, autograph_include=["excluded_module.submodule"])
def f(x):
return excluded_module.submodule.func(x)



For example, this might be useful if importing functionality from PennyLane (such as a transform or decomposition), and would like to have Autograph capture and convert associated control flow.

* Controlled operations that do not have a matrix representation defined are now supported via applying PennyLane's decomposition. [(831)](https://github.com/PennyLaneAI/catalyst/pull/831)

python
qjit
qml.qnode(qml.device("lightning.qubit", wires=2))
def circuit():
qml.Hadamard(0)
qml.ctrl(qml.TrotterProduct(H, time=2.4, order=2), control=[1])
return qml.state()


* Catalyst is now officially support on Linux aarch64, with pre-built binaries available on PyPI; simply `pip install pennylane-catalyst` on Linux aarch64 systems. [(767)](https://github.com/PennyLaneAI/catalyst/pull/767)

<h3>Improvements</h3>

* Validation is now performed for observables and operations to ensure that provided circuits are compatible with the devices for execution.
[(626)](https://github.com/PennyLaneAI/catalyst/pull/626) [(#783)](https://github.com/PennyLaneAI/catalyst/pull/783)

python
dev = qml.device("lightning.qubit", wires=2, shots=10000)

qjit
qml.qnode(dev)
def circuit(x):
qml.Hadamard(wires=0)
qml.CRX(x, wires=[0, 1])
return qml.var(qml.PauliZ(1))


pycon
>>> circuit(0.43)
DifferentiableCompileError: Variance returns are forbidden in gradients


* Catalyst's adjoint and ctrl methods are now fully compatible with the PennyLane equivalent when applied to a single Operator. This should lead to improved compatibility with PennyLane library code, as well when reusing quantum functions with both Catalyst and PennyLane. [(768)](https://github.com/PennyLaneAI/catalyst/pull/768) [(#771)](https://github.com/PennyLaneAI/catalyst/pull/771) [(#802)](https://github.com/PennyLaneAI/catalyst/pull/802)

* Controlled operations defined via specialized classes (like `Toffoli` or `ControlledQubitUnitary`) are now implemented as controlled versions of their base operation if the device supports it.
In particular, `MultiControlledX` is no longer executed as a `QubitUnitary` with Lightning. [(792)](https://github.com/PennyLaneAI/catalyst/pull/792)

* The Catalyst frontend now supports Python logging through PennyLane's `qml.logging` module. For more details, please see the [logging documentation](https://docs.pennylane.ai/en/stable/introduction/logging.html). [(#660)](https://github.com/PennyLaneAI/catalyst/pull/660)

* Catalyst now performs a stricter validation of the wire requirements for devices. In particular, only integer, continuous wire labels starting at 0 are allowed. [(784)](https://github.com/PennyLaneAI/catalyst/pull/784)

* Catalyst no longer disallows quantum circuits with 0 qubits. [(784)](https://github.com/PennyLaneAI/catalyst/pull/784)

* Added support for `IsingZZ` as a native gate in Catalyst. Previously, the IsingZZ gate would be decomposed into a CNOT and RZ gates, even if a device supported it. [(730)](https://github.com/PennyLaneAI/catalyst/pull/730)

* All decorators in Catalyst, including `vmap`, `qjit`, `mitigate_with_zne`, as well as gradient decorators `grad`, `jacobian`, `jvp`, and `vjp`, can now be used both with and without keyword arguments as a decorator without the need for `functools.partial`: [(758)](https://github.com/PennyLaneAI/catalyst/pull/758) [(#761)](https://github.com/PennyLaneAI/catalyst/pull/761) [(#762)](https://github.com/PennyLaneAI/catalyst/pull/762) [(#763)](https://github.com/PennyLaneAI/catalyst/pull/763)

python
qjit
grad(method="fd")
def fn1(x):
return x ** 2

qjit(autograph=True)
grad
def fn2(x):
return jnp.sin(x)


pycon
>>> fn1(0.43)
array(0.8600001)
>>> fn2(0.12)
array(0.99280864)


* The built-in instrumentation with `detailed` output will no longer report the cumulative time for MLIR pipelines, since the cumulative time was being reported as just another step alongside individual timings for each pipeline. [(772)](https://github.com/PennyLaneAI/catalyst/pull/772)

* Raise a better error message when no shots are specified and `qml.sample` or `qml.counts` is used. [(786)](https://github.com/PennyLaneAI/catalyst/pull/786)

* The finite difference method for differentiation is now always allowed, even on functions with mid-circuit measurements, callbacks without custom derivates, or other operations that cannot be differentiated via traditional autodiff. [(789)](https://github.com/PennyLaneAI/catalyst/pull/789)

* A `non_commuting_observables` flag has been added to the device TOML schema, indicating whether or not the device supports measuring non-commuting observables. If `false`, non-commuting measurements will be split into multiple executions. [(821)](https://github.com/PennyLaneAI/catalyst/pull/821)

* The underlying PennyLane `Operation` objects for `cond`, `for_loop`, and `while_loop` can now beaccessed directly via `body_function.operation`.[(711)](https://github.com/PennyLaneAI/catalyst/pull/711)

This can be beneficial when, among other things, writing transforms without using the queuing mechanism:

python
qml.transform
def my_quantum_transform(tape):
ops = tape.operations.copy()

for_loop(0, 4, 1)
def f(i, sum):
qml.Hadamard(0)
return sum+1

res = f(0)
ops.append(f.operation) This is now supported!

def post_processing_fn(results):
return results
modified_tape = qml.tape.QuantumTape(ops, tape.measurements)
print(res)
print(modified_tape.operations)
return [modified_tape], post_processing_fn

qml.qjit
my_quantum_transform
qml.qnode(qml.device("lightning.qubit", wires=2))
def main():
qml.Hadamard(0)
return qml.probs()


pycon
>>> main()
Traced<ShapedArray(int64[], weak_type=True)>with<DynamicJaxprTrace(level=2/1)>
[Hadamard(wires=[0]), ForLoop(tapes=[[Hadamard(wires=[0])]])]
(array([0.5, 0. , 0.5, 0. ]),)


<h3>Breaking changes</h3>

* Binary distributions for Linux are now based on `manylinux_2_28` instead of `manylinux_2014`. As a result, Catalyst will only be compatible on systems with `glibc` versions `2.28` and above (e.g., Ubuntu 20.04 and above). [(663)](https://github.com/PennyLaneAI/catalyst/pull/663)

<h3>Bug fixes</h3>

* Functions that have been annotated with return type annotations will now correctly compile with `qjit`. [(751)](https://github.com/PennyLaneAI/catalyst/pull/751)

* An issue in the Lightning backend for the Catalyst runtime has been fixed that would only compute approximate probabilities when implementing mid-circuit measurements. As a result, low shot numbers would lead to unexpected behaviours or projections on zero probability states. Probabilities for mid-circuit measurements are now always computed analytically. [(801)](https://github.com/PennyLaneAI/catalyst/pull/801)

* The Catalyst runtime now raises an error if a qubit is accessed out of bounds from the allocated register. [(784)](https://github.com/PennyLaneAI/catalyst/pull/784)

* `jax.scipy.linalg.expm` is now supported within qjit-compiled functions. [(733)](https://github.com/PennyLaneAI/catalyst/pull/733) [(#752)](https://github.com/PennyLaneAI/catalyst/pull/752)

This required correctly linking openblas routines necessary for `jax.scipy.linalg.expm`. In this bug fix, four openblas routines were newly linked and are now discoverable by `stablehlo.custom_call<blas_routine>`. They are `blas_dtrsm`, `blas_ztrsm`, `lapack_dgetrf`, `lapack_zgetrf`.

* Fixes a bug where QNodes that contained `QubitUnitary` with a complex matrix would error during gradient computation. [(778)](https://github.com/PennyLaneAI/catalyst/pull/778)

* Callbacks can now return types which can be flattened and unflattened. [(812)](https://github.com/PennyLaneAI/catalyst/pull/812)

* `catalyst.qjit` and `catalyst.grad` now work correctly on functions that have been wrapped with `functools.partial`. [(820)](https://github.com/PennyLaneAI/catalyst/pull/820)

<h3>Internal changes</h3>

* Catalyst uses the `collapse` method of Lightning simulators in `Measure` to select a state vector branch and normalize. [(801)](https://github.com/PennyLaneAI/catalyst/pull/801)

* Measurement process primitives for Catalyst's JAXPR representation now have a standardized call signature so that `shots` and `shape` can both be provided as keyword arguments. [(790)](https://github.com/PennyLaneAI/catalyst/pull/790)

* The `QCtrl` class in Catalyst has been renamed to `HybridCtrl`, indicating its capability to contain a nested scope of both quantum and classical operations. Using `ctrl` on a single operation will now directly dispatch to the equivalent PennyLane class. [(771)](https://github.com/PennyLaneAI/catalyst/pull/771)

* The `Adjoint` class in Catalyst has been renamed to `HybridAdjoint`, indicating its capability to contain a nested scope of both quantum and classical operations. Using `adjoint` on a single operation will now directly dispatch to the equivalent PennyLane class. [(768)](https://github.com/PennyLaneAI/catalyst/pull/768) [(#802)](https://github.com/PennyLaneAI/catalyst/pull/802)

* Add support to use a locally cloned PennyLane Lightning repository with the runtime. [(732)](https://github.com/PennyLaneAI/catalyst/pull/732)

* The `qjit_device.py` and `preprocessing.py` modules have been refactored into the sub-package `catalyst.device`. [(721)](https://github.com/PennyLaneAI/catalyst/pull/721)

* The `ag_autograph.py` and `autograph.py` modules have been refactored into the sub-package `catalyst.autograph`. [(722)](https://github.com/PennyLaneAI/catalyst/pull/722)

* Callback refactoring. This refactoring creates the classes `FlatCallable` and `MemrefCallable`. [(742)](https://github.com/PennyLaneAI/catalyst/pull/742)

The `FlatCallable` class is a `Callable` that is initialized by providing some parameters and kwparameters that match the the expected shapes that will be received at the callsite. Instead of taking shaped `*args` and `**kwargs`, it receives flattened arguments. The flattened arguments are unflattened with the shapes with which the function was initialized. The `FlatCallable` return values will allways be flattened before returning to the caller.

The `MemrefCallable` is a subclass of `FlatCallable`. It takes a result type parameter during initialization that corresponds to the expected return type. This class is expected to be called only from the Catalyst runtime. It expects all arguments to be `void*` to memrefs. These `void*` are casted to MemrefStructDescriptors using ctypes, numpy arrays, and finally jax arrays. These flat jax arrays are then sent to the `FlatCallable`. `MemrefCallable` is again expected to be called only from within the Catalyst runtime. And the return values match those expected by Catalyst runtime.

This separation allows for a better separation of concerns, provides a nicer interface and allows for multiple `MemrefCallable` to be defined for a single callback, which is necessary for custom gradient of `pure_callbacks`.

* A new `catalyst::gradient::GradientOpInterface` is available when querying the gradient method in the mlir c++ api. [(800)](https://github.com/PennyLaneAI/catalyst/pull/800)

`catalyst::gradient::GradOp`, `ValueAndGradOp`, `JVPOp`, and `VJPOp` now inherits traits in this new `GradientOpInterface`. The supported attributes are now `getMethod()`, `getCallee()`, `getDiffArgIndices()`, `getDiffArgIndicesAttr()`, `getFiniteDiffParam()`, and `getFiniteDiffParamAttr()`.

- There are operations that could potentially be used as `GradOp`, `ValueAndGradOp`, `JVPOp` or `VJPOp`. When trying to get the gradient method, instead of doing
C++
auto gradOp = dyn_cast<GradOp>(op);
auto jvpOp = dyn_cast<JVPOp>(op);
auto vjpOp = dyn_cast<VJPOp>(op);

llvm::StringRef MethodName;
if (gradOp)
MethodName = gradOp.getMethod();
else if (jvpOp)
MethodName = jvpOp.getMethod();
else if (vjpOp)
MethodName = vjpOp.getMethod();

to identify which op it actually is and protect against segfaults (calling `nullptr.getMethod()`), in the new interface we just do
C++
auto gradOpInterface = cast<GradientOpInterface>(op);
llvm::StringRef MethodName = gradOpInterface.getMethod();


- Another advantage is that any concrete gradient operation object can behave like a `GradientOpInterface`:
C++
GradOp op; // or ValueAndGradOp op, ...
auto foo = [](GradientOpInterface op){
llvm::errs() << op.getCallee();
};
foo(op); // this works!


- Finally, concrete op specific methods can still be called by "reinterpret"-casting the interface back to a concrete op (provided the concrete op type is correct):
C++
auto foo = [](GradientOpInterface op){
size_t numGradients = cast<ValueAndGradOp>(&op)->getGradients().size();
};
ValueAndGradOp op;
foo(op); // this works!


<h3>Contributors</h3>

This release contains contributions from (in alphabetical order):

Ali Asadi,
Lillian M.A. Frederiksen,
David Ittah,
Christina Lee,
Erick Ochoa,
Haochen Paul Wang,
Lee James O'Riordan,
Mehrdad Malekmohammadi,
Vincent Michaud-Rioux,
Mudit Pandey,
Raul Torres,
Sergei Mironov,
Tzung-Han Juang.

0.6.1

New features since last release

* Added a `print_applied` method to QNodes, allowing the operation and observable queue to be printed as last constructed. [378](https://github.com/XanaduAI/pennylane/pull/378)

Improvements

* A new `Operator` base class is introduced, which is inherited by both the `Observable` class and the `Operation` class. [355](https://github.com/XanaduAI/pennylane/pull/355)

* Removed deprecated `abstractproperty` decorators in `_device.py`. [374](https://github.com/XanaduAI/pennylane/pull/374)

* Comprehensive gradient tests have been added for the interfaces. [381](https://github.com/XanaduAI/pennylane/pull/381)

Documentation

* The new restructured documentation has been polished and updated. [387](https://github.com/XanaduAI/pennylane/pull/387) [#375](https://github.com/XanaduAI/pennylane/pull/375) [#372](https://github.com/XanaduAI/pennylane/pull/372) [#370](https://github.com/XanaduAI/pennylane/pull/370) [#369](https://github.com/XanaduAI/pennylane/pull/369) [#367](https://github.com/XanaduAI/pennylane/pull/367) [#364](https://github.com/XanaduAI/pennylane/pull/364)

* Updated the development guides. [382](https://github.com/XanaduAI/pennylane/pull/382) [#379](https://github.com/XanaduAI/pennylane/pull/379)

* Added all modules, classes, and functions to the API section in the documentation. [373](https://github.com/XanaduAI/pennylane/pull/373)

Bug fixes

* Replaces the existing `np.linalg.norm` normalization with hand-coded normalization, allowing AmplitudeEmbedding` to be used with differentiable parameters. AmplitudeEmbedding tests have been added and improved. [376](https://github.com/XanaduAI/pennylane/pull/376)

Contributors

This release contains contributions from (in alphabetical order):

Josh Izaac, Nathan Killoran, Maria Schuld, Antal Száva

0.6.0

<h3>New features</h3>

* Catalyst now supports externally hosted callbacks with parameters and return values within qjit-compiled code. This provides the ability to insert native Python code into any qjit-compiled function, allowing for the capability to include subroutines that do not yet support qjit-compilation and enhancing the debugging experience. [(540)](https://github.com/PennyLaneAI/catalyst/pull/540) [(#596)](https://github.com/PennyLaneAI/catalyst/pull/596) [(#610)](https://github.com/PennyLaneAI/catalyst/pull/610) [(#650)](https://github.com/PennyLaneAI/catalyst/pull/650) [(#649)](https://github.com/PennyLaneAI/catalyst/pull/649) [(#661)](https://github.com/PennyLaneAI/catalyst/pull/661) [(#686)](https://github.com/PennyLaneAI/catalyst/pull/686) [(#689)](https://github.com/PennyLaneAI/catalyst/pull/689)

The following two callback functions are available:

- `catalyst.pure_callback` supports callbacks of **pure** functions. That is, functions with no [side-effects](https://runestone.academy/ns/books/published/fopp/Functions/SideEffects.html) that accept parameters and return values. However, the return type and shape of the function must be known in advance, and is provided as a type signature.

python
pure_callback
def callback_fn(x) -> float:
here we call non-JAX compatible code, such
as standard NumPy
return np.sin(x)

qjit
def fn(x):
return jnp.cos(callback_fn(x ** 2))

pycon
>>> fn(0.654)
array(0.9151995)


- `catalyst.debug.callback` supports callbacks of functions with **no** return values. This makes it an easy entry point for debugging, for example via printing or logging at runtime.

python
catalyst.debug.callback
def callback_fn(y):
print("Value of y =", y)

qjit
def fn(x):
y = jnp.sin(x)
callback_fn(y)
return y ** 2

pycon
>>> fn(0.54)
Value of y = 0.5141359916531132
array(0.26433582)
>>> fn(1.52)
Value of y = 0.998710143975583
array(0.99742195)


Note that callbacks do not currently support differentiation, and cannot be used inside functions that `catalyst.grad` is applied to.

* More flexible runtime printing through support for format strings. [(621)](https://github.com/PennyLaneAI/catalyst/pull/621)

The `catalyst.debug.print` function has been updated to support Python-like format strings:

python
qjit
def cir(a, b, c):
debug.print("{c} {b} {a}", a=a, b=b, c=c)


pycon
>>> cir(1, 2, 3)
3 2 1


Note that previous functionality of the print function to print out memory reference information of variables has been moved to `catalyst.debug.print_memref`.

* Catalyst now supports QNodes that execute on [Oxford Quantum Circuits (OQC)](https://www.oqc.tech/) superconducting hardware, via [OQC Cloud](https://docs.oqc.app). [(#578)](https://github.com/PennyLaneAI/catalyst/pull/578) [(#579)](https://github.com/PennyLaneAI/catalyst/pull/579) [(#691)](https://github.com/PennyLaneAI/catalyst/pull/691)

To use OQC Cloud with Catalyst, simply ensure your credentials are set as environment variables, and load the `oqc.cloud` device to be used within your qjit-compiled workflows.

python
import os
os.environ["OQC_EMAIL"] = "your_email"
os.environ["OQC_PASSWORD"] = "your_password"
os.environ["OQC_URL"] = "oqc_url"

dev = qml.device("oqc.cloud", backend="lucy", shots=2012, wires=2)

qjit
qml.qnode(dev)
def circuit(a: float):
qml.Hadamard(0)
qml.CNOT(wires=[0, 1])
qml.RX(wires=0)
return qml.counts(wires=[0, 1])

print(circuit(0.2))


* Catalyst now ships with an instrumentation feature allowing to explore what steps are run during compilation and execution, and for how long. [(528)](https://github.com/PennyLaneAI/catalyst/pull/528) [(#597)](https://github.com/PennyLaneAI/catalyst/pull/597)

Instrumentation can be enabled from the frontend with the `catalyst.debug.instrumentation` context manager:

pycon
>>> qjit
... def expensive_function(a, b):
... return a + b
>>> with debug.instrumentation("session_name", detailed=False):
... expensive_function(1, 2)
[DIAGNOSTICS] Running capture walltime: 3.299 ms cputime: 3.294 ms programsize: 0 lines
[DIAGNOSTICS] Running generate_ir walltime: 4.228 ms cputime: 4.225 ms programsize: 14 lines
[DIAGNOSTICS] Running compile walltime: 57.182 ms cputime: 12.109 ms programsize: 121 lines
[DIAGNOSTICS] Running run walltime: 1.075 ms cputime: 1.072 ms


The results will be appended to the provided file if the `filename` attribute is set, and printed to the console otherwise. The flag `detailed` determines whether individual steps in the compiler and runtime are instrumented, or whether only high-level steps like "program capture" and "compilation" are reported.

Measurements currently include wall time, CPU time, and (intermediate) program size.

<h3>Improvements</h3>

* AutoGraph now supports return statements inside conditionals in qjit-compiled functions. [(583)](https://github.com/PennyLaneAI/catalyst/pull/583)

For example, the following pattern is now supported, as long as all return values have the same type:

python
qjit(autograph=True)
def fn(x):
if x > 0:
return jnp.sin(x)
return jnp.cos(x)


pycon
>>> fn(0.1)
array(0.09983342)
>>> fn(-0.1)
array(0.99500417)


This support extends to quantum circuits:

python
dev = qml.device("lightning.qubit", wires=1)

qjit(autograph=True)
qml.qnode(dev)
def f(x: float):
qml.RX(x, wires=0)

m = catalyst.measure(0)

if not m:
return m, qml.expval(qml.PauliZ(0))

qml.RX(x ** 2, wires=0)

return m, qml.expval(qml.PauliZ(0))


pycon
>>> f(1.4)
(array(False), array(1.))
>>> f(1.4)
(array(True), array(0.37945176))


Note that returning results with different types or shapes within the same function, such as different observables or differently shaped arrays, is not possible.

* Errors are now raised at compile time if the gradient of an unsupported function is requested. [(204)](https://github.com/PennyLaneAI/catalyst/pull/204)

At the moment, `CompileError` exceptions will be raised if at compile time it is found that code reachable from the gradient operation contains either a mid-circuit measurement, a callback, or a JAX-style custom call (which happens through the mitigation operation as well as certain JAX operations).

* Catalyst now supports devices built from the [new PennyLane device API](https://docs.pennylane.ai/en/stable/code/api/pennylane.devices.Device.html). [(#565)](https://github.com/PennyLaneAI/catalyst/pull/565) [(#598)](https://github.com/PennyLaneAI/catalyst/pull/598) [(#599)](https://github.com/PennyLaneAI/catalyst/pull/599) [(#636)](https://github.com/PennyLaneAI/catalyst/pull/636) [(#638)](https://github.com/PennyLaneAI/catalyst/pull/638) [(#664)](https://github.com/PennyLaneAI/catalyst/pull/664) [(#687)](https://github.com/PennyLaneAI/catalyst/pull/687)

When using the new device API, Catalyst will discard the preprocessing from the original device, replacing it with Catalyst-specific preprocessing based on the TOML file provided by the device. Catalyst also requires that provided devices specify their wires upfront.

* A new compiler optimization that removes redundant chains of self inverse operations has been added. This is done within a new MLIR pass called `remove-chained-self-inverse`. Currently we only match redundant Hadamard operations, but the list of supported operations can be expanded. [(630)](https://github.com/PennyLaneAI/catalyst/pull/630)

* The `catalyst.measure` operation is now more lenient in the accepted type for the `wires` parameter. In addition to a scalar, a 1D array is also accepted as long as it only contains one element. [(623)](https://github.com/PennyLaneAI/catalyst/pull/623)

For example, the following is now supported:

python
catalyst.measure(wires=jnp.array([0]))


* The compilation & execution of `qjit` compiled functions can now be aborted using an interrupt signal (SIGINT). This includes using `CTRL-C` from a command line and the `Interrupt` button in a Jupyter Notebook. [(642)](https://github.com/PennyLaneAI/catalyst/pull/642)

* The Catalyst Amazon Braket support has been updated to work with the latest version of the Amazon Braket PennyLane plugin (v1.25.0) and Amazon Braket Python SDK (v1.73.3) [(620)](https://github.com/PennyLaneAI/catalyst/pull/620) [(#672)](https://github.com/PennyLaneAI/catalyst/pull/672) [(#673)](https://github.com/PennyLaneAI/catalyst/pull/673)

Note that with this update, all declared qubits in a submitted program will always be measured, even if specific qubits were never used.

* An updated quantum device specification format, TOML schema v2, is now supported by Catalyst. This allows device authors to specify properties such as native quantum control support, gate invertibility, and differentiability on a per-operation level. [(554)](https://github.com/PennyLaneAI/catalyst/pull/554)

For more details on the new TOML schema, please refer to the [custom devices documentation](https://docs.pennylane.ai/projects/catalyst/en/latest/dev/custom_devices.html).

* An exception is now raised when OpenBLAS cannot be found by Catalyst during compilation. [(643)](https://github.com/PennyLaneAI/catalyst/pull/643)

<h3>Breaking changes</h3>

* `qml.sample` and `qml.counts` now produce integer arrays for the sample array and basis state array when used without observables. [(648)](https://github.com/PennyLaneAI/catalyst/pull/648)

* The endianness of counts in Catalyst now matches the convention of PennyLane. [(601)](https://github.com/PennyLaneAI/catalyst/pull/601)

* `catalyst.debug.print` no longer supports the `memref` keyword argument. Please use `catalyst.debug.print_memref` instead. [(621)](https://github.com/PennyLaneAI/catalyst/pull/621)

<h3>Bug fixes</h3>

* The QNode argument `diff_method=None` is now supported for QNodes within a qjit-compiled function. [(658)](https://github.com/PennyLaneAI/catalyst/pull/658)

* A bug has been fixed where the C++ compiler driver was incorrectly being triggered twice. [(594)](https://github.com/PennyLaneAI/catalyst/pull/594)

* Programs with `jnp.reshape` no longer fail. [(592)](https://github.com/PennyLaneAI/catalyst/pull/592)

* A bug in the quantum adjoint routine in the compiler has been fixed, which didn't take into account control wires on operations in all instances. [(591)](https://github.com/PennyLaneAI/catalyst/pull/591)

* A bug in the test suite causing stochastic autograph test failures has been fixed. [(652)](https://github.com/PennyLaneAI/catalyst/pull/652)

* Running Catalyst tests should no longer raise `ResourceWarning` from the use of `tempfile.TemporaryDirectory`. [(676)](https://github.com/PennyLaneAI/catalyst/pull/676)

* Raises an exception if the user has an incompatible CUDA Quantum version installed. [(707)](https://github.com/PennyLaneAI/catalyst/pull/707)

<h3>Internal changes</h3>

* The deprecated `qfunc` decorator, in use mainly by the LIT test suite, has been removed. [(679)](https://github.com/PennyLaneAI/catalyst/pull/679)

* Catalyst now publishes a revision string under `catalyst.__revision__`, in addition to the existing `catalyst.__version__` string. The revision contains the Git commit hash of the repository at the time of packaging, or for editable installations the active commit hash at the time of package import. [(560)](https://github.com/PennyLaneAI/catalyst/pull/560)

* The Python interpreter is now a shared resource across the runtime. [(615)](https://github.com/PennyLaneAI/catalyst/pull/615)

This change allows any part of the runtime to start executing Python code through pybind.

<h3>Contributors</h3>

This release contains contributions from (in alphabetical order):

Ali Asadi,
David Ittah,
Romain Moyard,
Sergei Mironov,
Erick Ochoa Lopez,
Lee James O'Riordan,
Muzammiluddin Syed.

0.5.0

<h3>New features</h3>

* Catalyst now provides a QJIT compatible `catalyst.vmap` function, which makes it even easier to modify functions to map over inputs with additional batch dimensions. [(497)](https://github.com/PennyLaneAI/catalyst/pull/497) [(#569)](https://github.com/PennyLaneAI/catalyst/pull/569)

When working with tensor/array frameworks in Python, it can be important to ensure that code is written to minimize usage of Python for loops (which can be slow and inefficient), and instead push as much of the computation through to the array manipulation library, by taking advantage of extra batch dimensions.

For example, consider the following QNode:

python
dev = qml.device("lightning.qubit", wires=1)

qml.qnode(dev)
def circuit(x, y):
qml.RX(jnp.pi * x[0] + y, wires=0)
qml.RY(x[1] ** 2, wires=0)
qml.RX(x[1] * x[2], wires=0)
return qml.expval(qml.PauliZ(0))


pycon
>>> circuit(jnp.array([0.1, 0.2, 0.3]), jnp.pi)
Array(-0.93005586, dtype=float64)


We can use `catalyst.vmap` to introduce additional batch dimensions to our input arguments, without needing to use a Python for loop:

pycon
>>> x = jnp.array([[0.1, 0.2, 0.3],
... [0.4, 0.5, 0.6],
... [0.7, 0.8, 0.9]])
>>> y = jnp.array([jnp.pi, jnp.pi / 2, jnp.pi / 4])
>>> qjit(vmap(cost))(x, y)
array([-0.93005586, -0.97165424, -0.6987465 ])


`catalyst.vmap()` has been implemented to match the same behaviour of `jax.vmap`, so should be a drop-in replacement in most cases. Under-the-hood, it is automatically inserting Catalyst-compatible for loops, which will be compiled and executed outside of Python for increased performance.

* Catalyst now supports compiling and executing QJIT-compiled QNodes using the CUDA Quantum compiler toolchain. [(477)](https://github.com/PennyLaneAI/catalyst/pull/477) [(#536)](https://github.com/PennyLaneAI/catalyst/pull/536) [(#547)](https://github.com/PennyLaneAI/catalyst/pull/547)

Simply import the CUDA Quantum `cudaqjit` decorator to use this functionality:

python
from catalyst.cuda import cudaqjit


Or, if using Catalyst from PennyLane, simply specify `qml.qjit(compiler="cuda_quantum")`.

The following devices are available when compiling with CUDA Quantum:

* `softwareq.qpp`: a modern C++ statevector simulator
* `nvidia.custatevec`: The NVIDIA CuStateVec GPU simulator (with support for multi-gpu)
* `nvidia.cutensornet`: The NVIDIA CuTensorNet GPU simulator (with support for matrix product state)

For example:

python
dev = qml.device("softwareq.qpp", wires=2)

cudaqjit
qml.qnode(dev)
def circuit(x):
qml.RX(x[0], wires=0)
qml.RY(x[1], wires=1)
qml.CNOT(wires=[0, 1])
return qml.expval(qml.PauliY(0))


pycon
>>> circuit(jnp.array([0.5, 1.4]))
-0.47244976756708373


Note that CUDA Quantum compilation currently does not have feature parity with Catalyst compilation; in particular, AutoGraph, control flow, differentiation, and various measurement statistics (such as probabilities and variance) are not yet supported. Classical code support is also limited.

* Catalyst now supports just-in-time compilation of static (compile-time constant) arguments. [(476)](https://github.com/PennyLaneAI/catalyst/pull/476) [(#550)](https://github.com/PennyLaneAI/catalyst/pull/550)

The `qjit` decorator takes a new argument `static_argnums`, which specifies positional arguments of the decorated function should be treated as compile-time static arguments.

This allows any hashable Python object to be passed to the function during compilation; the function will only be re-compiled if the hash value of the static arguments change. Otherwise, re-using previous static argument values will result in no re-compilation.

python
qjit(static_argnums=(1,))
def f(x, y):
print(f"Compiling with y={y}")
return x + y


pycon
>>> f(0.5, 0.3)
Compiling with y=0.3
array(0.8)
>>> f(0.1, 0.3) no re-compilation occurs
array(0.4)
>>> f(0.1, 0.4) y changes, re-compilation
Compiling with y=0.4
array(0.5)


This functionality can be used to support passing arbitrary Python objects to QJIT-compiled functions, as long as they are hashable:

py
from dataclasses import dataclass

dataclass
class MyClass:
val: int

def __hash__(self):
return hash(str(self))

qjit(static_argnums=(1,))
def f(x: int, y: MyClass):
return x + y.val


pycon
>>> f(1, MyClass(5))
array(6)
>>> f(1, MyClass(6)) re-compilation
array(7)
>>> f(2, MyClass(5)) no re-compilation
array(7)


* Mid-circuit measurements now support post-selection and qubit reset when used with the Lightning simulators. [(491)](https://github.com/PennyLaneAI/catalyst/pull/491) [(#507)](https://github.com/PennyLaneAI/catalyst/pull/507)

To specify post-selection, simply pass the `postselect` argument to the `catalyst.measure` function:

python
dev = qml.device("lightning.qubit", wires=1)

qjit
qml.qnode(dev)
def f():
qml.Hadamard(0)
m = measure(0, postselect=1)
return qml.expval(qml.PauliZ(0))


Likewise, to reset a wire after mid-circuit measurement, simply specify `reset=True`:

python
dev = qml.device("lightning.qubit", wires=1)

qjit
qml.qnode(dev)
def f():
qml.Hadamard(0)
m = measure(0, reset=True)
return qml.expval(qml.PauliZ(0))


<h3>Improvements</h3>

* Catalyst now supports Python 3.12 [(532)](https://github.com/PennyLaneAI/catalyst/pull/532)

* The JAX version used by Catalyst has been updated to `v0.4.23`. [(428)](https://github.com/PennyLaneAI/catalyst/pull/428)

* Catalyst now supports the `qml.GlobalPhase` operation. [(563)](https://github.com/PennyLaneAI/catalyst/pull/563)

* Native support for `qml.PSWAP` and `qml.ISWAP` gates on Amazon Braket devices has been added. [(458)](https://github.com/PennyLaneAI/catalyst/pull/458)

Specifically, a circuit like

py
dev = qml.device("braket.local.qubit", wires=2, shots=100)

qjit
qml.qnode(dev)
def f(x: float):
qml.Hadamard(0)
qml.PSWAP(x, wires=[0, 1])
qml.ISWAP(wires=[1, 0])
return qml.probs()


would no longer decompose the `PSWAP` and `ISWAP` gates.

* The `qml.BlockEncode` operator is now supported with Catalyst. [(483)](https://github.com/PennyLaneAI/catalyst/pull/483)

* Catalyst no longer relies on a TensorFlow installation for its AutoGraph functionality. Instead, the standalone `diastatic-malt` package is used and automatically installed as a dependency. [(401)](https://github.com/PennyLaneAI/catalyst/pull/401)

* The `qjit` decorator will remember previously compiled functions when the PyTree metadata of arguments changes, in addition to also remembering compiled functions when static arguments change. [(522)](https://github.com/PennyLaneAI/catalyst/pull/531)

The following example will no longer trigger a third compilation:
py
qjit
def func(x):
print("compiling")
return x

pycon
>>> func([1,]); list
compiling
>>> func((2,)); tuple
compiling
>>> func([3,]); list


Note however that in order to keep overheads low, changing the argument *type* or *shape* (in a promotion incompatible way) may override a previously stored function (with identical PyTree metadata and static argument values):

py
qjit
def func(x):
print("compiling")
return x

pycon
>>> func(jnp.array(1)); scalar
compiling
>>> func(jnp.array([2.])); 1-D array
compiling
>>> func(jnp.array(3)); scalar
compiling


* Catalyst gradient functions (`grad`, `jacobian`, `vjp`, and `jvp`) now support being applied to functions that use (nested) container types as inputs and outputs. This includes lists and dictionaries, as well as any data structure implementing the [PyTree protocol](https://jax.readthedocs.io/en/latest/pytrees.html). [(#500)](https://github.com/PennyLaneAI/catalyst/pull/500) [(#501)](https://github.com/PennyLaneAI/catalyst/pull/501) [(#508)](https://github.com/PennyLaneAI/catalyst/pull/508) [(#549)](https://github.com/PennyLaneAI/catalyst/pull/549)

py
dev = qml.device("lightning.qubit", wires=1)

qml.qnode(dev)
def circuit(phi, psi):
qml.RY(phi, wires=0)
qml.RX(psi, wires=0)
return [{"expval0": qml.expval(qml.PauliZ(0))}, qml.expval(qml.PauliZ(0))]

psi = 0.1
phi = 0.2

pycon
>>> qjit(jacobian(circuit, argnum=[0, 1]))(psi, phi)
[{'expval0': (array(-0.0978434), array(-0.19767681))}, (array(-0.0978434), array(-0.19767681))]


* Support has been added for linear algebra functions which depend on computing the eigenvalues of symmetric matrices, such as `np.sqrt_matrix()`. [(488)](https://github.com/PennyLaneAI/catalyst/pull/488)

For example, you can compile `qml.math.sqrt_matrix`:

python
qml.qjit
def workflow(A):
B = qml.math.sqrt_matrix(A)
return B A


Internally, this involves support for lowering the eigenvectors/values computation lapack method `lapack_dsyevd` via `stablehlo.custom_call`.

* Additional debugging functions are now available in the `catalyst.debug` directory. [(529)](https://github.com/PennyLaneAI/catalyst/pull/529) [(#522)](https://github.com/PennyLaneAI/catalyst/pull/531)

This includes:

- `filter_static_args(args, static_argnums)` to remove static values from arguments using the
provided index list.

- `get_cmain(fn, *args)` to return a C program that calls a jitted function with the provided
arguments.

- `print_compilation_stage(fn, stage)` to print one of the recorded compilation stages for a
JIT-compiled function.

For more details, please see the `catalyst.debug` documentation.

* Remove redundant copies of TOML files for `lightning.kokkos` and `lightning.qubit`. [(472)](https://github.com/PennyLaneAI/catalyst/pull/472)

`lightning.kokkos` and `lightning.qubit` now ship with their own TOML file. As such, we use the TOML file provided by them.

* Capturing quantum circuits with many gates prior to compilation is now quadratically faster (up to a factor), by removing `qextract_p` and `qinst_p` from forced-order primitives. [(469)](https://github.com/PennyLaneAI/catalyst/pull/469)

* Update `AllocateQubit` and `AllocateQubits` in `LightningKokkosSimulator` to preserve the current state-vector before qubit re-allocations in the runtime dynamic qubits management. [(479)](https://github.com/PennyLaneAI/catalyst/pull/479)

* The [PennyLane custom compiler entry point name convention has changed](https://github.com/PennyLaneAI/pennylane/pull/5140), necessitating a change to the Catalyst entry points. [(#493)](https://github.com/PennyLaneAI/catalyst/pull/493)

<h3>Breaking changes</h3>

* Catalyst gradient functions now match the Jax convention for the returned axes of gradients, Jacobians, VJPs, and JVPs. As a result, the returned tensor shape from various Catalyst gradient functions may differ compared to previous versions of Catalyst. [(500)](https://github.com/PennyLaneAI/catalyst/pull/500) [(#501)](https://github.com/PennyLaneAI/catalyst/pull/501) [(#508)](https://github.com/PennyLaneAI/catalyst/pull/508)

* The Catalyst Python frontend has been partially refactored. The impact on user-facing functionality is minimal, but the location of certain classes and methods used by the package may have changed. [(529)](https://github.com/PennyLaneAI/catalyst/pull/529) [(#522)](https://github.com/PennyLaneAI/catalyst/pull/531)

The following changes have been made:

* Some debug methods and features on the QJIT class have been turned into free functions and moved to the `catalyst.debug` module, which will now appear in the public documention. This includes compiling a program from IR, obtaining a C program to invoke a compiled function from, and printing fine-grained MLIR compilation stages.

* The `compilation_pipelines.py` module has been renamed to `jit.py`, and certain functionality has been moved out (see following items).

* A new module `compiled_functions.py` now manages low-level access to compiled functions.

* A new module `tracing/type_signatures.py` handles functionality related managing arguments and type signatures during the tracing process.

* The `contexts.py` module has been moved from `utils` to the new `tracing` sub-module.

<h3>Internal changes</h3>

* Changes to the runtime QIR API and dependencies, to avoid symbol conflicts with other libraries that utilize QIR. [(464)](https://github.com/PennyLaneAI/catalyst/pull/464) [(#470)](https://github.com/PennyLaneAI/catalyst/pull/470)

The existing Catalyst runtime implements QIR as a library that can be linked against a QIR module. This works great when Catalyst is the only implementor of QIR, however it may generate symbol conflicts when used alongside other QIR implementations.

To avoid this, two changes were necessary:

* The Catalyst runtime now has a different API from QIR instructions.

The runtime has been modified such that QIR instructions are lowered to functions where the `__quantum__` part of the function name is replaced with `__catalyst__`. This prevents the possibility of symbol conflicts with other libraries that implement QIR as a library.

* The Catalyst runtime no longer depends on QIR runner's stdlib.

We no longer depend nor link against QIR runner's stdlib. By linking against QIR runner's stdlib, some definitions persisted that may be different than ones used by third party implementors. To prevent symbol conflicts QIR runner's stdlib was removed and is no longer linked against. As a result, the following functions are now defined and implemented in Catalyst's runtime:

* `int64_t __catalyst__rt__array_get_size_1d(QirArray *)`
* `int8_t *__catalyst__rt__array_get_element_ptr_1d(QirArray *, int64_t)`

and the following functions were removed since the frontend does not generate them

* `QirString *__catalyst__rt__qubit_to_string(QUBIT *)`
* `QirString *__catalyst__rt__result_to_string(RESULT *)`

* Fix an issue when no qubit number was specified for the `qinst` primitive. The primitive now correctly deduces the number of qubits when no gate parameters are present. This change is not user facing. [(496)](https://github.com/PennyLaneAI/catalyst/pull/496)

<h3>Bug fixes</h3>

* Fixed a bug where differentiation of sliced arrays would result in an error. [(552)](https://github.com/PennyLaneAI/catalyst/pull/552)

py
def f(x):
return jax.numpy.sum(x[::2])

x = jax.numpy.array([0.1, 0.2, 0.3, 0.4])

pycon
>>> catalyst.qjit(catalyst.grad(f))(x)
[1. 0. 1. 0.]


* Fixed a bug where quantum control applied to a subcircuit was not correctly mapping wires, and the wires in the nested region remained unchanged. [(555)](https://github.com/PennyLaneAI/catalyst/pull/555)

* Catalyst will no longer print a warning that recompilation is triggered when a `qjit` decorated function with no arguments is invoke without having been compiled first, for example via the use of `target="mlir"`. [(522)](https://github.com/PennyLaneAI/catalyst/pull/531)

* Fixes a bug in the configuration of dynamic shaped arrays that would cause certain program to error with `TypeError: cannot unpack non-iterable ShapedArray object`. [(526)](https://github.com/PennyLaneAI/catalyst/pull/526)

This is fixed by replacing the code which updates the `JAX_DYNAMIC_SHAPES` option with a `transient_jax_config()` context manager which temporarily sets the value of `JAX_DYNAMIC_SHAPES` to True and then restores the original configuration value following the yield. The context manager is used by `trace_to_jaxpr()` and `lower_jaxpr_to_mlir()`.

* Exceptions encountered in the runtime when using the `qjit` option `async_qnodes=Tue` will now be properly propagated to the frontend. [(447)](https://github.com/PennyLaneAI/catalyst/pull/447) [(#510)](https://github.com/PennyLaneAI/catalyst/pull/510)

This is done by:
* changeing `llvm.call` to `llvm.invoke`
* setting async runtime tokens and values to be errors
* deallocating live tokens and values

* Fixes a bug when computing gradients with the indexing/slicing, by fixing the scatter operation lowering when `updatedWindowsDim` is empty. [(475)](https://github.com/PennyLaneAI/catalyst/pull/475)

* Fix the issue in `LightningKokkos::AllocateQubits` with allocating too many qubit IDs on qubit re-allocation. [(473)](https://github.com/PennyLaneAI/catalyst/pull/473)

* Fixed an issue where wires was incorrectly set as `<Wires = [<WiresEnum.AnyWires: -1>]>` when using `catalyst.adjoint` and `catalyst.ctrl`, by adding a `wires` property to these operations. [(480)](https://github.com/PennyLaneAI/catalyst/pull/480)

* Fix the issue with multiple lapack symbol definitions in the compiled program by updating the `stablehlo.custom_call` conversion pass. [(488)](https://github.com/PennyLaneAI/catalyst/pull/488)

<h3>Contributors</h3>

This release contains contributions from (in alphabetical order):

Mikhail Andrenkov,
Ali Asadi,
David Ittah,
Tzung-Han Juang,
Erick Ochoa Lopez,
Romain Moyard,
Raul Torres,
Haochen Paul Wang.

Page 9 of 11

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.