Stannum

Latest version: v0.9.1

Safety actively analyzes 638452 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 2 of 4

0.6.3

0.6.2

Introduced a configuration in Tube `enable_backward`. When `enable_backward` is `False`, `Tube` will eagerly recycle Taichi memory by destroying SNodeTree right after forward calculation. This should improve performance of forward-only calculations and should mitigate the memory problem of Taichi in **forward-only mode**.

0.6.1

* 7 is fixed because of upstream Taichi has fixed uninitialized memory problem in 0.9.1
* Intermediate fields are now required to be batched if any input tensors are batched

0.5.0

Persistent mode and Eager mode of Tube
Before v0.5.0, the Taichi fields created in Tube is persistent and their lifetime is like:
PyTorch upstream tensors -> Tube -> create fields -> forward pass -> copy values to downstream tensors -> compute graph of Autograd completes -> optional backward pass -> compute graph destroyed -> destroy fields

They're so-called persistent fields as they persist when the compute graph is being constructed.

Now in v0.5.0, we introduce an eager mode of Tube. With `persistent_fields=False` when instancing a `Tube`, eager mode is turned on, in which the lifetime of fields is like:
PyTorch upstream tensors -> Tube -> fields -> copied values to downstream tensors -> destroy fields -> compute graph of Autograd completes -> optional backward pass -> compute graph destroyed

Zooming in the optional backward pass, since we've destroyed fields that store values in the forward pass, we need to re-allocate new fields when calculating gradients, then the backward pass is like:
Downstream gradients -> Tube -> create fields and load values -> load downstream gradients to fields -> backward pass -> copy gradients to tensors -> Destroy fields -> Upstream PyTorch gradient calculation

This introduces some overhead but may be faster on "old" Taichi (any Taichi that does not merge https://github.com/taichi-dev/taichi/pull/4356). For details, please see this PR. At the time we release v0.5.0, stable Taichi does not merge this PR.

Compatibility issue fixes
At the time we release v0.5.0, Taichi has been being under refactoring heavily, so we introduced many small fixes to deal with incompatibilities caused by such refactoring. If you find compatibility issues, feel free to submit issues and make PRs.

0.4.4

Fix many problems due to Taichi changes and bugs:
* API import problems due to Taichi API changes
* Memory uninit problem due to this https://github.com/taichi-dev/taichi/issues/4334 and this https://github.com/taichi-dev/taichi/issues/4016

0.4.0

`Tube`
`Tube` is more flexible than `Tin` and slower in that it helps you create necessary fields and do automatic batching.

Registrations
All you need to do is to register:
* Input/intermediate/output **tensor shapes** instead of fields
* At least one kernel that takes the following as arguments
* Taichi fields: correspond to tensors (may or may not require gradients)
* (Optional) Extra arguments: will NOT receive gradients

Acceptable dimensions of tensors to be registered:
* `None`: means the flexible batch dimension, must be the first dimension e.g. `(None, 2, 3, 4)`
* Positive integers: fixed dimensions with the indicated dimensionality
* Negative integers:
* `-1`: means any number `[1, +inf)`, only usable in the registration of input tensors.
* Negative integers < -1: indices of some dimensions that must be of the same dimensionality
* Restriction: negative indices must be "declared" in the registration of input tensors first, then used in the registration of intermediate and output tensors.
* Example 1: tensor `a` and `b` of shapes `a: (2, -2, 3)` and `b: (-2, 5, 6)` mean the dimensions of `-2` must match.
* Example 2: tensor `a` and `b` of shapes `a: (-1, 2, 3)` and `b: (-1, 5, 6)` mean no restrictions on the first dimensions.

Registration order:
Input tensors/intermediate fields/output tensors must be registered first, and then kernel.
python
ti.kernel
def ti_add(arr_a: ti.template(), arr_b: ti.template(), output_arr: ti.template()):
for i in arr_a:
output_arr[i] = arr_a[i] + arr_b[i]

ti.init(ti.cpu)
cpu = torch.device("cpu")
a = torch.ones(10)
b = torch.ones(10)
tube = Tube(cpu) \
.register_input_tensor((10,), torch.float32, "arr_a", False) \
.register_input_tensor((10,), torch.float32, "arr_b", False) \
.register_output_tensor((10,), torch.float32, "output_arr", False) \
.register_kernel(ti_add, ["arr_a", "arr_b", "output_arr"]) \
.finish()
out = tube(a, b)

When registering a kernel, a list of field/tensor names is required, for example, the above `["arr_a", "arr_b", "output_arr"]`.
This list should correspond to the fields in the arguments of a kernel (e.g. above `ti_add()`).

The order of input tensors should match the input fields of a kernel.

Automatic batching
Automatic batching is done simply by running kernels `batch` times. The batch number is determined by the leading dimension of tensors of registered shape `(None, ...)`.

It's required that if any input tensors or intermediate fields are batched (which means they have registered the first dimension to be `None`), all output tensors must be registered as batched.

Examples
Simple one without negative indices or batch dimension:
python
ti.kernel
def ti_add(arr_a: ti.template(), arr_b: ti.template(), output_arr: ti.template()):
for i in arr_a:
output_arr[i] = arr_a[i] + arr_b[i]

ti.init(ti.cpu)
cpu = torch.device("cpu")
a = torch.ones(10)
b = torch.ones(10)
tube = Tube(cpu) \
.register_input_tensor((10,), torch.float32, "arr_a", False) \
.register_input_tensor((10,), torch.float32, "arr_b", False) \
.register_output_tensor((10,), torch.float32, "output_arr", False) \
.register_kernel(ti_add, ["arr_a", "arr_b", "output_arr"]) \
.finish()
out = tube(a, b)


With negative dimension index:

python
ti.init(ti.cpu)
cpu = torch.device("cpu")
tube = Tube(cpu) \
.register_input_tensor((-2,), torch.float32, "arr_a", False) \
.register_input_tensor((-2,), torch.float32, "arr_b", False) \
.register_output_tensor((-2,), torch.float32, "output_arr", False) \
.register_kernel(ti_add, ["arr_a", "arr_b", "output_arr"]) \
.finish()
dim = 10
a = torch.ones(dim)
b = torch.ones(dim)
out = tube(a, b)
assert torch.allclose(out, torch.full((dim,), 2.))
dim = 100
a = torch.ones(dim)
b = torch.ones(dim)
out = tube(a, b)
assert torch.allclose(out, torch.full((dim,), 2.))


With batch dimension:
python
ti.kernel
def int_add(a: ti.template(), b: ti.template(), out: ti.template()):
out[None] = a[None] + b[None]

ti.init(ti.cpu)
b = torch.tensor(1., requires_grad=True)
batched_a = torch.ones(10, requires_grad=True)
tube = Tube() \
.register_input_tensor((None,), torch.float32, "a") \
.register_input_tensor((), torch.float32, "b") \
.register_output_tensor((None,), torch.float32, "out", True) \
.register_kernel(int_add, ["a", "b", "out"]) \
.finish()
out = tube(batched_a, b)
loss = out.sum()
loss.backward()
assert torch.allclose(torch.ones_like(batched_a) + 1, out)
assert b.grad == 10.
assert torch.allclose(torch.ones_like(batched_a), batched_a.grad)


For more invalid use examples, please see tests in `tests/test_tube`.

Advanced field construction with `FieldManager`
There is a way to tweak how fields are constructed in order to gain performance improvement in kernel calculations.

By supplying a customized `FieldManager` when registering a field, you can construct a field however you want.

Please refer to the code `FieldManger` in `src/stannum/auxiliary.py` for more information.

If you don't know why constructing fields differently can improve performance, don't use this feature.

If you don't know how to construct fields differently, please refer to [Taichi field documentation](https://docs.taichi.graphics/lang/articles/advanced/layout).

Page 2 of 4

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.