Taichi

Latest version: v1.7.2

Safety actively analyzes 685670 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 3 of 23

2.1

Dynamic VRAM Allocation:
- In our latest update, the CUDA backend has been optimized to dynamically allocate Video RAM (VRAM), significantly reducing the initial preallocation requirement. Now, less than 50MB is preallocated upon `ti.init`.

Changes in `device_memory_GB` and `device_memory_fraction` Usage:
- These settings are now specifically tailored for preallocating memory for **SPARSE** data structures, such as `ti.pointer`. This preallocation occurs only once a Sparse data structure is detected in your code.

Impact on VRAM Consumption:
- Users can expect a noticeable decrease in VRAM usage with these enhancements. For instance:
`diffmpm3d:  3866MB --> 3190 MB`
`nerf_train_deploy: 5618MB --> 4664 MB`

1.7.2

Highlights:
- **Bug fixes**
- Fix Loop-Invariant-Cache for dynamic indexed pointers (8577) (by **Zhanlue Yang**)
- Fix bug to disable taichi header print (8517) (by **Yong-Chao Wu**)
- **Build system**
- Lift macOS min compat version to Big Sur (8583) (by **Proton**)
- Drop manylinux2014 wheel support (8581) (by **Proton**)
- **Language and syntax**
- Migrate irpass::force_scalarize_matrix() beforehand (8532) (by **Zhanlue Yang**)
- Add config.force_scalarize_matrix to avoid perf-regression in certain scenario (8509) (by **Zhanlue Yang**)

Full changelog:
- [misc] Bump version to v1.7.2 (by **Proton**)
- [aot] Add stream_ variable for CUDAContext to use a specific CUDA stream to launch CUDA kernel (8579) (by **Sichao He**)
- [Build] Lift macOS min compat version to Big Sur (8583) (by **Proton**)
- [ci] Drop nvidia driver 510 support (already EOL'd) (8582) (by **Proton**)
- [Build] Drop manylinux2014 wheel support (8581) (by **Proton**)
- [Bug] Fix Loop-Invariant-Cache for dynamic indexed pointers (8577) (by **Zhanlue Yang**)
- [bug] Fix assign may lose precision warning & improve related logging (8553) (by **Bob Cao**)
- [bug] Fixes for numpy 2.0 (unblocking python 3.12 release build on mac) (8552) (by **Bob Cao**)
- [doc] Fix typo: 'inheritence' -> 'inheritance' (8551) (by **3n3l**)
- [misc] Ensure succeeded variable is properly initialized in matrix-free solvers (8484) (by **liblaf**)
- [Bug] Fix bug to disable taichi header print (8517) (by **Yong-Chao Wu**)
- [misc] Add conversions for unsigned types, torch > 2.3.0 (8528) (by **Oliver Batchelor**)
- [doc] Fix typo & missing typedef in math/math_module.md (8541) (by **Jingwei Xu**)
- [ci] Remove driver470, add driver 550 (8546) (by **Proton**)
- [misc] Bump spdlog version and fix unformattable error (8543) (by **Bob Cao**)
- [build] Fix build.py bootstrap corner cases (8544) (by **Proton**)
- [ci] Force TI_USE_GIT_CACHE on (8545) (by **Proton**)
- [Lang] Migrate irpass::force_scalarize_matrix() beforehand (8532) (by **Zhanlue Yang**)
- [bug] Fix offline cache emit dependencies (8510) (by **Mingrui Zhang**)
- [Lang] Add config.force_scalarize_matrix to avoid perf-regression in certain scenario (8509) (by **Zhanlue Yang**)

1.7.1

Highlights:
- **Bug fixes**
- Fix CFG aliasing error with matrix of matrix (8445) (by **Zhanlue Yang**)
- **Documentation**
- Update offset.md (8470) (by **Kenshi Takayama**)
- Update math_module.md (8471) (by **Kenshi Takayama**)
- Update accelerate_pytorch.md | Fix typo in recap: Eeasy -> Easy (8475) (by **Aryan Garg**)
- **Miscellaneous**
- Bump version to 1.7.1 (by **Haidong Lan**)
- Bump taichi version to v1.8.0 (8458) (by **Zhanlue Yang**)

Full changelog:
- [Misc] Bump version to 1.7.1 (by **Haidong Lan**)
- [bug] Fix abs on unsigned types (8476) (by **Lin Jiang**)
- [Doc] Update offset.md (8470) (by **Kenshi Takayama**)
- [Doc] Update math_module.md (8471) (by **Kenshi Takayama**)
- [Doc] Update accelerate_pytorch.md | Fix typo in recap: Eeasy -> Easy (8475) (by **Aryan Garg**)
- [Misc] Bump taichi version to v1.8.0 (8458) (by **Zhanlue Yang**)
- [lang] Warn about non-contiguous gradient tensors (8450) (by **Bob Cao**)
- [autodiff] Fix the type of cmp statements in autodiff (8452) (by **Lin Jiang**)
- [Bug] Fix CFG aliasing error with matrix of matrix (8445) (by **Zhanlue Yang**)
- [misc] Add flag to disable taichi header print (8413) (by **Chaoming Wang**)

1.7.0

1. New features

1.6.0

Deprecation Notice
- We removed some APIs that were deprecated a long time ago. See the table below:

| Removed API | Replace with |
| --- | --- |
| Using atomic operations like a.atomic_add(b) | ti.atomic_add(a, b) or a += b |
| Using is and is not inside Taichi kernel and Taichi function | Not supported |
| Ndrange for loop with the number of the loop variables not equal to the dimension of the ndrange | Not supported |
| ti.ui.make_camera() | ti.ui.Camera() |
| ti.ui.Window.write_image() | ti.ui.Window.save_image() |
| ti.SOA | ti.Layout.SOA |
| ti.AOS | ti.Layout.AOS |
| ti.print_profile_info | ti.profiler.print_scoped_profiler_info |
| ti.clear_profile_info | ti.profiler.clear_scoped_profiler_info |
| ti.print_memory_profile_info | ti.profiler.print_memory_profiler_info |
| ti.CuptiMetric | ti.profiler.CuptiMetric |
| ti.get_predefined_cupti_metrics | ti.profiler.get_predefined_cupti_metrics |
| ti.print_kernel_profile_info | ti.profiler.print_kernel_profiler_info |
| ti.query_kernel_profile_info | ti.profiler.query_kernel_profiler_info |
| ti.clear_kernel_profile_info | ti.profiler.clear_kernel_profiler_info |
| ti.kernel_profiler_total_time | ti.profiler.get_kernel_profiler_total_time |
| ti.set_kernel_profiler_toolkit | ti.profiler.set_kernel_profiler_toolkit |
| ti.set_kernel_profile_metrics | ti.profiler.set_kernel_profiler_metrics |
| ti.collect_kernel_profile_metrics | ti.profiler.collect_kernel_profiler_metrics |
| ti.VideoManager | ti.tools.VideoManager |
| ti.PLYWriter | ti.tools.PLYWriter |
| ti.imread | ti.tools.imread |
| ti.imresize | ti.tools.imresize |
| ti.imshow | ti.tools.imshow |
| ti.imwrite | ti.tools.imwrite |
| ti.ext_arr | ti.types.ndarray |
| ti.any_arr | ti.types.ndarray |
| ti.Tape | ti.ad.Tape |
| ti.clear_all_gradients | ti.ad.clear_all_gradients |
| ti.linalg.sparse_matrix_builder | ti.types.sparse_matrix_builder |

- We no longer deprecate the builtin min/max function in the Taichi kernel anymore.
- We deprecate some arguments in the declaration of the arguments of the compute graph, and they will be removed in v1.7.0. Including:
- `element_shape` argument for scalar and ndarray
- `shape`, `channel_format` and `num_channels` arguments for texture
- `cc` backend will be removed at next release (`v1.7.0`)

New features

Struct arguments
You can now use struct arguments in all backends. The structs can be nested, and it can contain matrices and vectors. Here's an example:

python
transform_type = ti.types.struct(R=ti.math.mat3, T=ti.math.vec3)
pos_type = ti.types.struct(x=ti.math.vec3, trans=transform_type)
ti.kernel
def kernel_with_nested_struct_arg(p: pos_type) -> ti.math.vec3:
return p.trans.R p.x + p.trans.T
trans = transform_type(ti.math.mat3(1), [1, 1, 1])
p = pos_type(x=[1, 1, 1], trans=trans)
print(kernel_with_nested_struct_arg(p)) [4., 4., 4.]


Ndarray
- Support 0 dim ndarray read & write in python scope
- Fixed a bug when writing into ndarray from Python scope

Improvements
- Support rsqrt operator in autodiff
- Added assembly printer for CPU backend **Zhanlue Yang**
- Supporting CUDA shared array allocation over 48KiB

Performance
- Improved vectorization support on CPU backend, with significant performance gains for specific applications

New Examples
- 2D euler fluid simulation example by **Lee-abcde**

Misc
- Python 3.11 support
- `ti.frexp` is supported on CUDA, Vulkan, Metal, OpenGL backends.
- `ti.math.popcnt` intrinsic by **Garry Ling**
- Fixed a memory leak issue during SNodeTree destruction **Zhanlue Yang**
- Added validation and improved error report for ti.Field finalization **Zhanlue Yang**
- Fixed a memory leak issue with Cuda backend in C-API **Zhanlue Yang**
- Added support for formatted printing with str.format() and f-strings **Tianyi Liu**
- Changed Python code formatter from `yapf` to `black`

Developer Experience
- build.py script for preparing build & testing environment


Full changelog

Highlights:
- **Bug fixes**
- Fix wrong datatype size when writing to ndarray from Python scope (by **Ailing Zhang**)
- **CUDA backend**
- Warn driver version if it doesn't support memory pool. (7912) (by **Haidong Lan**)
- Better handling shared array shape check (7818) (by **Haidong Lan**)
- Support large shared memory for CUDA backend (7452) (by **Haidong Lan**)
- **Documentation**
- Add doc about struct arguments (7959) (by **Lin Jiang**)
- Fix docstring of mix function (7922) (by **Zhao Liang**)
- Update faq and ggui, and add them to CI (7861) (by **Zhao Liang**)
- Update doc for dynamic snode (7804) (by **Zhao Liang**)
- Update field.md (7819) (by **zhoooou**)
- Update readme (7808) (by **yanqingzhang**)
- Update write_test.md (7745) (by **Qian Bao**)
- Update performance.md (7720) (by **Zhao Liang**)
- Update readme (7673) (by **Zhao Liang**)
- Update tutorial.md (7512) (by **Chenzhan Shang**)
- Update gui_system.md (7628) (by **Qian Bao**)
- Remove deprecated api docstrings (7596) (by **pengyu**)
- Fix the cexp docstring (7588) (by **Zhao Liang**)
- Add doc about returning struct (7556) (by **Lin Jiang**)
- **Error messages**
- Update deprecation warning of the graph arguments (7965) (by **Lin Jiang**)
- **Language and syntax**
- Remove deprecated funcs in __init__.py (7941) (by **Lin Jiang**)
- Remove deprecated sparse_matrix_builder function (7942) (by **Lin Jiang**)
- Remove deprecated funcs in ti.ui (7940) (by **Lin Jiang**)
- Remove the support for 'is' (7930) (by **Lin Jiang**)
- Raise error when the dimension of the ndrange does not equal to the number of the loop variable (7933) (by **Lin Jiang**)
- Remove a.atomic(b) (7925) (by **Lin Jiang**)
- Cancel deprecating native min/max (7928) (by **Lin Jiang**)
- Let nested data classes have methods (7909) (by **Lin Jiang**)
- Let kernel argument support matrix nested in a struct (by **lin-hitonami**)
- Support the functions of dataclass as kernel argument and return value (7865) (by **Lin Jiang**)
- Fix a bug on PosixPath (7860) (by **Zhao Liang**)
- Seprate out the scalarization for MatrixOfMatrixPtrStmt and MatrixOfGlobalPtrStmt (7803) (by **Zhanlue Yang**)
- Fix pylance warning (7805) (by **Zhao Liang**)
- Support taking structs as kernel arguments (by **lin-hitonami**)
- Fix math module circular import bugs (7762) (by **Zhao Liang**)
- Support formatted printing in str.format() and f-strings (7686) (by **魔法少女赵志辉**)
- Replace internal representation of Python-scope ti.Matrix with numpy arrays (7559) (by **Yi Xu**)
- Stop letting ti.Struct inherit from TaichiOperations (7474) (by **Yi Xu**)
- Support writing sparse matrix as matrix market file (7529) (by **pengyu**)
- **Vulkan backend**
- Fix repeated generation of array ranges in spirv codegen. (7625) (by **Haidong Lan**)

Full changelog:
- [CUDA] Warn driver version if it doesn't support memory pool. (7912) (by **Haidong Lan**)
- [Doc] Add doc about struct arguments (7959) (by **Lin Jiang**)
- [Error] Update deprecation warning of the graph arguments (7965) (by **Lin Jiang**)
- [windows] Workaround C++ mangling special chars (7964) (by **Ailing**)
- [Lang] Remove deprecated funcs in __init__.py (7941) (by **Lin Jiang**)
- [build] Remove redundant C-API shared object in wheel (7950) (by **Proton**)
- [test] Do not test cc backend (by **Proton**)
- [Lang] Remove deprecated sparse_matrix_builder function (7942) (by **Lin Jiang**)
- [Lang] Remove deprecated funcs in ti.ui (7940) (by **Lin Jiang**)
- [Lang] Remove the support for 'is' (7930) (by **Lin Jiang**)
- [Lang] Raise error when the dimension of the ndrange does not equal to the number of the loop variable (7933) (by **Lin Jiang**)
- [Lang] Remove a.atomic(b) (7925) (by **Lin Jiang**)
- [Lang] Cancel deprecating native min/max (7928) (by **Lin Jiang**)
- [Doc] Fix docstring of mix function (7922) (by **Zhao Liang**)
- [example] Fix ti example bugs (7903) (by **Zhao Liang**)
- [ci] Build.py: Source generated env in new spawned shell (by **Proton**)
- [misc] Fix changelog commit extract code (by **Proton**)
- [ci] More robust build.py bootstrapping (7920) (by **Proton**)
- [Lang] [bug] Let nested data classes have methods (7909) (by **Lin Jiang**)
- [cuda] Only set CU_LIMIT_STACK_SIZE when necessary (7906) (by **Ailing**)
- [Lang] Let kernel argument support matrix nested in a struct (by **lin-hitonami**)
- [Bug] Fix wrong datatype size when writing to ndarray from Python scope (by **Ailing Zhang**)
- [lang] Support 0 dim ndarray read & write in python scope (by **Ailing Zhang**)
- [Lang] Support the functions of dataclass as kernel argument and return value (7865) (by **Lin Jiang**)
- [spirv] Support struct as kernel argument (by **Lin Jiang**)
- [spirv] Fix the ret type of frexp (by **lin-hitonami**)
- [ci] Build.py: Do not try to bootstrap pip (too many issues) (7897) (by **Proton**)
- [ci] Build.py quirks fix (7894) (by **Proton**)
- [Doc] Update faq and ggui, and add them to CI (7861) (by **Zhao Liang**)
- [build] Remove unused apt pkg 'libmirclient-dev' to make 'build.py' run properly on ubuntu 22.04 (7871) (by **Yu Zhang**)
- [Lang] Fix a bug on PosixPath (7860) (by **Zhao Liang**)
- [ci] Polishing build.py, wave 4 (7857) (by **Proton**)
- [build] Use LLVM without zstd dependency on M1 Macs (7856) (by **Proton**)
- [doc] Update dev_install.md to reflect build.py usage (7848) (by **Proton**)
- [ci] Polishing build.py, wave 3 (7845) (by **Proton**)
- [lang] Add popcnt to llvm intrinsic support (7772) (by **Garry Ling**)
- [Doc] Update doc for dynamic snode (7804) (by **Zhao Liang**)
- [ci] Fix release build failure (7834) (by **Proton**)
- [ci] More robust build.py bootstrapping (7833) (by **Proton**)
- [Doc] Update field.md (7819) (by **zhoooou**)
- [autodiff] Remove redundant autodiff mode in kernel name (7829) (by **Ailing**)
- [lang] Migrate Caching Allocation logics from CudaDevice/AmdgpuDevice to DeviceMemoryPool (7793) (by **Zhanlue Yang**)
- [misc] Resolve code formatter frictions (7828) (by **Proton**)
- [Lang] Seprate out the scalarization for MatrixOfMatrixPtrStmt and MatrixOfGlobalPtrStmt (7803) (by **Zhanlue Yang**)
- [bug] Fix imgui_context in destroying multiple GGUI windows (7812) (by **Ailing**)
- [misc] Update git-blame-ignore-revs (7825) (by **Proton**)
- [ci] Complete doc test list, remove redundant default prelude (7823) (by **Proton**)
- [misc] Relax Black formatter line length limit to 120 (7824) (by **Proton**)
- [Doc] Update readme (7808) (by **yanqingzhang**)
- [misc] Switch code formatter from `yapf` to `black` (7785) (by **Proton**)
- [CUDA] Better handling shared array shape check (7818) (by **Haidong Lan**)
- [misc] Improve ::liong::json::deserialize() (by **PGZXB**)
- [bug] Fix gen_offline_cache_key (7810) (by **PGZXB**)
- [ci] Fix build.py ensurepip (7811) (by **Proton**)
- [Lang] Fix pylance warning (7805) (by **Zhao Liang**)
- [lang] Support frexp on spirv-based backends (7770) (by **Ailing**)
- [lang] Split MemoryPool into DeviceMemoryPool and HostMemoryPool (7786) (by **Zhanlue Yang**)
- [misc] Optimize import overhead: pytorch and get_clangpp (7797) (by **Haidong Lan**)
- [ci] [doc] Tighten up document testing (7801) (by **Proton**)
- [ci] Polishing build.py, wave 2 (7800) (by **Proton**)
- [aot] Remove unused AotDataConverter (7799) (by **Lin Jiang**)
- [perf] Fix Taichi CPU backend compile parameter to pair performance with Numba. (7731) (by **zhengxianli**)
- [ci] Polishing build.py (7794) (by **Proton**)
- [bug] Returning nan for ti.sym_eig on identity matrix (7443) (by **Yimin Tang**)
- [Lang] Support taking structs as kernel arguments (by **lin-hitonami**)
- [ir] Add 'create_load' to ArgLoadStmt (by **lin-hitonami**)
- [ir] Let the src of GetElementStmt be a pointer (by **lin-hitonami**)
- [lang] Clean up runtime allocation functions (7773) (by **Zhanlue Yang**)
- [lang] Migrate CUDA preallocation logic to CudaMemoryPool (7746) (by **Zhanlue Yang**)
- [gfx] Fix runtime buffer/image copy barrier semantics (7781) (by **Bob Cao**)
- [misc] Remove unnecessary TaskCodeGenLLVM::task_counter (7777) (by **PGZXB**)
- [ci] Temporarily force Windows release builds to run on sm70 nodes (7767) (by **Proton**)
- [refactor] Remove Kernel::lowered_ (7765) (by **PGZXB**)
- [gui] Fluid visualization utilities (7682) (by **Qian Bao**)
- [Lang] Fix math module circular import bugs (7762) (by **Zhao Liang**)
- [misc] Make pre-commit happy (7768) (by **Proton**)
- [ci] Build iOS AOT static library (by **Proton**)
- [misc] Wrap path with std::filesystem::path (7754) (by **Bob Cao**)
- [lang] Support vector and matrix dtypes in ti.field (7761) (by **Ailing**)
- [ir] Remove unnecessary field_dims_ in ArgLoadStmt (7755) (by **Ailing**)
- [refactor] Remove Kernel::task_counter_ (7751) (by **PGZXB**)
- [ci] Build.py: Introduce TAICHI_CMAKE_ARGS manager for better log readability (by **Proton**)
- [ci] Reorganize build.py code (by **Proton**)
- [refactor] Let KernelCompilationManager manage kernel compilation in gfx::AotModuleBuilderImpl (7715) (by **PGZXB**)
- [misc] Remove unused FullSimplifyPass::Args::program (7750) (by **PGZXB**)
- [refactor] Re-impl LlvmAotModule using LLVM::KernelLauncher (7744) (by **PGZXB**)
- [lang] Implement experimental CG(Conjugate Gradient) solver in Taichi-lang (7690) (by **Qian Bao**)
- [lang] Transform bit_shr to bit_sar for uint (7757) (by **Ailing**)
- [ir] Postpone scalarize and lower_matrix_ptr to after bit loop vectorization (7726) (by **魔法少女赵志辉**)
- [ci] Isolate post sm70 tests (7740) (by **Proton**)
- [cuda] Suppport using SparseMatrix on more CUDA versions (7724) (by **Yu Zhang**)
- [cuda] Update the data layout of CUDA (7748) (by **Lin Jiang**)
- [ci] Ignore dup benchmark data points (7749) (by **Proton**)
- [bug] Fix reduction of atomic max (7747) (by **Lin Jiang**)
- [Doc] Update write_test.md (7745) (by **Qian Bao**)
- [refactor] Remove 'args' from 'RuntimeContext' (by **lin-hitonami**)
- [gfx] Let gfx backends use LaunchContextBuilder to build arguments in struct type (by **lin-hitonami**)
- [gfx] [refactor] Convert f16 in LaunchContextBuilder (by **lin-hitonami**)
- [gfx] Record the struct type of arguments and results in KernelContextAttributes (by **lin-hitonami**)
- [gfx] Compile struct type of result and arguments in gfx backends (by **lin-hitonami**)
- [refactor] Implement CompiledKernelData::check() (7743) (by **PGZXB**)
- [doc] [test] Update docs for printing with f-strings and formatted strings (7733) (by **魔法少女赵志辉**)
- [lang] Improve error message for mismatched index for ndarrays in python scope (7737) (by **Ailing**)
- [bug] Avoid redundant cache loading (7741) (by **PGZXB**)
- [refactor] Let KernelCompilationManager manage kernel compilation in LlvmAotModuleBuilder (7714) (by **PGZXB**)
- [ci] Skip large shared memory test for Turing GPUs. (7739) (by **Haidong Lan**)
- [cuda] Remove deprecated cusparse functions (7725) (by **Yu Zhang**)
- [misc] Update pull_request_template.md (7738) (by **Ailing**)
- [misc] Remove TI_WARN for cuda in memory_pool.cpp (7734) (by **Ailing**)
- [CUDA] Support large shared memory for CUDA backend (7452) (by **Haidong Lan**)
- [vulkan] Update SPIR-V codegen to emit FP16 consts (7676) (by **Bob Cao**)
- [lang] Support frexp on cuda backend (7721) (by **Ailing**)
- [refactor] Unify implementation of ProgramImpl::compile() (by **PGZXB**)
- [refactor] Introduce LLVM::KernelLauncher (by **PGZXB**)
- [refactor] Introduce gfx::KernelLauncher (by **PGZXB**)
- [test] Enable test offline cache on amdgpu and dx11 (7703) (by **PGZXB**)
- [lang] Refactor ownership and inheritance of allocators (7685) (by **Zhanlue Yang**)
- [ci] Fix git cache quirks (7722) (by **Proton**)
- [lang] Improve error msg in create ndarray (7709) (by **Garry Ling**)
- [Doc] Update performance.md (7720) (by **Zhao Liang**)
- [bug] Switch the gallery image used by README. (7716) (by **Chengchen(Rex) Wang**)
- [lang] Merge AMDGPUCachingAllocator to the generic CachingAllocator (7717) (by **Zhanlue Yang**)
- [bug] Invalid Field cache, RWAccessors cache, and Kernel cache upon SNodeTree destruction (7704) (by **Zhanlue Yang**)
- [ci] [test] Enable cc test on CI (by **lin-hitonami**)
- [test] [cc] Skip tests that cc backend doesn't support (by **lin-hitonami**)
- [test] Exclude the cc backend from tests that involve dynamic indexing (7705) (by **魔法少女赵志辉**)
- [bug] Fix camera controls (7681) (by **liblaf**)
- [bug] [cc] Fix comparison op in cc backend (by **Lin Jiang**)
- [bug] [cc] Set external ptr for cc backend (by **lin-hitonami**)
- [lang] Merged VirtualMemoryAllocator into MemoryPool for LLVM-CPU backend (7671) (by **Zhanlue Yang**)
- [misc] Remove useless JITEvaluatorId (7700) (by **PGZXB**)
- [bug] Fixed building with clang on Windows failed (7699) (by **PGZXB**)
- [Lang] Support formatted printing in str.format() and f-strings (7686) (by **魔法少女赵志辉**)
- [ci] Git caching proxy in CI (7692) (by **Proton**)
- [build] Let msvc generate pdb for cpp & c_api tests (by **lin-hitonami**)
- [refactor] Stop storing pointers to array devallocs in kernel args (by **lin-hitonami**)
- [aot] Implement bin2c in AOT cppgen (7687) (by **PENGUINLIONG**)
- [cpu] Remove atomics demotion for single-thread CPU targets. (7631) (by **Haidong Lan**)
- [aot] Export templated kernels (7683) (by **PENGUINLIONG**)
- [ci] Revive /benchmark (7680) (by **Proton**)
- [Doc] Update readme (7673) (by **Zhao Liang**)
- [misc] Device API public headers and CMake rework part 1 (7624) (by **Bob Cao**)
- [misc] Move optimize cpu module to KernelCodeGen (7667) (by **PGZXB**)
- [lang] [ir] Extract and save the format specifiers in str.format() (7660) (by **魔法少女赵志辉**)
- [example] Add 2D euler fluid simulation example (7568) (by **Lee-abcde**)
- [wasm] Remove WASM backend (by **lin-hitonami**)
- [build] Fix ssize_t type undefined errors when building with TI_WITH_LLVM=OFF on windows (7665) (by **Yu Zhang**)
- [misc] Remove unused Kernel::is_evaluator (7669) (by **PGZXB**)
- [misc] Remove unused Program::jit_evaluator_cache and Program::jit_evaluator_cache_mut (7668) (by **PGZXB**)
- [misc] Simplify test_offline_cache.py (7663) (by **PGZXB**)
- [lang] Improve error reporting for FieldsBuilder finalization (7640) (by **Zhanlue Yang**)
- [misc] Rename taichi::lang::llvm to taichi::lang::LLVM (7659) (by **PGZXB**)
- [refactor] Remove MemoryPool daemon in LLVM runtime (7648) (by **Zhanlue Yang**)
- [opt] Cleanup unncessary options in constant fold pass (7661) (by **Ailing**)
- [ci] Use build.py to prepare testing environment on Windows (7658) (by **Proton**)
- [opt] Move binary jit evaluator to host (by **Ailing Zhang**)
- [test] Update C++ constant fold tests to test operator one by one (by **Ailing Zhang**)
- [aot] Avoid shared library file being packaged into wheel data (7652) (by **Chenzhan Shang**)
- [ci] Fix scipy install (7649) (by **Proton**)
- [misc] Remove an unnecessary parameter of KernelCompilationManager::make_filename (by **PGZXB**)
- [refactor] Remove some unnecessary functions of KernelCodeGen (by **PGZXB**)
- [refactor] Re-impl JIT and Offline Cache on LLVM backends (by **PGZXB**)
- [refactor] Implement llvm::KernelCompiler (by **PGZXB**)
- [refactor] Gen code for KernelCodeGen::ir instead of KernelCodeGen::kernel->ir (by **PGZXB**)
- [Doc] Update tutorial.md (7512) (by **Chenzhan Shang**)
- [ci] Test manylinux2014 build on PR (7647) (by **Proton**)
- [bug] Fix logical comparison returns -1 (7641) (by **Ailing**)
- [doc] Fix gui_system.md tests (7646) (by **Proton**)
- [Doc] Update gui_system.md (7628) (by **Qian Bao**)
- [aot] Hand-written CMake target script (7644) (by **PENGUINLIONG**)
- [ci] Do not use Android toolchain for perf testing (7642) (by **Proton**)
- [ci] Support Python 3.11 (7627) (by **Proton**)
- [build] Setup Android SDK environment for performance bot (7635) (by **Zhanlue Yang**)
- [ci] Update perf mon image (7639) (by **Proton**)
- [ci] Fix perf mon break (7638) (by **Proton**)
- [doc] Add documentation on using ghstack (7632) (by **Proton**)
- [build] Static linking libstdc++ on Linux (by **Proton**)
- [ci] Rewrite Dockerfiles (by **Proton**)
- [ci] Resolve "Needed single revision" workaround failure when the repo directory is empty (7633) (by **Proton**)
- [Vulkan] Fix repeated generation of array ranges in spirv codegen. (7625) (by **Haidong Lan**)
- [build] Switch to use docker with Android-SDK for performance bot (7630) (by **Zhanlue Yang**)
- [opengl] glfw finalize crash fix (by **Proton**)
- [ci] build.py: Android support, entering shell, export env (by **Proton**)
- [ci] Do not run tests with mixed backends (by **Proton**)
- [refactor] Use f16 function from external lib (by **lin-hitonami**)
- [refactor] Migrate members from RuntimeContext to LaunchContextBuilder (by **lin-hitonami**)
- [bug] Fix setting arguments exceeding the max arg num (by **lin-hitonami**)
- [cpu] Explicitly make cpu multithreading loop for range-fors. (7593) (by **Haidong Lan**)
- [aot] Fixed generator for compute graph (7626) (by **PENGUINLIONG**)
- [ir] Postpone scalarize and lower_matrix_ptr to after typecheck (7589) (by **魔法少女赵志辉**)
- [aot] Header generator completed (7609) (by **PENGUINLIONG**)
- [amdgpu] Initialize AMDGPUContext with defaults (by **Proton**)
- [build] Remove libSPIRV-Tools-shared.(so|dll) in wheel (by **Proton**)
- [lang] Removed cpu_device(), cuda_device(), and amdgpu_device() from LlvmRuntimeExecutor (7544) (by **Zhanlue Yang**)
- [refactor] Remove the get/set functions in RuntimeContext (by **lin-hitonami**)
- [aot] Pass LaunchContextBuilder to CompiledGraph::init_runtime_context (by **lin-hitonami**)
- [gfx] Let GfxRuntime use LaunchContextBuilder (by **lin-hitonami**)
- Let LaunchContextBuilder be the argument of the kernel launch function (by **lin-hitonami**)
- [llvm] [refactor] Set the llvm runtime when executing (by **lin-hitonami**)
- [refactor] Migrate {set, get}_{arg, ret} functions from RuntimeContext (by **lin-hitonami**)
- [bug] Fix compilation error (7606) (by **PGZXB**)
- [aot] Hide map memory failure (7604) (by **PENGUINLIONG**)
- [refactor] Fix KernelCodeGen::kernel from Kernel * to const Kernel * (by **PGZXB**)
- [refactor] Remove legacy implementation of llvm offline cache (by **PGZXB**)
- [refactor] Impl llvm::CompiledKernelData (by **PGZXB**)
- [bug] Type check for logical not op with real type inputs (7600) (by **Ailing**)
- [bug] Improve ndarray creation to fix segmentation fault (7577) (by **pengyu**)
- [lang] Add assembly printer for CPU backend (7590) (by **Zhanlue Yang**)
- [misc] Update docker filer (7598) (by **Zeyu Li**)
- [aot] Fix absolute path in generated TaichiTargets.cmake (7597) (by **Chenzhan Shang**)
- [Doc] Remove deprecated api docstrings (7596) (by **pengyu**)
- [llvm] Compile the kernel arguments to a StructType (by **Lin Jiang**)
- [lang] Fix issue with llvm opaque pointer (7557) (by **Zhanlue Yang**)
- [opt] Constant folding for unary ops on host (7573) (by **Ailing**)
- [bug] Type check for bit_not op with real type inputs (7592) (by **Ailing**)
- [Doc] Fix the cexp docstring (7588) (by **Zhao Liang**)
- [Lang] Replace internal representation of Python-scope ti.Matrix with numpy arrays (7559) (by **Yi Xu**)
- [bug] Avoid cuda compilation via clang and ship pre-compiled .bc file instead (7570) (by **Zhanlue Yang**)
- [aot] Taichi kernel AOT command (7565) (by **PENGUINLIONG**)
- [bug] Fix struct members registered to StructField class (7574) (by **Ailing**)
- [aot] Mobile platform AOT build scripts (7567) (by **PENGUINLIONG**)
- [misc] Revert "Security upgrade ipython from 7.34.0 to 8.10.0 (7341)" (7571) (by **Proton**)
- [test] Add cpp tests for constant folding pass (7566) (by **Ailing**)
- [misc] Security upgrade ipython from 7.34.0 to 8.10.0 (7341) (by **Chengchen(Rex) Wang**)
- [lang] Refactor CudaCachingAllocator into a more generic caching allocator (7531) (by **Zhanlue Yang**)
- [aot] Load GfxRuntime140 module from TCM (7539) (by **PENGUINLIONG**)
- [lang] Fixed useless serial shader to blit ExternalTensorShapeAlongAxisStmt on Metal (7562) (by **PENGUINLIONG**)
- [aot] Enable Vulkan 8bit storage (7564) (by **PENGUINLIONG**)
- [bug] Fix crashing on printing FrontendFuncCallStmt with no return value (by **lin-hitonami**)
- [refactor] Remove LaunchContextBuilder::set_arg_raw (by **lin-hitonami**)
- [llvm] Generalize TaskCodeGenLLVM::create_return to set_struct_to_buffer (by **lin-hitonami**)
- [bug] Fix Cuda memory leak during TiRuntime destruction (7345) (by **Zhanlue Yang**)
- [ir] Let void struct type represent void type (by **lin-hitonami**)
- [aot] Let C-API use LaunchContextBuilder to manage RuntimeContext (by **lin-hitonami**)
- [ir] Let the reference type declare a pointer argument (by **lin-hitonami**)
- [Doc] Add doc about returning struct (7556) (by **Lin Jiang**)
- [bug] Fix returning struct containing vec3 (7552) (by **Lin Jiang**)
- [lang] [ir] Extract and save the format specifiers in the f-string (7514) (by **魔法少女赵志辉**)
- [Lang] Stop letting ti.Struct inherit from TaichiOperations (7474) (by **Yi Xu**)
- [aot] Recover AOT CI branch names (7543) (by **PENGUINLIONG**)
- [aot] Put TiRT in Python wheel and CMake script to find it in wheel (7537) (by **PENGUINLIONG**)
- [refactor] Remove the difficult-to-implement CompiledKernelData::size() (7540) (by **PGZXB**)
- [bug] Implement the missing clone function for FrontendFuncCallStmt (7538) (by **PGZXB**)
- [misc] Bump version to v1.6.0 (7536) (by **Haidong Lan**)
- [doc] Handle 2 digit minor versions correctly (7535) (by **Ritoban Roy-Chowdhury**)
- [aot] GfxRuntime140 convention docs (7527) (by **PENGUINLIONG**)
- [rhi] Refactor allocate_memory API to use RhiResult (7463) (by **Bob Cao**)
- [metal] Choose the proper msl version according to the device capability (7506) (by **Yu Zhang**)
- [Lang] Support writing sparse matrix as matrix market file (7529) (by **pengyu**)

1.5.0

Deprecation Notice
- ndarray no longer accepts field_dim, replaced by the ndim argument.
- [RFC] Deprecate ti.cc backend in favor of TiRT and its C API, if you have any concerns please let us know at https://github.com/taichi-dev/taichi/issues/7629
New features
AOT
- Taichi Runtime (TiRT) now supports Apple's Metal API and OpenGL ES for compatibility on old mobile platforms. Now Taichi programs can be deployed to any mainstream consumer devices.
NOTE Taichi program deployment on mobile platforms is experimental. Please contact us at contacttaichi.graphics for long-term services.
- Taichi AOT now fully supports float16 dtype.
Ndarray
- Out of bound check is now supported on ndarrays
Improvements
Python Frontend
We now support returning a struct on LLVM-based backends (CPU and CUDA backend). The struct can contain vectors and matrices, and it can also nest with other structs. Here's an example.
Python
s0 = ti.types.struct(a=ti.math.vec3, b=ti.i16)
s1 = ti.types.struct(a=ti.f32, b=s0)

ti.kernel
def foo() -> s1:
return s1(a=1, b=s0(a=ti.math.vec3(100, 0.2, 3), b=1))

print(foo()) {'a': 1.0, 'b': {'a': [100.0, 0.2, 3.0], 'b': 1}}

Performance
- Support atomic operation on half2 for CUDA backend (with compute capability > 60). You can enable this with ti.init(half2_vectorization=True). This feature could effectively accelerate the Nerf training process, please refer to [this repo](https://github.com/taichi-dev/taichi-nerfs) for details.
GGUI
- GGUI now has no computing backend restrictions! You can now use Metal, OpenGL, AMDGPU, or DirectX 11, in addition to CPU, CUDA, Vulklan that's previously suported by GGUI.
- GGUI now has been validated on mesa's software rasterizer lavapipe, you can utilize this solution for headless server visualization, or on servers with no graphics capabilities (such as A100)
- Add the fps_limit option which adjusts the maximal frame rate in GGUI.

Full changelog:

Highlights:
- **AMDGPU backend**
- Enable shared array on amdgpu backend (7403) (by **Zeyu Li**)
- Add print kernel amdgcn (7357) (by **Zeyu Li**)
- Add amdgpu backend profiler (7330) (by **Zeyu Li**)
- **Aot module**
- Let AOT kernel inherit CallableBase and use LaunchContextBuilder (by **lin-hitonami**)
- Deprecate element shape and field dim for AOT symbolic args (7100) (by **Haidong Lan**)
- **Bug fixes**
- Fix copy_from() of StructField (7294) (by **Yi Xu**)
- Fix caching same loop invariant global vars inside nested fors (7285) (by **Lin Jiang**)
- Fix num_splits in parallel_struct_for (7121) (by **Yi Xu**)
- Fix ret_type and cast_type of UnaryOpStmt in Scalarize (7082) (by **Yi Xu**)
- **Documentation**
- Update GGUI docs with correct API (7525) (by **pengyu**)
- Fix typos and improve example code in data_oriented_class.md (7520) (by **pengyu**)
- Update gui_system.md, remove unnecessary example (7487) (by **NextoneX**)
- Fix typo in API doc (7511) (by **pengyu**)
- Update math_module (7405) (by **Zhao Liang**)
- Update hello_world.md (7400) (by **Zhao Liang**)
- Update debugging.md (7401) (by **Zhao Liang**)
- Update hello_world.md (7380) (by **Zhao Liang**)
- Update type.md (7376) (by **Zhao Liang**)
- Update kernel_function.md (7375) (by **Zhao Liang**)
- Update hello_world.md (7369) (by **Zhao Liang**)
- Update hello_world.md (7368) (by **Zhao Liang**)
- Update data_oriented_class.md (6790) (by **Zhao Liang**)
- Update hello_world.md (7367) (by **Zhao Liang**)
- Update kernel_function.md (7364) (by **Zhao Liang**)
- Update hello_world.md (7354) (by **Zhao Liang**)
- Update llvm_sparse_runtime.md (7323) (by **Gabriel Vainer**)
- Update profiler.md (7358) (by **Zhao Liang**)
- Update kernel_function.md (7356) (by **Zhao Liang**)
- Update tut.md (7352) (by **Gabriel Vainer**)
- Update type.md (7350) (by **Zhao Liang**)
- Update hello_world.md (7337) (by **Zhao Liang**)
- Update append docstring (7265) (by **Zhao Liang**)
- Update ndarray.md (7236) (by **Gabriel Vainer**)
- Update llvm_sparse_runtime.md (7215) (by **Zhao Liang**)
- Remove doc tutorial (7198) (by **Olinaaaloompa**)
- Rename tutorial doc (7186) (by **Zhao Liang**)
- Update tutorial.md (7176) (by **Zhao Liang**)
- Update math_module.md (7175) (by **Zhao Liang**)
- Update debugging.md (7173) (by **Zhao Liang**)
- Fix C++ tutorial does not display on doc site (7174) (by **Zhao Liang**)
- Update doc regarding dynamic index (7148) (by **Yi Xu**)
- Move glossary to top level (7118) (by **Zhao Liang**)
- Update type.md (7038) (by **Zhao Liang**)
- Fix docstring (7065) (by **Zhao Liang**)
- **Error messages**
- Allow IfExp on matrices when the condition is scalar (7241) (by **Lin Jiang**)
- Remove deprecations in ti.ui in 1.6.0 (7229) (by **Lin Jiang**)
- Remove deprecated ti.linalg.sparse_matrix_builder in 1.6.0 (7228) (by **Lin Jiang**)
- Remove deprecations in ASTTransformer in 1.6.0 (7226) (by **Lin Jiang**)
- Remove deprecated a.atomic_op(b) in Taichi v1.6.0 (7225) (by **Lin Jiang**)
- Remove deprecations in taichi/__init__.py in v1.6.0 (7222) (by **Lin Jiang**)
- Raise error when using deprecated ifexp on matrices (7224) (by **Lin Jiang**)
- Better error message when creating sparse snodes on backends that do not support sparse (7191) (by **Lin Jiang**)
- Raise errors when using metal sparse (7113) (by **Lin Jiang**)
- **GUI**
- GGUI use shader "factory" (GGUI rework n/N) (7271) (by **Bob Cao**)
- **Intermediate representation**
- Unified type system for internal operations (6337) (by **daylily**)
- **Language and syntax**
- Keep ti.pyfunc (7530) (by **Lin Jiang**)
- Type check assignments between tensors (7480) (by **Yi Xu**)
- Fix pylance warnings raised by ti.static (7437) (by **Zhao Liang**)
- Deprecate arithmetic operations and fill() on ti.Struct (7456) (by **Yi Xu**)
- Fix pylance warnnings by ti.random (7439) (by **Zhao Liang**)
- Fix pylance types warning (7417) (by **Zhao Liang**)
- Add better error message for dynamic snode (7238) (by **Zhao Liang**)
- Simplify the swizzle generator (7216) (by **Zhao Liang**)
- Remove the deprecated dynamic_index switch (7195) (by **Yi Xu**)
- Remove deprecated packed switch (7104) (by **Yi Xu**)
- Raise errors when using the packed switch (7125) (by **Yi Xu**)
- Fix cannot use taichi in REPL (7114) (by **Zhao Liang**)
- Remove deprecated ti.Matrix.rotation2d() (7098) (by **Yi Xu**)
- Remove filename kwarg in aot Module save() (7085) (by **Ailing**)
- Remove sourceinspect deprecation warning message (7081) (by **Zhao Liang**)
- Make slicing a single row/column of a matrix return a vector (7068) (by **Yi Xu**)
- **Miscellaneous**
- Strictly check ndim with external array (7126) (by **Haidong Lan**)

Full changelog:
- [cc] Add deprecation notice for cc backend (7651) (by **Ailing**)
- [misc] Cherry pick struct return related commits (7575) (by **Haidong Lan**)
- [Lang] Keep ti.pyfunc (7530) (by **Lin Jiang**)
- [bug] Fix symbol conflicts with taichi_cpp_tests (7528) (by **Zhanlue Yang**)
- [bug] Fix numerical issue with TensorType'd arithmetics (7526) (by **Zhanlue Yang**)
- [aot] Enable Metal AOT test (7461) (by **PENGUINLIONG**)
- [Doc] Update GGUI docs with correct API (7525) (by **pengyu**)
- [misc] Implement KernelCompialtionManager::clean_offline_cache (7515) (by **PGZXB**)
- [ir] Except shared array from demote atomics pass. (7513) (by **Haidong Lan**)
- [bug] Fix error with windows-clang compilation for cuda_runtime.cu (7519) (by **Zhanlue Yang**)
- [misc] Deprecate field dim and update deprecation warnings (7491) (by **Haidong Lan**)
- [build] Fix build failure without nvcc (7521) (by **Ailing**)
- [Doc] Fix typos and improve example code in data_oriented_class.md (7520) (by **pengyu**)
- [aot] Kernel argument count limit (7518) (by **PENGUINLIONG**)
- [Doc] Update gui_system.md, remove unnecessary example (7487) (by **NextoneX**)
- [AOT] [llvm] Let AOT kernel inherit CallableBase and use LaunchContextBuilder (by **lin-hitonami**)
- [llvm] Let the offline cache record the type info of arguments and return values (by **lin-hitonami**)
- [ir] Separate LaunchContextBuilder from Kernel (by **lin-hitonami**)
- [Doc] Fix typo in API doc (7511) (by **pengyu**)
- [aot] Build Runtime C-API by default (7508) (by **PENGUINLIONG**)
- [bug] Fix run_tests.py --with-offline-cache (7507) (by **PGZXB**)
- [vulkan] Support printing constant strings containing % (7499) (by **魔法少女赵志辉**)
- [ci] Fix nightly version number, 2nd try (7501) (by **Proton**)
- [aot] Fixed memory leak in metal backend (7500) (by **PENGUINLIONG**)
- [ci] Fix nightly version number issue (7498) (by **Proton**)
- [example] Remove cv2, cairo dependency (7496) (by **Zhao Liang**)
- [type] Let Type * be serializable (by **lin-hitonami**)
- [ci] Second attempt at permission check for ghstack landing (7490) (by **Proton**)
- [docs] Reword words of warning about building from source (7488) (by **Anselm Schüler**)
- [lang] Fixed double release of Metal command buffer (7484) (by **PENGUINLIONG**)
- [ci] Switch Android bots lock redis to bot-master (7482) (by **Proton**)
- [ci] Status check of ghstack CI bot (7479) (by **Proton**)
- [Lang] Type check assignments between tensors (7480) (by **Yi Xu**)
- [doc] Fix typo in ndarray.md (7476) (by **Chenzhan Shang**)
- [opt] Enable half2 optimization for atomic_add operations on CUDA backend (7465) (by **Zhanlue Yang**)
- [Lang] Fix pylance warnings raised by ti.static (7437) (by **Zhao Liang**)
- Let the LaunchContextBuilder manage the result buffer (by **lin-hitonami**)
- [ci] Fix nightly build failure, and minor improvements (7475) (by **Proton**)
- [ci] Fix duplicated names in aot tests (7471) (by **Ailing**)
- [lang] Improve float16 support from Taichi type system (7402) (by **Zhanlue Yang**)
- [Lang] Deprecate arithmetic operations and fill() on ti.Struct (7456) (by **Yi Xu**)
- [misc] Add out of bound check for ndarray (7458) (by **Ailing**)
- [aot] Remove graph kernel interfaces (7466) (by **PENGUINLIONG**)
- [llvm] Let the RuntimeContext use the host result buffer (by **lin-hitonami**)
- [gui] Fix 3d line drawing & add test (7454) (by **Bob Cao**)
- [lang] Fixed texture assertions (7450) (by **PENGUINLIONG**)
- [aot] Fixed header generator (7455) (by **PENGUINLIONG**)
- [aot] AOT module convention GfxRuntime140 (7440) (by **PENGUINLIONG**)
- [misc] Add an explicit error in cc backend codegen for dynamic indexing (7449) (by **Ailing**)
- [ci] Lower C++ tests concurrency (7451) (by **Proton**)
- [aot] Properly handle texture attributes (7433) (by **PENGUINLIONG**)
- [Lang] Fix pylance warnnings by ti.random (7439) (by **Zhao Liang**)
- [ir] Get the StructType of the kernel parameters (by **lin-hitonami**)
- [ci] Report failure (not throwing exception) when C++ tests fail (7435) (by **Proton**)
- [llvm] Allocate the result buffer from preallocated memory (by **lin-hitonami**)
- [vulkan] Fix GGUI and vulkan swapchain on AMD drivers (7382) (by **Bob Cao**)
- [autodiff] Handle return statement (7389) (by **Mingrui Zhang**)
- [misc] Remove unnecessary functions of gfx::AotModuleBuilderImpl (7425) (by **PGZXB**)
- [bug] Fix offline_cache::clean_offline_cache_files (ti cache clean) (7426) (by **PGZXB**)
- [test] Refactor C++ tests runner (7421) (by **Proton**)
- [ci] Adjust perfmon GPU freq (7429) (by **Proton**)
- [misc] Remove AotModuleParams::enable_lazy_loading (7424) (by **PGZXB**)
- [aot] Use graphs.json instead of TCB (7392) (by **PENGUINLIONG**)
- [refactor] Introduce KernelCompilationManager (7409) (by **PGZXB**)
- [IR] Unified type system for internal operations (6337) (by **daylily**)
- [lang] Add is_lvalue() to Expr to check writeback_binary operand (7414) (by **魔法少女赵志辉**)
- [bug] Fix get_error_string ret type typo (7418) (by **Zeyu Li**)
- [aot] Reorganize graph argument creation process (7412) (by **PENGUINLIONG**)
- [Amdgpu] Enable shared array on amdgpu backend (7403) (by **Zeyu Li**)
- [Lang] Fix pylance types warning (7417) (by **Zhao Liang**)
- [aot] Simplify device capability assignment (7407) (by **PENGUINLIONG**)
- [Doc] Update math_module (7405) (by **Zhao Liang**)
- [ci] Lock GPU frequency in perf benchmarking (7413) (by **Proton**)
- [ci] Add 'Needed single revision' workaround to all tasks (7408) (by **Proton**)
- [Doc] Update hello_world.md (7400) (by **Zhao Liang**)
- [refactor] Introduce KernelCompiler and implement spirv::KernelCompiler (7371) (by **PGZXB**)
- [Amdgpu] Add print kernel amdgcn (7357) (by **Zeyu Li**)
- [Doc] Update debugging.md (7401) (by **Zhao Liang**)
- [refactor] Disable ASTSerializer::allow_undefined_visitor (7391) (by **PGZXB**)
- [amdgpu] Enable llvm FpOpFusion option on AMDGPU backend (7398) (by **Zeyu Li**)
- [aot] Add test for shared array (7387) (by **Ailing**)
- [vulkan] Change command list submit error message & misc device API cleanups (7395) (by **Bob Cao**)
- [bug] Fix arch_uses_spirv (7399) (by **PGZXB**)
- [gui] Fix ggui & vulkan swapchain sizes on HiDPI displays (7394) (by **Bob Cao**)
- [Doc] Update hello_world.md (7380) (by **Zhao Liang**)
- [aot] Remove support for depth24stencil8 format on Metal (7377) (by **PENGUINLIONG**)
- [bug] Add DeviceCapabilityConfig to offline cache key (7384) (by **PGZXB**)
- [Doc] Update type.md (7376) (by **Zhao Liang**)
- [refactor] Remove dependencies on Callable::program in cpp tests (7373) (by **PGZXB**)
- [lang] Experimental support of conjugate gradient solver (7035) (by **pengyu**)
- [aot] Metal interop APIs (7366) (by **PENGUINLIONG**)
- [Doc] Update kernel_function.md (7375) (by **Zhao Liang**)
- [gui] Add `fps_limit` for GGUI (7374) (by **Bob Cao**)
- [Doc] Update hello_world.md (7369) (by **Zhao Liang**)
- [aot] Fix blockers in static library build with XCode (7365) (by **PENGUINLIONG**)
- [vulkan] Remove GLFW from Vulkan rhi dependency (7351) (by **Bob Cao**)
- [misc] Remove useless semicolon in llvm_program.h (7372) (by **PGZXB**)
- [Doc] Update hello_world.md (7368) (by **Zhao Liang**)
- [Amdgpu] Add amdgpu backend profiler (7330) (by **Zeyu Li**)
- [lang] Stop broadcasting scalar cond in select statements (7344) (by **魔法少女赵志辉**)
- [bug] Fix validation erros due to inactive VK_KHR_16bit_storage (7360) (by **Zhanlue Yang**)
- [aot] Support texture in Metal (7363) (by **PENGUINLIONG**)
- [Doc] Update data_oriented_class.md (6790) (by **Zhao Liang**)
- [Doc] Update hello_world.md (7367) (by **Zhao Liang**)
- [refactor] Introduce lang::CompiledKernelData (7340) (by **PGZXB**)
- [bug] Fix matrix initialization error with numpy.floating data (7362) (by **Zhanlue Yang**)
- [Doc] Update kernel_function.md (7364) (by **Zhao Liang**)
- [test] [amdgpu] Fix bug with allocs bb in function body (7308) (by **Zeyu Li**)
- [Doc] Update hello_world.md (7354) (by **Zhao Liang**)
- [aot] Fixed C-API docs (7361) (by **PENGUINLIONG**)
- [refactor] Remove dependencies on Callable::program in lang::CompiledGraph::run (7288) (by **PGZXB**)
- [DOC] Update llvm_sparse_runtime.md (7323) (by **Gabriel Vainer**)
- [Doc] Update profiler.md (7358) (by **Zhao Liang**)
- [Doc] Update kernel_function.md (7356) (by **Zhao Liang**)
- [aot] Improve Taichi C++ wrapper implementation (7347) (by **PENGUINLIONG**)
- [Doc] Update tut.md (7352) (by **Gabriel Vainer**)
- [ci] Add doc snippet CI requirements (7355) (by **Proton**)
- [amdgpu] Update device memory free (7346) (by **Zeyu Li**)
- [Doc] Update type.md (7350) (by **Zhao Liang**)
- [aot] Enable 16-bit dtype support for Taichi AOT (7315) (by **Zhanlue Yang**)
- [example] Re-implement the Cornell Box demo with shorter lines of code (7252) (by **HK-SHAO**)
- [aot] AOT CI refactorization (7339) (by **PENGUINLIONG**)
- [llvm] Let the kernel return struct (by **lin-hitonami**)
- [Doc] Update hello_world.md (7337) (by **Zhao Liang**)
- [ci] Reduce doc test concurrency (7336) (by **Proton**)
- [ir] Refactor result fetching (by **lin-hitonami**)
- [ir] Get the offsets of elements in StructType (by **lin-hitonami**)
- [misc] Delete test.py (7332) (by **Bob Cao**)
- [vulkan] More subgroup operations (7328) (by **Bob Cao**)
- [vulkan] Add vulkan profiler (7295) (by **Haidong Lan**)
- [refactor] Move TaichiLLVMContext::runtime_jit_module and TaichiLLVMContext::create_jit_module() to LlvmRuntimeExecutor (7320) (by **PGZXB**)
- [refactor] Remove dependencies on LlvmProgramImpl::get_llvm_context() in TaskCodeGenLLVM (7321) (by **PGZXB**)
- [ci] Checkout with privileged token when landing ghstack PRs (7331) (by **Proton**)
- [ir] Add fields to StructType (by **lin-hitonami**)
- [gui] Remove renderable reuse & make renderable immediate (7327) (by **Bob Cao**)
- [Gui] GGUI use shader "factory" (GGUI rework n/N) (7271) (by **Bob Cao**)
- [bug] Fix u64 field cannot be assigned value >= 2 ** 63 (7319) (by **Lin Jiang**)
- [type] Let the compute type of quant uint be unsigned int (by **lin-hitonami**)
- [doc] Replace slack with discord (7318) (by **yanqingzhang**)
- [refactor] Change print statement to warnings.warn in taichi.lang.util.warning (7301) (by **Jett Chen**)
- [ci] ChatOps: ghstack land (7314) (by **Proton**)
- [refactor] Remove TaichiLLVMContext::lookup_function_pointer() (7312) (by **PGZXB**)
- [misc] Update MSVC flags (7254) (by **Bob Cao**)
- [doc] [ci] Cover code snippets in docs (7309) (by **Proton**)
- [refactor] Remove dependencies on LlvmProgramImpl::get_llvm_context() in KernelCodeGen (7289) (by **PGZXB**)
- [rhi] Device upload readback functions (7278) (by **Bob Cao**)
- [aot] Fixed external project inclusion (7297) (by **PENGUINLIONG**)
- [Doc] Update append docstring (7265) (by **Zhao Liang**)
- [refactor] Remove dependencies on Callable::program in lang::get_hashed_offline_cache_key (7287) (by **PGZXB**)
- [ci] [amdgpu] Enable amdgpu backend python unit tests (7293) (by **Zeyu Li**)
- [Bug] Fix copy_from() of StructField (7294) (by **Yi Xu**)
- [ci] Adapt new Android phone behavior (7306) (by **Proton**)
- [Bug] Fix caching same loop invariant global vars inside nested fors (7285) (by **Lin Jiang**)
- [amdgpu] Part5 enable the api of amdgpu (7202) (by **Zeyu Li**)
- [amdgpu] Enable struct for on amdgpu backend (7247) (by **Zeyu Li**)
- [misc] Update external/asset which was accidentally downgraded in 7248 (7284) (by **Lin Jiang**)
- [amdgpu] Update runtime module (7248) (by **Zeyu Li**)
- [llvm] Remove unused argument 'arch' in LlvmProgramImpl::get_llvm_context (7282) (by **Lin Jiang**)
- [misc] Remove deprecated kwarg in rw_texture type annotations (7267) (by **Ailing**)
- [ci] Tolerate duplicates when registering version (7281) (by **Proton**)
- [gui] Fix GGUI destruction order (7279) (by **Bob Cao**)
- [doc] Rename /doc/ndarray_android to /doc/tutorial (7273) (by **Lin Jiang**)
- [llvm] Unify the llvm context of host and device (7249) (by **Lin Jiang**)
- [misc] Fix manylinux2014 warning not printing (7270) (by **Proton**)
- [ci] Building: add complete PATH set for conda (7268) (by **Proton**)
- [autodiff] Support rsqrt operator (7259) (by **Mingrui Zhang**)
- [ci] Update pre-commit repos version (7257) (by **Proton**)
- [refactor] Fix "const CompileConfig *" to "const CompileConfig &" (Part2) (7253) (by **PGZXB**)
- [refactor] Fix "const CompileConfig *" to "const CompileConfig &" (7243) (by **PGZXB**)
- [aot] Added third-party render thread task injection for Unity (7151) (by **PENGUINLIONG**)
- [aot] Support statically linked C-API library on MacOS (7207) (by **Zhanlue Yang**)
- [gui] Force GGUI to go through host memory (nuking interops) (7218) (by **Bob Cao**)
- [Error] Allow IfExp on matrices when the condition is scalar (7241) (by **Lin Jiang**)
- [bug] Fix the parity of the RNG (7239) (by **Lin Jiang**)
- [Lang] Add better error message for dynamic snode (7238) (by **Zhao Liang**)
- [DOC] Update ndarray.md (7236) (by **Gabriel Vainer**)
- [Error] Remove deprecations in ti.ui in 1.6.0 (7229) (by **Lin Jiang**)
- [Doc] Update llvm_sparse_runtime.md (7215) (by **Zhao Liang**)
- [lang] Add validation checks for subscripts to reject negative indices (7212) (by **Zhanlue Yang**)
- [refactor] Remove legacy num_bits and acc_offsets from AxisExtractor (7227) (by **Yi Xu**)
- [Error] Remove deprecated ti.linalg.sparse_matrix_builder in 1.6.0 (7228) (by **Lin Jiang**)
- [Error] Remove deprecations in ASTTransformer in 1.6.0 (7226) (by **Lin Jiang**)
- [misc] Export DeviceAllocation into Python & support devalloc in field_info (7233) (by **Bob Cao**)
- [gui] Use templated bulk copy to simplify VBO preperation (7234) (by **Bob Cao**)
- [rhi] Add create_image_unique stub & misc RHI bug fixes (7232) (by **Bob Cao**)
- [opengl] Fix GLFW global context issue (7230) (by **Bob Cao**)
- [examples] Remove dependency on `ti.u8` compute type for ngp (7220) (by **Bob Cao**)
- [refactor] Remove Kernel::offload_to_executable (7210) (by **PGZXB**)
- [opengl] RW image binding & FP16 support (7219) (by **Bob Cao**)
- [Error] Remove deprecated a.atomic_op(b) in Taichi v1.6.0 (7225) (by **Lin Jiang**)
- [Error] Remove deprecations in taichi/__init__.py in v1.6.0 (7222) (by **Lin Jiang**)
- [Error] Raise error when using deprecated ifexp on matrices (7224) (by **Lin Jiang**)
- [refactor] Remove legacy BitExtractStmt (7221) (by **Yi Xu**)
- [amdgpu] Part4 link bitcode file (7180) (by **Zeyu Li**)
- [example] Reorganize example oit_renderer (7208) (by **Lin Jiang**)
- [aot] Fix ndarray aot with information from type hints (7214) (by **Ailing**)
- [gui] Fix wide line support on macOS (7205) (by **Bob Cao**)
- [Lang] Simplify the swizzle generator (7216) (by **Zhao Liang**)
- [refactor] Split constructing and compilation of lang::Function (7209) (by **PGZXB**)
- [doc] Fix netlify build command (7217) (by **Ailing**)
- [ci] M1 buildbot release tag (7213) (by **Proton**)
- [misc] Remove unused task_funcs (7211) (by **PGZXB**)
- [refactor] Program::this_thread_config() -> Program::compile_config() (7199) (by **PGZXB**)
- [doc] Fix format issues of windows debugging (7197) (by **Olinaaaloompa**)
- [aot] More OpenGL interop in C-API (7204) (by **PENGUINLIONG**)
- [metal] Disable a kernel test in offline cache to unblock CI (7154) (by **Ailing**)
- [ci] Switch Windows build script to build.py (6993) (by **Proton**)
- [misc] Update submodule taichi_assets (7203) (by **Lin Jiang**)
- [mac] Use ObjectLinkingLayer instead of RTDyldObjectLinkingLayer for aarch64 mac (7201) (by **Ailing**)
- [misc] Remove unused Program::jit_evaluator_id (7200) (by **PGZXB**)
- [misc] Remove legacy latex generation (7196) (by **Yi Xu**)
- [Lang] Remove the deprecated dynamic_index switch (7195) (by **Yi Xu**)
- [bug] Fix check_matched() failure with Ndarray holding TensorType'd element (7178) (by **Zhanlue Yang**)
- [Doc] Remove doc tutorial (7198) (by **Olinaaaloompa**)
- [bug] Fix example circle-packing (7194) (by **Lin Jiang**)
- [aot] C-API opengl runtime interop (7120) (by **damnkk**)
- [Error] Better error message when creating sparse snodes on backends that do not support sparse (7191) (by **Lin Jiang**)
- [example] Fix ti gallery close warning (7187) (by **Zhao Liang**)
- [lang] Interface refactors for MatrixType and VectorType (7143) (by **Zhanlue Yang**)
- [aot] Find Taichi in python wheel (7181) (by **PENGUINLIONG**)
- [gui] Update circles rendering to use quads (7163) (by **Bob Cao**)
- [Doc] Rename tutorial doc (7186) (by **Zhao Liang**)
- [ir] Fix gcc cannot compile inline template specialization (7179) (by **Lin Jiang**)
- [Doc] Update tutorial.md (7176) (by **Zhao Liang**)
- [aot] Replace std::exchange with local implementation for C++11 (7170) (by **PENGUINLIONG**)
- [ci] Fix near cache urls (missing comma) (7158) (by **Proton**)
- [docs] Create windows_debug.md (7164) (by **Bob Cao**)
- [Doc] Update math_module.md (7175) (by **Zhao Liang**)
- [aot] FindTaichi CMake module to help outside project integration (7168) (by **PENGUINLIONG**)
- [aot] Removed unused archs in C-API (7167) (by **PENGUINLIONG**)
- [Doc] Update debugging.md (7173) (by **Zhao Liang**)
- [refactor] Remove dependencies on Program::this_thread_config() in irpass::constant_fold (7159) (by **PGZXB**)
- [Doc] Fix C++ tutorial does not display on doc site (7174) (by **Zhao Liang**)
- [aot] C++ wrapper for memory slice and memory allocation with host access (7171) (by **PENGUINLIONG**)
- [aot] Fixed ti_get_last_error signature (7165) (by **PENGUINLIONG**)
- [misc] Log to stderr instead of stdout (7166) (by **PENGUINLIONG**)
- [aot] C-API get version wrapper (7169) (by **PENGUINLIONG**)
- [doc] Fix spelling of "paticle_field" (7024) (by **Xiang (Kevin) Li**)
- [misc] Remove useless Program::sync (7160) (by **PGZXB**)
- [doc] Update accelerate_python.md to use ti.max (7161) (by **Tao Jin**)
- [doc] Add doc ndarray (7157) (by **Olinaaaloompa**)
- [mac] Add .dylib and .cmake to built wheel (7156) (by **Ailing**)
- [refactor] Remove dependencies on Program::this_thread_config() in some tests (7155) (by **PGZXB**)
- [refactor] Remove dependencies on Program::this_thread_config() in llvm backends codegen (7153) (by **PGZXB**)
- [Lang] Remove deprecated packed switch (7104) (by **Yi Xu**)
- [example] Update quaternion arithmetics in fractal_3d_ggui (7139) (by **Zhao Liang**)
- [doc] Update field.md (Fields advanced) (6867) (by **Gabriel Vainer**)
- [ci] Use make_changelog.py to generate the full changelog (7152) (by **Lin Jiang**)
- [refactor] Rename Callable::*arg* to Callable::*param* (7133) (by **PGZXB**)
- [aot] Introduce new AOT deployment tutorial (7144) (by **PENGUINLIONG**)
- [bug] Unify error message matching with/without validation layers for CapiTest.FailMapDeviceOnlyMemory (7110) (by **Zhanlue Yang**)
- [lang] Remove redundant TensorType expansion for function returns (7124) (by **Zhanlue Yang**)
- [lang] Sign python library for Apple M1 (7138) (by **PENGUINLIONG**)
- [gui] Fix particle size limits (7149) (by **Bob Cao**)
- [lang] Migrate TensorType expansion in MatrixType/VectorType from Python code to Frontend IR (7127) (by **Zhanlue Yang**)
- [aot] Support texture arguments for AOT kernels (7142) (by **Zhanlue Yang**)
- [metal] Retain Metal commandBuffers & build command buffers directly (7137) (by **Bob Cao**)
- [rhi] Update `create_pipeline` API and add support of VkPipelineCache (7091) (by **Bob Cao**)
- [autodiff] Support grad in ndarray (6906) (by **PhrygianGates**)
- [Doc] Update doc regarding dynamic index (7148) (by **Yi Xu**)
- [refactor] Remove dependencies on Program::this_thread_config() in spirv::lower (7134) (by **PGZXB**)
- [Misc] Strictly check ndim with external array (7126) (by **Haidong Lan**)
- [ci] Run test when pushing to rc branches (7146) (by **Lin Jiang**)
- [refactor] Remove dependencies on Program::this_thread_config() in KernelCodeGen (7086) (by **PGZXB**)
- [ci] Disable backward_cpp on macOS (7145) (by **Proton**)
- [gui] Fix scene line renderable (7131) (by **Bob Cao**)
- [refactor] Remove useless Kernel::from_cache_ (7132) (by **PGZXB**)
- [cpu] Reuse VirtualMemoryAllocator for CPU ndarray memory allocation (7128) (by **Ailing**)
- [Lang] Raise errors when using the packed switch (7125) (by **Yi Xu**)
- [ci] Temporarily disable ad_external_array on Metal (7136) (by **Bob Cao**)
- [Error] Raise errors when using metal sparse (7113) (by **Lin Jiang**)
- [aot] AOT compat test in workflow (7033) (by **damnkk**)
- [Lang] Fix cannot use taichi in REPL (7114) (by **Zhao Liang**)
- [lang] Free ndarray memory when it's GC-ed in Python (7072) (by **Ailing**)
- [lang] Migrate TensorType expansion for FuncCallExpression from Python code to Frontend IR (6980) (by **Zhanlue Yang**)
- [amdgpu] Part2 add runtime (6482) (by **Zeyu Li**)
- [refactor] Remove dependencies on Program::this_thread_config() in codegen_cc.cpp (7088) (by **PGZXB**)
- [refactor] Remove dependencies on Program::this_thread_config() in gfx::run_codegen (7089) (by **PGZXB**)
- [Bug] Fix num_splits in parallel_struct_for (7121) (by **Yi Xu**)
- [Doc] Move glossary to top level (7118) (by **Zhao Liang**)
- [metal] Update Metal RHI impl & add support for shared arrays (7107) (by **Bob Cao**)
- [ci] Update amdgpu ci (7117) (by **Zeyu Li**)
- [refactor] Move Kernel::lower() outside the taichi::lang::Kernel (7048) (by **PGZXB**)
- [amdgpu] Part1 add codegen (6469) (by **Zeyu Li**)
- [Aot] Deprecate element shape and field dim for AOT symbolic args (7100) (by **Haidong Lan**)
- [refactor] Remove Program::current_ast_builder() (7075) (by **PGZXB**)
- [aot] Switch Metal to SPIR-V codegen (7093) (by **PENGUINLIONG**)
- [Lang] Remove deprecated ti.Matrix.rotation2d() (7098) (by **Yi Xu**)
- [doc] Modified some errors in the function examples (7094) (by **welann**)
- [ci] More Windows git hacks (7102) (by **Proton**)
- [Lang] Remove filename kwarg in aot Module save() (7085) (by **Ailing**)
- [aot] Rename device capability atomic_i64 to atomic_int64 for consistency (7095) (by **PENGUINLIONG**)
- [Lang] Remove sourceinspect deprecation warning message (7081) (by **Zhao Liang**)
- [example] Remove gui warning message (7090) (by **Zhao Liang**)
- [refactor] Remove unnecessary Kernel::arch (7074) (by **PGZXB**)
- [refactor] Remove unnecessary parameter of irpass::scalarize (7087) (by **PGZXB**)
- [Bug] Fix ret_type and cast_type of UnaryOpStmt in Scalarize (7082) (by **Yi Xu**)
- [lang] Migrate TensorType expansion for TextureOpExpression from Python code to Frontend IR (6968) (by **Zhanlue Yang**)
- [lang] Migrate TensorType expansion for ReturnStmt from Python code to Frontend IR (6946) (by **Zhanlue Yang**)
- [doc] Update ndarray deprecation warning to 1.5.0 (7083) (by **Haidong Lan**)
- [amdgpu] Update amdgpu module call (7022) (by **Zeyu Li**)
- [amdgpu] Add convert addressspace pass related unit test (7023) (by **Zeyu Li**)
- [ir] Let real function return nested StructType (by **lin-hitonami**)
- [ir] Replace FuncCallExpression with FrontendFuncCallStmt (by **lin-hitonami**)
- [example] Update gallery images (7053) (by **Zhao Liang**)
- [Doc] Update type.md (7038) (by **Zhao Liang**)
- [misc] Bump version to v1.5.0 (7077) (by **Lin Jiang**)
- [rhi] Update Stream `new_command_list` API (7073) (by **Bob Cao**)
- [Doc] Fix docstring (7065) (by **Zhao Liang**)
- [ci] Workaround windows checkout 'Needed a single revision' issue (7078) (by **Proton**)
- [Lang] Make slicing a single row/column of a matrix return a vector (7068) (by **Yi Xu**)

Page 3 of 23

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.