Deprecation Notice
- We removed some APIs that were deprecated a long time ago. See the table below:
| Removed API | Replace with |
| --- | --- |
| Using atomic operations like a.atomic_add(b) | ti.atomic_add(a, b) or a += b |
| Using is and is not inside Taichi kernel and Taichi function | Not supported |
| Ndrange for loop with the number of the loop variables not equal to the dimension of the ndrange | Not supported |
| ti.ui.make_camera() | ti.ui.Camera() |
| ti.ui.Window.write_image() | ti.ui.Window.save_image() |
| ti.SOA | ti.Layout.SOA |
| ti.AOS | ti.Layout.AOS |
| ti.print_profile_info | ti.profiler.print_scoped_profiler_info |
| ti.clear_profile_info | ti.profiler.clear_scoped_profiler_info |
| ti.print_memory_profile_info | ti.profiler.print_memory_profiler_info |
| ti.CuptiMetric | ti.profiler.CuptiMetric |
| ti.get_predefined_cupti_metrics | ti.profiler.get_predefined_cupti_metrics |
| ti.print_kernel_profile_info | ti.profiler.print_kernel_profiler_info |
| ti.query_kernel_profile_info | ti.profiler.query_kernel_profiler_info |
| ti.clear_kernel_profile_info | ti.profiler.clear_kernel_profiler_info |
| ti.kernel_profiler_total_time | ti.profiler.get_kernel_profiler_total_time |
| ti.set_kernel_profiler_toolkit | ti.profiler.set_kernel_profiler_toolkit |
| ti.set_kernel_profile_metrics | ti.profiler.set_kernel_profiler_metrics |
| ti.collect_kernel_profile_metrics | ti.profiler.collect_kernel_profiler_metrics |
| ti.VideoManager | ti.tools.VideoManager |
| ti.PLYWriter | ti.tools.PLYWriter |
| ti.imread | ti.tools.imread |
| ti.imresize | ti.tools.imresize |
| ti.imshow | ti.tools.imshow |
| ti.imwrite | ti.tools.imwrite |
| ti.ext_arr | ti.types.ndarray |
| ti.any_arr | ti.types.ndarray |
| ti.Tape | ti.ad.Tape |
| ti.clear_all_gradients | ti.ad.clear_all_gradients |
| ti.linalg.sparse_matrix_builder | ti.types.sparse_matrix_builder |
- We no longer deprecate the builtin min/max function in the Taichi kernel anymore.
- We deprecate some arguments in the declaration of the arguments of the compute graph, and they will be removed in v1.7.0. Including:
- `element_shape` argument for scalar and ndarray
- `shape`, `channel_format` and `num_channels` arguments for texture
- `cc` backend will be removed at next release (`v1.7.0`)
New features
Struct arguments
You can now use struct arguments in all backends. The structs can be nested, and it can contain matrices and vectors. Here's an example:
python
transform_type = ti.types.struct(R=ti.math.mat3, T=ti.math.vec3)
pos_type = ti.types.struct(x=ti.math.vec3, trans=transform_type)
ti.kernel
def kernel_with_nested_struct_arg(p: pos_type) -> ti.math.vec3:
return p.trans.R p.x + p.trans.T
trans = transform_type(ti.math.mat3(1), [1, 1, 1])
p = pos_type(x=[1, 1, 1], trans=trans)
print(kernel_with_nested_struct_arg(p)) [4., 4., 4.]
Ndarray
- Support 0 dim ndarray read & write in python scope
- Fixed a bug when writing into ndarray from Python scope
Improvements
- Support rsqrt operator in autodiff
- Added assembly printer for CPU backend **Zhanlue Yang**
- Supporting CUDA shared array allocation over 48KiB
Performance
- Improved vectorization support on CPU backend, with significant performance gains for specific applications
New Examples
- 2D euler fluid simulation example by **Lee-abcde**
Misc
- Python 3.11 support
- `ti.frexp` is supported on CUDA, Vulkan, Metal, OpenGL backends.
- `ti.math.popcnt` intrinsic by **Garry Ling**
- Fixed a memory leak issue during SNodeTree destruction **Zhanlue Yang**
- Added validation and improved error report for ti.Field finalization **Zhanlue Yang**
- Fixed a memory leak issue with Cuda backend in C-API **Zhanlue Yang**
- Added support for formatted printing with str.format() and f-strings **Tianyi Liu**
- Changed Python code formatter from `yapf` to `black`
Developer Experience
- build.py script for preparing build & testing environment
Full changelog
Highlights:
- **Bug fixes**
- Fix wrong datatype size when writing to ndarray from Python scope (by **Ailing Zhang**)
- **CUDA backend**
- Warn driver version if it doesn't support memory pool. (7912) (by **Haidong Lan**)
- Better handling shared array shape check (7818) (by **Haidong Lan**)
- Support large shared memory for CUDA backend (7452) (by **Haidong Lan**)
- **Documentation**
- Add doc about struct arguments (7959) (by **Lin Jiang**)
- Fix docstring of mix function (7922) (by **Zhao Liang**)
- Update faq and ggui, and add them to CI (7861) (by **Zhao Liang**)
- Update doc for dynamic snode (7804) (by **Zhao Liang**)
- Update field.md (7819) (by **zhoooou**)
- Update readme (7808) (by **yanqingzhang**)
- Update write_test.md (7745) (by **Qian Bao**)
- Update performance.md (7720) (by **Zhao Liang**)
- Update readme (7673) (by **Zhao Liang**)
- Update tutorial.md (7512) (by **Chenzhan Shang**)
- Update gui_system.md (7628) (by **Qian Bao**)
- Remove deprecated api docstrings (7596) (by **pengyu**)
- Fix the cexp docstring (7588) (by **Zhao Liang**)
- Add doc about returning struct (7556) (by **Lin Jiang**)
- **Error messages**
- Update deprecation warning of the graph arguments (7965) (by **Lin Jiang**)
- **Language and syntax**
- Remove deprecated funcs in __init__.py (7941) (by **Lin Jiang**)
- Remove deprecated sparse_matrix_builder function (7942) (by **Lin Jiang**)
- Remove deprecated funcs in ti.ui (7940) (by **Lin Jiang**)
- Remove the support for 'is' (7930) (by **Lin Jiang**)
- Raise error when the dimension of the ndrange does not equal to the number of the loop variable (7933) (by **Lin Jiang**)
- Remove a.atomic(b) (7925) (by **Lin Jiang**)
- Cancel deprecating native min/max (7928) (by **Lin Jiang**)
- Let nested data classes have methods (7909) (by **Lin Jiang**)
- Let kernel argument support matrix nested in a struct (by **lin-hitonami**)
- Support the functions of dataclass as kernel argument and return value (7865) (by **Lin Jiang**)
- Fix a bug on PosixPath (7860) (by **Zhao Liang**)
- Seprate out the scalarization for MatrixOfMatrixPtrStmt and MatrixOfGlobalPtrStmt (7803) (by **Zhanlue Yang**)
- Fix pylance warning (7805) (by **Zhao Liang**)
- Support taking structs as kernel arguments (by **lin-hitonami**)
- Fix math module circular import bugs (7762) (by **Zhao Liang**)
- Support formatted printing in str.format() and f-strings (7686) (by **魔法少女赵志辉**)
- Replace internal representation of Python-scope ti.Matrix with numpy arrays (7559) (by **Yi Xu**)
- Stop letting ti.Struct inherit from TaichiOperations (7474) (by **Yi Xu**)
- Support writing sparse matrix as matrix market file (7529) (by **pengyu**)
- **Vulkan backend**
- Fix repeated generation of array ranges in spirv codegen. (7625) (by **Haidong Lan**)
Full changelog:
- [CUDA] Warn driver version if it doesn't support memory pool. (7912) (by **Haidong Lan**)
- [Doc] Add doc about struct arguments (7959) (by **Lin Jiang**)
- [Error] Update deprecation warning of the graph arguments (7965) (by **Lin Jiang**)
- [windows] Workaround C++ mangling special chars (7964) (by **Ailing**)
- [Lang] Remove deprecated funcs in __init__.py (7941) (by **Lin Jiang**)
- [build] Remove redundant C-API shared object in wheel (7950) (by **Proton**)
- [test] Do not test cc backend (by **Proton**)
- [Lang] Remove deprecated sparse_matrix_builder function (7942) (by **Lin Jiang**)
- [Lang] Remove deprecated funcs in ti.ui (7940) (by **Lin Jiang**)
- [Lang] Remove the support for 'is' (7930) (by **Lin Jiang**)
- [Lang] Raise error when the dimension of the ndrange does not equal to the number of the loop variable (7933) (by **Lin Jiang**)
- [Lang] Remove a.atomic(b) (7925) (by **Lin Jiang**)
- [Lang] Cancel deprecating native min/max (7928) (by **Lin Jiang**)
- [Doc] Fix docstring of mix function (7922) (by **Zhao Liang**)
- [example] Fix ti example bugs (7903) (by **Zhao Liang**)
- [ci] Build.py: Source generated env in new spawned shell (by **Proton**)
- [misc] Fix changelog commit extract code (by **Proton**)
- [ci] More robust build.py bootstrapping (7920) (by **Proton**)
- [Lang] [bug] Let nested data classes have methods (7909) (by **Lin Jiang**)
- [cuda] Only set CU_LIMIT_STACK_SIZE when necessary (7906) (by **Ailing**)
- [Lang] Let kernel argument support matrix nested in a struct (by **lin-hitonami**)
- [Bug] Fix wrong datatype size when writing to ndarray from Python scope (by **Ailing Zhang**)
- [lang] Support 0 dim ndarray read & write in python scope (by **Ailing Zhang**)
- [Lang] Support the functions of dataclass as kernel argument and return value (7865) (by **Lin Jiang**)
- [spirv] Support struct as kernel argument (by **Lin Jiang**)
- [spirv] Fix the ret type of frexp (by **lin-hitonami**)
- [ci] Build.py: Do not try to bootstrap pip (too many issues) (7897) (by **Proton**)
- [ci] Build.py quirks fix (7894) (by **Proton**)
- [Doc] Update faq and ggui, and add them to CI (7861) (by **Zhao Liang**)
- [build] Remove unused apt pkg 'libmirclient-dev' to make 'build.py' run properly on ubuntu 22.04 (7871) (by **Yu Zhang**)
- [Lang] Fix a bug on PosixPath (7860) (by **Zhao Liang**)
- [ci] Polishing build.py, wave 4 (7857) (by **Proton**)
- [build] Use LLVM without zstd dependency on M1 Macs (7856) (by **Proton**)
- [doc] Update dev_install.md to reflect build.py usage (7848) (by **Proton**)
- [ci] Polishing build.py, wave 3 (7845) (by **Proton**)
- [lang] Add popcnt to llvm intrinsic support (7772) (by **Garry Ling**)
- [Doc] Update doc for dynamic snode (7804) (by **Zhao Liang**)
- [ci] Fix release build failure (7834) (by **Proton**)
- [ci] More robust build.py bootstrapping (7833) (by **Proton**)
- [Doc] Update field.md (7819) (by **zhoooou**)
- [autodiff] Remove redundant autodiff mode in kernel name (7829) (by **Ailing**)
- [lang] Migrate Caching Allocation logics from CudaDevice/AmdgpuDevice to DeviceMemoryPool (7793) (by **Zhanlue Yang**)
- [misc] Resolve code formatter frictions (7828) (by **Proton**)
- [Lang] Seprate out the scalarization for MatrixOfMatrixPtrStmt and MatrixOfGlobalPtrStmt (7803) (by **Zhanlue Yang**)
- [bug] Fix imgui_context in destroying multiple GGUI windows (7812) (by **Ailing**)
- [misc] Update git-blame-ignore-revs (7825) (by **Proton**)
- [ci] Complete doc test list, remove redundant default prelude (7823) (by **Proton**)
- [misc] Relax Black formatter line length limit to 120 (7824) (by **Proton**)
- [Doc] Update readme (7808) (by **yanqingzhang**)
- [misc] Switch code formatter from `yapf` to `black` (7785) (by **Proton**)
- [CUDA] Better handling shared array shape check (7818) (by **Haidong Lan**)
- [misc] Improve ::liong::json::deserialize() (by **PGZXB**)
- [bug] Fix gen_offline_cache_key (7810) (by **PGZXB**)
- [ci] Fix build.py ensurepip (7811) (by **Proton**)
- [Lang] Fix pylance warning (7805) (by **Zhao Liang**)
- [lang] Support frexp on spirv-based backends (7770) (by **Ailing**)
- [lang] Split MemoryPool into DeviceMemoryPool and HostMemoryPool (7786) (by **Zhanlue Yang**)
- [misc] Optimize import overhead: pytorch and get_clangpp (7797) (by **Haidong Lan**)
- [ci] [doc] Tighten up document testing (7801) (by **Proton**)
- [ci] Polishing build.py, wave 2 (7800) (by **Proton**)
- [aot] Remove unused AotDataConverter (7799) (by **Lin Jiang**)
- [perf] Fix Taichi CPU backend compile parameter to pair performance with Numba. (7731) (by **zhengxianli**)
- [ci] Polishing build.py (7794) (by **Proton**)
- [bug] Returning nan for ti.sym_eig on identity matrix (7443) (by **Yimin Tang**)
- [Lang] Support taking structs as kernel arguments (by **lin-hitonami**)
- [ir] Add 'create_load' to ArgLoadStmt (by **lin-hitonami**)
- [ir] Let the src of GetElementStmt be a pointer (by **lin-hitonami**)
- [lang] Clean up runtime allocation functions (7773) (by **Zhanlue Yang**)
- [lang] Migrate CUDA preallocation logic to CudaMemoryPool (7746) (by **Zhanlue Yang**)
- [gfx] Fix runtime buffer/image copy barrier semantics (7781) (by **Bob Cao**)
- [misc] Remove unnecessary TaskCodeGenLLVM::task_counter (7777) (by **PGZXB**)
- [ci] Temporarily force Windows release builds to run on sm70 nodes (7767) (by **Proton**)
- [refactor] Remove Kernel::lowered_ (7765) (by **PGZXB**)
- [gui] Fluid visualization utilities (7682) (by **Qian Bao**)
- [Lang] Fix math module circular import bugs (7762) (by **Zhao Liang**)
- [misc] Make pre-commit happy (7768) (by **Proton**)
- [ci] Build iOS AOT static library (by **Proton**)
- [misc] Wrap path with std::filesystem::path (7754) (by **Bob Cao**)
- [lang] Support vector and matrix dtypes in ti.field (7761) (by **Ailing**)
- [ir] Remove unnecessary field_dims_ in ArgLoadStmt (7755) (by **Ailing**)
- [refactor] Remove Kernel::task_counter_ (7751) (by **PGZXB**)
- [ci] Build.py: Introduce TAICHI_CMAKE_ARGS manager for better log readability (by **Proton**)
- [ci] Reorganize build.py code (by **Proton**)
- [refactor] Let KernelCompilationManager manage kernel compilation in gfx::AotModuleBuilderImpl (7715) (by **PGZXB**)
- [misc] Remove unused FullSimplifyPass::Args::program (7750) (by **PGZXB**)
- [refactor] Re-impl LlvmAotModule using LLVM::KernelLauncher (7744) (by **PGZXB**)
- [lang] Implement experimental CG(Conjugate Gradient) solver in Taichi-lang (7690) (by **Qian Bao**)
- [lang] Transform bit_shr to bit_sar for uint (7757) (by **Ailing**)
- [ir] Postpone scalarize and lower_matrix_ptr to after bit loop vectorization (7726) (by **魔法少女赵志辉**)
- [ci] Isolate post sm70 tests (7740) (by **Proton**)
- [cuda] Suppport using SparseMatrix on more CUDA versions (7724) (by **Yu Zhang**)
- [cuda] Update the data layout of CUDA (7748) (by **Lin Jiang**)
- [ci] Ignore dup benchmark data points (7749) (by **Proton**)
- [bug] Fix reduction of atomic max (7747) (by **Lin Jiang**)
- [Doc] Update write_test.md (7745) (by **Qian Bao**)
- [refactor] Remove 'args' from 'RuntimeContext' (by **lin-hitonami**)
- [gfx] Let gfx backends use LaunchContextBuilder to build arguments in struct type (by **lin-hitonami**)
- [gfx] [refactor] Convert f16 in LaunchContextBuilder (by **lin-hitonami**)
- [gfx] Record the struct type of arguments and results in KernelContextAttributes (by **lin-hitonami**)
- [gfx] Compile struct type of result and arguments in gfx backends (by **lin-hitonami**)
- [refactor] Implement CompiledKernelData::check() (7743) (by **PGZXB**)
- [doc] [test] Update docs for printing with f-strings and formatted strings (7733) (by **魔法少女赵志辉**)
- [lang] Improve error message for mismatched index for ndarrays in python scope (7737) (by **Ailing**)
- [bug] Avoid redundant cache loading (7741) (by **PGZXB**)
- [refactor] Let KernelCompilationManager manage kernel compilation in LlvmAotModuleBuilder (7714) (by **PGZXB**)
- [ci] Skip large shared memory test for Turing GPUs. (7739) (by **Haidong Lan**)
- [cuda] Remove deprecated cusparse functions (7725) (by **Yu Zhang**)
- [misc] Update pull_request_template.md (7738) (by **Ailing**)
- [misc] Remove TI_WARN for cuda in memory_pool.cpp (7734) (by **Ailing**)
- [CUDA] Support large shared memory for CUDA backend (7452) (by **Haidong Lan**)
- [vulkan] Update SPIR-V codegen to emit FP16 consts (7676) (by **Bob Cao**)
- [lang] Support frexp on cuda backend (7721) (by **Ailing**)
- [refactor] Unify implementation of ProgramImpl::compile() (by **PGZXB**)
- [refactor] Introduce LLVM::KernelLauncher (by **PGZXB**)
- [refactor] Introduce gfx::KernelLauncher (by **PGZXB**)
- [test] Enable test offline cache on amdgpu and dx11 (7703) (by **PGZXB**)
- [lang] Refactor ownership and inheritance of allocators (7685) (by **Zhanlue Yang**)
- [ci] Fix git cache quirks (7722) (by **Proton**)
- [lang] Improve error msg in create ndarray (7709) (by **Garry Ling**)
- [Doc] Update performance.md (7720) (by **Zhao Liang**)
- [bug] Switch the gallery image used by README. (7716) (by **Chengchen(Rex) Wang**)
- [lang] Merge AMDGPUCachingAllocator to the generic CachingAllocator (7717) (by **Zhanlue Yang**)
- [bug] Invalid Field cache, RWAccessors cache, and Kernel cache upon SNodeTree destruction (7704) (by **Zhanlue Yang**)
- [ci] [test] Enable cc test on CI (by **lin-hitonami**)
- [test] [cc] Skip tests that cc backend doesn't support (by **lin-hitonami**)
- [test] Exclude the cc backend from tests that involve dynamic indexing (7705) (by **魔法少女赵志辉**)
- [bug] Fix camera controls (7681) (by **liblaf**)
- [bug] [cc] Fix comparison op in cc backend (by **Lin Jiang**)
- [bug] [cc] Set external ptr for cc backend (by **lin-hitonami**)
- [lang] Merged VirtualMemoryAllocator into MemoryPool for LLVM-CPU backend (7671) (by **Zhanlue Yang**)
- [misc] Remove useless JITEvaluatorId (7700) (by **PGZXB**)
- [bug] Fixed building with clang on Windows failed (7699) (by **PGZXB**)
- [Lang] Support formatted printing in str.format() and f-strings (7686) (by **魔法少女赵志辉**)
- [ci] Git caching proxy in CI (7692) (by **Proton**)
- [build] Let msvc generate pdb for cpp & c_api tests (by **lin-hitonami**)
- [refactor] Stop storing pointers to array devallocs in kernel args (by **lin-hitonami**)
- [aot] Implement bin2c in AOT cppgen (7687) (by **PENGUINLIONG**)
- [cpu] Remove atomics demotion for single-thread CPU targets. (7631) (by **Haidong Lan**)
- [aot] Export templated kernels (7683) (by **PENGUINLIONG**)
- [ci] Revive /benchmark (7680) (by **Proton**)
- [Doc] Update readme (7673) (by **Zhao Liang**)
- [misc] Device API public headers and CMake rework part 1 (7624) (by **Bob Cao**)
- [misc] Move optimize cpu module to KernelCodeGen (7667) (by **PGZXB**)
- [lang] [ir] Extract and save the format specifiers in str.format() (7660) (by **魔法少女赵志辉**)
- [example] Add 2D euler fluid simulation example (7568) (by **Lee-abcde**)
- [wasm] Remove WASM backend (by **lin-hitonami**)
- [build] Fix ssize_t type undefined errors when building with TI_WITH_LLVM=OFF on windows (7665) (by **Yu Zhang**)
- [misc] Remove unused Kernel::is_evaluator (7669) (by **PGZXB**)
- [misc] Remove unused Program::jit_evaluator_cache and Program::jit_evaluator_cache_mut (7668) (by **PGZXB**)
- [misc] Simplify test_offline_cache.py (7663) (by **PGZXB**)
- [lang] Improve error reporting for FieldsBuilder finalization (7640) (by **Zhanlue Yang**)
- [misc] Rename taichi::lang::llvm to taichi::lang::LLVM (7659) (by **PGZXB**)
- [refactor] Remove MemoryPool daemon in LLVM runtime (7648) (by **Zhanlue Yang**)
- [opt] Cleanup unncessary options in constant fold pass (7661) (by **Ailing**)
- [ci] Use build.py to prepare testing environment on Windows (7658) (by **Proton**)
- [opt] Move binary jit evaluator to host (by **Ailing Zhang**)
- [test] Update C++ constant fold tests to test operator one by one (by **Ailing Zhang**)
- [aot] Avoid shared library file being packaged into wheel data (7652) (by **Chenzhan Shang**)
- [ci] Fix scipy install (7649) (by **Proton**)
- [misc] Remove an unnecessary parameter of KernelCompilationManager::make_filename (by **PGZXB**)
- [refactor] Remove some unnecessary functions of KernelCodeGen (by **PGZXB**)
- [refactor] Re-impl JIT and Offline Cache on LLVM backends (by **PGZXB**)
- [refactor] Implement llvm::KernelCompiler (by **PGZXB**)
- [refactor] Gen code for KernelCodeGen::ir instead of KernelCodeGen::kernel->ir (by **PGZXB**)
- [Doc] Update tutorial.md (7512) (by **Chenzhan Shang**)
- [ci] Test manylinux2014 build on PR (7647) (by **Proton**)
- [bug] Fix logical comparison returns -1 (7641) (by **Ailing**)
- [doc] Fix gui_system.md tests (7646) (by **Proton**)
- [Doc] Update gui_system.md (7628) (by **Qian Bao**)
- [aot] Hand-written CMake target script (7644) (by **PENGUINLIONG**)
- [ci] Do not use Android toolchain for perf testing (7642) (by **Proton**)
- [ci] Support Python 3.11 (7627) (by **Proton**)
- [build] Setup Android SDK environment for performance bot (7635) (by **Zhanlue Yang**)
- [ci] Update perf mon image (7639) (by **Proton**)
- [ci] Fix perf mon break (7638) (by **Proton**)
- [doc] Add documentation on using ghstack (7632) (by **Proton**)
- [build] Static linking libstdc++ on Linux (by **Proton**)
- [ci] Rewrite Dockerfiles (by **Proton**)
- [ci] Resolve "Needed single revision" workaround failure when the repo directory is empty (7633) (by **Proton**)
- [Vulkan] Fix repeated generation of array ranges in spirv codegen. (7625) (by **Haidong Lan**)
- [build] Switch to use docker with Android-SDK for performance bot (7630) (by **Zhanlue Yang**)
- [opengl] glfw finalize crash fix (by **Proton**)
- [ci] build.py: Android support, entering shell, export env (by **Proton**)
- [ci] Do not run tests with mixed backends (by **Proton**)
- [refactor] Use f16 function from external lib (by **lin-hitonami**)
- [refactor] Migrate members from RuntimeContext to LaunchContextBuilder (by **lin-hitonami**)
- [bug] Fix setting arguments exceeding the max arg num (by **lin-hitonami**)
- [cpu] Explicitly make cpu multithreading loop for range-fors. (7593) (by **Haidong Lan**)
- [aot] Fixed generator for compute graph (7626) (by **PENGUINLIONG**)
- [ir] Postpone scalarize and lower_matrix_ptr to after typecheck (7589) (by **魔法少女赵志辉**)
- [aot] Header generator completed (7609) (by **PENGUINLIONG**)
- [amdgpu] Initialize AMDGPUContext with defaults (by **Proton**)
- [build] Remove libSPIRV-Tools-shared.(so|dll) in wheel (by **Proton**)
- [lang] Removed cpu_device(), cuda_device(), and amdgpu_device() from LlvmRuntimeExecutor (7544) (by **Zhanlue Yang**)
- [refactor] Remove the get/set functions in RuntimeContext (by **lin-hitonami**)
- [aot] Pass LaunchContextBuilder to CompiledGraph::init_runtime_context (by **lin-hitonami**)
- [gfx] Let GfxRuntime use LaunchContextBuilder (by **lin-hitonami**)
- Let LaunchContextBuilder be the argument of the kernel launch function (by **lin-hitonami**)
- [llvm] [refactor] Set the llvm runtime when executing (by **lin-hitonami**)
- [refactor] Migrate {set, get}_{arg, ret} functions from RuntimeContext (by **lin-hitonami**)
- [bug] Fix compilation error (7606) (by **PGZXB**)
- [aot] Hide map memory failure (7604) (by **PENGUINLIONG**)
- [refactor] Fix KernelCodeGen::kernel from Kernel * to const Kernel * (by **PGZXB**)
- [refactor] Remove legacy implementation of llvm offline cache (by **PGZXB**)
- [refactor] Impl llvm::CompiledKernelData (by **PGZXB**)
- [bug] Type check for logical not op with real type inputs (7600) (by **Ailing**)
- [bug] Improve ndarray creation to fix segmentation fault (7577) (by **pengyu**)
- [lang] Add assembly printer for CPU backend (7590) (by **Zhanlue Yang**)
- [misc] Update docker filer (7598) (by **Zeyu Li**)
- [aot] Fix absolute path in generated TaichiTargets.cmake (7597) (by **Chenzhan Shang**)
- [Doc] Remove deprecated api docstrings (7596) (by **pengyu**)
- [llvm] Compile the kernel arguments to a StructType (by **Lin Jiang**)
- [lang] Fix issue with llvm opaque pointer (7557) (by **Zhanlue Yang**)
- [opt] Constant folding for unary ops on host (7573) (by **Ailing**)
- [bug] Type check for bit_not op with real type inputs (7592) (by **Ailing**)
- [Doc] Fix the cexp docstring (7588) (by **Zhao Liang**)
- [Lang] Replace internal representation of Python-scope ti.Matrix with numpy arrays (7559) (by **Yi Xu**)
- [bug] Avoid cuda compilation via clang and ship pre-compiled .bc file instead (7570) (by **Zhanlue Yang**)
- [aot] Taichi kernel AOT command (7565) (by **PENGUINLIONG**)
- [bug] Fix struct members registered to StructField class (7574) (by **Ailing**)
- [aot] Mobile platform AOT build scripts (7567) (by **PENGUINLIONG**)
- [misc] Revert "Security upgrade ipython from 7.34.0 to 8.10.0 (7341)" (7571) (by **Proton**)
- [test] Add cpp tests for constant folding pass (7566) (by **Ailing**)
- [misc] Security upgrade ipython from 7.34.0 to 8.10.0 (7341) (by **Chengchen(Rex) Wang**)
- [lang] Refactor CudaCachingAllocator into a more generic caching allocator (7531) (by **Zhanlue Yang**)
- [aot] Load GfxRuntime140 module from TCM (7539) (by **PENGUINLIONG**)
- [lang] Fixed useless serial shader to blit ExternalTensorShapeAlongAxisStmt on Metal (7562) (by **PENGUINLIONG**)
- [aot] Enable Vulkan 8bit storage (7564) (by **PENGUINLIONG**)
- [bug] Fix crashing on printing FrontendFuncCallStmt with no return value (by **lin-hitonami**)
- [refactor] Remove LaunchContextBuilder::set_arg_raw (by **lin-hitonami**)
- [llvm] Generalize TaskCodeGenLLVM::create_return to set_struct_to_buffer (by **lin-hitonami**)
- [bug] Fix Cuda memory leak during TiRuntime destruction (7345) (by **Zhanlue Yang**)
- [ir] Let void struct type represent void type (by **lin-hitonami**)
- [aot] Let C-API use LaunchContextBuilder to manage RuntimeContext (by **lin-hitonami**)
- [ir] Let the reference type declare a pointer argument (by **lin-hitonami**)
- [Doc] Add doc about returning struct (7556) (by **Lin Jiang**)
- [bug] Fix returning struct containing vec3 (7552) (by **Lin Jiang**)
- [lang] [ir] Extract and save the format specifiers in the f-string (7514) (by **魔法少女赵志辉**)
- [Lang] Stop letting ti.Struct inherit from TaichiOperations (7474) (by **Yi Xu**)
- [aot] Recover AOT CI branch names (7543) (by **PENGUINLIONG**)
- [aot] Put TiRT in Python wheel and CMake script to find it in wheel (7537) (by **PENGUINLIONG**)
- [refactor] Remove the difficult-to-implement CompiledKernelData::size() (7540) (by **PGZXB**)
- [bug] Implement the missing clone function for FrontendFuncCallStmt (7538) (by **PGZXB**)
- [misc] Bump version to v1.6.0 (7536) (by **Haidong Lan**)
- [doc] Handle 2 digit minor versions correctly (7535) (by **Ritoban Roy-Chowdhury**)
- [aot] GfxRuntime140 convention docs (7527) (by **PENGUINLIONG**)
- [rhi] Refactor allocate_memory API to use RhiResult (7463) (by **Bob Cao**)
- [metal] Choose the proper msl version according to the device capability (7506) (by **Yu Zhang**)
- [Lang] Support writing sparse matrix as matrix market file (7529) (by **pengyu**)