Taichi

Latest version: v1.7.2

Safety actively analyzes 687918 Python packages for vulnerabilities to keep your Python projects secure.

Scan your dependencies

Page 6 of 23

1.0.4

Highlights:
- **Documentation**
- Fix typos (5283) (by **Kian-Meng Ang**)
- Update dev_install.md (5266) (by **Vissidarte-Herman**)
- Updated README command lines (5199) (by **Vissidarte-Herman**)
- Modify compilation warnings (5180) (by **Olinaaaloompa**)
- Updated odop.md, removing obsolete information (5163) (by **Vissidarte-Herman**)
- **Language and syntax**
- Refine SNode with quant 7/n: Support placing QuantFixedType under quant_array (5386) (by **Yi Xu**)
- Add determinant for 1d case (5375) (by **Zhao Liang**)
- Make floor, ceil and round accept a dtype optional argument (5307) (by **Zhao Liang**)
- Rename struct_class to dataclass (5365) (by **Zhao Liang**)
- Improve ti example so that users can choose which example to run by entering numbers. (5265) (by **Zhao Liang**)
- Refine SNode with quant 5/n: Rename bit_array to quant_array (5344) (by **Yi Xu**)
- Make bit_vectorize a parameter of ti.loop_config (5334) (by **Yi Xu**)
- Refine SNode with quant 3/n: Turn bit_vectorize into an on/off switch (5331) (by **Yi Xu**)
- Add errror message for missing init call (5280) (by **Zhao Liang**)
- Fix fractal gui close warning (5281) (by **Zhao Liang**)
- Refine SNode with quant 2/n: Enable struct for on bit_array with bit_vectorize off (5253) (by **Yi Xu**)
- Refactor indexing expressions in AST & enforce integer indices (5138) (by **daylily**)

Full changelog:
- Revert "[llvm] (Decomp of 5251 11/n) Enable parallel compilation on CPU backend (5394)" (by **Proton**)
- [refactor] Default dtype of ndarray type should be None instead of f32 (5391) (by **Ailing**)
- [llvm] (Decomp of 5251 11/n) Enable parallel compilation on CPU backend (5394) (by **Lin Jiang**)
- [gui] [vulkan] Surpport for python users to control the start index and count number of particles & meshes data. (5388) (by **Mocki**)
- [autodiff] Support binary operators for forward mode (5389) (by **Mingrui Zhang**)
- [llvm] (Decomp of 5251 10/n) Make SNode tree compatible with parallel compilation (5390) (by **Lin Jiang**)
- [llvm] [refactor] (Decomp of 5251 9/n) Refactor CodeGen to support parallel compilation on LLVM backend (5387) (by **Lin Jiang**)
- [Lang] [type] Refine SNode with quant 7/n: Support placing QuantFixedType under quant_array (5386) (by **Yi Xu**)
- [llvm] [refactor] (Decomp of 5251 8/n) Refactor KernelCacheData (5383) (by **Lin Jiang**)
- [cuda] [type] Refine SNode with quant 6/n: Support __ldg for loading QuantFixedType and QuantFloatType (5374) (by **Yi Xu**)
- [doc] Add simt functions in operators (5333) (by **Bo Qiao**)
- [Lang] Add determinant for 1d case (5375) (by **Zhao Liang**)
- [lang] Texture image load store support (5317) (by **Bob Cao**)
- [bug] Cast scalar to right type before converting to uint64 (by **Ailing Zhang**)
- [refactor] Check dtype mismatch in cgraph compilation and runtime (by **Ailing Zhang**)
- [refactor] Check field_dim mismatch in cgraph compilation and runtime (by **Ailing Zhang**)
- [test] Check repeated arg names in cgraph (by **Ailing Zhang**)
- [llvm] [refactor] (Decomp of 5251 6/n) Let ModuleToFunctionConverter support multiple modules (5372) (by **Lin Jiang**)
- [Lang] Make floor, ceil and round accept a dtype optional argument (5307) (by **Zhao Liang**)
- [refactor] Rename the confused needs_grad (5359) (by **Mingrui Zhang**)
- [autodiff] Support unary ops for forward mode (5366) (by **Mingrui Zhang**)
- [llvm] (Decomp of 5251 7/n) Change the way to record the time of offline cache (5373) (by **Lin Jiang**)
- [llvm] (Decomp of 5251 5/n) Add the parallel compilation worker to LlvmProgramImpl (5364) (by **Lin Jiang**)
- [gui] [test] Fix bug in test_ggui.py when some pc env do not surrport ggui (5370) (by **Mocki**)
- [Lang] Rename struct_class to dataclass (5365) (by **Zhao Liang**)
- [llvm] Drop code for llvm 15. (5313) (by **Xiang Li**)
- [llvm] [aot] Rewrite LLVM AOT tests with LlvmRuntimeExecutor (5358) (by **Zhanlue Yang**)
- [example] Avoid f64 type in simulation/initial_value_problem.py (5355) (by **Proton**)
- [ci] testing: add retention-days for broken wheels (5326) (by **Proton**)
- [test] (Decomp of 5251 4/n) Delete tests for AsyncTaichi (5357) (by **Lin Jiang**)
- [llvm] [refactor] (Decomp of 5251 2/n) Make modulegen a virtual function and let LLVMCompiledData replace ModuleGenValue (5353) (by **Lin Jiang**)
- [gui] Support exporting gif && video in GGUI (5354) (by **Mocki**)
- [autodiff] Handle field accessing by zero for forward mode (5339) (by **Mingrui Zhang**)
- [llvm] [refactor] (Decomp of 5251 3/n) Remove codegen from OffloadedTask and let it replace OffloadedTaskCacheData (5356) (by **Lin Jiang**)
- [refactor] Turn off stack traceback info by default (5347) (by **Ailing**)
- [refactor] (Decomp of 5251 1/n) Move ParallelExecutor out of async engine (5351) (by **Lin Jiang**)
- [Lang] Improve ti example so that users can choose which example to run by entering numbers. (5265) (by **Zhao Liang**)
- [gui] Add get_view_matrix() and get_projection_matrix() APIs for camera (5345) (by **Mocki**)
- [bug] Added warning messages for implicit type conversion for RangeFor boundaries (5322) (by **Zhanlue Yang**)
- [example] Fix simulation/waterwave.py:update race condition (5346) (by **Proton**)
- [Lang] [type] Refine SNode with quant 5/n: Rename bit_array to quant_array (5344) (by **Yi Xu**)
- [llvm] [aot] Added CGraph tests for LLVM backend (5305) (by **Zhanlue Yang**)
- [autodiff] [test] Add for-loop tests for forward mode (5336) (by **Mingrui Zhang**)
- [example] Lower example GUI resolution to fit buildbot display (5337) (by **Proton**)
- [build] [bug] Fix building on macOS 10.14 failed (5332) (by **PGZXB**)
- [llvm] [aot] Replaced LlvmProgramImpl with LlvmRuntimeExecutor for LlvmAotModuleLoader (5330) (by **Zhanlue Yang**)
- [AOT] Fixed certain crashes in C-API (5335) (by **PENGUINLIONG**)
- [Lang] [type] Make bit_vectorize a parameter of ti.loop_config (5334) (by **Yi Xu**)
- [autodiff] Skip store forwarding to keep the GlobalLoadStmt alive (5315) (by **Mingrui Zhang**)
- [llvm] [aot] RModified ModuleToFunctionConverter to use LlvmRuntimeExecutor instead of LlvmProgramImpl (5328) (by **Zhanlue Yang**)
- [llvm] Changed LlvmProgramImpl to save cache_data_ with unique_ptr instead of raw object (5329) (by **Zhanlue Yang**)
- [Lang] [type] Refine SNode with quant 3/n: Turn bit_vectorize into an on/off switch (5331) (by **Yi Xu**)
- [misc] Fix a few compilation warnings (5325) (by **yekuang**)
- [bug] Accept numpy integers in ndrange (5245) (5323) (by **Proton**)
- [misc] Implement cache file cleaning (5310) (by **PGZXB**)
- Fixed C-AP build on Android (5321) (by **PENGUINLIONG**)
- [AOT] Save AOT module artifacts as zip archive (5316) (by **PENGUINLIONG**)
- [llvm] [aot] Added LLVM backend support for Compute Graph (5294) (by **Zhanlue Yang**)
- [AOT] Unity native plugin interfaces (5273) (by **PENGUINLIONG**)
- [autodiff] Check not placed field.grad when needs_grad = True (5295) (by **Mingrui Zhang**)
- [autodiff] Fix alloca block and add control flow test case for forward mode (5301) (by **Mingrui Zhang**)
- [refactor] Synchronize should always be called in non-async mode (5302) (by **Ailing**)
- [Lang] Add errror message for missing init call (5280) (by **Zhao Liang**)
- Update prtags.json (5304) (by **Bob Cao**)
- [refactor] Get rid ndarray host accessor kernels (by **Ailing Zhang**)
- [refactor] Use device api for CPU/CUDA ndarray (by **Ailing Zhang**)
- [refactor] Switch to using staging buffer for metal/vulkan/opengl (by **Ailing Zhang**)
- [llvm] Use LlvmProgramImpl::cache_data_ to store compiled kernel info (5290) (by **Zhanlue Yang**)
- [opengl] Texture support in OpenGL (5296) (by **Bob Cao**)
- [build] [refactor] Cleanup backends folder and rename to RHI (5288) (by **Bo Qiao**)
- [Lang] Fix fractal gui close warning (5281) (by **Zhao Liang**)
- [autodiff] [test] Add atomic test for forward autodiff (5286) (by **Mingrui Zhang**)
- [dx11] Fix DX backend with new runtime & Better D3D11 buffer handling (5244) (by **Bob Cao**)
- [autodiff] Set default seed only for scalar parameter to avoid silent unexpected results (5287) (by **Mingrui Zhang**)
- test (5292) (by **Ailing**)
- [AOT] Added C-API for on-device memory copy (5271) (by **PENGUINLIONG**)
- [Doc] Fix typos (5283) (by **Kian-Meng Ang**)
- [autodiff] Support control flow for forward mode (by **mingrui**)
- [autodiff] Support for-loop and mutation for forward mode (by **mingrui**)
- [autodiff] Refactor dual field allocation (by **mingrui**)
- [AOT] Refactor C-API codegen (5272) (by **PENGUINLIONG**)
- Update README.md (5279) (by **Taichi contributor**)
- [metal] Support memcpy_internal via buffer_copy (5268) (by **Ailing**)
- [bug] Fix missing old but useful metadata in offline cache (5267) (by **PGZXB**)
- [Lang] [type] Refine SNode with quant 2/n: Enable struct for on bit_array with bit_vectorize off (5253) (by **Yi Xu**)
- [Doc] Update dev_install.md (5266) (by **Vissidarte-Herman**)
- [build] [bug] Fix dependency for opengl_rhi target (by **Bo Qiao**)
- Update fallback order, move opengl behind Vulkan (5257) (by **Bob Cao**)
- [opengl] Move OpenGL backend onto Gfx runtime (5246) (by **Bob Cao**)
- [build] [refactor] Move LLVM source files to target locations (5254) (by **Bo Qiao**)
- [bug] Fixed misuse of std::forward (5237) (by **Zhanlue Yang**)
- [AOT] Added safety checks to prevent hard crashes on failure (5249) (by **PENGUINLIONG**)
- [build] [refactor] Move shaders source files to runtime (5247) (by **Bo Qiao**)
- [example] Fix diff_sph example with --train (5242) (by **Mingrui Zhang**)
- [misc] Add filename option to ti.tools.VideoManager. (5219) (by **Qian Bao**)
- [bug] Throw exceptions when ndrange gets non-integral arguments (5245) (by **Mike He**)
- [build] [refactor] Move wasm and dx11 source files to target locations (5235) (by **Bo Qiao**)
- [type] [bug] Refine SNode with quant 1/n: Fix (atomic_)set_mask_bN (5238) (by **Yi Xu**)
- [lang] 1d/3d texture support (5233) (by **Bob Cao**)
- [vulkan] Fix OpBranch for reversed RangeForStmt (5241) (by **Mingrui Zhang**)
- [build] Fix -Werror errors for TI_WITH_CUDA_TOOLKIT=ON (5133) (5216) (by **Proton**)
- [ci] Enable pylint on examples (5222) (by **Proton**)
- [llvm] [aot] Split LlvmRuntimeExecutor from LlvmProgramImpl (5207) (by **Zhanlue Yang**)
- [type] [refactor] Decouple quant from SNode 3/n: Extend bit pointers (5232) (by **Yi Xu**)
- [vulkan] Codegen & runtime improvements (5213) (by **Bob Cao**)
- [gui] Fix the device memory leak when GGUI terminates (by **Ailing Zhang**)
- [gui] Let gui and renderer manage the resource they own (by **Ailing Zhang**)
- [AOT] Unity language binding generator (5204) (by **PENGUINLIONG**)
- [type] [refactor] Decouple quant from SNode 2/n: Remove physical_type from QuantIntType (5223) (by **Yi Xu**)
- [type] [refactor] Decouple quant from SNode 1/n: Add BitStructTypeBuilder (5209) (by **Yi Xu**)
- [build] [refactor] Move metal source files to target locations (5208) (by **Bo Qiao**)
- [lang] Export a few types from the share library (5220) (by **yekuang**)
- [llvm] [refactor] LLVMProgramImpl code clean up: part-5 (5197) (by **Zhanlue Yang**)
- [spirv] Fixed `OpLoad` with physical address (5212) (by **PENGUINLIONG**)
- [wip] Enable full wheel build when TI_EXPORT_CORE is on (5211) (by **Ailing**)
- [llvm] [refactor] LLVMProgramImpl code clean up: part-4 (5189) (by **Zhanlue Yang**)
- Move spdlog include to profiler.cpp (5210) (by **Ailing**)
- Fix ti gallery command bug (5196) (by **Zhao Liang**)
- [misc] Improve TI_STATIC_ASSERT compatibility (5205) (by **Yuanming Hu**)
- [llvm] [refactor] LLVMProgramImpl code clean up: part-3 (5188) (by **Zhanlue Yang**)
- Fixed C-API provision (5203) (by **PENGUINLIONG**)
- [lang] Improve error message when literal val is out of range of default dtype (5191) (by **Ailing**)
- [Lang] [ir] Refactor indexing expressions in AST & enforce integer indices (5138) (by **daylily**)
- Remove stale coverage from README.md (5202) (by **yekuang**)
- [ci] Slim cpu build image (5198) (by **Proton**)
- [build] [refactor] Move opengl source files to target locations (5200) (by **Bo Qiao**)
- [example] Fix dtype for metal backend and enforce vulkan (5201) (by **Mingrui Zhang**)
- [Doc] Updated README command lines (5199) (by **Vissidarte-Herman**)
- [llvm] [refactor] LLVMProgramImpl code clean up: part-2 (5187) (by **Zhanlue Yang**)
- [AOT] Support Matrix/Vector as graph arguments (5165) (by **Haidong Lan**)
- [refactor] Enable adaptive block_dim selection for CPU backend (5190) (by **Bo Qiao**)
- [Doc] Modify compilation warnings (5180) (by **Olinaaaloompa**)
- [ci] Save wheel to artifact when test fails (5186) (by **Proton**)
- [gui] Detailed error message when GGUI is not available (5164) (by **Proton**)
- [ci] Run C++ tests on Windows (5176) (by **Proton**)
- [lang] Texture support 3/n (Python changes) (5174) (by **Bob Cao**)
- [llvm] [refactor] LLVMProgramImpl code clean up: part-1 (5181) (by **Zhanlue Yang**)
- [AOT] Implementation of Taichi Runtime C-API (5168) (by **PENGUINLIONG**)
- [refactor] [autodiff] Clean redundant compiled functions and refactor kernel key (5178) (by **Mingrui Zhang**)
- [doc] Add badge on README.md (5177) (by **yanqingzhang**)
- [lang] Texture support 2/n (SPIR-V backend & runtime changes) (5159) (by **Bob Cao**)
- [build] Export cmake config to ease clients usage in Cmake (5162) (by **Bo Qiao**)
- [refactor] [autodiff] Refactor autodiff api and add corresponding tests (5175) (by **Mingrui Zhang**)
- [aot] [llvm] LLVM AOT Field part-4: Added AOT tests for Fields - CUDA backend (5124) (by **Zhanlue Yang**)
- [type] [refactor] Consistently use quant_xxx in quant-related names (5166) (by **Yi Xu**)
- [cuda] Disable reduction in non-full warps (5161) (by **Bob Cao**)
- [autodiff] Support basic operations for forward mode autodiff (by **mingrui**)
- [autodiff] Add a context manager for forward mode autodiff (by **mingrui**)
- [AOT] C-APIs for Taichi runtime distribution (5150) (by **PENGUINLIONG**)
- [cli] Improve user interface for CLI command ti example (5153) (by **Zhao Liang**)
- [Doc] Updated odop.md, removing obsolete information (5163) (by **Vissidarte-Herman**)
- [autodiff] [refactor] Refactor autodiff tape api and TapeImpl (5154) (by **Mingrui Zhang**)
- [type] [refactor] Separate CustomFixedType from CustomFloatType (5149) (by **Yi Xu**)
- [ui] Properlly fix UTF-8 title string by converting to UTF16 (5155) (by **Bob Cao**)
- [aot] [llvm] LLVM AOT Field 3: Added AOT tests for Fields - CPU backend (5121) (by **Zhanlue Yang**)
- Bump version to v1.0.4 (5157) (by **Taichi Gardener**)
- [lang] Texture support 1/n (Context & Programs) (5139) (by **Bob Cao**)

1.0.3

Highlights:
- **Aot module**
- Support importing external Vulkan buffers (5020) (by **PENGUINLIONG**)
- Supported inclusion of taichi as subdirectory for AOT modules (5007) (by **PENGUINLIONG**)
- **Bug fixes**
- Fix frontend type check for reading a whole bit_struct (5027) (by **Yi Xu**)
- Remove redundant AllocStmt when lowering FrontendWhileStmt (4870) (by **Zhanlue Yang**)
- **Build system**
- Improve Windows build script (4955) (by **PENGUINLIONG**)
- Improved building on Windows (4925) (by **PENGUINLIONG**)
- Define Cmake OpenGL runtime target (4887) (by **Bo Qiao**)
- Use keywords instead of plain target_link_libraries CMake (4864) (by **Bo Qiao**)
- Define runtime build target (4838) (by **Bo Qiao**)
- Switch to scikit-build as the build backend (4624) (by **Frost Ming**)
- **Documentation**
- Improve ODOP doc structure (5089) (by **Yi Xu**)
- Add documentation of Taichi Struct Classes. (5075) (by **bsavery**)
- Updated type system (5054) (by **Vissidarte-Herman**)
- Branding updates. Also tests netlify. (4994) (by **Vissidarte-Herman**)
- Fix netlify cache & sync doc without pr content (5003) (by **Justin**)
- Update trouble shooting URL in bug report template (4988) (by **Haidong Lan**)
- Updated URL (4990) (by **Vissidarte-Herman**)
- Fix docs deploy netlify test configuration (4991) (by **Justin**)
- Updated relative path (4929) (by **Vissidarte-Herman**)
- Updated broken links (4912) (by **Vissidarte-Herman**)
- Updated links that may break. (4874) (by **Vissidarte-Herman**)
- Add limitation about TLS optimization (4877) (by **Ailing**)
- **Examples**
- Fix block_dim warning in ggui (5128) (by **Zhao Liang**)
- Update visual effects of mass_spring_3d_ggui.py (5081) (by **Zhao Liang**)
- Update mass_spring_3d_ggui.py to v2 (3879) (by **Alex Brown**)
- **Language and syntax**
- Add more initialization routines for glsl matrix types (5069) (by **Zhao Liang**)
- Support constructing vector and matrix ndarray from ti.ndarray() (by **ailzhang**)
- Disallow reading a whole bit_struct (5061) (by **Yi Xu**)
- Struct Classes implementation (4989) (by **bsavery**)
- Add short-circuit if-then-else operator (5022) (by **daylily**)
- Build sparse matrix from ndarray (4841) (by **pengyu**)
- Fix potential precision bug when using math vector and matrix types (5032) (by **Zhao Liang**)
- Refactor quant type definition APIs (5036) (by **Yi Xu**)
- Fix parameter name 'range' for ti.types.quant.fixed (5006) (by **Yi Xu**)
- Refactor quantized_types module and make quant APIs public (4985) (by **Yi Xu**)
- Add more functions to math module (4939) (by **Zhao Liang**)
- Support sparse matrix datatype and storage format configuration (4673) (by **pengyu**)
- Copy-free interaction between Taichi and PaddlePaddle (4886) (by **0xzhang**)
- **LLVM backend (CPU and CUDA)**
- Add AOT builder and loader (5013) (by **yekuang**)
- **Metal backend**
- Support Ndarray (4720) (by **yekuang**)
- **RFC**
- AOT for all SNodes (4806) (by **yekuang**)
- **SIMT programming**
- Add match_all warp intrinsics (4961) (by **Zeyu Li**)
- Add match_any warp intrinsics (4921) (by **Zeyu Li**)
- Add uni_sync warp intrinsics (4927) (by **0xzhang**)
- Add activemask warp intrinsics (4918) (by **Zeyu Li**)
- Add syncwarp warp intrinsics (4917) (by **Zeyu Li**)
- **Vulkan backend**
- Fixed vulkan backend crash on AOT examples (5047) (by **PENGUINLIONG**)
- **GitHub Actions/Workflows**
- Update release_test.sh (4960) (by **Chuandong Yan**)

Full changelog:
- [aot] [llvm] LLVM AOT Field 2: Updated LLVM AOTModuleLoader & AOTModuleBuilder to support Fields (5120) (by **Zhanlue Yang**)
- [type] [refactor] Misc improvements to quant codegen (5129) (by **Yi Xu**)
- [ci] Enable yapf and isort on example files (5140) (by **Ailing**)
- [Example] Fix block_dim warning in ggui (5128) (by **Zhao Liang**)
- fix mass_spring_3d_ggui backend (5127) (by **Zhao Liang**)
- [lang] Texture support 0/n: IR changes (5134) (by **Bob Cao**)
- Editorial update (5119) (by **Olinaaaloompa**)
- [aot] [llvm] LLVM AOT Field 1: Adjust serialization/deserialization logics for FieldCacheData (5111) (by **Zhanlue Yang**)
- [aot][bug] Use cached compiled kernel pointer when it's added to graph (5122) (by **Ailing**)
- [aot] [llvm] LLVM AOT Field 0: Implemented FieldCacheData & refactored initialize_llvm_runtime_snodes() (5108) (by **Zhanlue Yang**)
- [autodiff] Add forward mode pipeline for autodiff pass (5098) (by **Mingrui Zhang**)
- [build] [refactor] Move Vulkan runtime out of backends dir (5106) (by **Bo Qiao**)
- [bug] Fix build without llvm backend crash (5113) (by **Bo Qiao**)
- [type] [llvm] [refactor] Fix function names in codegen_llvm_quant (5115) (by **Yi Xu**)
- [llvm] [refactor] Replace cast_int() with LLVM native integer cast (5110) (by **Yi Xu**)
- [type] [refactor] Remove redundant promotion for custom int in type_check (5102) (by **Yi Xu**)
- [Example] Update visual effects of mass_spring_3d_ggui.py (5081) (by **Zhao Liang**)
- [test] Save mpm88 graph in python and load in C++ test. (5104) (by **Ailing**)
- [llvm] [refactor] Move load_bit_pointer() to CodeGenLLVM (5099) (by **Yi Xu**)
- [refactor] Remove ndarray element shape from extra arg buffer (5100) (by **Haidong Lan**)
- [refactor] Update Ndarray constructor used in AOT runtime. (5095) (by **Ailing**)
- clean hidden override functions (5097) (by **Mingrui Zhang**)
- [llvm] [aot] CUDA-AOT PR 2: Implemented AOTModuleLoader & AOTModuleBuilder for LLVM-CUDA backend (5087) (by **Zhanlue Yang**)
- [Doc] Improve ODOP doc structure (5089) (by **Yi Xu**)
- Use pre-calculated runtime size array for gfx runtime. (5094) (by **Haidong Lan**)
- [bug] Minor fix for ndarray element_shape in graph mode (5093) (by **Ailing**)
- [llvm] [refactor] Use LLVM native atomic ops if possible (5091) (by **Yi Xu**)
- [autodiff] Extract shared components for reverse and forward mode (5088) (by **Mingrui Zhang**)
- [llvm] [aot] Add LLVM-CPU AOT tests (5079) (by **Zhanlue Yang**)
- [Doc] Add documentation of Taichi Struct Classes. (5075) (by **bsavery**)
- [build] [refactor] Change CMake global include_directories to target based function (5082) (by **Bo Qiao**)
- [autodiff] Allocate dual and adjoint snode (5083) (by **Mingrui Zhang**)
- [refactor] Make sure Ndarray shape is field shape (5085) (by **Ailing**)
- [llvm] [refactor] Merge AtomicOpStmt codegen in CPU and CUDA backends (5086) (by **Yi Xu**)
- [llvm] [aot] CUDA-AOT PR 1: Extracted common logics from CPUAotModuleImpl into LLVMAotModule (5072) (by **Zhanlue Yang**)
- [infra] Refactor Vulkan runtime into true Common Runtime (5058) (by **Bob Cao**)
- [refactor] Correctly set ndarray element_size and nelement (5080) (by **Ailing**)
- [cuda] [simt] Add assertions for warp intrinsics on old GPUs (5077) (by **Bo Qiao**)
- [Lang] Add more initialization routines for glsl matrix types (5069) (by **Zhao Liang**)
- [spirv] Specialize element shape for spirv codegen. (5068) (by **Haidong Lan**)
- [llvm] Specialize element shape for LLVM backend (5071) (by **Haidong Lan**)
- [doc] Fix broken link for github action status badge (5076) (by **Ailing**)
- [Example] Update mass_spring_3d_ggui.py to v2 (3879) (by **Alex Brown**)
- [refactor] Resolve comments from 5065 (5074) (by **Ailing**)
- [Lang] Support constructing vector and matrix ndarray from ti.ndarray() (by **ailzhang**)
- [refactor] Pass element_shape and layout to C++ Ndarray (by **ailzhang**)
- [refactor] Specialized Ndarray Type is (element_type, shape, layout) (by **ailzhang**)
- [aot] [CUDA-AOT PR 0] Refactored compile_module_to_executable() to CUDAModuleToFunctionConverter (5070) (by **Zhanlue Yang**)
- [refactor] Split GraphBuilder out of Graph class (5064) (by **Ailing**)
- [build] [bug] Ensure the assets folder is copied to the project directory (5063) (by **Frost Ming**)
- [bug] Remove operator ! for Expr (5062) (by **Yi Xu**)
- [Lang] [type] Disallow reading a whole bit_struct (5061) (by **Yi Xu**)
- [Lang] Struct Classes implementation (4989) (by **bsavery**)
- [Lang] [ir] Add short-circuit if-then-else operator (5022) (by **daylily**)
- [bug] Ndarray type should include primitive dtype as well (5052) (by **Ailing**)
- [Doc] Updated type system (5054) (by **Vissidarte-Herman**)
- [bug] Added type promotion support for atan2 (5037) (by **Zhanlue Yang**)
- [Lang] Build sparse matrix from ndarray (4841) (by **pengyu**)
- Set host_write to false for opengl ndarray (5038) (by **Ailing**)
- [ci] Run cpp tests via run_tests.py (5035) (by **yekuang**)
- Exit CI builds when download of prebuilt packages fails (5043) (by **PENGUINLIONG**)
- [Vulkan] Fixed vulkan backend crash on AOT examples (5047) (by **PENGUINLIONG**)
- [Lang] Fix potential precision bug when using math vector and matrix types (5032) (by **Zhao Liang**)
- [Metal] Support Ndarray (4720) (by **yekuang**)
- [Lang] [type] Refactor quant type definition APIs (5036) (by **Yi Xu**)
- [aot] Bind graph APIs to python and add mpm88 example (5034) (by **Ailing**)
- [aot] Move ArgKind as first argument in Arg class (by **ailzhang**)
- [aot] Serialize built graph, deserialize and run. (by **ailzhang**)
- [ci] Disable win cpu docker job test (5033) (by **Bo Qiao**)
- [doc] Update OS names (5030) (by **Bo Qiao**)
- fix fast_gui rgba bug (5031) (by **Zhao Liang**)
- [Bug] [type] Fix frontend type check for reading a whole bit_struct (5027) (by **Yi Xu**)
- [AOT] Support importing external Vulkan buffers (5020) (by **PENGUINLIONG**)
- [SIMT] Add match_all warp intrinsics (4961) (by **Zeyu Li**)
- [bug] Revert freeing ndarray memory when python GC triggers (5019) (by **Ailing**)
- [ci] Fix nightly macos (5018) (by **Bo Qiao**)
- [Llvm] Add AOT builder and loader (5013) (by **yekuang**)
- [aot] Build and run graph without serialization (by **Ailing Zhang**)
- [test] Unify kernel setup for ndarray related tests (by **Ailing Zhang**)
- [ci] [build] Enable ccache for windows docker (5001) (by **Frost Ming**)
- [refactor] Move get ndarray data ptr to program (5012) (by **pengyu**)
- [bug] Fixed numerical error for Atomic-Sub between unsigned values with different number of bits (5011) (by **Zhanlue Yang**)
- [llvm] Add serializable LlvmLaunchArgInfo (4992) (by **yekuang**)
- [doc] Update community section (4943) (by **yanqingzhang**)
- [SIMT] Add match_any warp intrinsics (4921) (by **Zeyu Li**)
- [Lang] [type] Fix parameter name 'range' for ti.types.quant.fixed (5006) (by **Yi Xu**)
- [misc] Version bump: v1.0.2 -> v1.0.3 (5008) (by **Haidong Lan**)
- [AOT] Supported inclusion of taichi as subdirectory for AOT modules (5007) (by **PENGUINLIONG**)
- [Doc] Branding updates. Also tests netlify. (4994) (by **Vissidarte-Herman**)
- [refactor] Get rid of data_ptr_ in Ndarray (by **Ailing Zhang**)
- [refactor] Move ndarray fast fill methods to Program (by **Ailing Zhang**)
- [refactor] Free ndarray's memory when python GC triggers (by **Ailing Zhang**)
- [refactor] Construct ndarray from existing DeviceAllocation. (by **Ailing Zhang**)
- [test] Add test for Ndarray from DeviceAllocation (by **Ailing Zhang**)
- [refactor] Program owns allocated ndarrays. (by **Ailing Zhang**)
- [Doc] Fix netlify cache & sync doc without pr content (5003) (by **Justin**)
- [test] Fix a few mis-configured ndarray tests (5000) (by **Ailing**)
- Update README.md (by **Vissidarte-Herman**)
- [Lang] [type] Refactor quantized_types module and make quant APIs public (4985) (by **Yi Xu**)
- [Doc] Update trouble shooting URL in bug report template (4988) (by **Haidong Lan**)
- [Doc] Updated URL (4990) (by **Vissidarte-Herman**)
- [Doc] Fix docs deploy netlify test configuration (4991) (by **Justin**)
- [llvm] Use serializer for LLVM cache (4982) (by **yekuang**)
- Provision of prebuilt LLVM 10 for VS2022 (4987) (by **PENGUINLIONG**)
- [Workflow] Update release_test.sh (4960) (by **Chuandong Yan**)
- [cuda] Add block and grid level intrinsic for cuda backend (4977) (by **YuZhang**)
- [bug] Fix infinite recursion of get_offline_cache_key_of_snode_impl() (4983) (by **PGZXB**)
- [misc] Add ASTSerializer::visit(ReferenceExpression *) (4984) (by **PGZXB**)
- [llvm] Support both BC and LL cache format (4979) (by **yekuang**)
- [refactor] Improve serializer and cleanup utils (4980) (by **yekuang**)
- [Build] Improve Windows build script (4955) (by **PENGUINLIONG**)
- [llvm] Make cache writer support BC format (4978) (by **yekuang**)
- [ci] [build] Containerize Windows CPU build and test (4933) (by **Bo Qiao**)
- [llvm] Make codegen produce static llvm::Module (4975) (by **yekuang**)
- [test] Add an ndarray test in C++. (4972) (by **Ailing**)
- [build] Fixed Ilegal Instruction Error when importing PaddlePaddle module (4969) (by **Zhanlue Yang**)
- [llvm] Create ModuleToFunctionConverter (4962) (by **yekuang**)
- [bug] [simt] Fix the problem that some intrinsics are never called (4957) (by **Yi Xu**)
- [vulkan] Set kApiVersion to VK_API_VERSION_1_3 (4970) (by **Haidong Lan**)
- [ci] Add new buildbot with latest driver for Linux/Vulkan test (4953) (by **Bo Qiao**)
- [RFC] AOT for all SNodes (4806) (by **yekuang**)
- [llvm] Move cache directory to dump() (4963) (by **yekuang**)
- [lang] Add reference type support on real functions (4889) (by **Lin Jiang**)
- [refactor] Some renamings (4959) (by **yekuang**)
- [refactor] Add ArrayMetadata to store the array runtime size (4950) (by **yekuang**)
- [lang] [bug] Implement Expression serializing and fix some bugs (4931) (by **PGZXB**)
- [Lang] Add more functions to math module (4939) (by **Zhao Liang**)
- [Build] Improved building on Windows (4925) (by **PENGUINLIONG**)
- [ci] Fix Nightly (4948) (by **Bo Qiao**)
- [build] Limit -Werror to Clang-compiler only (4947) (by **Zhanlue Yang**)
- [refactor] [llvm] Remove struct_compiler_ as a member variable (4945) (by **yekuang**)
- [build] Turned off -Werror temporarily for issues with performance-bot (4946) (by **Zhanlue Yang**)
- [refactor] Remove unused snode_trees in ProgramImpl interface (4942) (by **yekuang**)
- [doc] Updated documentations for implicit type casting rules (4885) (by **Zhanlue Yang**)
- [build] Turn on -Werror on Linux and Mac platforms (4928) (by **Zhanlue Yang**)
- [build] Enable -Werror on Linux & Mac (4941) (by **Zhanlue Yang**)
- [SIMT] Add uni_sync warp intrinsics (4927) (by **0xzhang**)
- [lang] Fix type check warnings for ti.Mesh (4930) (by **Chang Yu**)
- [Lang] Support sparse matrix datatype and storage format configuration (4673) (by **pengyu**)
- [Doc] Updated relative path (4929) (by **Vissidarte-Herman**)
- [refactor] Simplify Matrix's initializer (4923) (by **yekuang**)
- [build] Warning Suppression PR 4: Fixed warnings with MacOS (4926) (by **Zhanlue Yang**)
- [build] Warning Suppression PR 3: Eliminate warnings from third-party headers (4920) (by **Zhanlue Yang**)
- [SIMT] Add activemask warp intrinsics (4918) (by **Zeyu Li**)
- [build] Warning Suppression PR 1: Turned on -Wno-ignored-attributes & Removed unused functions (4916) (by **Zhanlue Yang**)
- [refactor] Create MatrixImpl to differentiate Taichi and Python scopes (4853) (by **yekuang**)
- [SIMT] Add syncwarp warp intrinsics (4917) (by **Zeyu Li**)
- [build] Warning Suppression PR 2: Fixed codebase warnings (4909) (by **Zhanlue Yang**)
- [test] Exit on error during Paddle windows test (4910) (by **Bo Qiao**)
- [Doc] Updated broken links (4912) (by **Vissidarte-Herman**)
- remove debug print (4883) (by **yixu**)
- [test] Cancel tests for Paddle on GPU (4914) (by **0xzhang**)
- [Lang] [test] Copy-free interaction between Taichi and PaddlePaddle (4886) (by **0xzhang**)
- Use Ninja generator on Windows and skip generator test (4896) (by **Frost Ming**)
- [vulkan] Add new VMA vulkan functions. (4893) (by **Bob Cao**)
- [vulkan] Fix typo for waitSemaphoreCount (4892) (by **Gabriel H**)
- [Build] [refactor] Define Cmake OpenGL runtime target (4887) (by **Bo Qiao**)
- [aot] [vulkan] Expose symbols for AOT (4879) (by **yekuang**)
- [bug] Fixed type promotion rule for bit-shift operations (4884) (by **Zhanlue Yang**)
- [Build] [refactor] Use keywords instead of plain target_link_libraries CMake (4864) (by **Bo Qiao**)
- [metal] Migrate runtime's MTLBuffer allocation to unified device API (4865) (by **yekuang**)
- [error] [lang] Improved error messages for illegal slicing or indexing to ti.field (4873) (by **Zhanlue Yang**)
- [Doc] Updated links that may break. (4874) (by **Vissidarte-Herman**)
- [metal] Complete Device API (4862) (by **yekuang**)
- [vulkan] Device API explicit semaphores (4852) (by **Bob Cao**)
- [build] Change the library output dir for export core (4880) (by **Frost Ming**)
- [refactor] Add ASTSerializer and use it to generate offline-cache-key (4863) (by **PGZXB**)
- [ci] Use the updated docker image for libtaichi_export_core (4881) (by **Bo Qiao**)
- [Doc] Add limitation about TLS optimization (4877) (by **Ailing**)
- [Build] [refactor] Define runtime build target (4838) (by **Bo Qiao**)
- [ci] Add libtaichi_export_core build for desktop in CI (4871) (by **Ailing**)
- [build] [bug] Fix a bug of skbuild that loses the root package_dir (4875) (by **Frost Ming**)
- [Bug] Remove redundant AllocStmt when lowering FrontendWhileStmt (4870) (by **Zhanlue Yang**)
- [misc] Bump version to v1.0.2 (4867) (by **Taichi Gardener**)
- [build] Install export core library to build dir (4866) (by **Frost Ming**)
- [Build] Switch to scikit-build as the build backend (4624) (by **Frost Ming**)

1.0.2

Highlights:

The v1.0.2 release is a patch fix that improves Taichi's stability on multiple platforms, especially for GGUI and the Vulkan backend.
- **Bug fixes**
- Remove redundant AllocStmt when lowering FrontendWhileStmt (4870) (by **Zhanlue Yang**)
- **Build system**
- Define Cmake OpenGL runtime target (4887) (by **Bo Qiao**)
- Use keywords instead of plain target_link_libraries CMake (4864) (by **Bo Qiao**)
- Define runtime build target (4838) (by **Bo Qiao**)
- Switch to scikit-build as the build backend (4624) (by **Frost Ming**)
- **Documentation**
- Add limitation about TLS optimization (4877) (by **Ailing**)

Full changelog:
- [ci] Fix Nightly (4948) (by **Bo Qiao**)
- [ci] [build] Containerize Windows CPU build and test (4933) (by **Bo Qiao**)
- [vulkan] Set kApiVersion to VK_API_VERSION_1_3 (4970) (by **Haidong Lan**)
- [ci] Add new buildbot with latest driver for Linux/Vulkan test (4953) (by **Bo Qiao**)
- [vulkan] Add new VMA vulkan functions. (4893) (by **Bob Cao**)
- [vulkan] Fix typo for waitSemaphoreCount (4892) (by **Gabriel H**)
- [Build] [refactor] Define Cmake OpenGL runtime target (4887) (by **Bo Qiao**)
- [Build] [refactor] Use keywords instead of plain target_link_libraries CMake (4864) (by **Bo Qiao**)
- [vulkan] Device API explicit semaphores (4852) (by **Bob Cao**)
- [build] Change the library output dir for export core (4880) (by **Frost Ming**)
- [ci] Use the updated docker image for libtaichi_export_core (4881) (by **Bo Qiao**)
- [Doc] Add limitation about TLS optimization (4877) (by **Ailing**)
- [Build] [refactor] Define runtime build target (4838) (by **Bo Qiao**)
- [ci] Add libtaichi_export_core build for desktop in CI (4871) (by **Ailing**)
- [build] [bug] Fix a bug of skbuild that loses the root package_dir (4875) (by **Frost Ming**)
- [Bug] Remove redundant AllocStmt when lowering FrontendWhileStmt (4870) (by **Zhanlue Yang**)
- [misc] Bump version to v1.0.2 (4867) (by **Taichi Gardener**)
- [build] Install export core library to build dir (4866) (by **Frost Ming**)
- [Build] Switch to scikit-build as the build backend (4624) (by **Frost Ming**)

1.0.1

Highlights:
- **Automatic differentiation**
- Implement ti.ad.no_grad to skip autograd (4751) (by **Shawn Yao**)
- **Bug fixes**
- Fix and refactor type check for atomic ops (4858) (by **Yi Xu**)
- Fix and refactor type check for local stores (4843) (by **Yi Xu**)
- Fix implicit cast warning for global stores (4834) (by **Yi Xu**)
- **Documentation**
- Updated URL (4847) (by **Vissidarte-Herman**)
- LLVM sparse runtime design doc (4790) (by **yekuang**)
- Proofread Getting started (4682) (by **Vissidarte-Herman**)
- Editorial review to fields (advanced) (4686) (by **Vissidarte-Herman**)
- Update docstring for ti.Mesh (4818) (by **Chang Yu**)
- Remove redundant semicolon in path (4801) (by **gaoxinge**)
- **Error messages**
- Show warning when serialize=True is set on a struct for (4844) (by **Lin Jiang**)
- Provide source code info in warnings (4840) (by **Yi Xu**)
- **Language and syntax**
- Add single character property for vector swizzle && test (4845) (by **Zhao Liang**)
- Remove obsolete vectypes class (4831) (by **LiangZhao**)
- Add support for keyword arguments (4794) (by **Lin Jiang**)
- Support swizzles on all Matrix/Vector types (4828) (by **yekuang**)
- Add 2d and 3d rotation functions to math module (4822) (by **Zhao Liang**)
- Walkaround Vulkan backend behavior which changes cwd on Mac (4812) (by **TiGeekMan**)
- Add mod function to math module (4809) (by **Zhao Liang**)
- Support in-place operator of ti.Matrix in python scope (4799) (by **Lin Jiang**)
- Move short-circuit boolean logic into AST-to-IR passes (4580) (by **daylily**)
- Promote output type of log, exp, and sqrt ops (4622) (by **Andrew Sun**)
- Fix integral type promotion rules (e.g., u8 + u8 now leads to u8 instead of i32) (4789) (by **Yuanming Hu**)
- Add basic complex arithmetic and add a mandelbrot example (4780) (by **Zhao Liang**)
- **SIMT programming**
- Add shfl_down_f32 intrinsic. (4819) (by **Chun Cai**)

Full changelog:
- [gui] Avoid implicit type casts in staging_buffer (4861) (by **Yi Xu**)
- [lang] Add better error detection for swizzle patterens (4860) (by **yekuang**)
- [Bug] [ir] Fix and refactor type check for atomic ops (4858) (by **Yi Xu**)
- [Doc] Updated URL (4847) (by **Vissidarte-Herman**)
- [bug] Fix bug that building with TI_EXPORT_CORE:BOOL=ON failed (4850) (by **PGZXB**)
- [Error] Show warning when serialize=True is set on a struct for (4844) (by **Lin Jiang**)
- [lang] Group related Matrix methods closer (4836) (by **yekuang**)
- [Lang] Add single character property for vector swizzle && test (4845) (by **Zhao Liang**)
- [Bug] [ir] Fix and refactor type check for local stores (4843) (by **Yi Xu**)
- [Error] Provide source code info in warnings (4840) (by **Yi Xu**)
- [misc] Update pre-commit hooks (4713) (by **pre-commit-ci[bot]**)
- [Bug] [ir] Fix implicit cast warning for global stores (4834) (by **Yi Xu**)
- [mesh] Remove link hints from ti.Mesh (4825) (by **yixu**)
- [Lang] Remove obsolete vectypes class (4831) (by **LiangZhao**)
- [doc] Fix doc link (4835) (by **yekuang**)
- [Doc] LLVM sparse runtime design doc (4790) (by **yekuang**)
- [Lang] Add support for keyword arguments (4794) (by **Lin Jiang**)
- [Lang] Support swizzles on all Matrix/Vector types (4828) (by **yekuang**)
- [test] Add simple test for offline-cache-key of compile-config (4805) (by **PGZXB**)
- [vulkan] Device API blending (4815) (by **Bob Cao**)
- [spirv] Fix int casts (4814) (by **Bob Cao**)
- [gui] Only call ImGui_ImplVulkan_Shutdown if it's initialized (4827) (by **Ailing**)
- [ci] Use a new PAT for project with org permission (4826) (by **Frost Ming**)
- [Lang] Add 2d and 3d rotation functions to math module (4822) (by **Zhao Liang**)
- [Doc] Proofread Getting started (4682) (by **Vissidarte-Herman**)
- [Doc] Editorial review to fields (advanced) (4686) (by **Vissidarte-Herman**)
- [bug] Fix bug that building with gcc9.4 will fail (4823) (by **PGZXB**)
- [SIMT] Add shfl_down_f32 intrinsic. (4819) (by **Chun Cai**)
- [workflow] Add issues to project when issue opened (4816) (by **Frost Ming**)
- [vulkan] Fix vulkan initialization on macOS with cpu backend (4813) (by **Bob Cao**)
- [Doc] [mesh] Update docstring for ti.Mesh (4818) (by **Chang Yu**)
- [vulkan] Fix Vulkan device score bug (4803) (by **Andrew Sun**)
- [Lang] Walkaround Vulkan backend behavior which changes cwd on Mac (4812) (by **TiGeekMan**)
- [misc] Add SNode to offline-cache key (4716) (by **PGZXB**)
- [Lang] Add mod function to math module (4809) (by **Zhao Liang**)
- [doc] Fix doc of running C++ tests (4798) (by **Yi Xu**)
- [Lang] Support in-place operator of ti.Matrix in python scope (4799) (by **Lin Jiang**)
- [Lang] [ir] Move short-circuit boolean logic into AST-to-IR passes (4580) (by **daylily**)
- [lang] Fix frontend type check for sqrt, log, exp (4797) (by **Yi Xu**)
- [Doc] Remove redundant semicolon in path (4801) (by **gaoxinge**)
- [Lang] [ir] Promote output type of log, exp, and sqrt ops (4622) (by **Andrew Sun**)
- [ci] Update ci images to use latest git (4792) (by **Bo Qiao**)
- [Lang] Fix integral type promotion rules (e.g., u8 + u8 now leads to u8 instead of i32) (4789) (by **Yuanming Hu**)
- [Lang] Add basic complex arithmetic and add a mandelbrot example (4780) (by **Zhao Liang**)
- Update index.md (4791) (by **Bob Cao**)
- [spirv] Add 16 bit float immediate number (4787) (by **Bob Cao**)
- [ci] Update ubuntu 18.04 image to use latest git (4785) (by **Frost Ming**)
- [lang] Store relations with 16-bit type (4779) (by **Chang Yu**)
- [Autodiff] Implement ti.ad.no_grad to skip autograd (4751) (by **Shawn Yao**)
- [misc] Remove some unnecessary attributes from offline-cache key of compile-config (4770) (by **PGZXB**)
- [doc] Update install instruction with "--upgrade" (4775) (by **Yuanming Hu**)
- Expose VboHelpers class (4773) (by **Ailing**)
- Bump version to v1.0.1 (4774) (by **Taichi Gardener**)
- [refactor] Merge Kernel.argument_names and argument_annotations (4753) (by **dongqi shen**)
- [dx11] Constant buffer binding and AtomicIncrement in RAND_STATE (4650) (by **quadpixels**)

1.0

Supported Data Types
Argument Packs are currently compatible with a variety of data types, including `scalar`, `matrix`, `vector`, `Ndarray`, and `Struct`.

Limitations
Please note that Argument Packs currently do not support the following features and data types:
- Ahead-of-Time (AOT) Compilation and Compute Graph
- `ti.template`
- `ti.data_oriented`

2. Improvements

1.0.0

v1.0.0 was released on April 13, 2022.
Compatibility changes
License change
Taichi's license is changed from MIT to Apache-2.0 after a public vote in [4607](https://github.com/taichi-dev/taichi/discussions/4607).
Python 3.10 support
This release supports Python 3.10 on all supported operating systems (Windows, macOS, and Linux).
Manylinux2014-compatible wheels
Before v1.0.0, Taichi works only on Linux distributions that support glibc 2.27+ (for example Ubuntu 18.04+). As of v1.0.0, in addition to the normal Taichi wheels, Taichi provides the manylinux2014-compatible wheels to work on most modern Linux distributions, including CentOS 7.
- The normal wheels support all backends; the incoming manylinux2014-compatible wheels support the CPU and CUDA backends only. Choose the wheels that work best for you.
- If you encounter any issue when installing the wheels, try upgrading your **pip** to the latest version first.
Deprecations
- This release deprecates `ti.ext_arr()` and uses `ti.types.ndarray()` instead. `ti.types.ndarray()` supports both Taichi Ndarrays and external arrays, for example NumPy arrays.
- Taichi plans to drop support for Python 3.6 in the next minor release (v1.1.0). If you have any questions or concerns, please let us know at [4772](https://github.com/taichi-dev/taichi/discussions/4772).
New features
Non-Python deployment solution
By working together with OPPO US Research Center, Taichi delivers Taichi AOT, a solution for deploying kernels in non-Python environments, such as in mobile devices.

Compiled Taichi kernels can be saved from a Python process, then loaded and run by the [provided C++ runtime library](https://github.com/taichi-dev/taichi/releases/download/v1.0.0/libtaichi_export_core.so). With a set of APIs, your Python/Taichi code can be easily deployed in any C++ environment. We demonstrate the simplicity of this workflow by porting [the implicit FEM (finite element method) demo](https://github.com/taichi-dev/taichi/blob/master/python/taichi/examples/simulation/implicit_fem.py) released in v0.9.0 to an Android application. Download the [Android package](https://github.com/taichi-dev/taichi/releases/download/v1.0.0/TaichiAOT.apk) and find out what Taichi AOT has to offer! If you want to try out this solution, please also check out [the `taichi-aot-demo` repo](https://github.com/taichi-dev/taichi-aot-demo).

<p align="center">
<img width=35% src=https://github.com/taichi-dev/taichi/releases/download/v1.0.0/taichi-aot-demo.gif>
</p>

python
In Python app.py
module = ti.aot.Module(ti.vulkan)
module.add_kernel(my_kernel, template_args={'x': x})
module.save('my_app')

The following code snippet shows the C++ workflow for loading the compiled AOT modules.
cpp
// Initialize Vulkan program pipeline
taichi::lang::vulkan::VulkanDeviceCreator::Params evd_params;
evd_params.api_version = VK_API_VERSION_1_2;
auto embedded_device =
std::make_unique<taichi::lang::vulkan::VulkanDeviceCreator>(evd_params);

std::vector<uint64_t> host_result_buffer;
host_result_buffer.resize(taichi_result_buffer_entries);
taichi::lang::vulkan::VkRuntime::Params params;
params.host_result_buffer = host_result_buffer.data();
params.device = embedded_device->device();
auto vulkan_runtime = std::make_unique<taichi::lang::vulkan::VkRuntime>(std::move(params));

// Load AOT module saved from Python
taichi::lang::vulkan::AotModuleParams aot_params{"my_app", vulkan_runtime.get()};
auto module = taichi::lang::aot::Module::load(taichi::Arch::vulkan, aot_params);
auto my_kernel = module->get_kernel("my_kernel");

// Allocate device buffer
taichi::lang::Device::AllocParams alloc_params;
alloc_params.host_write = true;
alloc_params.size = /*Ndarray size for `x`*/;
alloc_params.usage = taichi::lang::AllocUsage::Storage;
auto devalloc_x = embedded_device->device()->allocate_memory(alloc_params);

// Execute my_kernel without Python environment
taichi::lang::RuntimeContext host_ctx;
host_ctx.set_arg_devalloc(/*arg_id=*/0, devalloc_x, /*shape=*/{128}, /*element_shape=*/{3, 1});
my_kernel->launch(&host_ctx);

Note that Taichi only supports the Vulkan backend in the C++ runtime library. The Taichi team is working on supporting more backends.
Real functions (experimental)
All Taichi functions are inlined into the Taichi kernel during compile time. However, the kernel becomes lengthy and requires longer compile time if it has too many Taichi function calls. This becomes especially obvious if a Taichi function involves [compile-time recursion](https://docs.taichi-lang.org/lang/articles/meta#compile-time-recursion-of-tifunc). For example, the following code calculates the Fibonacci numbers recursively:
python
ti.func
def fib_impl(n: ti.template()):
if ti.static(n <= 0):
return 0
if ti.static(n == 1):
return 1
return fib_impl(n - 1) + fib_impl(n - 2)

ti.kernel
def fibonacci(n: ti.template()):
print(fib_impl(n))

In this code, `fib_impl()` recursively calls itself until `n` reaches `1` or `0`. The total time of the calls to `fib_impl()` increases exponentially as `n` grows, so the length of the kernel also increases exponentially. When `n` reaches `25`, it takes more than a minute to compile the kernel.

This release introduces "real function", a new type of Taichi function that compiles independently instead of being inlined into the kernel. It is an experimental feature and only supports scalar arguments and scalar return value for now.

You can use it by decorating the function with `ti.experimental.real_func`. For example, the following is the real function version of the code above.
python
ti.experimental.real_func
def fib_impl(n: ti.i32) -> ti.i32:
if n <= 0:
return 0
if n == 1:
return 1
return fib_impl(n - 1) + fib_impl(n - 2)

ti.kernel
def fibonacci(n: ti.i32):
print(fib_impl(n))

The length of the kernel does not increase as `n` grows because the kernel only makes a call to the function instead of inlining the whole function. As a result, the code takes far less than a second to compile regardless of the value of `n`.

The main differences between a normal Taichi function and a real function are listed below:
- You can write return statements in any part of a real function, while you cannot write return statements inside the scope of non-static `if` / `for` / `while` statements in a normal Taichi function.
- A real function can be called recursively at runtime, while a normal Taichi function only supports compile-time recursion.
- The return value and arguments of a real function must be type hinted, while the type hints are optional in a normal Taichi function.
Type annotations for literals
Previously, you cannot explicitly give a type to a literal. For example,
python
ti.kernel
def foo():
a = 2891336453 i32 overflow (>2^31-1)

In the code snippet above, `2891336453` is first turned into a default integer type (`ti.i32` if not changed). This causes an overflow. Starting from v1.0.0, you can write type annotations for literals:
python
ti.kernel
def foo():
a = ti.u32(2891336453) similar to 2891336453u in C

Top-level loop configurations
You can use `ti.loop_config` to control the behavior of the subsequent top-level for-loop. Available parameters are:
- `block_dim`: Sets the number of threads in a block on GPU.
- `parallelize`: Sets the number of threads to use on CPU.
- `serialize`: If you set `serialize` to `True`, the for-loop runs serially, and you can write break statements inside it (Only applies on range/ndrange for-loops). Setting `serialize` to `True` Equals setting `parallelize` to `1`.

Here are two examples:
python
ti.kernel
def break_in_serial_for() -> ti.i32:
a = 0
ti.loop_config(serialize=True)
for i in range(100): This loop runs serially
a += i
if i == 10:
break
return a

break_in_serial_for() returns 55

python
n = 128
val = ti.field(ti.i32, shape=n)

ti.kernel
def fill():
ti.loop_config(parallelize=8, block_dim=16)
If the kernel is run on the CPU backend, 8 threads will be used to run it
If the kernel is run on the CUDA backend, each block will have 16 threads
for i in range(n):
val[i] = i

`math` module
This release adds a `math` module to support GLSL-standard vector operations and to make it easier to port GLSL shader code to Taichi. For example, vector types, including `vec2`, `vec3`, `vec4`, `mat2`, `mat3`, and `mat4`, and functions, including `mix()`, `clamp()`, and `smoothstep()`, act similarly to their counterparts in GLSL. See the following examples:
Vector initialization and swizzling
You can use the `rgba`, `xyzw`, `uvw` properties to get and set vector entries:
python
import taichi.math as tm

ti.kernel
def example():
v = tm.vec3(1.0) (1.0, 1.0, 1.0)
w = tm.vec4(0.0, 1.0, 2.0, 3.0)
v.rgg += 1.0 v = (2.0, 3.0, 1.0)
w.zxy += tm.sin(v)

Matrix multiplication
Each Taichi vector is implemented as a column vector. Ensure that you put the the matrix before the vector in a matrix multiplication.
python
ti.kernel
def example():
M = ti.Matrix([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
v = tm.vec3(1, 2, 3)
w = (M v).xyz [1, 2, 3]

GLSL-standard functions
python
ti.kernel
def example():
v = tm.vec3(0., 1., 2.)
w = tm.smoothstep(0.0, 1.0, v.xyz)
w = tm.clamp(w, 0.2, 0.8)

CLI command `ti gallery`
This release introduces a CLI command `ti gallery`, allowing you to select and run Taichi examples in a pop-up window. To do so:
1. Open a terminal:
Bash
ti gallery

*A window pops up:*

<p align="center">
<img src=https://github.com/taichi-dev/taichi/releases/download/v1.0.0/taichi-gallery.jpg>
</p>

2. Click to run any example in the pop-up window.
*The console prints the corresponding source code at the same time.*
Improvements
Enhanced matrix type
As of v1.0.0, Taichi accepts matrix or vector types as parameters and return values. You can use `ti.types.matrix` or `ti.types.vector` as the type annotations.

Taichi also supports basic, read-only matrix slicing. Use the `mat[:,:]` syntax to quickly retrieve a specific portion of a matrix. See [Slicings](https://docs.taichi-lang.org/lang/articles/reference#slicings) for more information.

The following code example shows how to get numbers in four corners of a `3x3` matrix `mat`:
python
import taichi as ti

ti.init()

ti.kernel
def foo(mat: ti.types.matrix(3, 3, ti.i32)) -> ti.types.matrix(2, 2, ti.i32)
corners = mat[::2, ::2]
return corners

mat = ti.Matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
corners = foo(mat) [[1 3] [7 9]]

Note that in a slice, the lower bound, the upper bound, and the stride must be constant integers. If you want to use a variable index together with a slice, you should set `ti.init(dynamic_index=True)`. For example:
python
import taichi as ti

ti.init(dynamic_index=True)

ti.kernel
def foo(mat: ti.types.matrix(3, 3, ti.i32), ind: ti.i32) -> ti.types.matrix(3, 1, ti.i32):
col = mat[:, ind]
return col

mat = ti.Matrix([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
col = foo(mat, 2) [3 6 9]

More flexible Autodiff: Kernel Simplicity Rule removed
Flexiblity is key to the user experience of an automatic-differentiation (AD) system. Before v1.0.0, Taichi AD system requires that a differentiable Taichi kernel only consist multiple simply nested for-loops (shown in `task1` below). This was once called the Kernel Simplicity Rule (KSR). KSR prevents Taichi's users from writing differentiable kernels with multiple serial for-loops (shown in `task2` below) or with a mixture of serial for-loop and non-for statements (shown in `task3` below).
python
OK: multiple simply nested for-loops
ti.kernel
def task1():
for i in range(2):
for j in range(3):
for k in range(3):
y[None] += x[None]

Error: multiple serial for-loops
ti.kernel
def task2():
for i in range(2):
for j in range(3):
y[None] += x[None]
for j in range(3):
y[None] += x[None]

Error: a mixture of serial for-loop and non-for
ti.kernel
def task3():
for i in range(2):
y[None] += x[None]
for j in range(3):
y[None] += x[None]

With KSR being removed from this release, code with different kinds of for-loops structures can be differentiated, as shown in the snippet below.
python
OK: A complicated control flow that is still differentiable in Taichi
for j in range(2):
for i in range(3):
y[None] += x[None]
for i in range(3):
for ii in range(2):
y[None] += x[None]
for iii in range(2):
y[None] += x[None]
for iv in range(2):
y[None] += x[None]
for i in range(3):
for ii in range(2):
for iii in range(2):
y[None] += x[None]


Taichi provides a [demo](https://github.com/taichi-dev/taichi/blob/master/python/taichi/examples/autodiff/diff_sph/diff_sph.py) to demonstrate how to implement a differentiable simulator using this enhanced Taichi AD system.
<p align="center">
<img width=35% src=https://github.com/taichi-dev/taichi/releases/download/v1.0.0/diff-sph-demo.gif>
</p>

f-string support in an `assert` statement
This release supports including an f-string in an `assert` statement as an error message. You can include scalar variables in the f-string. See the example below:
python
import taichi as ti

ti.init(debug=True)

ti.kernel
def assert_is_zero(n: ti.i32):
assert n == 0, f"The number is {n}, not zero"

assert_is_zero(42) TaichiAssertionError: The number is 42, not zero

Note that the `assert` statement works only in debug mode.
Documentation changes
Taichi language reference
This release comes with [the first version of the Taichi language specification](https://docs.taichi-lang.org/lang/articles/reference), which attempts to provide an exhaustive description of the syntax and semantics of the Taichi language and makes a decent reference for Taichi's users and developers when they determine if a specific behavior is correct, buggy, or undefined.
API changes
Deprecated
| **Deprecated** | **Replaced by** |
| -------------- | -------------------- |
| `ti.ext_arr()` | `ti.types.ndarray()` |

Full changelog
- [example] Add diff sph demo (4769) (by **Mingrui Zhang**)
- [autodiff] Fix nullptr during adjoint codegen (4771) (by **Ye Kuang**)
- [bug] Fix kernel profiler on CPU backend (4768) (by **Lin Jiang**)
- [example] Fix taichi_dynamic example (4767) (by **Yi Xu**)
- [aot] Provide a convenient API to set devallocation as argument (4762) (by **Ailing**)
- [Lang] Deprecate ti.pyfunc (4764) (by **Lin Jiang**)
- [misc] Bump version to v1.0.0 (4763) (by **Yi Xu**)
- [SIMT] Add all_sync warp intrinsics (4718) (by **Yongmin Hu**)
- [doc] Taichi spec: calls, unary ops, binary ops and comparison (4663) (by **squarefk**)
- [SIMT] Add any_sync warp intrinsics (4719) (by **Yongmin Hu**)
- [Doc] Update community standard (4759) (by **notginger**)
- [Doc] Propose the RFC process (4755) (by **Ye Kuang**)
- [Doc] Fixed a broken link (4758) (by **Vissidarte-Herman**)
- [Doc] Taichi spec: conditional expressions and simple statements (4728) (by **Xiangyun Yang**)
- [bug] [lang] Let matrix initialize to the target type (4744) (by **Lin Jiang**)
- [ci] Fix ci nightly (4754) (by **Bo Qiao**)
- [doc] Taichi spec: compound statements, if, while (4658) (by **Lin Jiang**)
- [build] Simplify build command for android (4752) (by **Ailing**)
- [lang] Add PolygonMode enum for rasterizer (4750) (by **Ye Kuang**)
- [Aot] Support template args in AOT module add_kernel (4748) (by **Ye Kuang**)
- [lang] Support in-place operations on math vectors (4738) (by **Lin Jiang**)
- [ci] Add python 3.6 and 3.10 to nightly release (4740) (by **Bo Qiao**)
- [Android] Fix Android get height issue (4743) (by **Ye Kuang**)
- Updated logo (4745) (by **Vissidarte-Herman**)
- [Error] Raise an error when non-static condition is passed into ti.static_assert (4735) (by **Lin Jiang**)
- [Doc] Taichi spec: For (4689) (by **Lin Jiang**)
- [SIMT] [cuda] Use correct source lane offset for warp intrinsics (4734) (by **Bo Qiao**)
- [SIMT] Add shfl_xor_i32 warp intrinsics (4642) (by **Yongmin Hu**)
- [Bug] Fix warnings (4730) (by **Peng Yu**)
- [Lang] Add vector swizzle feature to math module (4629) (by **TiGeekMan**)
- [Doc] Taichi spec: static expressions (4702) (by **Lin Jiang**)
- [Doc] Taichi spec: assignment expressions (4725) (by **Xiangyun Yang**)
- [mac] Fix external_func test failures on arm backend (4733) (by **Ailing**)
- [doc] Fix deprecated tools APIs in docs, tests, and examples (4729) (by **Yi Xu**)
- [ci] Switch to self-hosted PyPI for nightly release (4706) (by **Bo Qiao**)
- [Doc] Taichi spec: boolean operations (4724) (by **Xiangyun Yang**)
- [doc] Fix deprecated profiler APIs in docs, tests, and examples (4726) (by **Yi Xu**)
- [spirv] Ext arr name should include arg id (4727) (by **Ailing**)
- [SIMT] Add shfl_sync_i32/f32 warp intrinsics (4717) (by **Yongmin Hu**)
- [Lang] Add 2x2/3x3 matrix solve with Guass elimination (4634) (by **Peng Yu**)
- [metal] Tweak Device to support Ndarray (4721) (by **Ye Kuang**)
- [build] Fix non x64 linux builds (4715) (by **Bob Cao**)
- [Doc] Fix 4 typos in doc (4714) (by **Jiayi Weng**)
- [simt] Subgroup reduction primitives (4643) (by **Bob Cao**)
- [misc] Remove legacy LICENSE.txt (4708) (by **Yi Xu**)
- [gui] Make GGUI VBO configurable for mesh (4707) (by **Yuheng Zou**)
- [Docs] Change License from MIT to Apache-2.0 (4701) (by **notginger**)
- [Doc] Update docstring for module misc (4644) (by **Zhao Liang**)
- [doc] Proofread GGUI.md (4676) (by **Vissidarte-Herman**)
- [refactor] Remove Expression::serialize and add ExpressionHumanFriendlyPrinter (4657) (by **PGZXB**)
- [Doc] Remove extension_libraries in doc site (4696) (by **LittleMan**)
- [Lang] Let assertion error message support f-string (4700) (by **Lin Jiang**)
- [Doc] Taichi spec: prims, attributes, subscriptions, slicings (4697) (by **Yi Xu**)
- [misc] Add compile-config to offline-cache key (4681) (by **PGZXB**)
- [refactor] Remove legacy usage of ext_arr/any_arr in codebase (4698) (by **Yi Xu**)
- [doc] Taichi spec: pass, return, break, and continue (4656) (by **Lin Jiang**)
- [bug] Fix chain assignment (4695) (by **Lin Jiang**)
- [Doc] Refactored GUI.md (4672) (by **Vissidarte-Herman**)
- [misc] Update linux version name (4685) (by **Jiasheng Zhang**)
- [bug] Fix ndrange when start > end (4690) (by **Lin Jiang**)
- [bug] Fix bugs in test_offline_cache.py (4674) (by **PGZXB**)
- [Doc] Fix gif link (4694) (by **Ye Kuang**)
- [Lang] Add math module to support glsl-style functions (4683) (by **LittleMan**)
- [Doc] Editorial updates (4688) (by **Vissidarte-Herman**)
- Editorial updates (4687) (by **Vissidarte-Herman**)
- [ci] [windows] Add Dockerfile for Windows build and test (CPU) (4667) (by **Bo Qiao**)
- [Doc] Taichi spec: list and dictionary displays (4665) (by **Yi Xu**)
- [CUDA] Fix the fp32 to fp64 promotion due to incorrect fmax/fmin call (4664) (by **Haidong Lan**)
- [misc] Temporarily disable a flaky test (4669) (by **Yi Xu**)
- [bug] Fix void return (4654) (by **Lin Jiang**)
- [Workflow] Use pre-commit hooks to check codes (4633) (by **Frost Ming**)
- [SIMT] Add ballot_sync warp intrinsics (4641) (by **Wimaxs**)
- [refactor] [cuda] Refactor offline-cache and support it on arch=cuda (4600) (by **PGZXB**)
- [Error] [doc] Add TaichiAssertionError and add assert to the lang spec (4649) (by **Lin Jiang**)
- [Doc] Taichi spec: parenthesized forms; expression lists (4653) (by **Yi Xu**)
- [Doc] Updated definition of a 0D field. (4651) (by **Vissidarte-Herman**)
- [Doc] Taichi spec: variables and scope; atoms; names; literals (4621) (by **Yi Xu**)
- [doc] Fix broken links and update docs. (4647) (by **Chengchen(Rex) Wang**)
- [Bug] Fix broken links (4646) (by **Peng Yu**)
- [Doc] Refactored field.md (4618) (by **Vissidarte-Herman**)
- [gui] Allow to configure the texture data type (4630) (by **Gabriel H**)
- [vulkan] Fixes the string comparison when querying extensions (4638) (by **Bob Cao**)
- [doc] Add docstring for ti.loop_config (4625) (by **Lin Jiang**)
- [SIMT] Add shfl_up_i32/f32 warp intrinsics (4632) (by **Yu Zhang**)
- [Doc] Examples directory update (4640) (by **dongqi shen**)
- [vulkan] Choose better devices (4614) (by **Bob Cao**)
- [SIMT] Implement ti.simt.warp.shfl_down_i32 and add stubs for other warp-level intrinsics (4616) (by **Yuanming Hu**)
- [refactor] Refactor Identifier::id_counter to global and local counter (4581) (by **PGZXB**)
- [android] Disable XDG on non-supported platform (4612) (by **Gabriel H**)
- [gui] [aot] Allow set_image to use user VBO (4611) (by **Gabriel H**)
- [Doc] Add docsting for Camera class in ui module (4588) (by **Zhao Liang**)
- [metal] Implement buffer_fill for unified device API (4595) (by **Ye Kuang**)
- [Lang] Matrix 3x3 eigen decomposition (4571) (by **Peng Yu**)
- [Doc] Set up the basis of Taichi specification (4603) (by **Yi Xu**)
- [gui] Make GGUI VBO configurable for particles (4610) (by **Yuheng Zou**)
- [Doc] Update with Python 3.10 support (4609) (by **Bo Qiao**)
- [misc] Bump version to v0.9.3 (4608) (by **Taichi Gardener**)
- [Lang] Deprecate ext_arr/any_arr in favor of types.ndarray (4598) (by **Yi Xu**)
- [Doc] Adjust CPU GUI document layout (4605) (by **Peng Yu**)
- [Doc] Refactored Type system. (4584) (by **Vissidarte-Herman**)
- [lang] Fix vector matrix ndarray to numpy layout (4597) (by **Bo Qiao**)
- [bug] Fix bug that caching kernels with same AST will fail (4582) (by **PGZXB**)

Page 6 of 23

© 2024 Safety CLI Cybersecurity Inc. All Rights Reserved.