Highlights:
- **Bug fixes**
- Fix OffsetAndExtractBitsStmt optimization and improve documentation on virtual/physical indices (1259) (by **Yuanming Hu**)
- Fix image I/O for channels = 1 and improve test coverage (unrevert) (1242) (by **彭于斌**)
- Fix tuple assignment behavior in Taichi-scope (1247) (by **彭于斌**)
- **Documentation**
- Standardize versioning and release workflow (1220) (by **Yuanming Hu**)
- **Language and syntax**
- Support vector unpacking (e.g., "a, b = ti.Vector([1, 2])") (1252) (by **彭于斌**)
- **IR optimization passes**
- Remove exceptions from IR check_out_of_bound and constant_fold (1251) (by **Xuanda Yang**)
Full changelog:
- [workflow] Improve PR title checker for the release tag (1244) (by **彭于斌**)
- [Lang] Support vector unpacking (e.g., "a, b = ti.Vector([1, 2])") (1252) (by **彭于斌**)
- [perf] Refactor kernel profiler (1261) (by **Yuanming Hu**)
- [Bug] [opt] [doc] Fix OffsetAndExtractBitsStmt optimization and improve documentation on virtual/physical indices (1259) (by **Yuanming Hu**)
- [perf] Improve dynamic SNode performance (stage 3) (1238) (by **xumingkuan**)
- [misc] Add auto-profiling to IR passes (1255) (by **xumingkuan**)
- [doc] Remove changelog link from readme (1209) (by **Chengchen(Rex) Wang**)
- [Opt] [ir] [refactor] Remove exceptions from IR check_out_of_bound and constant_fold (1251) (by **Xuanda Yang**)
- [Bug] [misc] Fix image I/O for channels = 1 and improve test coverage (unrevert) (1242) (by **彭于斌**)
- [Bug] [lang] Fix tuple assignment behavior in Taichi-scope (1247) (by **彭于斌**)
- [ir] Set root block kernel after cloning (1245) (by **Ye Kuang**)
- [cli] Add "ti dist" command to test in release mode (1227) (by **彭于斌**)
- [Doc] Standardize versioning and release workflow (1220) (by **Yuanming Hu**)
- [test] Add coverage threshold to make codecov not fail so easily (1246) (by **彭于斌**)
- [bug] [lang] Use functools.wraps for non-classfunc too (1233) (by **彭于斌**)