Highlights:
- **Language and syntax**
- Rename "ti.cache_shared" to "ti.block_local" (2030) (by **Zhiya Luo**, welcome!)
Full changelog:
- [opt] Algebraic simplification for sar/shl/shr (2031) (by **xumingkuan**)
- [type] Support bit-level read and write in Python-scope (2029) (by **Jiafeng Liu**)
- [Lang] [refactor] Rename "ti.cache_shared" to "ti.block_local" (2030) (by **Zhiya Luo**)
- [type] Refactor bit pointers (2028) (by **Yuanming Hu**)
- [async] Use loop-unique info for fusion (2012) (by **xumingkuan**)
- [ir] [opt] Demote BitExtractStmt into a series of binary operations for optimization (1795) (by **彭于斌**)
- [misc] Fix Type* ownership in Python-scope (2026) (by **Yuanming Hu**)
- [example] Interpolate vertices for mciso_advanced.py to make it smoother (1991) (by **彭于斌**)
- [opengl] [refactor] Move rand_state from runtime to gtmp to reduce SSBO numbers (2021) (by **彭于斌**)
- [type] Add BitArrayType and corresponding SNodes (2017) (by **Xuanda Yang**)
- [opengl] [refactor] Reduce SSBO numbers: merge earg with args (2020) (by **彭于斌**)
- [misc] Add clear_profile_info() (2018) (by **Ye Kuang**)
- [opengl] [perf] Grid-stride loop for all type of loops (2016) (by **彭于斌**)
- [metal] Support pointer SNode in codegen (2015) (by **Ye Kuang**)
- [async] Support activation demotion in "if" statements (2009) (by **Yuanming Hu**)