Fixed - Fix build problem in cuda 10.2 - Fix some bug related to nvrtc
0.3.1
Fixed - Fix cpu build problem
0.3.0
Added - Add Ampere support. faster fp16, faster tf32 and greatly faster int8 kernels in Ampere GPUs. * Add nvrtc support for conv kernel. Removed - drop python 3.6 support. Changed * BREAKING CHANGE: change dtype enum value for some important reason.
0.2.8
Fixed * Fix missing sm37 in supported arch
0.2.7
Added * add sm37 for cu102. * add compile info (cuda arch) for better error information.
0.2.6
Fixed * Fix a small bug that incorrectly limit arch of simt to sm52.