- Add `model` paramter to support v4.0~v4.6 models.
- Add `ensemble` parameter to smooth predictions in areas where the estimation is uncertain.
- Fix corruption with FP16 mode on 4K video.
- Replace `multi` parameter with `factor_num`, `factor_den`, `fps_num` and `fps_den` for rational frame rate change.
- Add `sc` and `sc_threshold` parameters for scene change detection.
- Add `cuda_graphs` parameter to use CUDA Graphs.
- Add `fusion` parameter to enable fusion through nvFuser.
- Remove `device_type` parameter. No one bothers to run deep learning inference on CPU anyway.
- Add `num_streams` parameter for parallel execution.
- Remove `fp16` parameter and now it's controlled by the format of the clip. `RGBH` format uses FP16 mode and `RGBS` format uses FP32 mode.
- Add `trt`, `trt_max_workspace_size`, and `trt_cache_path` parameters for TensorRT support.
With the usage of TensorRT, it should run at least 40~50% faster than previous version or RIFE-ncnn-Vulkan implementation using FP16 mode on GPUs with Tensor Cores. For ease of installation on Windows, you can download the CUDA 7z file which contains required runtime libraries and Python wheel file. Either add the unzipped directory to your system `PATH` or copy the DLL files to a directory which is already in your system `PATH`. Finally `pip install` the Python wheel file.