[TorchCodec 0.1](https://pytorch.org/torchcodec/0.1/) is out ! It is **packed** with exciting features, and it is the first release of TorchCodec that we're pushing for wide adoption.
- [Installation Instructions](https://github.com/pytorch/torchcodec?tab=readme-ov-file#installing-torchcodec)
- [Getting Started](https://pytorch.org/torchcodec/0.1/generated_examples/index.html)
New features and improvements
GPU Decoding
Decoding can now be done on CUDA GPUs by simply using the `device` parameter: `decoder = VideoDecoder(..., device="cuda")`. GPU decoding can lead to faster decoding pipelines in a variety of cases. To learn more on how to use GPU decoding and how to install it, follow our [GPU decoding example](https://pytorch.org/torchcodec/0.1/generated_examples/basic_cuda_example.html#sphx-glr-generated-examples-basic-cuda-example-py)!
Clip sampling
TorchCodec now supports fast clip samplers in the `torchcodec.samplers` namespace. We support random and regular sampling for both index-based and time-based strategies. Read more about samplers in our [sampling example](https://pytorch.org/torchcodec/0.1/generated_examples/sampling.html#sphx-glr-generated-examples-sampling-py)!
Improvements to `VideoDecoder`
Note: `SimpleVideoDecoder` became `VideoDecoder`! See below for other changes.
- The `VideoDecoder` class now exposes the following parameter to provide users with more control:
- `num_ffmpeg_threads`
- `stream_index`
- Two new methods were added: `decoder.get_frames_at(indices=[3, 1, 10])` and `decoder.get_frames_played_at(seconds=[10.5, 0.3])`. When decoding multiple frames, calling these method is a lot faster than calling `get_frame_at()` or `get_frame_played_at()` repeatedly.
Read more on the [docs](https://pytorch.org/torchcodec/0.1/generated/torchcodec.decoders.VideoDecoder.html#torchcodec.decoders.VideoDecoder).
Speed improvements
Various performance improvements were made, including:
- The decoder now automatically switches between the lower-level `swscale` and `filtergraph` libraries. These libraries are mainly used to convert YUV colors to RGB, and swscale usually leads to faster results. TorchCodec relies on one or the other when appropriate.
- We now avoid extra copies of the output frame tensor in batch-decoding APIs
You can find detailed benchmark results on [the repo](https://github.com/pytorch/torchcodec/tree/main?tab=readme-ov-file#benchmark-results).
MacOS support
TorchCodec now supports MacOS! Just run `pip install torchcodec` and follow our normal [installation instructions](https://github.com/pytorch/torchcodec#installing-torchcodec).
Breaking changes
TorchCodec is still in development stage and some APIs may be updated in future versions without a deprecation cycle, depending on user feedback. For this release, a few important API changes were made, but we do not anticipate significant changes of the sort in future releases, and we now consider the existing APIs largely stable.
- The `SimpleVideoDecoder` class was renamed to `VideoDecoder`
- Methods of `VideoDecoder` containing the term "displayed" have been changed to the term "played". E.g. `get_frame_displayed_at()` is now `get_frame_played_at()`. This is to accommodate for future audio support.
- The `get_frames_at()` and `get_frames_displayed_at()` methods have been renamed to `get_frames_in_range()` and `get_frames_played_in_range()`. The method names `get_frames_at()` and `get_frames_played_at()` still exist, but they do something else (see new features section).
Bug fixes
- Time-based decoding APIs were returning the wrong frame when the timestamp corresponded to the second-to-last frame. See https://github.com/pytorch/torchcodec/pull/287 for more details.