-
Notifications
You must be signed in to change notification settings - Fork 7.1k
Closed
Labels
Description
A minimal version of jpeg decoding on GPUs was implemented in #3792. Here's a list of potential future improvements:
- Support for A100 devices
- Support for batch decoding (I didn't see any speed improvement in my experiments in [WIP] nvJPEG support #2786 (comment), but perhaps I missed something)
- Use a finer-grained API for the decoding phases, and potentially change the decoding backend depending on the image size, taking inspiration from https://github.com/NVIDIA/CUDALibrarySamples/tree/master/nvJPEG/nvJPEG-Decoder-MultipleInstances
- As per Support for decoding jpegs on GPU with nvjpeg #3792 (comment), we could:
- Avoid creating tensor views and use some pointer arithmetic
- investigate whether the layout (CHW vs HWC) has an impact on performance
chajath and SomeoneSerge