-
Notifications
You must be signed in to change notification settings - Fork 405
Add hardware acceleration to video decoding #331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
rvillalba-novetta
commented
May 28, 2018
- Add optional dictionary parameter to input open allowing the user to pass setting for hardware acceleration
- Add nvidia libraries to the ffmpeg build if available
Note that this is related to #307. |
I've rebased this onto the current develop. It is in the |
Any ideas on when this will be updated with the current branch and merged? |
@mikeboers I'm totally confused about this PR. If the most up-to-date code is on your branch, let's close this PR and open a new one based on your branch? |
437df6e
to
18dc455
Compare
I have rebased this PR on top of develop, pushed it to @rvillalba-novetta 's repo and killed the |
Thanks for that @jlaine. Without putting much effort in (because as you can tell I'm not at the moment), can you identify why it is FFmpeg 4.x only? |
It doesn't even compile on older ffmpegs so I'm wondering if it's an entirely new API |
Looks like the headers go back to at least 3.3. |
|
As far as the example is concerned, it replaces this (annoying) function -> https://www.ffmpeg.org/doxygen/3.4/hw__decode_8c_source.html#l00047 |
I have had great results with this patch. I additionally added |
Now that we've dropped FFmpeg < 4.0, this is more approachable as is. The tests pass. So now (at some point) we can decide if we like the API and how it is implemented. |
Actually... I think I'm done with this at this point because:
Since it is unreasonable to think that anyone is decoding with PyAV just for playback, this seems pretty pointless to keep fighting. Thanks, everyone, for your time on this, but I'm getting off here. |
We will not likely continue down this path, as `man ffmpeg` makes the point that CPU decoding is about the same speed as GPU decoding, and so is only really of use for playback. I don't think PyAVs target is such high performance playback, so we don't need to make the design concesions required for this branch. NOTE: This has not been tested to work. Two commits back is the original PR squashed into a single commit and is more likely to work, although if there is any hope of this being merged it will have to look more like this commit does. See (and further any discussion) #331 on GitHub.
We will not likely continue down this path, as `man ffmpeg` makes the point that CPU decoding is about the same speed as GPU decoding, and so is only really of use for playback. I don't think PyAVs target is such high performance playback, so we don't need to make the design concesions required for this branch. NOTE: This has not been tested to work. Two commits back is the original PR squashed into a single commit and is more likely to work, although if there is any hope of this being merged it will have to look more like this commit does. See (and further any discussion) #331 on GitHub.
Hmm, hardware acceleration in the context of PyAV for me is mainly for encoding, not decoding. For encoding there's a substantial performance gain, and you can do multiple encodes in parallel if you have the correct hardware. |
Not to open a new "question-like" issue, is it expected we can use hardware accelerated encoders? For example, With PyAV 8.0.1 the first one triggers:
and it's used like:
I also tried Any suggestions? There is also nothing in the documentation about this flow... Thank you! |
FWIW, I do want to receive a h264 stream from a remote machine and directly render on a Raspberry Pi. I may take a shot at this if PyAV sees it as out of scope. |
Theres other use cases where hardware decode is very useful outside of playback - decode to hardware accelerated memory. For my use case I would like to be able to decode into memory backed by the GPU to send as a tensor for inference for machine learning. PyAV is perfect as a shim to libAV to have finer grained frame access and in process control - however without GPU accelerated memory its not quite as appealing. Note this is also very helpful for server side on demand rendering on headless GPU instances - decoding directly to a texture is quite nice. We did used C++ for this project using LibAV with nvdec on AWS hardware : https://rarevolume.com/work/reuterstv/ - it would have been much slower (and more expensive) sans libAVs nvdec implementation. Being able to use Python + libav + nvdec / nvenc would allow a a lot of nice optimal code paths outside of rendering:
Thank you. |
Hi - I was able to do some tests using VPF (Nvidia's) new video performance framework in Python and test these claims. They dont hold up. Not even close. PyAV on a Google Colab machine for a h.264 QuickTime has the following performance: One comment for those in the public - while the shared Google Colab environment can have different load so performance benchmarking is difficult, for the same h.264 QuickTime mov file - LibAV CPU:
VPF try on a T4 on collab GPU instance try one:
Try 2 (less contention perhaps?)
This is rendering back to the CPU, so it includes GPU to CPU transfer. That is a non trivial performance increase with NVDec. |
my suggestion to the VPF authors at Nvidia is to leverage PyAV for Demuxing and have them build a special NVEnc NVDec packet decoder which can vend surfaces (on the gpu) or frames (CPU backed memory which has been read back) This would mean HWAccel is ignored - so other existing back ends dont get support. I do feel however its a murky fix and not really inline with having LibAV fully ported to python. Given the above perf delta - which is real, demonstrable (look at new updates from Adobe switching to use NV encoder and decoder) as well as Apple using HW Accel in Video Toolbox via the T2 and AMD chipsets - the gains are tremendous for pro workflows. I highly suggest this is reconsidered. Can I help somehow? |
Any news regarding the hardware acceleration? I'm currently using VPF to do this but I'm having issues with rtsp transport of IP cameras. Could someone outline how a solution with VPF and PyAV in tandem would look like? Essentially I want PyAV to handle transport and container related stuff and then let VPF handle the H.264 decoding. Is this a valid use-case for PyAV? |
VPF authors have found a way to combine this and are waiting for bitstream support in PyAV |
I think one of the main reasons to use hardware acceleration is to offload the processing power to GPU and keep the CPU free, not only about the processing time only. CPU workload dropped from 30+% to less than 2% after enabling hardware acceleration for decoding 1080P@30. I did that by using C++ to call FFmpeg libraries and I would say it rather simple after all as long as you know the concept behind it and which function to call. Though, I am using the precompiled FFmpeg libraries (libavcodec, libavformat etc…) as it is tedious to compile every dependencies for each different hardware. I am more than happy to help if Mike is up to it. Could setup a remote workstation for you to test things out if that is you concern. |