Skip to content

Commit 61c5e65

Browse files
committed
improved documentation
1 parent c18a15c commit 61c5e65

File tree

4 files changed

+1471
-47
lines changed

4 files changed

+1471
-47
lines changed

README.md

Lines changed: 58 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -4,14 +4,16 @@ This ROS2 package supports encoding/decoding with the FFMpeg
44
library, for example encoding h264 and h265 or HEVC, using
55
Nvidia or other hardware acceleration when available.
66
This package is meant to be used by image transport plugins like
7-
the [ffmpeg image transport](https://github.com/ros-misc-utilities/ffmpeg_image_transport/).
7+
the [ffmpeg image transport](https://github.com/ros-misc-utilities/ffmpeg_image_transport/)
8+
and the [foxglove compressed video transport](https://github.com/ros-misc-utilities/foxglove_compressed_video_transport/).
89

910
## Supported systems
1011

1112
Continuous integration is tested under Ubuntu with the following ROS2 distros:
1213

1314
[![Build Status](https://build.ros2.org/buildStatus/icon?job=Hdev__ffmpeg_encoder_decoder__ubuntu_jammy_amd64&subject=Humble)](https://build.ros2.org/job/Hdev__ffmpeg_encoder_decoder__ubuntu_jammy_amd64/)
1415
[![Build Status](https://build.ros2.org/buildStatus/icon?job=Jdev__ffmpeg_encoder_decoder__ubuntu_noble_amd64&subject=Jazzy)](https://build.ros2.org/job/Jdev__ffmpeg_encoder_decoder__ubuntu_noble_amd64/)
16+
[![Build Status](https://build.ros2.org/buildStatus/icon?job=Kdev__ffmpeg_encoder_decoder__ubuntu_noble_amd64&subject=Kilted)](https://build.ros2.org/job/Kdev__ffmpeg_encoder_decoder__ubuntu_noble_amd64/)
1517
[![Build Status](https://build.ros2.org/buildStatus/icon?job=Rdev__ffmpeg_encoder_decoder__ubuntu_noble_amd64&subject=Rolling)](https://build.ros2.org/job/Rdev__ffmpeg_encoder_decoder__ubuntu_noble_amd64/)
1618

1719

@@ -34,10 +36,59 @@ and follow the [instructions here](https://github.com/ros-misc-utilities/.github
3436

3537
Make sure to source your workspace's ``install/setup.bash`` afterwards.
3638

37-
## ROS Parameters
39+
## API overview
3840

39-
This package does not expose ROS parameters. It is the upper layer's responsibility to e.g. manage the mapping between encoder and decoder, i.e. to tell the decoder class which libav decoder should be used for the decoding, or to set the encoding parameters.
41+
### Preliminaries
42+
When using libav it is important to understand the difference between the encoder and the codec.
43+
The *codec* is the standardized format in which images are encoded, for instance ``h264`` or ``hevc``.
44+
The *encoder* is a libav software module that can encode images for a given codec. For instance ``libx264``, ``libx264rgb``,
45+
`h264_nvenc``, and ``h264_vaapi`` are all *encoders* that encode for the *codec* h264.
46+
Some of the encoders are hardware accelerated, some can handle more image formats than others, but in the end they all encode video for a specific codec.
4047

48+
For the many AV options available for various libav encoders, and for ``qmax``, ``bit_rate`` and similar settings please refer to the ffmpeg documentation.
49+
50+
### The inner workings
51+
![overview_diagram](./doc/encoder_decoder.svg)
52+
53+
The diagram above shows the stations that a ROS Image message passes as it traverses the encoder and decoder.
54+
1) The ROS image (with [ROS sensor\_msgs/Image encoding](https://github.com/ros2/common_interfaces/blob/a2ef867438e6d4eed074d6e3668ae45187e7de86/sensor_msgs/include/sensor_msgs/image_encodings.hpp)) is first converted with the ROS [cv\_bridge](https://github.com/ros-perception/vision_opencv) into the ``cv_bridge_target_format``.
55+
This conversion is necessary because some ROS encodings (like bayer images) are not supported by libswscale.
56+
The ``cv_bridge_target_format`` can be set via ``setCVBridgeTargetFormat(const std::string & fmt)``.
57+
If this format is not set explicitly the image will be converted to the default format of ``bgr8``.
58+
This may not be what you want for e.g. ``mono8`` (gray) or Bayer images.
59+
Ideally the``cv_bridge_target_format`` can be directly used by the libav decoder so the next step becomes a no-op.
60+
But at the very least ``cv_bridge_target_format`` must be an acceptable libswscale input format (with the exception of
61+
special hacks for encoding single-channel images, see below).
62+
2) The image is then converted to ``av_source_pixel_format`` using libswscale.
63+
The ``av_source_pixel_format`` can be set with ``setAVSourcePixelFormat()``, defaulting to something that is acceptable to the libav encoder.
64+
You can use ffmpeg (``ffmpeg -h encoder=libx264 | grep formats``) to list all formats that an encoder supports.
65+
Note that the ffmpeg/libav format string notation is different from the ROS encoding strings, and the ``av_source_pixel_format`` is specified using the libav convention, whereas the ``cv_bridge_target_format`` uses ROS convention!
66+
(If you choose to bypass the cv\_bridge conversion from step 1 by feeding the images to the encoder directly via the ``encodeImage(const cv::Mat & img ...)`` method, you must still set the ``cv_bridge_target_format`` such that the encoder knows what format the ``img`` argument has.)
67+
When aiming for lossless compression, beware of any ``av_source_pixel_format`` that reduces the color resolution, such as ``yuv420p``, ``nv12`` etc.
68+
For Bayer images, use the special hack for single-channel images.
69+
70+
3) The libav encoder encodes the packet with its supported codec, e.g. the ``libx264`` will produce ``h264`` packets.
71+
The ``encoding`` field of the FFMPEGPacket message will document all image format conversions and the codec, in reverse order, separated by semicolon.
72+
This way the decoder can attempt to reproduce the original ``ros_encoding``.
73+
74+
4) The libav decoder decodes the packet into the original ``av_source_pixel_format``.
75+
76+
5) Finally the image is converted to ``output_message_format`` using libswscale.
77+
This format can be set (in ROS encoding syntax!) with ``setOutputMessageEncoding()``.
78+
The format must be supported by both ROS and libswscale (except when using the special hack for single-channel images).
79+
80+
Note that only very few combinations of libav encoders, ``cv_bridge_target_format`` and ``av_source_pixel_format`` have been tested. Please provide feedback if you observe crashes or find obvious bugs. PRs are always appreciated!
81+
82+
### The special single-channel hack
83+
84+
Many libav encoders do not support single-channel formats (like mono8 or bayer).
85+
For this reason a special hack is implemented in the encoder that adds an empty (zero-value) color channel to the single-channel image.
86+
Later, the decoder removes it again.
87+
To utilitze this hack, specify a ``cv_bridge_target_format`` of e.g. ``bayer_rggb8``. Without the special hack, this would trigger an error because Bayer formats are not acceptable to libswscale.
88+
Instead, the image is converted to ``yuv420p`` or ``nv12`` by adding an empty color channel.
89+
These formats are acceptable to most encoders.
90+
The decoder in turn recognizes that the ``cv_bridge_target_format`` is a single-channel format, but ``yuv420p``/``nv12`` are not, and therefore drops the color channel.
91+
This hack greatly improves the efficiency for lossless encoding of Bayer images because it avoids conversion to full RGB and back.
4192

4293
## API usage
4394

@@ -53,42 +104,15 @@ Using the encoder involves the following steps:
53104
- flushing the encoder (may result in additional callbacks with encoded packets)
54105
- destroying the ``Encoder`` object.
55106

56-
The ``Encoder`` class description has a short example code snippet.
57-
58-
When using libav it is important to understand the difference between the encoder and the codec. The *codec* is the standardized format in which images are encoded, for instance ``h264`` or ``hevc``. The *encoder* is a software module that can encode images for a given codec. For instance ``libx264``, ``libx264rgb``, ``h264_nvenc``, and ``h264_vaapi`` are all *encoders* that encode for the *codec* h264.
59-
Some of the encoders are hardware accelerated, some can handle more image formats than others, but in the end they all encode video for a specific codec. You set the libav encoder with the ``setEncoder()`` method.
60-
61-
For the many AV options available for various libav encoders, and for ``qmax``, ``bit_rate`` and similar settings please refer to the ffmpeg documentation.
62-
63-
The topic of "pixel formats" (also called "encodings") can be very confusing and frustrating, so here a few lines about it.
64-
- Images in ROS arrive as messages that are encoded in one of the formats allowed by the [sensor\_msgs/Image encodings](https://github.com/ros2/common_interfaces/blob/a2ef867438e6d4eed074d6e3668ae45187e7de86/sensor_msgs/include/sensor_msgs/image_encodings.hpp). If you pass that message in via the ``encodeImage(const Image & msg)`` message, it will first will be converted to an opencv matrix using the [cv_bridge](https://github.com/ros-perception/vision_opencv).
65-
If you don't set the ``cv_bridge_target_format`` via ``setCVBridgeTargetFormat(const std::string & fmt)``, the image will be converted to the default format of ``bgr8``. Depending on how performance sensitive you are (or if you have a ``mono8`` (gray) image!), this may not be what you want. Ideally, you should pick a ``cv_bridge_target_format`` that can be directly used by the libav decoder that you have chosen. You can use ffmpeg to list the formats that the libav encoder directly supports:
66-
```
67-
ffmpeg -h encoder=libx264 | grep formats
68-
```
69-
Note that the ffmpeg/libav format strings often diverge from the ROS encoding strings, so some guesswork and experimentation may be necessary to pick the right ``cv_bridge_target_format``.
70-
If you choose to bypass the cv\_bridge conversion by feeding the images to the encoder directly via the ``encodeImage(const cv::Mat & img ...)`` method, you must still set the ``cv_bridge_target_format`` such that encoder knows what format the ``img`` argument has.
71-
- Once the image is available in ``cv_bridge_target_format`` the encoder may perform another conversion to an image format that the libav encoder supports, the ``av_source_pixel_format``. Again, ``ffmpeg -h encoder=<your encoder>`` shows the supported formats. If you don't set the ``av_source_pixel_format`` with ``setAVSourcePixelFormat()``, the encoder will try to pick one that is supported by the libav encoder. That often works but may not be optimal.
72-
- Finally, the libav encoder produces a packet in its output codec, e.g. ``h264``, ``hevc`` etc. This encoding format is provided as the ``codec`` parameter when the encoder calls back with the encoded packet. Later, the codec needs to be provided to the decoder so it can pick a suitable libav decoder.
73-
74-
To summarize, the conversion goes as follows:
75-
```
76-
<ros_message> -> cv_bridge_target_format -> av_source_pixel_format --(libav encoder)--> codec
77-
```
78-
By properly choosing ``cv_bridge_target_format`` and ``av_source_pixel_format`` two of those conversions
79-
may become no-ops, but to what extend the cv\_bridge and libav actually recognize and optimize for this has not been looked into yet.
80-
81-
Note that only very few combinations of libav encoders, ``cv_bridge_target_format`` and ``av_source_pixel_format`` have been tested. Please provide feedback if you observe crashes or find obvious bugs. PRs are always appreciated!
107+
The ``Encoder`` class API description has a short example code snippet.
82108

83109
### Decoder
84110

85111
Using the decoder involves the following steps:
86112
- instantiating the ``Decoder`` object.
87113
- if so desired, setting the ROS output (image encoding) format.
88-
- initializing the decoder object. For this you need to know the encoding (codec, e.g. "h264").
89-
During initialization you also have to present an ordered list of libav decoders that
90-
should be used. If an empty list is provided, the decoder will attempt to find a suitable
91-
libav decoder based on the encoding, with a preference for hardware accelerated decoding.
114+
- initializing the decoder object. For this you need to know the encoding (codec, e.g. "h264"),
115+
and you have to specify the libav decoder name (e.g. "h264_cuvid").
92116
- feeding encoded packets to the decoder (and handling the callbacks
93117
when decoded images become available)
94118
- flushing the decoder (may result in additional callbacks with decoded images)
@@ -126,20 +150,8 @@ depth=1 &&cd jetson-ffmpeg && ./ffpatch.sh ../ffmpeg && cd ../ffmpeg && ./config
126150

127151
Then follow the section above on how to
128152
actually use that custom ffmpeg library. As always first test on the
129-
CLI that the newly compiled ``ffmpeg`` command now supports
130-
``h264_nvmpi``. The transport can then be configured to use
131-
nvmpi like so:
153+
CLI that the newly compiled ``ffmpeg`` command now supports ``h264_nvmpi``.
132154

133-
```
134-
parameters=[{'ffmpeg_image_transport.encoding': 'h264_nvmpi',
135-
'ffmpeg_image_transport.profile': 'main',
136-
'ffmpeg_image_transport.preset': 'll',
137-
'ffmpeg_image_transport.gop_size': 15}]
138-
```
139-
Sometimes the ffmpeg parameters show up under different names. If the above
140-
settings don't work, try the command ``ros2 param dump <name_of_your_node>``
141-
*after* subscribing to the ffmpeg image topic with e.g. ``ros2 topic hz``.
142-
From the output you can see what the correct parameter names are.
143155
## License
144156

145157
This software is issued under the Apache License Version 2.0.

0 commit comments

Comments
 (0)