You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
@@ -34,10 +36,59 @@ and follow the [instructions here](https://github.com/ros-misc-utilities/.github
34
36
35
37
Make sure to source your workspace's ``install/setup.bash`` afterwards.
36
38
37
-
## ROS Parameters
39
+
## API overview
38
40
39
-
This package does not expose ROS parameters. It is the upper layer's responsibility to e.g. manage the mapping between encoder and decoder, i.e. to tell the decoder class which libav decoder should be used for the decoding, or to set the encoding parameters.
41
+
### Preliminaries
42
+
When using libav it is important to understand the difference between the encoder and the codec.
43
+
The *codec* is the standardized format in which images are encoded, for instance ``h264`` or ``hevc``.
44
+
The *encoder* is a libav software module that can encode images for a given codec. For instance ``libx264``, ``libx264rgb``,
45
+
`h264_nvenc``, and ``h264_vaapi`` are all *encoders* that encode for the *codec* h264.
46
+
Some of the encoders are hardware accelerated, some can handle more image formats than others, but in the end they all encode video for a specific codec.
40
47
48
+
For the many AV options available for various libav encoders, and for ``qmax``, ``bit_rate`` and similar settings please refer to the ffmpeg documentation.
49
+
50
+
### The inner workings
51
+

52
+
53
+
The diagram above shows the stations that a ROS Image message passes as it traverses the encoder and decoder.
54
+
1) The ROS image (with [ROS sensor\_msgs/Image encoding](https://github.com/ros2/common_interfaces/blob/a2ef867438e6d4eed074d6e3668ae45187e7de86/sensor_msgs/include/sensor_msgs/image_encodings.hpp)) is first converted with the ROS [cv\_bridge](https://github.com/ros-perception/vision_opencv) into the ``cv_bridge_target_format``.
55
+
This conversion is necessary because some ROS encodings (like bayer images) are not supported by libswscale.
56
+
The ``cv_bridge_target_format`` can be set via ``setCVBridgeTargetFormat(const std::string & fmt)``.
57
+
If this format is not set explicitly the image will be converted to the default format of ``bgr8``.
58
+
This may not be what you want for e.g. ``mono8`` (gray) or Bayer images.
59
+
Ideally the``cv_bridge_target_format`` can be directly used by the libav decoder so the next step becomes a no-op.
60
+
But at the very least ``cv_bridge_target_format`` must be an acceptable libswscale input format (with the exception of
61
+
special hacks for encoding single-channel images, see below).
62
+
2) The image is then converted to ``av_source_pixel_format`` using libswscale.
63
+
The ``av_source_pixel_format`` can be set with ``setAVSourcePixelFormat()``, defaulting to something that is acceptable to the libav encoder.
64
+
You can use ffmpeg (``ffmpeg -h encoder=libx264 | grep formats``) to list all formats that an encoder supports.
65
+
Note that the ffmpeg/libav format string notation is different from the ROS encoding strings, and the ``av_source_pixel_format`` is specified using the libav convention, whereas the ``cv_bridge_target_format`` uses ROS convention!
66
+
(If you choose to bypass the cv\_bridge conversion from step 1 by feeding the images to the encoder directly via the ``encodeImage(const cv::Mat & img ...)`` method, you must still set the ``cv_bridge_target_format`` such that the encoder knows what format the ``img`` argument has.)
67
+
When aiming for lossless compression, beware of any ``av_source_pixel_format`` that reduces the color resolution, such as ``yuv420p``, ``nv12`` etc.
68
+
For Bayer images, use the special hack for single-channel images.
69
+
70
+
3) The libav encoder encodes the packet with its supported codec, e.g. the ``libx264`` will produce ``h264`` packets.
71
+
The ``encoding`` field of the FFMPEGPacket message will document all image format conversions and the codec, in reverse order, separated by semicolon.
72
+
This way the decoder can attempt to reproduce the original ``ros_encoding``.
73
+
74
+
4) The libav decoder decodes the packet into the original ``av_source_pixel_format``.
75
+
76
+
5) Finally the image is converted to ``output_message_format`` using libswscale.
77
+
This format can be set (in ROS encoding syntax!) with ``setOutputMessageEncoding()``.
78
+
The format must be supported by both ROS and libswscale (except when using the special hack for single-channel images).
79
+
80
+
Note that only very few combinations of libav encoders, ``cv_bridge_target_format`` and ``av_source_pixel_format`` have been tested. Please provide feedback if you observe crashes or find obvious bugs. PRs are always appreciated!
81
+
82
+
### The special single-channel hack
83
+
84
+
Many libav encoders do not support single-channel formats (like mono8 or bayer).
85
+
For this reason a special hack is implemented in the encoder that adds an empty (zero-value) color channel to the single-channel image.
86
+
Later, the decoder removes it again.
87
+
To utilitze this hack, specify a ``cv_bridge_target_format`` of e.g. ``bayer_rggb8``. Without the special hack, this would trigger an error because Bayer formats are not acceptable to libswscale.
88
+
Instead, the image is converted to ``yuv420p`` or ``nv12`` by adding an empty color channel.
89
+
These formats are acceptable to most encoders.
90
+
The decoder in turn recognizes that the ``cv_bridge_target_format`` is a single-channel format, but ``yuv420p``/``nv12`` are not, and therefore drops the color channel.
91
+
This hack greatly improves the efficiency for lossless encoding of Bayer images because it avoids conversion to full RGB and back.
41
92
42
93
## API usage
43
94
@@ -53,42 +104,15 @@ Using the encoder involves the following steps:
53
104
- flushing the encoder (may result in additional callbacks with encoded packets)
54
105
- destroying the ``Encoder`` object.
55
106
56
-
The ``Encoder`` class description has a short example code snippet.
57
-
58
-
When using libav it is important to understand the difference between the encoder and the codec. The *codec* is the standardized format in which images are encoded, for instance ``h264`` or ``hevc``. The *encoder* is a software module that can encode images for a given codec. For instance ``libx264``, ``libx264rgb``, ``h264_nvenc``, and ``h264_vaapi`` are all *encoders* that encode for the *codec* h264.
59
-
Some of the encoders are hardware accelerated, some can handle more image formats than others, but in the end they all encode video for a specific codec. You set the libav encoder with the ``setEncoder()`` method.
60
-
61
-
For the many AV options available for various libav encoders, and for ``qmax``, ``bit_rate`` and similar settings please refer to the ffmpeg documentation.
62
-
63
-
The topic of "pixel formats" (also called "encodings") can be very confusing and frustrating, so here a few lines about it.
64
-
- Images in ROS arrive as messages that are encoded in one of the formats allowed by the [sensor\_msgs/Image encodings](https://github.com/ros2/common_interfaces/blob/a2ef867438e6d4eed074d6e3668ae45187e7de86/sensor_msgs/include/sensor_msgs/image_encodings.hpp). If you pass that message in via the ``encodeImage(const Image & msg)`` message, it will first will be converted to an opencv matrix using the [cv_bridge](https://github.com/ros-perception/vision_opencv).
65
-
If you don't set the ``cv_bridge_target_format`` via ``setCVBridgeTargetFormat(const std::string & fmt)``, the image will be converted to the default format of ``bgr8``. Depending on how performance sensitive you are (or if you have a ``mono8`` (gray) image!), this may not be what you want. Ideally, you should pick a ``cv_bridge_target_format`` that can be directly used by the libav decoder that you have chosen. You can use ffmpeg to list the formats that the libav encoder directly supports:
66
-
```
67
-
ffmpeg -h encoder=libx264 | grep formats
68
-
```
69
-
Note that the ffmpeg/libav format strings often diverge from the ROS encoding strings, so some guesswork and experimentation may be necessary to pick the right ``cv_bridge_target_format``.
70
-
If you choose to bypass the cv\_bridge conversion by feeding the images to the encoder directly via the ``encodeImage(const cv::Mat & img ...)`` method, you must still set the ``cv_bridge_target_format`` such that encoder knows what format the ``img`` argument has.
71
-
- Once the image is available in ``cv_bridge_target_format`` the encoder may perform another conversion to an image format that the libav encoder supports, the ``av_source_pixel_format``. Again, ``ffmpeg -h encoder=<your encoder>`` shows the supported formats. If you don't set the ``av_source_pixel_format`` with ``setAVSourcePixelFormat()``, the encoder will try to pick one that is supported by the libav encoder. That often works but may not be optimal.
72
-
- Finally, the libav encoder produces a packet in its output codec, e.g. ``h264``, ``hevc`` etc. This encoding format is provided as the ``codec`` parameter when the encoder calls back with the encoded packet. Later, the codec needs to be provided to the decoder so it can pick a suitable libav decoder.
By properly choosing ``cv_bridge_target_format`` and ``av_source_pixel_format`` two of those conversions
79
-
may become no-ops, but to what extend the cv\_bridge and libav actually recognize and optimize for this has not been looked into yet.
80
-
81
-
Note that only very few combinations of libav encoders, ``cv_bridge_target_format`` and ``av_source_pixel_format`` have been tested. Please provide feedback if you observe crashes or find obvious bugs. PRs are always appreciated!
107
+
The ``Encoder`` class API description has a short example code snippet.
82
108
83
109
### Decoder
84
110
85
111
Using the decoder involves the following steps:
86
112
- instantiating the ``Decoder`` object.
87
113
- if so desired, setting the ROS output (image encoding) format.
88
-
- initializing the decoder object. For this you need to know the encoding (codec, e.g. "h264").
89
-
During initialization you also have to present an ordered list of libav decoders that
90
-
should be used. If an empty list is provided, the decoder will attempt to find a suitable
91
-
libav decoder based on the encoding, with a preference for hardware accelerated decoding.
114
+
- initializing the decoder object. For this you need to know the encoding (codec, e.g. "h264"),
115
+
and you have to specify the libav decoder name (e.g. "h264_cuvid").
92
116
- feeding encoded packets to the decoder (and handling the callbacks
93
117
when decoded images become available)
94
118
- flushing the decoder (may result in additional callbacks with decoded images)
0 commit comments