Skip to content

Commit 10921a7

Browse files
committed
decoder flush(), decoder testing, improved api docs
1 parent c9dfc79 commit 10921a7

File tree

11 files changed

+1038
-291
lines changed

11 files changed

+1038
-291
lines changed

CMakeLists.txt

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,13 +138,15 @@ if(BUILD_TESTING)
138138
ament_pep257()
139139
ament_xmllint()
140140

141+
find_package(ffmpeg_image_transport_msgs)
141142
find_package(ament_cmake_gtest REQUIRED)
142143
ament_add_gtest(${PROJECT_NAME}_encoder_test test/encoder_test.cpp
143144
WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/test)
144145
target_include_directories(${PROJECT_NAME}_encoder_test PUBLIC
145146
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include>
146147
$<INSTALL_INTERFACE:include>)
147-
target_link_libraries(${PROJECT_NAME}_encoder_test ${PROJECT_NAME})
148+
target_link_libraries(${PROJECT_NAME}_encoder_test
149+
${ffmpeg_image_transport_msgs_TARGETS} ${PROJECT_NAME})
148150
endif()
149151

150152
ament_export_include_directories(include)

README.md

Lines changed: 60 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -34,11 +34,68 @@ and follow the [instructions here](https://github.com/ros-misc-utilities/.github
3434

3535
Make sure to source your workspace's ``install/setup.bash`` afterwards.
3636

37-
## Parameters
37+
## ROS Parameters
3838

39-
This package has no parameters. It is the upper layer's responsibility to e.g. manage the mapping between encoder and decoder, i.e. to tell the decoder class which libav decoder should be used for the decoding, or to set the encoding parameters.
39+
This package does not expose ROS parameters. It is the upper layer's responsibility to e.g. manage the mapping between encoder and decoder, i.e. to tell the decoder class which libav decoder should be used for the decoding, or to set the encoding parameters.
4040

41-
### How to use a custom version of libav (aka ffmpeg)
41+
42+
## API usage
43+
44+
### Encoder
45+
46+
Using the encoder involves the following steps:
47+
- instantiating the ``Encoder`` object.
48+
- setting properties like the libav encoder to use, the encoding formats, and AV options.
49+
- initializing the encoder object. This requires knowledge of the image size and therefore
50+
can only be done when the first image is available. Note that many properties
51+
(encoding formats etc) must have been set *before* initializing the encoder.
52+
- feeding images to the encoder (and handling the callbacks when encoded packets become available)
53+
- flushing the encoder (may result in additional callbacks with encoded packets)
54+
- destroying the ``Encoder`` object.
55+
56+
The ``Encoder`` class description has a short example code snippet.
57+
58+
When using libav it is important to understand the difference between the encoder and the codec. The *codec* is the standardized format in which images are encoded, for instance ``h264`` or ``hevc``. The *encoder* is a software module that can encode images for a given codec. For instance ``libx264``, ``libx264rgb``, ``h264_nvenc``, and ``h264_vaapi`` are all *encoders* that encode for the *codec* h264.
59+
Some of the encoders are hardware accelerated, some can handle more image formats than others, but in the end they all encode video for a specific codec. You set the libav encoder with the ``setEncoder()`` method.
60+
61+
For the many AV options available for various libav encoders, and for ``qmax``, ``bit_rate`` and similar settings please refer to the ffmpeg documentation.
62+
63+
The topic of "pixel formats" (also called "encodings") can be very confusing and frustrating, so here a few lines about it.
64+
- Images in ROS arrive as messages that are encoded in one of the formats allowed by the [sensor\_msgs/Image encodings](https://github.com/ros2/common_interfaces/blob/a2ef867438e6d4eed074d6e3668ae45187e7de86/sensor_msgs/include/sensor_msgs/image_encodings.hpp). If you pass that message in via the ``encodeImage(const Image & msg)`` message, it will first will be converted to an opencv matrix using the [cv_bridge](https://github.com/ros-perception/vision_opencv).
65+
If you don't set the ``cv_bridge_target_format`` via ``setCVBridgeTargetFormat(const std::string & fmt)``, the image will be converted to the default format of ``bgr8``. Depending on how performance sensitive you are (or if you have a ``mono8`` (gray) image!), this may not be what you want. Ideally, you should pick a ``cv_bridge_target_format`` that can be directly used by the libav decoder that you have chosen. You can use ffmpeg to list the formats that the libav encoder directly supports:
66+
```
67+
ffmpeg -h encoder=libx264 | grep formats
68+
```
69+
Note that the ffmpeg/libav format strings often diverge from the ROS encoding strings, so some guesswork and experimentation may be necessary to pick the right ``cv_bridge_target_format``.
70+
If you choose to bypass the cv\_bridge conversion by feeding the images to the encoder directly via the ``encodeImage(const cv::Mat & img ...)`` method, you must still set the ``cv_bridge_target_format`` such that encoder knows what format the ``img`` argument has.
71+
- Once the image is available in ``cv_bridge_target_format`` the encoder may perform another conversion to an image format that the libav encoder supports, the ``av_source_pixel_format``. Again, ``ffmpeg -h encoder=<your encoder>`` shows the supported formats. If you don't set the ``av_source_pixel_format`` with ``setAVSourcePixelFormat()``, the encoder will try to pick one that is supported by the libav encoder. That often works but may not be optimal.
72+
- Finally, the libav encoder produces a packet in its output codec, e.g. ``h264``, ``hevc`` etc. This encoding format is provided as the ``codec`` parameter when the encoder calls back with the encoded packet. Later, the codec needs to be provided to the decoder so it can pick a suitable libav decoder.
73+
74+
To summarize, the conversion goes as follows:
75+
```
76+
<ros_message> -> cv_bridge_target_format -> av_source_pixel_format --(libav encoder)--> codec
77+
```
78+
By properly choosing ``cv_bridge_target_format`` and ``av_source_pixel_format`` two of those conversions
79+
may become no-ops, but to what extend the cv\_bridge and libav actually recognize and optimize for this has not been looked into yet.
80+
81+
Note that only very few combinations of libav encoders, ``cv_bridge_target_format`` and ``av_source_pixel_format`` have been tested. Please provide feedback if you observe crashes or find obvious bugs. PRs are always appreciated!
82+
83+
### Decoder
84+
85+
Using the decoder involves the following steps:
86+
- instantiating the ``Decoder`` object.
87+
- if so desired, setting the ROS output (image encoding) format.
88+
- initializing the decoder object. For this you need to know the encoding (codec, e.g. "h264").
89+
During initialization you also have to present an ordered list of libav decoders that
90+
should be used. If an empty list is provided, the decoder will attempt to find a suitable
91+
libav decoder based on the encoding, with a preference for hardware accelerated decoding.
92+
- feeding encoded packets to the decoder (and handling the callbacks
93+
when decoded images become available)
94+
- flushing the decoder (may result in additional callbacks with decoded images)
95+
- destroying the ``Decoder`` object.
96+
The ``Decoder`` class description has a short example code snippet.
97+
98+
## How to use a custom version of libav (aka ffmpeg)
4299

43100
Compile *and install* ffmpeg. Let's say the install directory is
44101
``/home/foo/ffmpeg/build``, then for it to be found while building,

include/ffmpeg_encoder_decoder/decoder.hpp

Lines changed: 155 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -16,6 +16,7 @@
1616
#ifndef FFMPEG_ENCODER_DECODER__DECODER_HPP_
1717
#define FFMPEG_ENCODER_DECODER__DECODER_HPP_
1818

19+
#include <ffmpeg_encoder_decoder/pts_map.hpp>
1920
#include <ffmpeg_encoder_decoder/tdiff.hpp>
2021
#include <ffmpeg_encoder_decoder/types.hpp>
2122
#include <functional>
@@ -36,114 +37,214 @@ extern "C" {
3637

3738
namespace ffmpeg_encoder_decoder
3839
{
40+
/**
41+
* \brief Decodes ffmpeg encoded messages via libav (ffmpeg).
42+
*
43+
* The Decoder class facilitates decoding of messages that have been encoded with the
44+
* Encoder class by leveraging libav, the collection of libraries used by ffmpeg.
45+
* Sample code:
46+
```
47+
void imageCallback(const sensor_msgs::msg::Image::ConstSharedPtr & img, bool isKeyFrame)
48+
{
49+
// process decoded image here...
50+
}
51+
52+
ffmpeg_encoder_decoder::Decoder decoder;
53+
ffmpeg_image_transport_msgs::msg::FFMPEGPacket msg;
54+
msg.header.frame_id = "frame_id";
55+
msg.width = 640;
56+
msg.height = 480;
57+
msg.encoding = "hevc";
58+
msg.data.resize(10000, 0); // Obviously this is not a valid packet!!!
59+
60+
if (!decoder.isInitialized()) {
61+
decoder.initialize(msg.encoding, imageCallback, "hevc_cuvid");
62+
}
63+
64+
for (int64_t i = 0; i < 10; i++) {
65+
msg.header.stamp = rclcpp::Time(i, RCL_SYSTEM_TIME);
66+
if (!decoder.decodePacket(
67+
msg.encoding, &msg.data[0], msg.data.size(), msg.pts, msg.header.frame_id,
68+
msg.header.stamp)) {
69+
throw(std::runtime_error("error decoding packet!"));
70+
}
71+
}
72+
decoder.flush();
73+
```
74+
*/
3975
class Decoder
4076
{
4177
public:
78+
/**
79+
* \brief callback function signature
80+
* \param img pointer to decoded image
81+
* \param isKeyFrame true if the decoded image is a keyframe
82+
*/
4283
using Callback = std::function<void(const ImageConstPtr & img, bool isKeyFrame)>;
43-
using PTSMap = std::unordered_map<int64_t, rclcpp::Time>;
4484

85+
/**
86+
* \brief Constructor.
87+
*/
4588
Decoder();
89+
90+
/**
91+
* \brief Destructor. Calls reset();
92+
*/
4693
~Decoder();
94+
4795
/**
4896
* Test if decoder is initialized.
49-
* @return true if the decoder is initialized.
97+
* \return true if the decoder is initialized.
5098
*/
5199
bool isInitialized() const { return (codecContext_ != NULL); }
100+
52101
/**
53-
* Initialize decoder, with multiple decoders to pick from.
54-
* Will pick hardware accelerated decoders first if available.
55-
* If decoders.empty() a default decoder will be chosen (if available).
56-
* @param encoding the encoding from the first packet. Can never change!
57-
* @param callback the function to call when frame has been decoded
58-
* @param decoder the decoder to use. If empty string,
59-
* the decoder will try to find a suitable one based on the encoding
60-
* @return true if successful
61-
*/
62-
63-
bool initialize(const std::string & encoding, Callback callback, const std::string & decoder);
64-
/**
65-
* Initialize decoder with multiple decoders to pick from.
66-
* Will pick hardware accelerated decoders first if available.
67-
* If decoders.empty() a default decoder will be chosen (if available).
68-
* @param encoding the encoding from the first packet. Can never change!
69-
* @param callback the function to call when frame has been decoded
70-
* @param decoders the set of decoders to try sequentially. If empty()
71-
* the decoder will try to find a suitable one based on the encoding
72-
* @return true if successful
73-
*/
102+
* \brief Initializes the decoder for a given codec and libav decoder.
103+
*
104+
* Initializes the decoder, with multiple decoders to pick from.
105+
* If the name of the libav decoder string is empty, a suitable libav decoder
106+
* will be picked, or the initialization will fail if none is available.
107+
* \param codec the codec (encoding) from the first packet. Can never change!
108+
* \param callback the function to call when frame has been decoded.
109+
* \param decoder the name of the libav decoder to use. If empty string,
110+
* the decoder will try to find a suitable one based on the encoding.
111+
* \return true if initialized successfully.
112+
*/
113+
bool initialize(const std::string & codec, Callback callback, const std::string & decoder);
114+
115+
/**
116+
* \ brief initializes the decoder, trying different libavdecoders in order.
117+
*
118+
* Initialize decoder with multiple libav decoders to pick from.
119+
* If decoders.empty() a default decoder will be chosen (if available).
120+
* \param codec the codec (encoding) from the first packet. Can never change!
121+
* \param callback the function to call when frame has been decoded.
122+
* \param decoders names of the libav decoders to try sequentially. If empty()
123+
* the decoder will try to find a suitable one based on the codec.
124+
* \return true if successful
125+
*/
74126
bool initialize(
75-
const std::string & encoding, Callback callback, const std::vector<std::string> & decoders);
127+
const std::string & codec, Callback callback, const std::vector<std::string> & decoders);
128+
76129
/**
77-
* Clears all decoder state but not timers, loggers, and other settings.
130+
* \brief Sets the ROS output message encoding format.
131+
*
132+
* Sets the ROS output message encoding format. Must be compatible with one of the
133+
* libav encoding formats, or else exception will be thrown. If not set,
134+
* the output encoding will default to bgr8.
135+
* \param output_encoding defaults to bgr8
136+
* \throw std::runtime_error() if no matching libav pixel format could be found
137+
*/
138+
void setOutputMessageEncoding(const std::string & output_encoding);
139+
140+
/**
141+
* \brief Clears all decoder state except for timers, loggers, and other settings.
78142
*/
79143
void reset();
144+
80145
/**
146+
* \brief Decodes encoded packet.
147+
*
81148
* Decodes packet. Decoder must have been initialized beforehand. Calling this
82149
* function may result in callback with decoded frame.
83-
* @param encoding the name of the encoding (typically from msg encoding)
84-
* @param data pointer to packet data
85-
* @param size size of packet data
86-
* @param pts presentation time stamp of data packet
87-
* @param frame_id ros frame id (from message header)
88-
* @param stamp ros message header time stamp
89-
*/
150+
* \param encoding the name of the encoding/codec (typically from msg encoding)
151+
* \param data pointer to packet data
152+
* \param size size of packet data
153+
* \param pts presentation time stamp of data packet
154+
* \param frame_id ros frame id (from message header)
155+
* \param stamp ros message header time stamp
156+
* \return true if decoding was successful
157+
*/
90158
bool decodePacket(
91159
const std::string & encoding, const uint8_t * data, size_t size, uint64_t pts,
92160
const std::string & frame_id, const rclcpp::Time & stamp);
161+
93162
/**
94-
* Override default logger
95-
* @param logger the logger to override the default with
96-
*/
97-
void setLogger(rclcpp::Logger logger) { logger_ = logger; }
163+
* \brief Flush decoder.
164+
*
165+
* This method can only be called once at the end of the decoding stream.
166+
* It will force any buffered packets to be delivered as frames. No further
167+
* packets can be fed to the decoder after calling flush().
168+
* \return true if flush was successful (libav returns EOF)
169+
*/
170+
bool flush();
171+
98172
/**
99-
* deprecated, don't use!
173+
* \brief Overrides the default ("Decoder") logger.
174+
* \param logger the logger to override the default ("Decoder") with.
100175
*/
101-
static const std::unordered_map<std::string, std::string> & getDefaultEncoderToDecoderMap();
176+
void setLogger(rclcpp::Logger logger) { logger_ = logger; }
177+
102178
/**
103-
* Finds the name of hardware and software decoders that match a
104-
* certain encoding (or encoder)
179+
* \brief Finds all hardware and software decoders for a given codec.
180+
*
181+
* Finds the name of all hardware and software decoders that match
182+
* a certain codec (or encoder).
183+
* \param codec name of the codec, i.e. h264, hevc etc
184+
* \param hw_decoders non-null pointer for returning list of hardware decoders
185+
* \param sw_decoders non-null pointer for returning list of software decoders
105186
*/
106187
static void findDecoders(
107-
const std::string & encoding, std::vector<std::string> * hw_decoders,
188+
const std::string & codec, std::vector<std::string> * hw_decoders,
108189
std::vector<std::string> * sw_decoders);
190+
109191
/**
110-
* Finds the name of all hardware and software decoders that match
111-
* a certain encoding (or encoder)
112-
*/
113-
static std::vector<std::string> findDecoders(const std::string & encoding);
192+
* \brief Finds all decoders that can decode a given codec.
193+
*
194+
* Finds the name of all hardware and software decoders (combined)
195+
* that match a certain codec (or encoder).
196+
* \param codec name of the codec, i.e. h264, hevc etc
197+
* \return vector with names of matching libav decoders
198+
*/
199+
static std::vector<std::string> findDecoders(const std::string & codec);
200+
114201
/**
115-
* For performance debugging
202+
* \brief Enables or disables performance measurements. Poorly tested, may be broken.
203+
* \param p set to true to enable performance debugging.
116204
*/
117205
void setMeasurePerformance(bool p) { measurePerformance_ = p; }
206+
118207
/**
119-
* For performance debugging
208+
* \brief Prints performance timers. Poorly tested, may be broken.
209+
* \param prefix for labeling the printout
120210
*/
121211
void printTimers(const std::string & prefix) const;
212+
122213
/**
123-
* For performance debugging
214+
* \brief resets performance debugging timers. Poorly tested, may be broken.
124215
*/
125216
void resetTimers();
126217

218+
// ------------------- deprecated functions ---------------
219+
/**
220+
* \deprecated Use findDecoders(codec) instead.
221+
*/
222+
[[deprecated("use findDecoders(codec) now.")]]
223+
static const std::unordered_map<std::string, std::string> & getDefaultEncoderToDecoderMap();
224+
127225
private:
128-
rclcpp::Logger logger_;
129226
bool initSingleDecoder(const std::string & decoder);
130227
bool initDecoder(const std::vector<std::string> & decoders);
228+
int receiveFrame();
131229
// --------------- variables
230+
rclcpp::Logger logger_;
132231
Callback callback_;
133-
PTSMap ptsToStamp_; // mapping of header
134-
232+
PTSMap ptsToStamp_;
135233
// --- performance analysis
136234
bool measurePerformance_{false};
137235
TDiff tdiffTotal_;
138-
// --- libav stuff
236+
// --- libav related variables
139237
AVRational timeBase_{1, 100};
140238
std::string encoding_;
141239
AVCodecContext * codecContext_{NULL};
142-
AVFrame * decodedFrame_{NULL};
240+
AVFrame * swFrame_{NULL};
143241
AVFrame * cpuFrame_{NULL};
144-
AVFrame * colorFrame_{NULL};
242+
AVFrame * outputFrame_{NULL};
145243
SwsContext * swsContext_{NULL};
146244
enum AVPixelFormat hwPixFormat_;
245+
std::string outputMsgEncoding_;
246+
enum AVPixelFormat outputAVPixFormat_ { AV_PIX_FMT_NONE };
247+
int bitsPerPixel_; // output format bits/pixel including padding
147248
AVPacket packet_;
148249
AVBufferRef * hwDeviceContext_{NULL};
149250
};

0 commit comments

Comments
 (0)