-
Notifications
You must be signed in to change notification settings - Fork 290
Add parsing #2772
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Add parsing #2772
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds parsing functionality to OpenVINO GenAI, implementing both base parsers for batch processing and incremental parsers for streaming scenarios. The changes introduce parser classes for structured output processing, specifically reasoning content and tool calling, with support for both C++ and Python APIs.
- Introduces new parser base classes (
ParserBase
,IncrementalParserBase
) with concrete implementations (DeepSeekR1ReasoningParser
,Llama32PythonicParser
,BaseReasoningParser
) - Adds
TextParserStreamer
class that extendsTextStreamer
with parsing capabilities for incremental text processing - Integrates parsers into the generation pipeline through
GenerationConfig
and provides comprehensive test coverage
Reviewed Changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 10 comments.
Show a summary per file
File | Description |
---|---|
tests/python_tests/test_parsers.py |
Python test cases for parser functionality |
tests/cpp/parser.cpp |
C++ unit tests for various parser implementations |
tests/cpp/CMakeLists.txt |
Build configuration update to include nlohmann_json dependency |
src/python/py_streamers.cpp |
Python bindings for TextParserStreamer class |
src/python/py_parsers.cpp |
Python bindings for parser base classes |
src/python/py_openvino_genai.cpp |
Integration of parser module initialization |
src/python/py_generation_config.cpp |
Adds parsers field to GenerationConfig Python bindings |
src/python/openvino_genai/py_openvino_genai.pyi |
Type definitions for parser classes and updated exports |
src/python/openvino_genai/__init__.py |
Python module exports for parser classes |
src/cpp/src/text_streamer.cpp |
Implementation of TextParserStreamer class |
src/cpp/src/parsers.hpp |
Internal header with parser implementation details |
src/cpp/src/parsers.cpp |
Core parser implementations and registration logic |
src/cpp/src/llm/pipeline.cpp |
Integration of parsers into LLM generation pipeline |
src/cpp/src/generation_config.cpp |
Adds parsers field support to GenerationConfig |
src/cpp/include/openvino/genai/text_streamer.hpp |
Public header for TextParserStreamer |
src/cpp/include/openvino/genai/parsers.hpp |
Public header defining parser interfaces |
src/cpp/include/openvino/genai/llm_pipeline.hpp |
Adds parsed results field to DecodedResults |
src/cpp/include/openvino/genai/generation_config.hpp |
Adds parsers field to GenerationConfig |
samples/cpp/text_generation/parsed_output_sample.cpp |
Sample demonstrating parser usage |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
extended = stream_string[:] | ||
extended.append("") | ||
|
||
for parser in parsers: | ||
for (prev_subword, subword) in zip(extended, stream_string): |
Copilot
AI
Sep 25, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The zip operation creates pairs where prev_subword
comes from extended
(which has an extra empty string at the end) and subword
comes from the original stream_string
. This creates incorrect pairing - the last element of extended
will be paired with the last element of stream_string
, not providing the intended previous/current relationship.
extended = stream_string[:] | |
extended.append("") | |
for parser in parsers: | |
for (prev_subword, subword) in zip(extended, stream_string): | |
# Pair each subword with its previous subword (first element has no previous) | |
prev_subwords = [""] + stream_string[:-1] | |
for parser in parsers: | |
for (prev_subword, subword) in zip(prev_subwords, stream_string): |
Copilot uses AI. Check for mistakes.
450c7a8
to
b15fc5f
Compare
b15fc5f
to
290c821
Compare
290c821
to
e9d96fa
Compare
|
||
class ReasoningParser : public IncrementalParserBase { | ||
private: | ||
std::shared_ptr<ReasoningParserImpl> m_impl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::shared_ptr<ReasoningParserImpl> m_impl; | |
std::unique_ptr<ReasoningParserImpl> m_impl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is actually difficulty with unique_ptr
. It didn't compile, i suspect it requires destructor, but i didn't manage to get it work yet. Can we leave it for the moment as is and continue review while i try to change it to unique_ptr?
Or we can maybe even leave it a shared ptrs? Since the only place where pointer is created in in ctor and we are sure it's really unique, and we already use shared_ptr for other pimpls in our codebase.
|
||
|
||
streamer = TextParserStreamer(genai_tokenizer, parsers=[DeepSeekR1ReasoningParser()]) | ||
breakpoint() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove this
namespace ov { | ||
namespace genai { | ||
|
||
class IncrementalParserBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No virtual destructor - better add it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, i will add it
static std::string name() { return "Phi4ReasoningParser"; } | ||
}; | ||
|
||
class ParserBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have virtual destructor
return m_pimpl->generate(inputs, generation_config, streamer); | ||
auto res = m_pimpl->generate(inputs, generation_config, streamer); | ||
|
||
// If streamer is of StreamerBase type, and it is TextParserStreamer, get parsed message |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would've extracted this logic into a separate method. Something like apply_parsers
- otherwise generate
is too overburdened with non-related logic. Also to complex, too much nesting, better flatten it a little bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Will do that
: m_starts_with_thinking(starts_with_thinking), | ||
m_keep_original_content(keep_original_content) {} | ||
|
||
std::string parse( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method is too big, consider refactoring it into smaller ones.
src/cpp/src/parsers.cpp
Outdated
} | ||
|
||
// Ensure the backends are registered before main | ||
static bool are_backends_registered = register_backends(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need this bool and why it's so important to have them before main
? Can't they be lazily initialized on first request? Maybe it's better to introduce some kind of singleton like ParserRegistry
.
src/cpp/src/parsers.cpp
Outdated
// Regex to capture the [...] part | ||
std::smatch m; | ||
const std::string& text = input["content"].get_string(); | ||
std::regex r(R"(\[.*?\])"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::regex r(R"(\[.*?\])"); | |
std::regex r(R"^(\[.*?\])$"); |
We also need to have anchors, otherwise strings like xx[xxxx]xxx
will pass the test but won't be proper.
src/cpp/src/parsers.cpp
Outdated
} | ||
|
||
input["tool_calls"] = JsonContainer::array(); | ||
input["tool_calls"].push_back(JsonContainer({{"name", name}, {"arguments", kv}})); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if there are more than one tool call?
src/cpp/src/parsers.cpp
Outdated
registered_incremental_parsers[DeepSeekR1ReasoningParser::name()] = []() { return std::make_shared<DeepSeekR1ReasoningParser>(/*starts_with_thinking*/ true); }; | ||
registered_incremental_parsers[Phi4ReasoningParser::name()] = []() { return std::make_shared<Phi4ReasoningParser>(/*starts_with_thinking*/ false); }; | ||
|
||
registered_base_parsers[Llama32PythonicToolParser::name()] = []() { return std::make_shared<Llama32PythonicToolParser>(); }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why no Llama32JsonToolParser
?
if (!generation_config.has_value() || (*generation_config).parsers.empty()) { | ||
return res; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!generation_config.has_value() || (*generation_config).parsers.empty()) { | |
return res; | |
} |
return res; | ||
} | ||
|
||
std::vector<std::shared_ptr<ParserBase>> parsers = (*generation_config).parsers; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<std::shared_ptr<ParserBase>> parsers = (*generation_config).parsers; | |
std::vector<std::shared_ptr<ParserBase>> parsers = generation_config->parsers; |
for (auto& parser: m_parsers) { | ||
message = parser->parse(m_parsed_message, m_text_buffer, message); | ||
// Message can be modified inside parser, if parser for example extracted tool calling from message content | ||
// but parser |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
?
from transformers import AutoTokenizer | ||
from utils.hugging_face import convert_and_save_tokenizer, download_and_convert_model | ||
import re | ||
import textwrap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import textwrap |
JsonContainer msg; | ||
msg["content"] = res.texts[i]; | ||
for (auto& parser: parsers) { | ||
// TODO: Check the state of incremental parser and reset if necessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only applicable for incremental parsers
for (const auto& parsed: dr.parsed) { | ||
auto json_str = parsed.to_json_string(); | ||
py::dict json_dict = json_mod.attr("loads")(json_str); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void call_parser(py::dict& msg, std::function<void(JsonContainer&)> func) { | ||
auto msg_anymap = ov::genai::pybind::utils::py_object_to_any_map(msg); | ||
auto msg_cpp = JsonContainer(msg_anymap); | ||
|
||
func(msg_cpp); | ||
|
||
auto json_str = msg_cpp.to_json_string(); | ||
py::dict result = json_mod.attr("loads")(json_str); | ||
|
||
// update msg with result | ||
msg.clear(); | ||
for (auto item : result) { | ||
msg[item.first] = item.second; | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
void call_parser(py::dict& msg, std::function<void(JsonContainer&)> func) { | |
auto msg_anymap = ov::genai::pybind::utils::py_object_to_any_map(msg); | |
auto msg_cpp = JsonContainer(msg_anymap); | |
func(msg_cpp); | |
auto json_str = msg_cpp.to_json_string(); | |
py::dict result = json_mod.attr("loads")(json_str); | |
// update msg with result | |
msg.clear(); | |
for (auto item : result) { | |
msg[item.first] = item.second; | |
} | |
} | |
py::dict call_parser(py::dict& msg, std::function<void(JsonContainer&)> func) { | |
auto msg_anymap = ov::genai::pybind::utils::py_object_to_any_map(msg); | |
auto msg_cpp = JsonContainer(msg_anymap); | |
func(msg_cpp); | |
auto json_str = msg_cpp.to_json_string(); | |
return json_mod.attr("loads")(json_str); | |
} |
void init_parsers(py::module_& m) { | ||
py::class_<IncrementalParserBase, ConstructableIncrementalParserBase, std::shared_ptr<IncrementalParserBase>>(m, "IncrementalParserBase") | ||
.def(py::init<>()) | ||
.def("parse", [](IncrementalParserBase& self, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use call_parser
helper
|
||
StreamingStatus write(JsonContainer& message) override { | ||
py::dict message_py; | ||
auto json_obj = message.to_json(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed API
message_py[py::cast(it.key())] = py::cast(it.value().get<std::string>()); | ||
} | ||
|
||
// call python implementation which accepts py::dict instead of JsonContainer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add more description about the custom Python function call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test for a custom parser implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test for a custom parser implementation
kv[std::string((*it)[1])] = std::string((*it)[2]); | ||
} | ||
|
||
input["tool_calls"] = JsonContainer::array(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicated input["tool_calls"] = JsonContainer::array();
- see L191
#include <nlohmann/json.hpp> | ||
|
||
using json = nlohmann::json; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nlohmann/json
is unused
.def(py::init<>()) | ||
.def_property_readonly("texts", [](const DecodedResults &dr) -> py::typing::List<py::str> { return pyutils::handle_utf8((std::vector<std::string>)dr); }) | ||
.def_readonly("scores", &DecodedResults::scores) | ||
.def_property_readonly("parsed", [](const DecodedResults& dr) -> py::dict { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Return type seems to be list
.def_property_readonly("parsed", [](const DecodedResults& dr) -> py::dict { | |
.def_property_readonly("parsed", [](const DecodedResults& dr) -> py::list { |
static py::object json_mod = py::module_::import("json"); | ||
py::list result_dicts; | ||
|
||
for (const auto& parsed: dr.parsed) { | ||
auto json_str = parsed.to_json_string(); | ||
py::dict json_dict = json_mod.attr("loads")(json_str); | ||
|
||
result_dicts.append(json_dict); | ||
} | ||
return result_dicts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once PR #2816 is merged, you can use utility function:
static py::object json_mod = py::module_::import("json"); | |
py::list result_dicts; | |
for (const auto& parsed: dr.parsed) { | |
auto json_str = parsed.to_json_string(); | |
py::dict json_dict = json_mod.attr("loads")(json_str); | |
result_dicts.append(json_dict); | |
} | |
return result_dicts; | |
return pyutils::json_container_to_py_object(dr.parsed); |
For now it also uses json string serialization, but can be optimized later with direct native conversion.
auto msg_anymap = ov::genai::pybind::utils::py_object_to_any_map(msg); | ||
auto msg_cpp = JsonContainer(msg_anymap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto msg_anymap = ov::genai::pybind::utils::py_object_to_any_map(msg); | |
auto msg_cpp = JsonContainer(msg_anymap); | |
auto msg_cpp = pyutils::py_object_to_json_container(msg); |
BTW, conversion through AnyMap does not preserve object keys order as AnyMap sorts them alphabetically
message_py[py::cast(it.key())] = py::cast(it.value().get<std::string>()); | ||
} | ||
|
||
// call python implementation which accepts py::dict instead of JsonContainer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to use C++ implementation instead of python to prevent double conversion? Or I am missing something?
auto msg_anymap = ov::genai::pybind::utils::py_object_to_any_map(message_py); | ||
message = JsonContainer(msg_anymap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
auto msg_anymap = ov::genai::pybind::utils::py_object_to_any_map(message_py); | |
message = JsonContainer(msg_anymap); | |
message = pyutils::py_object_to_json_container(message_py); |
static py::object json_mod = py::module_::import("json"); | ||
|
||
auto res = self.get_parsed_message(); | ||
auto json_str = res.to_json_string(); | ||
py::dict json_dict = json_mod.attr("loads")(json_str); | ||
|
||
return json_dict; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static py::object json_mod = py::module_::import("json"); | |
auto res = self.get_parsed_message(); | |
auto json_str = res.to_json_string(); | |
py::dict json_dict = json_mod.attr("loads")(json_str); | |
return json_dict; | |
return pyutils::json_container_to_py_object(self.get_parsed_message()); |
add_executable(${TEST_TARGET_NAME} ${tests_src} $<TARGET_OBJECTS:openvino_genai_obj>) | ||
|
||
target_link_libraries(${TEST_TARGET_NAME} PRIVATE $<TARGET_PROPERTY:openvino::genai,LINK_LIBRARIES> gtest_main gmock_main) | ||
target_link_libraries(${TEST_TARGET_NAME} PRIVATE $<TARGET_PROPERTY:openvino::genai,LINK_LIBRARIES> gtest_main gmock_main nlohmann_json::nlohmann_json) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't be needed anymore
#include <gtest/gtest.h> | ||
#include "openvino/genai/generation_config.hpp" | ||
#include "openvino/genai/parsers.hpp" | ||
#include "nlohmann/json.hpp" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we test with JsonContainer
instead of nlohamnn::json
? JsonContainer
has equality operator that compares json objects under the hood.
Description
Ticket: CVS-170883
Checklist: