-
Notifications
You must be signed in to change notification settings - Fork 370
feat: support int64 <=> int32 auto conversion #1407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Bo Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
core/partitioning/shape_analysis.cpp
Outdated
| } | ||
| } | ||
| } | ||
| // TODO: This part might be necessary for some model, now checkint to verify |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bowang007 should this be uncommented?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, optimized and refactored, this part is now included.
|
@inocsin Can you verify these changes on key models? |
|
Seems like tests are failing for partitioning? |
Sure, we have asked users to test this pr with their models |
Signed-off-by: Bo Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
Signed-off-by: Bo Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to C++ style guidelines:
diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
cnt++;
}
}
- std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
- << std::endl;
+ std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+ << " aten::to nodes)" << std::endl;
return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
LOG_DEBUG(seg_block << " cur seg block");
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}
TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelines|
When compiling BART (https://huggingface.co/facebook/bart-base) using Torch TensorRT, this PR currently segfaults on my machine. Will add an additional comment with the line which is causing the issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to C++ style guidelines:
diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
cnt++;
}
}
- std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
- << std::endl;
+ std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+ << " aten::to nodes)" << std::endl;
return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
LOG_DEBUG(seg_block << " cur seg block");
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}
TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelinesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
gs-olive
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the suggested edits, PR is functioning for BART model and successfully casts Long tensors to Int tensors. Suggestions are related to bugs arising from input vs output checking and usage of different aten::to schemas
core/partitioning/shape_analysis.cpp
Outdated
| auto const_zero = g->insertConstant(0); | ||
| const_zero->setType(torch::jit::BoolType::get()); | ||
| auto none_val = g->insertNode(g->createNone())->output(); | ||
| cast_node = g->create(torch::jit::aten::to, {g->inputs()[index], const_type, const_zero, const_zero, none_val}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an if/else here to use g->inputs() if is_input = true otherwise use g->outputs()
core/partitioning/shape_analysis.cpp
Outdated
| } | ||
|
|
||
| torch::jit::Node* createCastNode(SegmentedBlock& seg_block, size_t index, bool is_input) { | ||
| torch::jit::Node* cast_node = getUpstreamCastNode(seg_block.raw_inputs()[index]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an if/else here to use raw_inputs() if is_input = true otherwise use raw_outputs()
core/partitioning/shape_analysis.cpp
Outdated
| // if we can find upstream aten::to node, we use it's parameters for creating new cast node | ||
| if (cast_node) { | ||
| std::unordered_map<torch::jit::Value*, torch::jit::Value*> value_map; | ||
| value_map.insert({cast_node->inputs()[0], g->inputs()[index]}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need an if/else here to check if insert should be g->inputs()[index] or g->outputs()[index].
core/partitioning/shape_analysis.cpp
Outdated
| // auto cast_node = g->prependNode(g->create(torch::jit::aten::to, {g->inputs()[i], const_type, const_zero, | ||
| // const_zero, none_val})); seg_block.inputs()[i]->replaceAllUsesAfterNodeWith(cast_node, | ||
| // cast_node->outputs()[0]); LOG_DEBUG(seg_block << " in shape analysis"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider removing commented code if not needed.
core/partitioning/shape_analysis.cpp
Outdated
| if (!is_input) { | ||
| // if this value is output, we need to cast it to int32 | ||
| auto const_val = g->insertConstant(3); | ||
| value_map.insert({cast_node->inputs()[1], const_val}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Throws an error when the upstream aten::to node does not have dtype as its second argument. For example, the schema aten::to.prim_Device(Tensor(a) self, Device? device, int? dtype=None, bool non_blocking=False, bool copy=False) -> Tensor(b|a) has Device as its second value, and this insertion causes it to be transformed to an invalid schema. We need to differentiate between schemas to ensure the dtype is placed in the right position. It seems that valid schemas for aten::to have dtype as either the second or third argument, or not at all. I believe there should be a check should be in getUpstreamCastNode to see if dtype is any of the arguments, and then a second check here to see if it is second or third argument in the schema.
The check here could be something like an if/else checking the debugName at the second index, as in:
if (cast_node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::DeviceObjType) {
value_map.insert({cast_node->inputs()[2], const_val});
} else {
value_map.insert({cast_node->inputs()[1], const_val});
}
core/partitioning/shape_analysis.cpp
Outdated
| auto cur_val = q.front(); | ||
| q.pop(); | ||
| auto node = cur_val->node(); | ||
| if (node->kind().toQualString() == std::string("aten::to")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May need an additional check to ensure that the aten::to schema is valid for dtype insertion, as some of these schemas do not take an integer dtype at all, for example:
aten::to(Tensor(a) self, bool non_blocking=False, bool copy=False) -> Tensor(b|a)aten::to(Tensor(a) self, Device device, ScalarType dtype, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor(a)aten::to(Tensor(a) self, Tensor other, bool non_blocking=False, bool copy=False, MemoryFormat? memory_format=None) -> Tensor(a)
A check could be something like an additional && with
(node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType) ||
(node->inputs()[2]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @gs-olive Any reproducer for this?
What I'm not sure about is that for getUpstreamNode() function when we pass in a int32 value will the first cast node be the cast node that casts this value to int64? If that's the case, then we don't need this check.
In other words, is it possible that the first cast node involving the passed value is to cast some other value? If the first cast node is not the cast node that casts to int64, will the second cast node be what we want?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bowang007 - as an update, while this is no longer throwing an error on my end, my thought was that we do need this check you have, but maybe it should be more stringent - something like:
if ((node->kind().toQualString() == std::string("aten::to")) &&
((node->inputs()[1]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType) ||
(node->inputs()[2]->node()->output()->type()->kind() == torch::jit::TypeKind::IntType))) {This is because, in the case where the aten::to is the second option in my above comment, then inserting a constant like 3 will cause the model to fail, as the schema for to as requested needs a ScalarType and not an int. I don't have a specific model to reproduce an error with, and I do not think I encountered one while testing, I just thought it is generally safer to be more strict about the type of upstream cast node used to recast to Int32 - specifically, if we are unsure whether a node has a valid schema for repurposing, we should choose the safer option which is to manually insert an Int32 cast node, as you do in createCastNode.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bowang007 Please let me know what you think about the comment in the thread above:
#1407 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gs-olive I got your point now, let me update this part.
|
Fixes #1346 |
Signed-off-by: Bo Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to C++ style guidelines:
diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
cnt++;
}
}
- std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
- << std::endl;
+ std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+ << " aten::to nodes)" << std::endl;
return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
LOG_DEBUG(seg_block << " cur seg block");
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}
TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelinesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some changes that do not conform to C++ style guidelines:
diff --git a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp b/tmp/changes.txt
index ad68c0a..d7b7e23 100644
--- a/home/runner/work/TensorRT/TensorRT/tests/core/partitioning/test_type_auto_conversion.cpp
+++ b/tmp/changes.txt
@@ -12,8 +12,8 @@ bool checkInsertedCastNodeNumber(torch_tensorrt::core::partitioning::SegmentedBl
cnt++;
}
}
- std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count << " aten::to nodes)"
- << std::endl;
+ std::cout << "Found count of " << cnt << " inserted aten::to nodes, (looking for " << target_count
+ << " aten::to nodes)" << std::endl;
return target_count == cnt;
}
@@ -61,7 +61,6 @@ TEST(Partitioning, ExplicitNodeAutoConversionCorrectly) {
LOG_DEBUG(seg_block << " cur seg block");
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
-
}
TEST(Partitioning, ImplicitAutoConversionCorrectly) {
@@ -105,5 +104,3 @@ TEST(Partitioning, ImplicitAutoConversionCorrectly) {
}
ASSERT_TRUE(checkInsertedCastNodeNumber(segmented_blocks[1], 2));
}
-
-
ERROR: Some files do not conform to style guidelinesThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
gs-olive
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One additional bugfix requested and one minor optional comment, and then the PR is successful when used for compilation + inference on BART.
core/partitioning/shape_analysis.cpp
Outdated
|
|
||
| torch::jit::Node* createCastNode(SegmentedBlock& seg_block, size_t index, bool is_input) { | ||
| auto cast_raw_value = is_input ? seg_block.raw_inputs()[index] : seg_block.raw_outputs()[index]; | ||
| auto cast_subgraph_value = is_input ? seg_block.outputs()[index] : seg_block.outputs()[index]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update this line to:
auto cast_subgraph_value = is_input ? seg_block.inputs()[index] : seg_block.outputs()[index];Currently, it is using the outputs regardless of the truth value of is_input. With this change, the PR (used along with PR #1416 is working for compilation + inference with the BART model)
core/partitioning/shape_analysis.cpp
Outdated
| auto cur_val = q.front(); | ||
| q.pop(); | ||
| auto node = cur_val->node(); | ||
| if (node->kind().toQualString() == std::string("aten::to")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bowang007 Please let me know what you think about the comment in the thread above:
#1407 (comment)
Signed-off-by: Bo Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to C++ style guidelines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code conforms to Python style guidelines
|
Looks good! Fully functional now on BART! |
Signed-off-by: Bo Wang [email protected]
Description
Support int64 <=> int32 type conversion.
Fixes #1382
Type of change
Checklist: