-
Notifications
You must be signed in to change notification settings - Fork 13.5k
[ASan][libc++] Annotating std::basic_string
with all allocators
#75845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ASan][libc++] Annotating std::basic_string
with all allocators
#75845
Conversation
@llvm/pr-subscribers-libcxx Author: Tacet (AdvenamTacet) ChangesThis commit turns on ASan annotations in Originally suggested here: https://reviews.llvm.org/D146214 This commit is part of our efforts to support container annotations with (almost) every allocator. Annotating Support in ASan API exests since 1c5ad6d. This patch removes the check in std::basic_string annotation member function (__annotate_contiguous_container) to support different allocators. You can turn off annotations for a specific allocator based on changes from 2fa1bec. The motivation for a research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a If you have any questions, please email: Full diff: https://github.com/llvm/llvm-project/pull/75845.diff 4 Files Affected:
diff --git a/libcxx/include/string b/libcxx/include/string
index fdffca5aed18be..ce2df8da8a91eb 100644
--- a/libcxx/include/string
+++ b/libcxx/include/string
@@ -1891,8 +1891,7 @@ private:
#if !defined(_LIBCPP_HAS_NO_ASAN) && defined(_LIBCPP_INSTRUMENTED_WITH_ASAN)
const void* __begin = data();
const void* __end = data() + capacity() + 1;
- if (!__libcpp_is_constant_evaluated() && __begin != nullptr &&
- is_same<allocator_type, __default_allocator_type>::value)
+ if (!__libcpp_is_constant_evaluated() && __asan_annotate_container_with_allocator<allocator_type>::value)
__sanitizer_annotate_contiguous_container(__begin, __end, __old_mid, __new_mid);
#endif
}
diff --git a/libcxx/test/libcxx/containers/strings/basic.string/asan.pass.cpp b/libcxx/test/libcxx/containers/strings/basic.string/asan.pass.cpp
new file mode 100644
index 00000000000000..4c3018c2b33cc5
--- /dev/null
+++ b/libcxx/test/libcxx/containers/strings/basic.string/asan.pass.cpp
@@ -0,0 +1,61 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// REQUIRES: asan
+
+// <string>
+
+// Basic test if ASan annotations work.
+
+#include <string>
+#include <cassert>
+#include <cstdlib>
+
+#include "asan_testing.h"
+#include "min_allocator.h"
+#include "test_iterators.h"
+#include "test_macros.h"
+
+extern "C" void __sanitizer_set_death_callback(void (*callback)(void));
+
+void do_exit() {
+ exit(0);
+}
+
+int main(int, char**)
+{
+ {
+ typedef cpp17_input_iterator<char*> MyInputIter;
+ // Should not trigger ASan.
+ std::basic_string<char, std::char_traits<char>, safe_allocator<char>> v;
+ char i[] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
+ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
+ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'
+ };;
+
+ v.insert(v.begin(), MyInputIter(i), MyInputIter(i + 29));
+ assert(v[0] == 'a');
+ assert(is_string_asan_correct(v));
+ }
+
+ __sanitizer_set_death_callback(do_exit);
+ {
+ using T = char;
+ using C = std::basic_string<T, std::char_traits<T>, safe_allocator<T>>;
+ const T t[] = {'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
+ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j',
+ 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j'
+ };
+ C c(std::begin(t), std::end(t));
+ assert(is_string_asan_correct(c));
+ assert(__sanitizer_verify_contiguous_container(c.data(), c.data() + c.size() + 1, c.data() + c.capacity() + 1) != 0);
+ volatile T foo = c[c.size()+1]; // should trigger ASAN. Use volatile to prevent being optimized away.
+ assert(false); // if we got here, ASAN didn't trigger
+ ((void)foo);
+ }
+}
\ No newline at end of file
diff --git a/libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp b/libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
new file mode 100644
index 00000000000000..875b3904fc6b29
--- /dev/null
+++ b/libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
@@ -0,0 +1,77 @@
+//===----------------------------------------------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+// UNSUPPORTED: c++03
+
+// <string>
+
+// Test based on: https://bugs.chromium.org/p/chromium/issues/detail?id=1419798#c5
+// Some allocators during deallocation may not call destructors and just reuse memory.
+// In those situations, one may want to deactivate annotations for a specific allocator.
+// It's possible with __asan_annotate_container_with_allocator template class.
+// This test confirms that those allocators work after turning off annotations.
+
+#include <assert.h>
+#include <stdlib.h>
+#include <string>
+#include <new>
+
+struct reuse_allocator {
+ static size_t const N = 100;
+ reuse_allocator() {
+ for (size_t i = 0; i < N; ++i)
+ __buffers[i] = malloc(8*1024);
+ }
+ ~reuse_allocator() {
+ for (size_t i = 0; i < N; ++i)
+ free(__buffers[i]);
+ }
+ void* alloc() {
+ assert(__next_id < N);
+ return __buffers[__next_id++];
+ }
+ void reset() { __next_id = 0; }
+ void* __buffers[N];
+ size_t __next_id = 0;
+} reuse_buffers;
+
+template <typename T>
+struct user_allocator {
+ using value_type = T;
+ user_allocator() = default;
+ template <class U>
+ user_allocator(user_allocator<U>) {}
+ friend bool operator==(user_allocator, user_allocator) {return true;}
+ friend bool operator!=(user_allocator x, user_allocator y) {return !(x == y);}
+
+ T* allocate(size_t) { return (T*)reuse_buffers.alloc(); }
+ void deallocate(T*, size_t) noexcept {}
+};
+
+template <class T>
+struct std::__asan_annotate_container_with_allocator<user_allocator<T>> {
+ static bool const value = false;
+};
+
+int main() {
+ using S = std::basic_string<char, std::char_traits<char>, user_allocator<char>>;
+
+ {
+ S* s = new (reuse_buffers.alloc()) S();
+ for (int i = 0; i < 100; i++)
+ s->push_back('a');
+ }
+ reuse_buffers.reset();
+ {
+ S s;
+ for (int i = 0; i < 1000; i++)
+ s.push_back('b');
+ }
+
+ return 0;
+}
\ No newline at end of file
diff --git a/libcxx/test/support/asan_testing.h b/libcxx/test/support/asan_testing.h
index 2dfec5c42b00b2..6bfc8280a4ead3 100644
--- a/libcxx/test/support/asan_testing.h
+++ b/libcxx/test/support/asan_testing.h
@@ -75,7 +75,7 @@ TEST_CONSTEXPR bool is_string_asan_correct(const std::basic_string<ChrT, TraitsT
return true;
if (!is_string_short(c) || _LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED) {
- if (std::is_same<Alloc, std::allocator<ChrT>>::value)
+ if (std::__asan_annotate_container_with_allocator<Alloc>::value)
return __sanitizer_verify_contiguous_container(c.data(), c.data() + c.size() + 1, c.data() + c.capacity() + 1) !=
0;
else
|
629d8e9
to
bc778ab
Compare
I think we are ready to turn on annotations for all allocators as no bugs are reported for #72677. @vitalybuka @ldionne let me know if you agree. |
bc778ab
to
f6f6618
Compare
383ff8d
to
d297670
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pretty opposed to a couple of aspects of this patch.
-
I think the customization point
std::__asan_annotate_container_with_allocator
encourages users to (A) use internals, (B) specialize STL types, and (C) configure the customization point for types they don't own. -
The added tests contain a bunch of UB.
Could you please provide an example that is well-formed? Then we can figure out a more appropriate way to provide the customization needed.
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Outdated
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan.pass.cpp
Outdated
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan.pass.cpp
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I have a few requested changes. Then I say we land this, and then take a short break to ensure everything that's landed so far is going to stick, and to do some cleanup.
One cleanup we need to undertake is one that removes our dependence on the optimizer removing dead calls for us (even if it does, that's likely at the cost of some other code it can't optimize because it's hit its inliner limits).
I'm going to hold off actually approving this patch until we figure out if the performance fixes for the earlier patches are sufficient.
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan.pass.cpp
Outdated
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Show resolved
Hide resolved
✅ With the latest revision this PR passed the C/C++ code formatter. |
27c62af
to
b606f9e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this LGTM, but it needs updating since we're not going to land the if constexpr
magic it's using. But I think we have a good handle on ensuring this is no overhead by other means.
If you wouldn't mind giving me a few days to test this internally before landing, I would appreciate it. But if there's time pressure to land this, that's OK too.
I will remove the constexpr magic Tomorrow. No big time pressure, I believe we can wait a few days. Probably it's good to test before landing to avoid reverting. Also, @ldionne didn't stamp this PR yet. Would be good, however, to work on short string annotations in parallel. I will update/finish #75882 Tomorrow, could you look at it as well then? |
This commit turns on ASan annotations in `std::basic_string` for all allocators by default. Originally suggested here: https://reviews.llvm.org/D146214 This commit is part of our efforts to support container annotations with (almost) every allocator. Annotating `std::basic_string` with default allocator is implemented in llvm#72677. Support in ASan API exests since llvm@dd1b7b7. This patch removes the check in std::basic_string annotation member function (__annotate_contiguous_container) to support different allocators. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: - [email protected] - [email protected]
This commit addresses some comments from a code review: - add size check, - add coments with context to a test, - add comments explaining what happens in the test, - remove volatile, - split an if to constexpr if and normal if.
3556d98
to
428a15f
Compare
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Outdated
Show resolved
Hide resolved
libcxx/test/libcxx/containers/strings/basic.string/asan_turning_off.pass.cpp
Outdated
Show resolved
Hide resolved
Co-authored-by: Louis Dionne <[email protected]>
Co-authored-by: Louis Dionne <[email protected]>
Also refactor one line.
@ldionne Thx for looking at it! I will merge when CI ends running. |
Buildbots are failing. https://lab.llvm.org/buildbot/#/builders/37/builds/29906 Error is not clear for me, but I think problem is not in that PR, therefore I'm not reverting.
Could it be that another error was detected? |
This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: #72677 Requires to pass CI without fails: - #75845 - #75858 Annotating `std::basic_string` with default allocator is implemented in #72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since dd1b7b7. You can turn off annotations for a specific allocator based on changes from 2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
Originally merged here: #75882 Reverted here: #78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - #79065 - #79066 Problematic code from `UniqueFunctionBase` for example: ```cpp #ifndef NDEBUG // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); #endif ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: #72677 Requires to pass CI without fails: - #75845 - #75858 Annotating `std::basic_string` with default allocator is implemented in #72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since dd1b7b7. You can turn off annotations for a specific allocator based on changes from 2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM: - llvm#79489 - llvm#79522 Additionaly annotations were updated (but it shouldn't have any impact on anything): - llvm#79292 Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang. Read PRs above and below for details. --- Previous description: Originally merged here: llvm#75882 Reverted here: llvm#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm#79065 - llvm#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
…vm#75845) This commit turns on ASan annotations in `std::basic_string` for all allocators by default. Originally suggested here: https://reviews.llvm.org/D146214 String annotations added here: llvm#72677 This commit is part of our efforts to support container annotations with (almost) every allocator. Annotating `std::basic_string` with default allocator is implemented in llvm#72677. Additionally it removes `__begin != nullptr` because `data()` should never return a nullptr. Support in ASan API exists since llvm@1c5ad6d. This patch removes the check in std::basic_string annotation member function (__annotate_contiguous_container) to support different allocators. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. The motivation for a research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a call to `std::equal` that took `iter1_begin`, `iter1_end`, `iter2_begin` iterators (with a custom comparison function). When object `iter1` was longer than `iter2`, read out-of-bounds on `iter2` could happen. Container sanitization would detect it. If you have any questions, please email: - [email protected] - [email protected]
This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm/llvm-project#72677 Requires to pass CI without fails: - llvm/llvm-project#75845 - llvm/llvm-project#75858 Annotating `std::basic_string` with default allocator is implemented in llvm/llvm-project#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm/llvm-project@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm/llvm-project@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected] NOKEYCHECK=True GitOrigin-RevId: d06fb0b29c7030497e0e6411cf256cabd71940c2
Originally merged here: llvm/llvm-project#75882 Reverted here: llvm/llvm-project#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm/llvm-project#79065 - llvm/llvm-project#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp #ifndef NDEBUG // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); #endif ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm/llvm-project#72677 Requires to pass CI without fails: - llvm/llvm-project#75845 - llvm/llvm-project#75858 Annotating `std::basic_string` with default allocator is implemented in llvm/llvm-project#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm/llvm-project@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm/llvm-project@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected] NOKEYCHECK=True GitOrigin-RevId: cb528ec5e6331ce207c7b835d7ab963bd5e13af7
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM: - llvm#79489 - llvm#79522 Additionaly annotations were updated (but it shouldn't have any impact on anything): - llvm#79292 Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang. Read PRs above and below for details. --- Previous description: Originally merged here: llvm#75882 Reverted here: llvm#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm#79065 - llvm#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM: - llvm#79489 - llvm#79522 Additionaly annotations were updated (but it shouldn't have any impact on anything): - llvm#79292 Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang. Read PRs above and below for details. --- Previous description: Originally merged here: llvm#75882 Reverted here: llvm#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm#79065 - llvm#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This is 3rd attempt to upstream short string annotations, it's the same as the previous one, but other PRs fixed issues withing LLVM: - llvm#79489 - llvm#79522 Additionaly annotations were updated (but it shouldn't have any impact on anything): - llvm#79292 Now, as far as I know, all buildbots should work without problems. Both previous reverts were not related to issues with string annotations, but with issues in LLVM/clang. Read PRs above and below for details. --- Previous description: Originally merged here: llvm#75882 Reverted here: llvm#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm#79065 - llvm#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm#72677 Requires to pass CI without fails: - llvm#75845 - llvm#75858 Annotating `std::basic_string` with default allocator is implemented in llvm#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: [email protected] [email protected]
This pull request is the third iteration aiming to integrate short string annotations. This commit includes: - Enabling basic_string annotations for short strings. - Setting a value of `__trivially_relocatable` in `std::basic_string` to `false_type` when compiling with ASan (nothing changes when compiling without ASan). Short string annotations make `std::basic_string` to not be trivially relocatable, because memory has to be unpoisoned. - Adding a `_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS` modifier to two functions. - Creating a macro `_LIBCPP_ASAN_VOLATILE_WRAPPER` to prevent problematic stack optimizations (the macro modifies code behavior only when compiling with ASan). Previously we had issues with compiler optimization, which we understand thanks to @vitalybuka. This commit also addresses smaller changes in short string, since previous upstream attempts. Problematic optimization was loading two values in code similar to: ``` __is_long() ? __get_long_size() : __get_short_size(); ``` We aim to resolve it with the volatile wrapper. This commit is built on top of two previous attempts which descriptions are below. Additionally, in the meantime, annotations were updated (but it shouldn't have any impact on anything): - #79292 --- Previous PR: #79049 Reverted: a16f81f Previous description: Originally merged here: #75882 Reverted here: #78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - #79065 - #79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: #72677 Requires to pass CI without fails: - #75845 - #75858 Annotating `std::basic_string` with default allocator is implemented in #72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since dd1b7b7. You can turn off annotations for a specific allocator based on changes from 2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: - [email protected] - [email protected]
This pull request is the third iteration aiming to integrate short string annotations. This commit includes: - Enabling basic_string annotations for short strings. - Setting a value of `__trivially_relocatable` in `std::basic_string` to `false_type` when compiling with ASan (nothing changes when compiling without ASan). Short string annotations make `std::basic_string` to not be trivially relocatable, because memory has to be unpoisoned. - Adding a `_LIBCPP_STRING_INTERNAL_MEMORY_ACCESS` modifier to two functions. - Creating a macro `_LIBCPP_ASAN_VOLATILE_WRAPPER` to prevent problematic stack optimizations (the macro modifies code behavior only when compiling with ASan). Previously we had issues with compiler optimization, which we understand thanks to @vitalybuka. This commit also addresses smaller changes in short string, since previous upstream attempts. Problematic optimization was loading two values in code similar to: ``` __is_long() ? __get_long_size() : __get_short_size(); ``` We aim to resolve it with the volatile wrapper. This commit is built on top of two previous attempts which descriptions are below. Additionally, in the meantime, annotations were updated (but it shouldn't have any impact on anything): - llvm/llvm-project#79292 --- Previous PR: llvm/llvm-project#79049 Reverted: llvm/llvm-project@a16f81f Previous description: Originally merged here: llvm/llvm-project#75882 Reverted here: llvm/llvm-project#78627 Reverted due to failing buildbots. The problem was not caused by the annotations code, but by code in the `UniqueFunctionBase` class and in the `JSON.h` file. That code caused the program to write to memory that was already being used by string objects, which resulted in an ASan error. Fixes are implemented in: - llvm/llvm-project#79065 - llvm/llvm-project#79066 Problematic code from `UniqueFunctionBase` for example: ```cpp // In debug builds, we also scribble across the rest of the storage. memset(RHS.getInlineStorage(), 0xAD, InlineStorageSize); ``` --- Original description: This commit turns on ASan annotations in `std::basic_string` for short stings (SSO case). Originally suggested here: https://reviews.llvm.org/D147680 String annotations added here: llvm/llvm-project#72677 Requires to pass CI without fails: - llvm/llvm-project#75845 - llvm/llvm-project#75858 Annotating `std::basic_string` with default allocator is implemented in llvm/llvm-project#72677 but annotations for short strings (SSO - Short String Optimization) are turned off there. This commit turns them on. This also removes `_LIBCPP_SHORT_STRING_ANNOTATIONS_ALLOWED`, because we do not plan to support turning on and off short string annotations. Support in ASan API exists since llvm/llvm-project@dd1b7b7. You can turn off annotations for a specific allocator based on changes from llvm/llvm-project@2fa1bec. This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. If you have any questions, please email: - [email protected] - [email protected] NOKEYCHECK=True GitOrigin-RevId: 1a96179596099b8a3839050dbff02bfed94502e5
This commit turns on ASan annotations in
std::basic_string
for all allocators by default.Originally suggested here: https://reviews.llvm.org/D146214
String annotations added here: #72677
This commit is part of our efforts to support container annotations with (almost) every allocator. Annotating
std::basic_string
with default allocator is implemented in #72677.Additionally it removes
__begin != nullptr
becausedata()
should never return a nullptr.Support in ASan API exists since 1c5ad6d. This patch removes the check in std::basic_string annotation member function (__annotate_contiguous_container) to support different allocators.
You can turn off annotations for a specific allocator based on changes from 2fa1bec.
The motivation for a research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a call to
std::equal
that tookiter1_begin
,iter1_end
,iter2_begin
iterators (with a custom comparison function). When objectiter1
was longer thaniter2
, read out-of-bounds oniter2
could happen. Container sanitization would detect it.If you have any questions, please email: