Skip to content

Commit e7db482

Browse files
Tacetcopybara-github
Tacet
authored andcommitted
[ASan][libc++] std::basic_string annotations (#72677)
This commit introduces basic annotations for `std::basic_string`, mirroring the approach used in `std::vector` and `std::deque`. Initially, only long strings with the default allocator will be annotated. Short strings (_SSO - short string optimization_) and strings with non-default allocators will be annotated in the near future, with separate commits dedicated to enabling them. The process will be similar to the workflow employed for enabling annotations in `std::deque`. **Please note**: these annotations function effectively only when libc++ and libc++abi dylibs are instrumented (with ASan). This aligns with the prevailing behavior of Memory Sanitizer. To avoid breaking everything, this commit also appends `_LIBCPP_INSTRUMENTED_WITH_ASAN` to `__config_site` whenever libc++ is compiled with ASan. If this macro is not defined, string annotations are not enabled. However, linking a binary that does **not** annotate strings with a dynamic library that annotates strings, is not permitted. Originally proposed here: https://reviews.llvm.org/D132769 Related patches on Phabricator: - Turning on annotations for short strings: https://reviews.llvm.org/D147680 - Turning on annotations for all allocators: https://reviews.llvm.org/D146214 This PR is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in `std::vector` and `std::deque` collections. These enhancements empower ASan to effectively detect instances where the instrumented program attempts to access memory within a collection's internal allocation that remains unused. This includes cases where access occurs before or after the stored elements in `std::deque`, or between the `std::basic_string`'s size (including the null terminator) and capacity bounds. The introduction of these annotations was spurred by a real-world software bug discovered by Trail of Bits, involving an out-of-bounds memory access during the comparison of two strings using the `std::equals` function. This function was taking iterators (`iter1_begin`, `iter1_end`, `iter2_begin`) to perform the comparison, using a custom comparison function. When the `iter1` object exceeded the length of `iter2`, an out-of-bounds read could occur on the `iter2` object. Container sanitization, upon enabling these annotations, would effectively identify and flag this potential vulnerability. This Pull Request introduces basic annotations for `std::basic_string`. Long strings exhibit structural similarities to `std::vector` and will be annotated accordingly. Short strings are already implemented, but will be turned on separately in a forthcoming commit. Look at [a comment](llvm/llvm-project#72677 (comment)) below to read about SSO issues at current moment. Due to the functionality introduced in [D132522](llvm/llvm-project@dd1b7b7), the `__sanitizer_annotate_contiguous_container` function now offers compatibility with all allocators. However, enabling this support will be done in a subsequent commit. For the time being, only strings with the default allocator will be annotated. If you have any questions, please email: - [email protected] - [email protected] NOKEYCHECK=True GitOrigin-RevId: 9ed20568e7de53dce85f1631d7d8c1415e7930ae
1 parent 1f70899 commit e7db482

File tree

92 files changed

+862
-64
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

92 files changed

+862
-64
lines changed

CMakeLists.txt

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -651,6 +651,19 @@ get_sanitizer_flags(SANITIZER_FLAGS "${LLVM_USE_SANITIZER}")
651651
add_library(cxx-sanitizer-flags INTERFACE)
652652
target_compile_options(cxx-sanitizer-flags INTERFACE ${SANITIZER_FLAGS})
653653

654+
# _LIBCPP_INSTRUMENTED_WITH_ASAN informs that library was built with ASan.
655+
# Defining _LIBCPP_INSTRUMENTED_WITH_ASAN while building the library with ASan is required.
656+
# Normally, the _LIBCPP_INSTRUMENTED_WITH_ASAN flag is used to keep information whether
657+
# dylibs are built with AddressSanitizer. However, when building libc++,
658+
# this flag needs to be defined so that the resulting dylib has all ASan functionalities guarded by this flag.
659+
# If the _LIBCPP_INSTRUMENTED_WITH_ASAN flag is not defined, then parts of the ASan instrumentation code in libc++
660+
# will not be compiled into it, resulting in false positives.
661+
# For context, read: https://github.com/llvm/llvm-project/pull/72677#pullrequestreview-1765402800
662+
string(FIND "${LLVM_USE_SANITIZER}" "Address" building_with_asan)
663+
if (NOT "${building_with_asan}" STREQUAL "-1")
664+
config_define(ON _LIBCPP_INSTRUMENTED_WITH_ASAN)
665+
endif()
666+
654667
# Link system libraries =======================================================
655668
function(cxx_link_system_libraries target)
656669
if (NOT MSVC)

include/__config_site.in

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,7 @@
2929
#cmakedefine _LIBCPP_HAS_NO_WIDE_CHARACTERS
3030
#cmakedefine _LIBCPP_HAS_NO_STD_MODULES
3131
#cmakedefine _LIBCPP_HAS_NO_TIME_ZONE_DATABASE
32+
#cmakedefine _LIBCPP_INSTRUMENTED_WITH_ASAN
3233

3334
// PSTL backends
3435
#cmakedefine _LIBCPP_PSTL_CPU_BACKEND_SERIAL

include/string

Lines changed: 230 additions & 52 deletions
Large diffs are not rendered by default.

test/std/strings/basic.string/string.capacity/capacity.pass.cpp

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515

1616
#include "test_allocator.h"
1717
#include "min_allocator.h"
18+
#include "asan_testing.h"
1819

1920
#include "test_macros.h"
2021

@@ -28,6 +29,7 @@ TEST_CONSTEXPR_CXX20 void test_invariant(S s, test_allocator_statistics& alloc_s
2829
while (s.size() < s.capacity())
2930
s.push_back(typename S::value_type());
3031
assert(s.size() == s.capacity());
32+
LIBCPP_ASSERT(is_string_asan_correct(s));
3133
}
3234
#ifndef TEST_HAS_NO_EXCEPTIONS
3335
catch (...) {
@@ -43,17 +45,20 @@ TEST_CONSTEXPR_CXX20 void test_string(const Alloc& a) {
4345
{
4446
S const s((Alloc(a)));
4547
assert(s.capacity() >= 0);
48+
LIBCPP_ASSERT(is_string_asan_correct(s));
4649
}
4750
{
4851
S const s(3, 'x', Alloc(a));
4952
assert(s.capacity() >= 3);
53+
LIBCPP_ASSERT(is_string_asan_correct(s));
5054
}
5155
#if TEST_STD_VER >= 11
5256
// Check that we perform SSO
5357
{
5458
S const s;
5559
assert(s.capacity() > 0);
5660
ASSERT_NOEXCEPT(s.capacity());
61+
LIBCPP_ASSERT(is_string_asan_correct(s));
5762
}
5863
#endif
5964
}
@@ -63,18 +68,22 @@ TEST_CONSTEXPR_CXX20 bool test() {
6368
test_string(test_allocator<char>());
6469
test_string(test_allocator<char>(3));
6570
test_string(min_allocator<char>());
71+
test_string(safe_allocator<char>());
6672

6773
{
6874
test_allocator_statistics alloc_stats;
6975
typedef std::basic_string<char, std::char_traits<char>, test_allocator<char> > S;
7076
S s((test_allocator<char>(&alloc_stats)));
7177
test_invariant(s, alloc_stats);
78+
LIBCPP_ASSERT(is_string_asan_correct(s));
7279
s.assign(10, 'a');
7380
s.erase(5);
7481
test_invariant(s, alloc_stats);
82+
LIBCPP_ASSERT(is_string_asan_correct(s));
7583
s.assign(100, 'a');
7684
s.erase(50);
7785
test_invariant(s, alloc_stats);
86+
LIBCPP_ASSERT(is_string_asan_correct(s));
7887
}
7988

8089
return true;

test/std/strings/basic.string/string.capacity/clear.pass.cpp

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,31 +15,39 @@
1515

1616
#include "test_macros.h"
1717
#include "min_allocator.h"
18+
#include "asan_testing.h"
1819

1920
template <class S>
2021
TEST_CONSTEXPR_CXX20 void test(S s) {
2122
s.clear();
2223
assert(s.size() == 0);
24+
LIBCPP_ASSERT(is_string_asan_correct(s));
2325
}
2426

2527
template <class S>
2628
TEST_CONSTEXPR_CXX20 void test_string() {
2729
S s;
2830
test(s);
31+
LIBCPP_ASSERT(is_string_asan_correct(s));
2932

3033
s.assign(10, 'a');
3134
s.erase(5);
35+
LIBCPP_ASSERT(is_string_asan_correct(s));
3236
test(s);
37+
LIBCPP_ASSERT(is_string_asan_correct(s));
3338

3439
s.assign(100, 'a');
3540
s.erase(50);
41+
LIBCPP_ASSERT(is_string_asan_correct(s));
3642
test(s);
43+
LIBCPP_ASSERT(is_string_asan_correct(s));
3744
}
3845

3946
TEST_CONSTEXPR_CXX20 bool test() {
4047
test_string<std::string>();
4148
#if TEST_STD_VER >= 11
4249
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
50+
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
4351
#endif
4452

4553
return true;

test/std/strings/basic.string/string.capacity/reserve.pass.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@
1818

1919
#include "test_macros.h"
2020
#include "min_allocator.h"
21+
#include "asan_testing.h"
2122

2223
template <class S>
2324
void test(typename S::size_type min_cap, typename S::size_type erased_index) {
@@ -33,6 +34,7 @@ void test(typename S::size_type min_cap, typename S::size_type erased_index) {
3334
assert(s == s0);
3435
assert(s.capacity() <= old_cap);
3536
assert(s.capacity() >= s.size());
37+
LIBCPP_ASSERT(is_string_asan_correct(s));
3638
}
3739

3840
template <class S>
@@ -47,6 +49,7 @@ bool test() {
4749
test_string<std::string>();
4850
#if TEST_STD_VER >= 11
4951
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
52+
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
5053
#endif
5154

5255
return true;
Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
//===----------------------------------------------------------------------===//
2+
//
3+
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
4+
// See https://llvm.org/LICENSE.txt for license information.
5+
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
6+
//
7+
//===----------------------------------------------------------------------===//
8+
9+
// <string>
10+
11+
// This test verifies that the ASan annotations for basic_string objects remain accurate
12+
// after invoking basic_string::reserve(size_type __requested_capacity).
13+
// Different types are used to confirm that ASan works correctly with types of different sizes.
14+
#include <string>
15+
#include <cassert>
16+
17+
#include "test_macros.h"
18+
#include "asan_testing.h"
19+
20+
template <class S>
21+
void test() {
22+
S short_s1(3, 'a'), long_s1(100, 'c');
23+
short_s1.reserve(0x1337);
24+
long_s1.reserve(0x1337);
25+
26+
LIBCPP_ASSERT(is_string_asan_correct(short_s1));
27+
LIBCPP_ASSERT(is_string_asan_correct(long_s1));
28+
29+
short_s1.clear();
30+
long_s1.clear();
31+
32+
LIBCPP_ASSERT(is_string_asan_correct(short_s1));
33+
LIBCPP_ASSERT(is_string_asan_correct(long_s1));
34+
35+
short_s1.reserve(0x1);
36+
long_s1.reserve(0x1);
37+
38+
LIBCPP_ASSERT(is_string_asan_correct(short_s1));
39+
LIBCPP_ASSERT(is_string_asan_correct(long_s1));
40+
41+
S short_s2(3, 'a'), long_s2(100, 'c');
42+
short_s2.reserve(0x1);
43+
long_s2.reserve(0x1);
44+
45+
LIBCPP_ASSERT(is_string_asan_correct(short_s2));
46+
LIBCPP_ASSERT(is_string_asan_correct(long_s2));
47+
}
48+
49+
int main(int, char**) {
50+
test<std::string>();
51+
#ifndef TEST_HAS_NO_WIDE_CHARACTERS
52+
test<std::wstring>();
53+
#endif
54+
#if TEST_STD_VER >= 11
55+
test<std::u16string>();
56+
test<std::u32string>();
57+
#endif
58+
#if TEST_STD_VER >= 20
59+
test<std::u8string>();
60+
#endif
61+
62+
return 0;
63+
}

test/std/strings/basic.string/string.capacity/reserve_size.pass.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@
2020

2121
#include "test_macros.h"
2222
#include "min_allocator.h"
23+
#include "asan_testing.h"
2324

2425
template <class S>
2526
TEST_CONSTEXPR_CXX20 void
@@ -28,6 +29,7 @@ test(typename S::size_type min_cap, typename S::size_type erased_index, typename
2829
s.erase(erased_index);
2930
assert(s.size() == erased_index);
3031
assert(s.capacity() >= min_cap); // Check that we really have at least this capacity.
32+
LIBCPP_ASSERT(is_string_asan_correct(s));
3133

3234
#if TEST_STD_VER > 17
3335
typename S::size_type old_cap = s.capacity();
@@ -39,6 +41,7 @@ test(typename S::size_type min_cap, typename S::size_type erased_index, typename
3941
assert(s == s0);
4042
assert(s.capacity() >= res_arg);
4143
assert(s.capacity() >= s.size());
44+
LIBCPP_ASSERT(is_string_asan_correct(s));
4245
#if TEST_STD_VER > 17
4346
assert(s.capacity() >= old_cap); // reserve never shrinks as of P0966 (C++20)
4447
#endif

test/std/strings/basic.string/string.capacity/resize_and_overwrite.pass.cpp

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919

2020
#include "make_string.h"
2121
#include "test_macros.h"
22+
#include "asan_testing.h"
2223

2324
template <class S>
2425
constexpr void test_appending(std::size_t k, size_t N, size_t new_capacity) {
@@ -37,6 +38,7 @@ constexpr void test_appending(std::size_t k, size_t N, size_t new_capacity) {
3738
const S expected = S(k, 'a') + S(N - k, 'b');
3839
assert(s == expected);
3940
assert(s.c_str()[N] == '\0');
41+
LIBCPP_ASSERT(is_string_asan_correct(s));
4042
}
4143

4244
template <class S>
@@ -55,6 +57,7 @@ constexpr void test_truncating(std::size_t o, size_t N) {
5557
const S expected = S(N - 1, 'a') + S(1, 'b');
5658
assert(s == expected);
5759
assert(s.c_str()[N] == '\0');
60+
LIBCPP_ASSERT(is_string_asan_correct(s));
5861
}
5962

6063
template <class String>
@@ -76,11 +79,14 @@ constexpr bool test() {
7679
void test_value_categories() {
7780
std::string s;
7881
s.resize_and_overwrite(10, [](char*&&, std::size_t&&) { return 0; });
82+
LIBCPP_ASSERT(is_string_asan_correct(s));
7983
s.resize_and_overwrite(10, [](char* const&, const std::size_t&) { return 0; });
84+
LIBCPP_ASSERT(is_string_asan_correct(s));
8085
struct RefQualified {
8186
int operator()(char*, std::size_t) && { return 0; }
8287
};
8388
s.resize_and_overwrite(10, RefQualified{});
89+
LIBCPP_ASSERT(is_string_asan_correct(s));
8490
}
8591

8692
int main(int, char**) {

test/std/strings/basic.string/string.capacity/resize_size.pass.cpp

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,15 @@
1616

1717
#include "test_macros.h"
1818
#include "min_allocator.h"
19+
#include "asan_testing.h"
1920

2021
template <class S>
2122
TEST_CONSTEXPR_CXX20 void test(S s, typename S::size_type n, S expected) {
2223
if (n <= s.max_size()) {
2324
s.resize(n);
2425
LIBCPP_ASSERT(s.__invariants());
2526
assert(s == expected);
27+
LIBCPP_ASSERT(is_string_asan_correct(s));
2628
}
2729
#ifndef TEST_HAS_NO_EXCEPTIONS
2830
else if (!TEST_IS_CONSTANT_EVALUATED) {
@@ -61,6 +63,7 @@ TEST_CONSTEXPR_CXX20 bool test() {
6163
test_string<std::string>();
6264
#if TEST_STD_VER >= 11
6365
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
66+
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
6467
#endif
6568

6669
return true;

test/std/strings/basic.string/string.capacity/resize_size_char.pass.cpp

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -16,13 +16,15 @@
1616

1717
#include "test_macros.h"
1818
#include "min_allocator.h"
19+
#include "asan_testing.h"
1920

2021
template <class S>
2122
TEST_CONSTEXPR_CXX20 void test(S s, typename S::size_type n, typename S::value_type c, S expected) {
2223
if (n <= s.max_size()) {
2324
s.resize(n, c);
2425
LIBCPP_ASSERT(s.__invariants());
2526
assert(s == expected);
27+
LIBCPP_ASSERT(is_string_asan_correct(s));
2628
}
2729
#ifndef TEST_HAS_NO_EXCEPTIONS
2830
else if (!TEST_IS_CONSTANT_EVALUATED) {
@@ -57,12 +59,23 @@ TEST_CONSTEXPR_CXX20 void test_string() {
5759
'a',
5860
S("12345678901234567890123456789012345678901234567890aaaaaaaaaa"));
5961
test(S(), S::npos, 'a', S("not going to happen"));
62+
//ASan:
63+
test(S(), 21, 'a', S("aaaaaaaaaaaaaaaaaaaaa"));
64+
test(S(), 22, 'a', S("aaaaaaaaaaaaaaaaaaaaaa"));
65+
test(S(), 23, 'a', S("aaaaaaaaaaaaaaaaaaaaaaa"));
66+
test(S(), 24, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaa"));
67+
test(S(), 29, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
68+
test(S(), 30, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
69+
test(S(), 31, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
70+
test(S(), 32, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
71+
test(S(), 33, 'a', S("aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"));
6072
}
6173

6274
TEST_CONSTEXPR_CXX20 bool test() {
6375
test_string<std::string>();
6476
#if TEST_STD_VER >= 11
6577
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
78+
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
6679
#endif
6780

6881
return true;

test/std/strings/basic.string/string.capacity/shrink_to_fit.pass.cpp

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@
1515

1616
#include "test_macros.h"
1717
#include "min_allocator.h"
18+
#include "asan_testing.h"
1819

1920
template <class S>
2021
TEST_CONSTEXPR_CXX20 void test(S s) {
@@ -25,6 +26,7 @@ TEST_CONSTEXPR_CXX20 void test(S s) {
2526
assert(s == s0);
2627
assert(s.capacity() <= old_cap);
2728
assert(s.capacity() >= s.size());
29+
LIBCPP_ASSERT(is_string_asan_correct(s));
2830
}
2931

3032
template <class S>
@@ -43,12 +45,19 @@ TEST_CONSTEXPR_CXX20 void test_string() {
4345
s.assign(100, 'a');
4446
s.erase(50);
4547
test(s);
48+
49+
s.assign(100, 'a');
50+
for (int i = 0; i <= 9; ++i) {
51+
s.erase(90 - 10 * i);
52+
test(s);
53+
}
4654
}
4755

4856
TEST_CONSTEXPR_CXX20 bool test() {
4957
test_string<std::string>();
5058
#if TEST_STD_VER >= 11
5159
test_string<std::basic_string<char, std::char_traits<char>, min_allocator<char>>>();
60+
test_string<std::basic_string<char, std::char_traits<char>, safe_allocator<char>>>();
5261
#endif
5362

5463
return true;

0 commit comments

Comments
 (0)