[asan] Fix `unknown-crash` being reported for multi-byte errors, and incorrect memory access addresses being reported #144480

wxwern · 2025-06-17T08:45:06Z

This comprises of a fix for two intertwined bugs in ASan. The two changes would need to be simultaneously merged to not break any functionality.

`unknown-crash` reported for multi-byte errors

Given that a reported error by ASan spans multiple bytes, ASan may flag the error as an unknown-crash instead of the appropriate error name.

This error can be reproduced via a partial buffer overflow (on GCC, not Clang*), which reports unknown-crash instead of stack-buffer-overflow for the below:

https://godbolt.org/z/abrjrvnzj

# minimal reprod (should occur on gcc-7 - gcc-15, x86_64)
#
# gcc -fsanitize=address reprod.c

struct X {
    char bytes[16];
};

__attribute__((noinline)) struct X out_of_bounds() {
    volatile char bytes[16];
    struct X* x_ptr = (struct X*)(bytes + 2);
    return *x_ptr;
}

int main() {
    struct X x = out_of_bounds();
    return x.bytes[0];
}

This is due to a flawed heuristic in asan_errors.cpp, which won't always locate the appropriate shadow byte that would indicate a corresponding error. This can happen for any reported errors which span either:

exactly 8 bytes, or
16 and more bytes.

Reproducibility on Clang

The above example doesn't reproduce the issue on Clang due to another bug* masking this one. Specifically:

GCC-compiled binaries report the starting address and size of the failing read attempt to ASan.
Clang-compiled binaries use __asan_memcpy, which directly highlights the first byte access that overflows the buffer to ASan. This thus coincidentally allows the heuristic to always work. This appears to be an incorrect interpretation.

In order to replicate this bug on Clang (so that we can do tests), another bug in ASan must first be fixed, as below:

Incorrect reported address in `ACCESS_MEMORY_RANGE`

ACCESS_MEMORY_RANGE defined in asan_interceptors_memintrinsics.h reports the poisoned address (__bad) instead of the memory access start address (__offset) to ReportGenericError. (link).

We can determine that the latter (reporting __offset) should be the intended interpretation, as most error descriptions are decided by treating the given addr as a start address. For example, see: PrintAccessAndVarIntersection in asan_descriptions.cpp - it uses addr and access_size to determine whether a variable access overflows/underflows/etc. (link).

GCC also uses the latter interpretation, as mentioned above.

Existing tests previously assumed and check for the former incorrect interpretation. Corrections are made to update their output checks.

Performing this fix will result in the unknown-crash bug visible in GCC-compiled binaries to surface on Clang ones as well.

github-actions · 2025-06-17T08:45:28Z

Thank you for submitting a Pull Request (PR) to the LLVM Project!

This PR will be automatically labeled and the relevant teams will be notified.

If you wish to, you can add reviewers by using the "Reviewers" section on this page.

If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using @ followed by their GitHub username.

If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers.

If you have further questions, they may be answered by the LLVM GitHub User Guide.

You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums.

llvmbot · 2025-06-18T02:24:24Z

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Wern (wxwern)

Changes

Given that a reported error by asan spans multiple bytes, asan may flag the error as an unknown-crash instead of the appropriate error name.

This error can be reproduced via a partial buffer overflow (on gcc), which reports unknown-crash instead of stack-buffer-overflow for the below:

https://godbolt.org/z/abrjrvnzj

# minimal reprod (should occur on gcc-7 - gcc-15)
#
# gcc -fsanitize=address reprod.c

struct X {
    char bytes[16];
};

__attribute__((noinline)) struct X out_of_bounds() {
    volatile char bytes[16];
    struct X* x_ptr = (struct X*)(bytes + 2);
    return *x_ptr;
}

int main() {
    struct X x = out_of_bounds();
    return x.bytes[0];
}

This is due to a flawed heuristic in asan_errors.cpp, which won't always locate the appropriate shadow byte that would indicate a corresponding error. This can happen for any reported errors which span either:

exactly 8 bytes, or
16 and more bytes.

The above example doesn't reproduce the issue on clang as it reports errors via different pathways:

gcc-compiled binaries report the starting address and size of the failing read attempt to asan.
clang-compiled binaries highlight the first byte access that overflows the buffer to asan.

Note: out-of-scope, but this is also possibly misleading, as it still reports the full size of the read attempt, paired with an address that's not the start of the read.

This behavior appears to be identical for all past versions tested. I'm not aware of a way to replicate this specific issue with clang, though it might have impacted error reporting in other areas.

This patch resolves this issue via a linear scan of applicable shadow bytes (instead of the original heuristic, which, at best, only increments the shadow byte address by 1 for these scenarios).

Full diff: https://github.com/llvm/llvm-project/pull/144480.diff

1 Files Affected:

(modified) compiler-rt/lib/asan/asan_errors.cpp (+5-2)

diff --git a/compiler-rt/lib/asan/asan_errors.cpp b/compiler-rt/lib/asan/asan_errors.cpp
index 2a207cd06ccac..9e109c0895589 100644
--- a/compiler-rt/lib/asan/asan_errors.cpp
+++ b/compiler-rt/lib/asan/asan_errors.cpp
@@ -437,8 +437,11 @@ ErrorGeneric::ErrorGeneric(u32 tid, uptr pc_, uptr bp_, uptr sp_, uptr addr,
     bug_descr = "unknown-crash";
     if (AddrIsInMem(addr)) {
       u8 *shadow_addr = (u8 *)MemToShadow(addr);
-      // If we are accessing 16 bytes, look at the second shadow byte.
-      if (*shadow_addr == 0 && access_size > ASAN_SHADOW_GRANULARITY)
+      u8 *shadow_addr_upper_bound =
+          shadow_addr + (1 + ((access_size - 1) / ASAN_SHADOW_GRANULARITY));
+      // If the read could span multiple shadow bytes,
+      // do a sequential scan and look for the first bad shadow byte.
+      while (*shadow_addr == 0 && shadow_addr < shadow_addr_upper_bound)
         shadow_addr++;
       // If we are in the partial right redzone, look at the next shadow byte.
       if (*shadow_addr > 0 && *shadow_addr < 128) shadow_addr++;

compiler-rt/lib/asan/asan_errors.cpp

vitalybuka · 2025-06-18T02:42:45Z

We need a test for that

vitalybuka · 2025-06-18T02:46:33Z

The above example doesn't reproduce the issue on clang as it reports errors via different pathways:

You can probably trigger that path through ACCESS_MEMORY_RANGE and INTERCEPTORs?

wxwern · 2025-06-18T03:11:51Z

We need a test for that

The above example doesn't reproduce the issue on clang as it reports errors via different pathways:

You can probably trigger that path through ACCESS_MEMORY_RANGE and INTERCEPTORs?

Thanks, will look into it. I've not written tests yet as I haven't found a way to reproduce this via clang, will do once I can find a reproducible example.

wxwern · 2025-06-26T09:19:12Z

@vitalybuka I've made some updates, and the PR description has been updated with more details and findings. Please do let me know if they're alright, thanks!

wxwern · 2025-07-03T07:10:08Z

Ping

wxwern · 2025-07-10T06:35:36Z

Ping

wxwern · 2025-07-18T02:30:30Z

Ping

wxwern · 2025-07-25T10:09:16Z

Ping @vitalybuka @ramosian-glider, could someone please look into this?

fmayer

Please add newlines at EOF for the tests

compiler-rt/lib/asan/asan_errors.cpp

fmayer

Please merge the tests and just set the different defines on the clang command line (and use different FileCheck prefixes)

ACCESS_MEMORY_RANGE defined in asan_interceptors_memintrinsics.h reports the poisoned address (__bad), instead of the start address (__offset) during a memory access to ReportGenericError. We can determine that the latter (__offset) is the intended interpretation, as most error descriptions are decided by treating the given address as a start address (for example, see: PrintAccessAndVarIntersection in asan_descriptions.cpp, which decides whether a variable underflows or overflows depending on the given addr and access_size). GCC also uses the latter interpretation. For instance, in buffer overflows, it appears to do its own processing, and will report the start address of an overflowing read to ASan. This is in contrast to Clang, which uses __asan_memcpy directly. This patch fixes the above issue. Existing tests previously assumed and check for the former incorrect behaviour. The error descriptions in those tests have thus been corrected.

Given that a reported error by ASan spans multiple bytes, ASan may flag the error as an 'unknown-crash' instead of the appropriate error name. This error can be reproduced via a partial buffer overflow (any GCC, or after performing the patch in the previous commit to Clang). They'll report 'unknown-crash' instead of 'stack-buffer-overflow' for the below: # minimal reprod # https://godbolt.org/z/abrjrvnzj # # gcc -fsanitize=address reprod.c struct X { char bytes[16]; }; __attribute__((noinline)) struct X out_of_bounds() { volatile char bytes[16]; struct X* x_ptr = (struct X*)(bytes + 2); return *x_ptr; } int main() { struct X x = out_of_bounds(); return x.bytes[0]; } This is due to a flawed heuristic in asan_errors.cpp, which won't always locate the appropriate shadow byte that would indicate a corresponding error. This can happen for any reported errors which span either: exactly 8 bytes, or 16 and more bytes. This bug was previously hidden from Clang (but has always been present in GCC) until the previous commit's fix on address reporting. Specifically, ACCESS_MEMORY_RANGE in ASan previously reports the first poisoned byte (instead of the start address, like in GCC). This masked the above bug from occuring, as it coincidentally guarantees the heuristic will always work, with slightly inaccurate reports. This patch resolves this issue via a linear scan of applicable shadow bytes (instead of the original heuristic, which, at best, only increments the shadow byte address by 1 for these scenarios).

fmayer · 2025-07-28T16:37:31Z

compiler-rt/lib/asan/asan_interceptors_memintrinsics.h

+      ReportStringFunctionSizeOverflow(__offset, __size, &stack);            \
+    }                                                                        \
+    if (UNLIKELY(!QuickCheckForUnpoisonedRegion(__offset, __size)) &&        \
+        (__bad = __asan_region_is_poisoned(__offset, __size))) {             \


This __bad is now unused

I suppose it's fair to remove __bad entirely then?

Removing seems reasonable to me. No reason to have unused variables afaict :-)

thurstond

Would it work if the changes in ErrorGeneric::ErrorGeneric were replaced with a "one-line" (*) change:

438   if (AddrIsInMem(addr)) {
439 +   addr = __asan_region_is_poisoned(addr, access_size);
  ...

?

(*) We don't really want to reuse the addr variable but it conceptually the same

davidmrdavid · 2025-07-28T17:27:52Z

compiler-rt/lib/asan/asan_errors.cpp

+      // We use the MEM_TO_SHADOW macro for the upper bound above instead of
+      // MemToShadow to skip the assertion that (addr + access_size) is within
+      // the valid memory range. The validity of the shadow address is checked
+      // via AddrIsInShadow in the while loop below.


dismissible nit - since this mentions the variables above, I would recommend having this before those variables are declared, to make the code easier to read. Otherwise, reading the comments requires backtracking to earlier lines to get the full context.

In other words, I recommend having this before the declaration of u8 *shadow_addr :-)

fmayer · 2025-07-28T17:29:31Z

Would it work if the changes in ErrorGeneric::ErrorGeneric were replaced with a "one-line" (*) change:
438   if (AddrIsInMem(addr)) {
439 +   addr = __asan_region_is_poisoned(addr, access_size);
  ...
?

(*) We don't really want to reuse the addr variable but it conceptually the same

Isn't this what we had before the change by passing in __bad? Or are there other callsites

github-actions · 2025-07-28T17:33:43Z

⚠️ C/C++ code formatter, clang-format found issues in your code. ⚠️

You can test this locally with the following command:

git-clang-format --diff HEAD~1 HEAD --extensions h,cpp,c -- compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp compiler-rt/lib/asan/asan_errors.cpp compiler-rt/lib/asan/asan_interceptors_memintrinsics.h compiler-rt/test/asan/TestCases/strcasestr-1.c compiler-rt/test/asan/TestCases/strcasestr-2.c compiler-rt/test/asan/TestCases/strcspn-1.c compiler-rt/test/asan/TestCases/strcspn-2.c compiler-rt/test/asan/TestCases/strpbrk-1.c compiler-rt/test/asan/TestCases/strpbrk-2.c compiler-rt/test/asan/TestCases/strspn-1.c compiler-rt/test/asan/TestCases/strspn-2.c compiler-rt/test/asan/TestCases/strstr-1.c compiler-rt/test/asan/TestCases/strstr-2.c compiler-rt/test/asan/TestCases/strtok.c compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp compiler-rt/test/asan/TestCases/heap-overflow-large-read.cpp

View the diff from clang-format here.

diff --git a/compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp b/compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp
index 566b1158a..51fdf56d4 100644
--- a/compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp
+++ b/compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp
@@ -6,9 +6,9 @@
 // RUN: not %run %t 100 2>&1 | FileCheck %s
 // RUN: not %run %t 10000 2>&1 | FileCheck %s
 
+#include <stdio.h>
 #include <stdlib.h>
 #include <string.h>
-#include <stdio.h>
 
 int main(int argc, char *argv[]) {
   fprintf(stderr, "main\n");
diff --git a/compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp b/compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp
index 0e10d673c..e382517b2 100644
--- a/compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp
+++ b/compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp
@@ -24,9 +24,9 @@
 // RUN: not %run %t 13 2>&1 | FileCheck %s
 // RUN: not %run %t 19 2>&1 | FileCheck %s
 
-#include <stdlib.h>
 #include <assert.h>
 #include <stdio.h>
+#include <stdlib.h>
 
 struct X {
   char bytes[READ_SIZE];
@@ -34,7 +34,7 @@ struct X {
 
 __attribute__((noinline)) struct X out_of_bounds(int offset) {
   volatile char bytes[STACK_ALLOC_SIZE];
-  struct X* x_ptr = (struct X*)(bytes + offset);
+  struct X *x_ptr = (struct X *)(bytes + offset);
   return *x_ptr;
 }

fmayer · 2025-07-28T17:35:22Z

Please shorten the first line of the commit message a bit

thurstond · 2025-07-28T17:36:09Z

Would it work if the changes in ErrorGeneric::ErrorGeneric were replaced with a "one-line" (*) change:
438   if (AddrIsInMem(addr)) {
439 +   addr = __asan_region_is_poisoned(addr, access_size);
  ...
?
(*) We don't really want to reuse the addr variable but it conceptually the same
Isn't this what we had before the change by passing in __bad?

For ACCESS_MEMORY_RANGE, it would make the behavior exactly the same as before, which makes it easier to reason that the change is not bad.

Or are there other callsites

Yes, there are also other callsites, which is where the test behavior diverges, but it would still be preferable to avoid reimplementing parts of __asan_region_is_poisoned in ErrorGeneric (assuming this is what the change is meant to do).

thurstond · 2025-07-28T18:20:54Z

With the current patch set, I'm getting check-asan failures:

/usr/local/google/home/thurston/llvm-projectP/compiler-rt/lib/asan/tests/asan_mem_test.cpp:46
Death test: MEMSET(array + 1, element, size + sizeof(T))
    Result: died but not with expected error.
  Expected: contains regular expression "buffer-overflow.*WRITE.*located 0 bytes after"
Actual msg:
[  DEATH   ] =================================================================
[  DEATH   ] ==242206==ERROR: AddressSanitizer: unknown-crash on address 0xf2603288 at pc 0x5669907f bp 0xffe7cbc8 sp 0xffe7c7a0
[  DEATH   ] WRITE of size 2056 at 0xf2603288 thread T0

...

Failed Tests (64):
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/BCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/MAYBE_StrNDupOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/MemCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/MemCpyOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/MemMoveOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/MemSetOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrCaseCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrCatOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrChrAndIndexOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrDupOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrNCaseCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrNCatOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrNCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrNCpyOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-calls-Test/AddressSanitizer/StrNLenOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/BCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/MAYBE_StrNDupOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/MemCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/MemCpyOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/MemMoveOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/MemSetOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrCaseCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrCatOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrChrAndIndexOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrDupOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrNCaseCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrNCatOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrNCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrNCpyOOBTest
  AddressSanitizer-Unit :: ./Asan-i386-inline-Test/AddressSanitizer/StrNLenOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/BCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/MAYBE_StrNDupOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/MemCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/MemCpyOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/MemMoveOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/MemSetOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrCaseCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrCatOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrChrAndIndexOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrDupOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrNCaseCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrNCatOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrNCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrNCpyOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-calls-Test/AddressSanitizer/StrNLenOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/BCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/MAYBE_StrNDupOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/MemCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/MemCpyOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/MemMoveOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/MemSetOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrCaseCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrCatOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrChrAndIndexOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrDupOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrNCaseCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrNCatOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrNCmpOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrNCpyOOBTest
  AddressSanitizer-Unit :: ./Asan-x86_64-inline-Test/AddressSanitizer/StrNLenOOBTest

fmayer · 2025-07-28T20:17:10Z

Would it work if the changes in ErrorGeneric::ErrorGeneric were replaced with a "one-line" (*) change:
438   if (AddrIsInMem(addr)) {
439 +   addr = __asan_region_is_poisoned(addr, access_size);
  ...
?
(*) We don't really want to reuse the addr variable but it conceptually the same
Isn't this what we had before the change by passing in __bad?
For ACCESS_MEMORY_RANGE, it would make the behavior exactly the same as before, which makes it easier to reason that the change is not bad.

Or are there other callsites

Yes, there are also other callsites, which is where the test behavior diverges, but it would still be preferable to avoid reimplementing parts of __asan_region_is_poisoned in ErrorGeneric (assuming this is what the change is meant to do).

Wouldn't it be better to just change these callsites then?

thurstond · 2025-07-28T20:50:01Z

Wouldn't it be better to just change these callsites then?

Per the patch description, it is desirable to change ErrorGeneric's parameter for consistency with other error descriptions.

wxwern · 2025-07-29T02:37:35Z

@thurstond

Would it work if the changes in ErrorGeneric::ErrorGeneric were replaced with a "one-line" (*) change:
438   if (AddrIsInMem(addr)) {
439 +   addr = __asan_region_is_poisoned(addr, access_size);
  ...
?

(*) We don't really want to reuse the addr variable but it conceptually the same

For ACCESS_MEMORY_RANGE, it would make the behavior exactly the same as before, which makes it easier to reason that the change is not bad.

I agree with this, and this change should work in most cases.

However, due to how the fast check in __asan_region_is_poisoned works (it reports the end address as poisoned if the end address is not in memory), heap-overflow-large-read.cpp reports an unknown-crash with a wild pointer error (matching prior behaviour where the filename was wild-pointer.cpp).

As a user, I would personally prefer this case to have a more user friendly error summary, since it's reasonably easy to tell it's an overflow, but if less code duplication is desired we can probably sacrifice accuracy for "rarer" cases.

With the current patch set, I'm getting check-asan failures:

I can only replicate this if the patch is partially applied (i.e., change in ACCESS_MEMORY_RANGE from my first commit, but not ErrorGeneric::ErrorGeneric in my second commit), which is expected as the second commit fixes unknown-crash reports.

I originally considered these two changes distinct issues, since it would otherwise not be obvious that it helps resolves the unknown-crash issue in GCC. Please let me know if it's reasonable to squash them.

wxwern · 2025-07-29T02:48:47Z

From the discussions here so far I believe the following would be most adequate and achieve the same goal:

ACCESS_MEMORY_RANGE reports __offset instead of __bad to ErrorGeneric (and __bad is removed as it is now unused)
ErrorGeneric::ErrorGeneric is updated to use __asan_region_is_poisoned for error message classification*, i.e., the one-liner shadow_addr = MemToShadow(__asan_region_is_poisoned(addr, size)), instead of shadow_addr = MemToShadow(addr) and the loop.

*assuming it's okay to have the caveat mentioned above

I'll proceed with restructuring this PR if there're no objections.

thurstond · 2025-07-29T03:29:05Z

@thurstond
Would it work if the changes in ErrorGeneric::ErrorGeneric were replaced with a "one-line" (*) change:
438   if (AddrIsInMem(addr)) {
439 +   addr = __asan_region_is_poisoned(addr, access_size);
  ...
?
(*) We don't really want to reuse the addr variable but it conceptually the same
I agree with this and it should work in most cases.

However, due to how the fast check in __asan_region_is_poisoned works (it reports the end address as poisoned if the end address is not in memory), heap-overflow-large-read.cpp reports an unknown-crash with a wild pointer error (matching prior behaviour where the filename was wild-pointer.cpp).

As a user, I would personally prefer this case to have a more user friendly error summary, since it's reasonably easy to tell it's an overflow, but if less code duplication is desired we can probably sacrifice accuracy for "rarer" cases.

Hmm, wild-pointer is a good point(er).

I think wild pointers would be better addressed in a followup patch, though. As is, even with the current patch, wild pointers are not entirely fixed e.g., if I change wild-pointer.cpp to have p = 0xBADBADBADBAD, it will still print "unknown-crash". It's arguably a bit confusing that some wild pointers will print unknown-crash while others will print heap-overflow (and stack-overflow etc. in other cases).

With the current patch set, I'm getting check-asan failures:

I can only replicate this if the patch is partially applied (i.e., change in ACCESS_MEMORY_RANGE from my first commit, but not ErrorGeneric::ErrorGeneric in my second commit), which is expected as the second commit fixes unknown-crash reports.

Sorry, I must have had a dirty checkout. Please ignore.

wxwern · 2025-07-29T07:06:24Z

After further testing, it seems changing it to __asan_region_is_poisoned has more side effects than expected on the original wild_pointer.cpp (tentatively renamed to heap-overflow-large-read.cpp).

Without this patch, it originally outputs something like:

ERROR: AddressSanitizer: unknown-crash on address 0x4568018703436799 at pc 0x5e95bdbf4263 bp 0x7ffc7bc601c0 sp 0x7ffc7bc5f980
READ of size 5001116549197948809 at 0x4568018703436799 thread T0
:
:
Address 0x4568018703436799 is a wild pointer inside of access range of size 0x4567890123456789.

The patch with the while-loop allows it to be reported as a heap-buffer-overflow (which is most appropriate).

It looks something like:

ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7921821e0010 at pc 0x55b05df80bfe bp 0x7ffedac15910 sp 0x7ffedac150d8
READ of size 5001116549197948809 at 0x7921821e0010 thread T0
:    
:
Shadow bytes around the buggy address:
  0x7921821dfd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7921821dfe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7921821dfe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7921821dff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7921821dff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x7921821e0000: fa fa[01]fa fa fa 01 fa fa fa fa fa fa fa fa fa
  0x7921821e0080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7921821e0100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7921821e0180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7921821e0200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x7921821e0280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
:
:

However, changing it to __asan_region_is_poisoned causes it to flag unknown-crash, as the start address is still on the heap, but __asan_region_is_poisoned does not output the heap shadow byte in this case due to it bailing out early. It also doesn't output Address ... is a wild pointer anymore, since with the __bad to __offset replacement the provided address (start address) is no longer a wild pointer (technically it never was, but the code previously treats it as such).

It looks something like:

ERROR: AddressSanitizer: unknown-crash on address 0x772b749e0010 at pc 0x591728c2aafe bp 0x7ffcd48fad20 sp 0x7ffcd48fa4e8
READ of size 5001116549197948809 at 0x772b749e0010 thread T0
:    
:
Shadow bytes around the buggy address:
  0x772b749dfd80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x772b749dfe00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x772b749dfe80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x772b749dff00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x772b749dff80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x772b749e0000: fa fa[01]fa fa fa 01 fa fa fa fa fa fa fa fa fa
  0x772b749e0080: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x772b749e0100: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x772b749e0180: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x772b749e0200: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x772b749e0280: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
:
:

I'm not really sure what to do in this case - should I leave it as the while-loop, remove the wild pointer test, or maybe something else?

wxwern changed the title ~~[libasan] Fix unknown-crash reported for multi-byte errors~~ [asan] Fix unknown-crash reported for multi-byte errors Jun 17, 2025

wxwern force-pushed the fix-unknown-crash-desc-for-multi-byte-err branch from 0a9f98e to 4a46b7a Compare June 17, 2025 09:06

wxwern marked this pull request as ready for review June 18, 2025 02:23

llvmbot added compiler-rt compiler-rt:asan Address sanitizer compiler-rt:sanitizer labels Jun 18, 2025

wxwern force-pushed the fix-unknown-crash-desc-for-multi-byte-err branch from 4a46b7a to 69f24b6 Compare June 18, 2025 02:26

vitalybuka self-requested a review June 18, 2025 02:37

vitalybuka reviewed Jun 18, 2025

View reviewed changes

compiler-rt/lib/asan/asan_errors.cpp Outdated Show resolved Hide resolved

wxwern force-pushed the fix-unknown-crash-desc-for-multi-byte-err branch from 69f24b6 to f39a0fe Compare June 18, 2025 03:03

This comment was marked as outdated.

Sign in to view

wxwern force-pushed the fix-unknown-crash-desc-for-multi-byte-err branch 2 times, most recently from 63f33c6 to bed1f80 Compare June 26, 2025 08:36

wxwern changed the title ~~[asan] Fix unknown-crash reported for multi-byte errors~~ [asan] Fix unknown-crash reported for multi-byte errors and incorrect addresses Jun 26, 2025

wxwern force-pushed the fix-unknown-crash-desc-for-multi-byte-err branch 3 times, most recently from fec87fc to b4754b8 Compare June 26, 2025 09:12

wxwern requested a review from vitalybuka June 26, 2025 09:14

wxwern changed the title ~~[asan] Fix unknown-crash reported for multi-byte errors and incorrect addresses~~ [asan] Fix unknown-crash being reported for multi-byte errors, and incorrect memory access addresses being reported Jun 26, 2025

wxwern force-pushed the fix-unknown-crash-desc-for-multi-byte-err branch from b4754b8 to 57dc11a Compare July 3, 2025 07:08

fmayer reviewed Jul 25, 2025

View reviewed changes

compiler-rt/lib/asan/asan_errors.cpp Show resolved Hide resolved

compiler-rt/lib/asan/asan_errors.cpp Show resolved Hide resolved

fmayer reviewed Jul 25, 2025

View reviewed changes

wxwern added 3 commits July 28, 2025 12:18

[asan] Reformat spacing

334499e

wxwern force-pushed the fix-unknown-crash-desc-for-multi-byte-err branch from 57dc11a to 334499e Compare July 28, 2025 04:23

wxwern requested a review from fmayer July 28, 2025 04:26

fmayer reviewed Jul 28, 2025

View reviewed changes

thurstond reviewed Jul 28, 2025

View reviewed changes

davidmrdavid reviewed Jul 28, 2025

View reviewed changes

fmayer closed this Jul 28, 2025

fmayer reopened this Jul 28, 2025

[asan] Fix unknown-crash being reported for multi-byte errors, and incorrect memory access addresses being reported #144480

Are you sure you want to change the base?

[asan] Fix unknown-crash being reported for multi-byte errors, and incorrect memory access addresses being reported #144480

Uh oh!

Conversation

wxwern commented Jun 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

unknown-crash reported for multi-byte errors

Reproducibility on Clang

Incorrect reported address in ACCESS_MEMORY_RANGE

Uh oh!

github-actions bot commented Jun 17, 2025

Uh oh!

llvmbot commented Jun 18, 2025

Uh oh!

Uh oh!

vitalybuka commented Jun 18, 2025

Uh oh!

vitalybuka commented Jun 18, 2025

Uh oh!

wxwern commented Jun 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

This comment was marked as outdated.

wxwern commented Jun 26, 2025

Uh oh!

wxwern commented Jul 3, 2025

Uh oh!

wxwern commented Jul 10, 2025

Uh oh!

wxwern commented Jul 18, 2025

Uh oh!

wxwern commented Jul 25, 2025

Uh oh!

fmayer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

fmayer left a comment

Choose a reason for hiding this comment

Uh oh!

fmayer Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

wxwern Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

davidmrdavid Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

thurstond left a comment

Choose a reason for hiding this comment

Uh oh!

davidmrdavid Jul 28, 2025

Choose a reason for hiding this comment

Uh oh!

fmayer commented Jul 28, 2025

Uh oh!

github-actions bot commented Jul 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fmayer commented Jul 28, 2025

Uh oh!

thurstond commented Jul 28, 2025

Uh oh!

thurstond commented Jul 28, 2025

Uh oh!

fmayer commented Jul 28, 2025

Uh oh!

thurstond commented Jul 28, 2025

Uh oh!

wxwern commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wxwern commented Jul 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thurstond commented Jul 29, 2025

Uh oh!

[asan] Fix `unknown-crash` being reported for multi-byte errors, and incorrect memory access addresses being reported #144480

[asan] Fix `unknown-crash` being reported for multi-byte errors, and incorrect memory access addresses being reported #144480

wxwern commented Jun 17, 2025 •

edited

Loading

`unknown-crash` reported for multi-byte errors

Incorrect reported address in `ACCESS_MEMORY_RANGE`

wxwern commented Jun 18, 2025 •

edited

Loading

github-actions bot commented Jul 28, 2025 •

edited

Loading

wxwern commented Jul 29, 2025 •

edited

Loading

wxwern commented Jul 29, 2025 •

edited

Loading