-
Notifications
You must be signed in to change notification settings - Fork 14.6k
[asan] Fix unknown-crash
being reported for multi-byte errors, and incorrect memory access addresses being reported
#144480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[asan] Fix unknown-crash
being reported for multi-byte errors, and incorrect memory access addresses being reported
#144480
Conversation
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
unknown-crash
reported for multi-byte errorsunknown-crash
reported for multi-byte errors
0a9f98e
to
4a46b7a
Compare
@llvm/pr-subscribers-compiler-rt-sanitizer Author: Wern (wxwern) ChangesGiven that a reported error by asan spans multiple bytes, asan may flag the error as an This error can be reproduced via a partial buffer overflow (on gcc), which reports https://godbolt.org/z/abrjrvnzj
This is due to a flawed heuristic in
The above example doesn't reproduce the issue on clang as it reports errors via different pathways:
This behavior appears to be identical for all past versions tested. I'm not aware of a way to replicate this specific issue with clang, though it might have impacted error reporting in other areas. This patch resolves this issue via a linear scan of applicable shadow bytes (instead of the original heuristic, which, at best, only increments the shadow byte address by 1 for these scenarios). Full diff: https://github.com/llvm/llvm-project/pull/144480.diff 1 Files Affected:
diff --git a/compiler-rt/lib/asan/asan_errors.cpp b/compiler-rt/lib/asan/asan_errors.cpp
index 2a207cd06ccac..9e109c0895589 100644
--- a/compiler-rt/lib/asan/asan_errors.cpp
+++ b/compiler-rt/lib/asan/asan_errors.cpp
@@ -437,8 +437,11 @@ ErrorGeneric::ErrorGeneric(u32 tid, uptr pc_, uptr bp_, uptr sp_, uptr addr,
bug_descr = "unknown-crash";
if (AddrIsInMem(addr)) {
u8 *shadow_addr = (u8 *)MemToShadow(addr);
- // If we are accessing 16 bytes, look at the second shadow byte.
- if (*shadow_addr == 0 && access_size > ASAN_SHADOW_GRANULARITY)
+ u8 *shadow_addr_upper_bound =
+ shadow_addr + (1 + ((access_size - 1) / ASAN_SHADOW_GRANULARITY));
+ // If the read could span multiple shadow bytes,
+ // do a sequential scan and look for the first bad shadow byte.
+ while (*shadow_addr == 0 && shadow_addr < shadow_addr_upper_bound)
shadow_addr++;
// If we are in the partial right redzone, look at the next shadow byte.
if (*shadow_addr > 0 && *shadow_addr < 128) shadow_addr++;
|
4a46b7a
to
69f24b6
Compare
We need a test for that |
You can probably trigger that path through ACCESS_MEMORY_RANGE and INTERCEPTORs? |
69f24b6
to
f39a0fe
Compare
Thanks, will look into it. I've not written tests yet as I haven't found a way to reproduce this via |
This comment was marked as outdated.
This comment was marked as outdated.
63f33c6
to
bed1f80
Compare
unknown-crash
reported for multi-byte errorsunknown-crash
reported for multi-byte errors and incorrect addresses
fec87fc
to
b4754b8
Compare
unknown-crash
reported for multi-byte errors and incorrect addressesunknown-crash
being reported for multi-byte errors, and incorrect memory access addresses being reported
@vitalybuka I've made some updates, and the PR description has been updated with more details and findings. Please do let me know if they're alright, thanks! |
b4754b8
to
57dc11a
Compare
Ping |
2 similar comments
Ping |
Ping |
Ping @vitalybuka @ramosian-glider, could someone please look into this? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add newlines at EOF for the tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please merge the tests and just set the different defines on the clang command line (and use different FileCheck prefixes)
ACCESS_MEMORY_RANGE defined in asan_interceptors_memintrinsics.h reports the poisoned address (__bad), instead of the start address (__offset) during a memory access to ReportGenericError. We can determine that the latter (__offset) is the intended interpretation, as most error descriptions are decided by treating the given address as a start address (for example, see: PrintAccessAndVarIntersection in asan_descriptions.cpp, which decides whether a variable underflows or overflows depending on the given addr and access_size). GCC also uses the latter interpretation. For instance, in buffer overflows, it appears to do its own processing, and will report the start address of an overflowing read to ASan. This is in contrast to Clang, which uses __asan_memcpy directly. This patch fixes the above issue. Existing tests previously assumed and check for the former incorrect behaviour. The error descriptions in those tests have thus been corrected.
Given that a reported error by ASan spans multiple bytes, ASan may flag the error as an 'unknown-crash' instead of the appropriate error name. This error can be reproduced via a partial buffer overflow (any GCC, or after performing the patch in the previous commit to Clang). They'll report 'unknown-crash' instead of 'stack-buffer-overflow' for the below: # minimal reprod # https://godbolt.org/z/abrjrvnzj # # gcc -fsanitize=address reprod.c struct X { char bytes[16]; }; __attribute__((noinline)) struct X out_of_bounds() { volatile char bytes[16]; struct X* x_ptr = (struct X*)(bytes + 2); return *x_ptr; } int main() { struct X x = out_of_bounds(); return x.bytes[0]; } This is due to a flawed heuristic in asan_errors.cpp, which won't always locate the appropriate shadow byte that would indicate a corresponding error. This can happen for any reported errors which span either: exactly 8 bytes, or 16 and more bytes. This bug was previously hidden from Clang (but has always been present in GCC) until the previous commit's fix on address reporting. Specifically, ACCESS_MEMORY_RANGE in ASan previously reports the first poisoned byte (instead of the start address, like in GCC). This masked the above bug from occuring, as it coincidentally guarantees the heuristic will always work, with slightly inaccurate reports. This patch resolves this issue via a linear scan of applicable shadow bytes (instead of the original heuristic, which, at best, only increments the shadow byte address by 1 for these scenarios).
57dc11a
to
334499e
Compare
ReportStringFunctionSizeOverflow(__offset, __size, &stack); \ | ||
} \ | ||
if (UNLIKELY(!QuickCheckForUnpoisonedRegion(__offset, __size)) && \ | ||
(__bad = __asan_region_is_poisoned(__offset, __size))) { \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This __bad
is now unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose it's fair to remove __bad
entirely then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing seems reasonable to me. No reason to have unused variables afaict :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it work if the changes in ErrorGeneric::ErrorGeneric were replaced with a "one-line" (*) change:
438 if (AddrIsInMem(addr)) {
439 + addr = __asan_region_is_poisoned(addr, access_size);
...
?
(*) We don't really want to reuse the addr
variable but it conceptually the same
// We use the MEM_TO_SHADOW macro for the upper bound above instead of | ||
// MemToShadow to skip the assertion that (addr + access_size) is within | ||
// the valid memory range. The validity of the shadow address is checked | ||
// via AddrIsInShadow in the while loop below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dismissible nit - since this mentions the variables above, I would recommend having this before those variables are declared, to make the code easier to read. Otherwise, reading the comments requires backtracking to earlier lines to get the full context.
In other words, I recommend having this before the declaration of u8 *shadow_addr
:-)
Isn't this what we had before the change by passing in |
You can test this locally with the following command:git-clang-format --diff HEAD~1 HEAD --extensions h,cpp,c -- compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp compiler-rt/lib/asan/asan_errors.cpp compiler-rt/lib/asan/asan_interceptors_memintrinsics.h compiler-rt/test/asan/TestCases/strcasestr-1.c compiler-rt/test/asan/TestCases/strcasestr-2.c compiler-rt/test/asan/TestCases/strcspn-1.c compiler-rt/test/asan/TestCases/strcspn-2.c compiler-rt/test/asan/TestCases/strpbrk-1.c compiler-rt/test/asan/TestCases/strpbrk-2.c compiler-rt/test/asan/TestCases/strspn-1.c compiler-rt/test/asan/TestCases/strspn-2.c compiler-rt/test/asan/TestCases/strstr-1.c compiler-rt/test/asan/TestCases/strstr-2.c compiler-rt/test/asan/TestCases/strtok.c compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp compiler-rt/test/asan/TestCases/heap-overflow-large-read.cpp View the diff from clang-format here.diff --git a/compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp b/compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp
index 566b1158a..51fdf56d4 100644
--- a/compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp
+++ b/compiler-rt/test/asan/TestCases/heap-overflow-large-offset.cpp
@@ -6,9 +6,9 @@
// RUN: not %run %t 100 2>&1 | FileCheck %s
// RUN: not %run %t 10000 2>&1 | FileCheck %s
+#include <stdio.h>
#include <stdlib.h>
#include <string.h>
-#include <stdio.h>
int main(int argc, char *argv[]) {
fprintf(stderr, "main\n");
diff --git a/compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp b/compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp
index 0e10d673c..e382517b2 100644
--- a/compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp
+++ b/compiler-rt/test/asan/TestCases/stack-buffer-overflow-partial.cpp
@@ -24,9 +24,9 @@
// RUN: not %run %t 13 2>&1 | FileCheck %s
// RUN: not %run %t 19 2>&1 | FileCheck %s
-#include <stdlib.h>
#include <assert.h>
#include <stdio.h>
+#include <stdlib.h>
struct X {
char bytes[READ_SIZE];
@@ -34,7 +34,7 @@ struct X {
__attribute__((noinline)) struct X out_of_bounds(int offset) {
volatile char bytes[STACK_ALLOC_SIZE];
- struct X* x_ptr = (struct X*)(bytes + offset);
+ struct X *x_ptr = (struct X *)(bytes + offset);
return *x_ptr;
}
|
Please shorten the first line of the commit message a bit |
For
Yes, there are also other callsites, which is where the test behavior diverges, but it would still be preferable to avoid reimplementing parts of __asan_region_is_poisoned in ErrorGeneric (assuming this is what the change is meant to do). |
With the current patch set, I'm getting
|
Wouldn't it be better to just change these callsites then? |
Per the patch description, it is desirable to change ErrorGeneric's parameter for consistency with other error descriptions. |
I agree with this, and this change should work in most cases. However, due to how the fast check in As a user, I would personally prefer this case to have a more user friendly error summary, since it's reasonably easy to tell it's an overflow, but if less code duplication is desired we can probably sacrifice accuracy for "rarer" cases.
I can only replicate this if the patch is partially applied (i.e., change in I originally considered these two changes distinct issues, since it would otherwise not be obvious that it helps resolves the |
From the discussions here so far I believe the following would be most adequate and achieve the same goal:
*assuming it's okay to have the caveat mentioned above I'll proceed with restructuring this PR if there're no objections. |
Hmm, wild-pointer is a good point(er). I think wild pointers would be better addressed in a followup patch, though. As is, even with the current patch, wild pointers are not entirely fixed e.g., if I change wild-pointer.cpp to have
Sorry, I must have had a dirty checkout. Please ignore. |
After further testing, it seems changing it to Without this patch, it originally outputs something like:
The patch with the while-loop allows it to be reported as a It looks something like:
However, changing it to It looks something like:
I'm not really sure what to do in this case - should I leave it as the while-loop, remove the wild pointer test, or maybe something else? |
This comprises of a fix for two intertwined bugs in ASan. The two changes would need to be simultaneously merged to not break any functionality.
unknown-crash
reported for multi-byte errorsGiven that a reported error by ASan spans multiple bytes, ASan may flag the error as an
unknown-crash
instead of the appropriate error name.This error can be reproduced via a partial buffer overflow (on GCC, not Clang*), which reports
unknown-crash
instead ofstack-buffer-overflow
for the below:https://godbolt.org/z/abrjrvnzj
This is due to a flawed heuristic in
asan_errors.cpp
, which won't always locate the appropriate shadow byte that would indicate a corresponding error. This can happen for any reported errors which span either:Reproducibility on Clang
The above example doesn't reproduce the issue on Clang due to another bug* masking this one. Specifically:
GCC-compiled binaries report the starting address and size of the failing read attempt to ASan.
Clang-compiled binaries use
__asan_memcpy
, which directly highlights the first byte access that overflows the buffer to ASan. This thus coincidentally allows the heuristic to always work. This appears to be an incorrect interpretation.In order to replicate this bug on Clang (so that we can do tests), another bug in ASan must first be fixed, as below:
Incorrect reported address in
ACCESS_MEMORY_RANGE
ACCESS_MEMORY_RANGE
defined inasan_interceptors_memintrinsics.h
reports the poisoned address (__bad
) instead of the memory access start address (__offset
) toReportGenericError
. (link).We can determine that the latter (reporting
__offset
) should be the intended interpretation, as most error descriptions are decided by treating the givenaddr
as a start address. For example, see:PrintAccessAndVarIntersection
inasan_descriptions.cpp
- it usesaddr
andaccess_size
to determine whether a variable access overflows/underflows/etc. (link).GCC also uses the latter interpretation, as mentioned above.
Existing tests previously assumed and check for the former incorrect interpretation. Corrections are made to update their output checks.
Performing this fix will result in the
unknown-crash
bug visible in GCC-compiled binaries to surface on Clang ones as well.