Skip to content

[dfsan] Add test case for sscanf #94700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 7, 2024
Merged

Conversation

thurstond
Copy link
Contributor

@thurstond thurstond commented Jun 6, 2024

This test case shows a limitation of DFSan's sscanf implementation (introduced in https://reviews.llvm.org/D153775): it simply ignores ordinary characters in the format string, instead of actually comparing them against the input. This may change the semantics of instrumented programs.

Importantly, this also means that DFSan's release_shadow_space.c test, which relies on sscanf to scrape the RSS from /proc/maps output, will incorrectly match lines that don't contain RSS information. As a result, it adding together numbers from irrelevant output (e.g., base addresses), resulting in test flakiness
(#91287).

This test case shows a limitation of DFSan's sscanf implementation
(introduced in https://reviews.llvm.org/D153775): it simply ignores
ordinary characters in the format string, instead of actually comparing
them against the input. This may change the semantics of instrumented programs.

Importantly, this also means that DFSan's release_shadow_space.c test,
which relies on sscanf to scrape the RSS from /proc/maps output, will
incorrectly match lines that don't contain RSS information. As a result,
it is scraping numbers from irrelevant output (e.g., base addresses), and can
therefore result in test flakiness
(llvm#91287).
@llvmbot
Copy link
Member

llvmbot commented Jun 6, 2024

@llvm/pr-subscribers-compiler-rt-sanitizer

Author: Thurston Dang (thurstond)

Changes

This test case shows a limitation of DFSan's sscanf implementation (introduced in https://reviews.llvm.org/D153775): it simply ignores ordinary characters in the format string, instead of actually comparing them against the input. This may change the semantics of instrumented programs.

Importantly, this also means that DFSan's release_shadow_space.c test, which relies on sscanf to scrape the RSS from /proc/maps output, will incorrectly match lines that don't contain RSS information. As a result, it is scraping numbers from irrelevant output (e.g., base addresses), and can therefore result in test flakiness
(#91287).


Full diff: https://github.com/llvm/llvm-project/pull/94700.diff

1 Files Affected:

  • (added) compiler-rt/test/dfsan/sscanf.c (+19)
diff --git a/compiler-rt/test/dfsan/sscanf.c b/compiler-rt/test/dfsan/sscanf.c
new file mode 100644
index 0000000000000..dbc2de4ba96c1
--- /dev/null
+++ b/compiler-rt/test/dfsan/sscanf.c
@@ -0,0 +1,19 @@
+// RUN: %clang_dfsan %s -o %t && %run %t
+// XFAIL: *
+
+#include <assert.h>
+#include <stdio.h>
+
+int main(int argc, char *argv[]) {
+  char buf[256] = "10000000000-100000000000 rw-p 00000000 00:00 0";
+  long rss = 0;
+  // This test exposes a bug in DFSan's sscanf, that leads to flakiness
+  // in release_shadow_space.c (see
+  // https://github.com/llvm/llvm-project/issues/91287)
+  if (sscanf(buf, "Garbage text before, %ld, Garbage text after", &rss) == 1) {
+    printf("Error: matched %ld\n", rss);
+    return 1;
+  }
+
+  return 0;
+}

@thurstond
Copy link
Contributor Author

Relevant code in DFSan's scan_buffer:

static int scan_buffer(char *str, size_t size, const char *fmt,
                       dfsan_label *va_labels, dfsan_label *ret_label,
                       dfsan_origin *str_origin, dfsan_origin *ret_origin,
                       va_list ap) {
    ...
    if (*formatter.fmt_cur != '%') {
      // Ordinary character. Consume all the characters until a '%' or the end
      // of the string.
      for (; *(formatter.fmt_cur + 1) && *(formatter.fmt_cur + 1) != '%';
           ++formatter.fmt_cur) {
          // EDITOR'S NOTE: SHOULD THIS CHECK AGAINST THE INPUT STRING?
      }
      retval = formatter.scan();
      dfsan_set_label(0, formatter.str_cur(),
                      formatter.num_written_bytes(retval));

@thurstond thurstond merged commit 79cd6c3 into llvm:main Jun 7, 2024
7 of 8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants