-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[MC,ELF] Emit warning if a string constant contains newline char #98060
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MC,ELF] Emit warning if a string constant contains newline char #98060
Conversation
@llvm/pr-subscribers-backend-arm @llvm/pr-subscribers-mc Author: Dmitriy Chestnykh (chestnykh) ChangesGAS emits warning about newline in the string constant so make the same behaviour. Full diff: https://github.com/llvm/llvm-project/pull/98060.diff 2 Files Affected:
diff --git a/llvm/lib/MC/MCParser/AsmLexer.cpp b/llvm/lib/MC/MCParser/AsmLexer.cpp
index e08404ae0ad92f..eda7aedd9f5d3d 100644
--- a/llvm/lib/MC/MCParser/AsmLexer.cpp
+++ b/llvm/lib/MC/MCParser/AsmLexer.cpp
@@ -18,6 +18,7 @@
#include "llvm/ADT/StringSwitch.h"
#include "llvm/MC/MCAsmInfo.h"
#include "llvm/MC/MCParser/MCAsmLexer.h"
+#include "llvm/Support/raw_ostream.h"
#include "llvm/Support/Compiler.h"
#include "llvm/Support/SMLoc.h"
#include "llvm/Support/SaveAndRestore.h"
@@ -646,13 +647,17 @@ AsmToken AsmLexer::LexQuote() {
return AsmToken(AsmToken::String, StringRef(TokStart, CurPtr - TokStart));
}
- // TODO: does gas allow multiline string constants?
+ // gas doesn't allow multiline string constants
+ // and emits a warning if a string constant contains newline character.
while (CurChar != '"') {
if (CurChar == '\\') {
// Allow \", etc.
CurChar = getNextChar();
}
+ if (CurChar == '\n')
+ outs() << "Warning: unterminated string; newline inserted\n";
+
if (CurChar == EOF)
return ReturnError(TokStart, "unterminated string constant");
diff --git a/llvm/test/MC/ELF/warn-newline-in-string-constant.s b/llvm/test/MC/ELF/warn-newline-in-string-constant.s
new file mode 100644
index 00000000000000..e126db30ee47a4
--- /dev/null
+++ b/llvm/test/MC/ELF/warn-newline-in-string-constant.s
@@ -0,0 +1,6 @@
+// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s -o %t | FileCheck %s
+
+.string "abcdefg
+12345678"
+
+// CHECK: Warning: unterminated string; newline inserted
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
58dfd35
to
af41795
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Printing text to stdout is not the correct way to report warnings, it should be done via source manager (and it is likely there are helper functions).
- AsmLexer is a lexer, not parser. It does not know the context the lexed string is used. In some contexts the warning may not be desired. The warning, if needed, should be emitted by a parser that knows full context, e.g. in
AsmParser::parseDirectiveAscii
.
I reworked |
llvm/lib/MC/MCParser/AsmParser.cpp
Outdated
@@ -3125,6 +3130,7 @@ bool AsmParser::parseDirectiveAscii(StringRef IDVal, bool ZeroTerminated) { | |||
do { | |||
if (parseEscapedString(Data)) | |||
return true; | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert unneeded change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
||
" | ||
|
||
// CHECK-WARN: warn-newline-in-escaped-string.s:3:21: warning: unterminated string; newline inserted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of string/string/string check/check/check, consider check/string check/string check/string that new tests use more.
You can use the following to test relative line numbers, so that inserting lines in the future would not require test update. We often omit the filename. When the filename is useful, use FileCheck %s -DFILE=%s
[[#@LINE+1]]:21: warning:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx, reworked
// CHECK-WARN: [[#@LINE-8]]:28: warning: unterminated string; newline inserted | ||
// CHECK-WARN: .asciz "another test string | ||
// CHECK-WARN: [[#@LINE-9]]:1: warning: unterminated string; newline inserted | ||
// CHECK-WARN: ^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can delete ^
checks. With column numbers on the above line, ^
check isn't useful.
Use -NEXT
whenever applicable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
// CHECK-WARN: [[#@LINE-2]]:20: warning: unterminated string; newline inserted | ||
|
||
.ascii "test\nstring\xFF\n\n\xFF" | ||
// CHECK-WARN-NOT: [[#@LINE-1]]{{.*}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use FileCheck --implicit-check-not=warning:
and remove these -NOT:
patterns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@@ -0,0 +1,50 @@ | |||
// RUN: llvm-mc -filetype=obj -triple x86_64-pc-linux-gnu %s 2>&1 -o /dev/null | FileCheck -DFILE=%s --strict-whitespace %s --implicit-check-not=valid1_string --implicit-check-not=valid2_string --implicit-check-not=valid3_string --check-prefix=CHECK-WARN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just use x86_64 as the triple to emphasize that the behavior is for all ELF (actually other object file formats as well, but we don't bother testing), not just Linux.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. I don't have write access so please merge this PR when it is ready
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
GAS emits warning about newline in the string constant so make the same behaviour.
Emit warning about newline characters in strings for `.string`, '.ascii' and '.asciz' directives like GAS.
Move the point of check into `parseEscapedString`. Check if the `Warning()` returned true to stop compilation.
- Remove `WarnNewline` arg - Emit warning to `\n` symbol unconditionally
- Add new testcases
- Revert unneeded change - Refactor test to use string/check style and [[#@line]] instead of hardcoded line numbers
- Use `CHECK-NEXT` - Use `implicit-check-not` instead of `CHECK-WARN-NOT`
- Add one `--implicit-check-not` with the appropriate pattern - Refactor checks
- Remove redundant `--check-prefix` - Remove `-DFILE=`
c584425
to
2f517c3
Compare
@@ -0,0 +1,55 @@ | |||
// RUN: llvm-mc -filetype=obj -triple x86_64 %s 2>&1 -o /dev/null \ | |||
// RUN: | FileCheck %s --implicit-check-not="{{[0-9]+:[0-9]+: warning: unterminated string}}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We usually just test --implicit-check-not=warning:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
subject
The convention is to omit the trailing period in the subject |
I've rebased the branch to check if that would resolve CI failure (it looked unrelated). |
@@ -0,0 +1,55 @@ | |||
// RUN: llvm-mc -filetype=obj -triple x86_64 %s 2>&1 -o /dev/null \ | |||
// RUN: | FileCheck %s --implicit-check-not=warning: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
continuation lines should be indented by 2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
) Summary: GAS emits warning about newline in the string constant so make the same behaviour. Test Plan: Reviewers: Subscribers: Tasks: Tags: Differential Revision: https://phabricator.intern.facebook.com/D60250926
…112995) #98060 introduced a warning for unterminated string constants, however it was only checking for `\n` which means that it produced strange results on Windows (always blaming column 1) including having the [associated test fail](https://buildkite.com/llvm-project/github-pull-requests/builds/111277#0192a122-c5c9-4e4e-bc5b-7532fec99ae4) if Git happened to use Windows newlines when creating the file. This fix for this is to detect both `\r` and `\n`, but don't double-warn for Windows newlines.
GAS emits warning about newline in the string constant so make the same behaviour.