Skip to content

[clang-format] extend clang-format directive with options to prevent formatting for one line #118566

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 15 commits into from

Conversation

a-tarasyuk
Copy link
Member

Fixes #54334

@a-tarasyuk a-tarasyuk marked this pull request as ready for review December 4, 2024 13:56
@llvmbot llvmbot added clang Clang issues not falling into any other category clang-format labels Dec 4, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 4, 2024

@llvm/pr-subscribers-clang

Author: Oleksandr T. (a-tarasyuk)

Changes

Fixes #54334


Full diff: https://github.com/llvm/llvm-project/pull/118566.diff

11 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+1)
  • (modified) clang/include/clang/Format/Format.h (+9-2)
  • (modified) clang/lib/Format/DefinitionBlockSeparator.cpp (+2-1)
  • (modified) clang/lib/Format/Format.cpp (+41-18)
  • (modified) clang/lib/Format/FormatTokenLexer.cpp (+32-7)
  • (modified) clang/lib/Format/FormatTokenLexer.h (+7-1)
  • (modified) clang/lib/Format/IntegerLiteralSeparatorFixer.cpp (+2-2)
  • (modified) clang/lib/Format/SortJavaScriptImports.cpp (+7-3)
  • (modified) clang/lib/Format/TokenAnnotator.cpp (+3-1)
  • (modified) clang/unittests/Format/FormatTest.cpp (+53)
  • (modified) clang/unittests/Format/SortImportsTestJava.cpp (+9)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 755418e9550cf4..1d7fddb2236740 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -997,6 +997,7 @@ clang-format
   ``Never``, and ``true`` to ``Always``.
 - Adds ``RemoveEmptyLinesInUnwrappedLines`` option.
 - Adds ``KeepFormFeed`` option and set it to ``true`` for ``GNU`` style.
+- Adds ``off-line`` and ``off-next-line`` options to the ``clang-format`` directive
 
 libclang
 --------
diff --git a/clang/include/clang/Format/Format.h b/clang/include/clang/Format/Format.h
index 6383934afa2c40..b25d190178247d 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -5620,8 +5620,15 @@ inline StringRef getLanguageName(FormatStyle::LanguageKind Language) {
   }
 }
 
-bool isClangFormatOn(StringRef Comment);
-bool isClangFormatOff(StringRef Comment);
+enum class ClangFormatDirective {
+  None,
+  Off,
+  On,
+  OffLine,
+  OffNextLine,
+};
+
+ClangFormatDirective parseClangFormatDirective(StringRef Comment);
 
 } // end namespace format
 } // end namespace clang
diff --git a/clang/lib/Format/DefinitionBlockSeparator.cpp b/clang/lib/Format/DefinitionBlockSeparator.cpp
index 319236d3bd618c..709ad5e776cc5a 100644
--- a/clang/lib/Format/DefinitionBlockSeparator.cpp
+++ b/clang/lib/Format/DefinitionBlockSeparator.cpp
@@ -144,7 +144,8 @@ void DefinitionBlockSeparator::separateBlocks(
         return false;
 
       if (const auto *Tok = OperateLine->First;
-          Tok->is(tok::comment) && !isClangFormatOn(Tok->TokenText)) {
+          Tok->is(tok::comment) && parseClangFormatDirective(Tok->TokenText) ==
+                                       ClangFormatDirective::None) {
         return true;
       }
 
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index dcaac4b0d42cc5..11802e8f5b3738 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -3266,10 +3266,11 @@ tooling::Replacements sortCppIncludes(const FormatStyle &Style, StringRef Code,
       FormattingOff = false;
 
     bool IsBlockComment = false;
+    ClangFormatDirective CFD = parseClangFormatDirective(Trimmed);
 
-    if (isClangFormatOff(Trimmed)) {
+    if (CFD == ClangFormatDirective::Off) {
       FormattingOff = true;
-    } else if (isClangFormatOn(Trimmed)) {
+    } else if (CFD == ClangFormatDirective::On) {
       FormattingOff = false;
     } else if (Trimmed.starts_with("/*")) {
       IsBlockComment = true;
@@ -3452,9 +3453,10 @@ tooling::Replacements sortJavaImports(const FormatStyle &Style, StringRef Code,
         Code.substr(Prev, (Pos != StringRef::npos ? Pos : Code.size()) - Prev);
 
     StringRef Trimmed = Line.trim();
-    if (isClangFormatOff(Trimmed))
+    ClangFormatDirective CFD = parseClangFormatDirective(Trimmed);
+    if (CFD == ClangFormatDirective::Off)
       FormattingOff = true;
-    else if (isClangFormatOn(Trimmed))
+    else if (CFD == ClangFormatDirective::On)
       FormattingOff = false;
 
     if (ImportRegex.match(Line, &Matches)) {
@@ -4190,24 +4192,45 @@ Expected<FormatStyle> getStyle(StringRef StyleName, StringRef FileName,
   return FallbackStyle;
 }
 
-static bool isClangFormatOnOff(StringRef Comment, bool On) {
-  if (Comment == (On ? "/* clang-format on */" : "/* clang-format off */"))
-    return true;
+static unsigned skipWhitespace(unsigned Pos, StringRef Str, unsigned Length) {
+  while (Pos < Length && isspace(Str[Pos]))
+    ++Pos;
+  return Pos;
+}
 
-  static const char ClangFormatOn[] = "// clang-format on";
-  static const char ClangFormatOff[] = "// clang-format off";
-  const unsigned Size = (On ? sizeof ClangFormatOn : sizeof ClangFormatOff) - 1;
+ClangFormatDirective parseClangFormatDirective(StringRef Comment) {
+  size_t Pos = std::min(Comment.find("/*"), Comment.find("//"));
+  unsigned Length = Comment.size();
+  if (Pos == StringRef::npos)
+    return ClangFormatDirective::None;
 
-  return Comment.starts_with(On ? ClangFormatOn : ClangFormatOff) &&
-         (Comment.size() == Size || Comment[Size] == ':');
-}
+  Pos = skipWhitespace(Pos + 2, Comment, Length);
+  StringRef ClangFormatDirectiveName = "clang-format";
 
-bool isClangFormatOn(StringRef Comment) {
-  return isClangFormatOnOff(Comment, /*On=*/true);
-}
+  if (Comment.substr(Pos, ClangFormatDirectiveName.size()) ==
+      ClangFormatDirectiveName) {
+    Pos =
+        skipWhitespace(Pos + ClangFormatDirectiveName.size(), Comment, Length);
+
+    unsigned EndDirectiveValuePos = Pos;
+    while (EndDirectiveValuePos < Length) {
+      char Char = Comment[EndDirectiveValuePos];
+      if (isspace(Char) || Char == '*' || Char == ':')
+        break;
+
+      ++EndDirectiveValuePos;
+    }
+
+    return llvm::StringSwitch<ClangFormatDirective>(
+               Comment.substr(Pos, EndDirectiveValuePos - Pos))
+        .Case("off", ClangFormatDirective::Off)
+        .Case("on", ClangFormatDirective::On)
+        .Case("off-line", ClangFormatDirective::OffLine)
+        .Case("off-next-line", ClangFormatDirective::OffNextLine)
+        .Default(ClangFormatDirective::None);
+  }
 
-bool isClangFormatOff(StringRef Comment) {
-  return isClangFormatOnOff(Comment, /*On=*/false);
+  return ClangFormatDirective::None;
 }
 
 } // namespace format
diff --git a/clang/lib/Format/FormatTokenLexer.cpp b/clang/lib/Format/FormatTokenLexer.cpp
index 7a264bddcdfe19..1bcb4acd7e8ad6 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -32,7 +32,8 @@ FormatTokenLexer::FormatTokenLexer(
       LangOpts(getFormattingLangOpts(Style)), SourceMgr(SourceMgr), ID(ID),
       Style(Style), IdentTable(IdentTable), Keywords(IdentTable),
       Encoding(Encoding), Allocator(Allocator), FirstInLineIndex(0),
-      FormattingDisabled(false), MacroBlockBeginRegex(Style.MacroBlockBegin),
+      FDS(FormatDisableState::None),
+      MacroBlockBeginRegex(Style.MacroBlockBegin),
       MacroBlockEndRegex(Style.MacroBlockEnd) {
   Lex.reset(new Lexer(ID, SourceMgr.getBufferOrFake(ID), SourceMgr, LangOpts));
   Lex->SetKeepWhitespaceMode(true);
@@ -82,7 +83,23 @@ ArrayRef<FormatToken *> FormatTokenLexer::lex() {
   assert(Tokens.empty());
   assert(FirstInLineIndex == 0);
   do {
-    Tokens.push_back(getNextToken());
+    FormatToken *NextToken = getNextToken();
+
+    if (FDS == FormatDisableState::None && NextToken->is(tok::comment) &&
+        parseClangFormatDirective(NextToken->TokenText) ==
+            ClangFormatDirective::OffLine) {
+      for (unsigned i = FirstInLineIndex; i < Tokens.size(); ++i)
+        Tokens[i]->Finalized = true;
+    }
+
+    if (Tokens.size() >= 1 && Tokens.back()->isNot(tok::comment) &&
+        FDS == FormatDisableState::SingleLine &&
+        (NextToken->NewlinesBefore > 0 || NextToken->IsMultiline)) {
+      FDS = FormatDisableState::None;
+    }
+
+    Tokens.push_back(NextToken);
+
     if (Style.isJavaScript()) {
       tryParseJSRegexLiteral();
       handleTemplateStrings();
@@ -1450,13 +1467,21 @@ void FormatTokenLexer::readRawToken(FormatToken &Tok) {
   if ((Style.isJavaScript() || Style.isProto()) && Tok.is(tok::char_constant))
     Tok.Tok.setKind(tok::string_literal);
 
-  if (Tok.is(tok::comment) && isClangFormatOn(Tok.TokenText))
-    FormattingDisabled = false;
+  if (Tok.is(tok::comment) &&
+      parseClangFormatDirective(Tok.TokenText) == ClangFormatDirective::On) {
+    FDS = FormatDisableState::None;
+  }
+
+  Tok.Finalized = FDS != FormatDisableState::None;
 
-  Tok.Finalized = FormattingDisabled;
+  if (Tok.is(tok::comment)) {
+    ClangFormatDirective FSD = parseClangFormatDirective(Tok.TokenText);
+    if (FSD == ClangFormatDirective::Off)
+      FDS = FormatDisableState::Range;
 
-  if (Tok.is(tok::comment) && isClangFormatOff(Tok.TokenText))
-    FormattingDisabled = true;
+    if (FSD == ClangFormatDirective::OffNextLine)
+      FDS = FormatDisableState::SingleLine;
+  }
 }
 
 void FormatTokenLexer::resetLexer(unsigned Offset) {
diff --git a/clang/lib/Format/FormatTokenLexer.h b/clang/lib/Format/FormatTokenLexer.h
index 71389d2ade2b73..3d3338357bab05 100644
--- a/clang/lib/Format/FormatTokenLexer.h
+++ b/clang/lib/Format/FormatTokenLexer.h
@@ -32,6 +32,12 @@ enum LexerState {
   TOKEN_STASHED,
 };
 
+enum class FormatDisableState {
+  None,
+  SingleLine,
+  Range,
+};
+
 class FormatTokenLexer {
 public:
   FormatTokenLexer(const SourceManager &SourceMgr, FileID ID, unsigned Column,
@@ -131,7 +137,7 @@ class FormatTokenLexer {
 
   llvm::SmallPtrSet<IdentifierInfo *, 8> TemplateNames, TypeNames;
 
-  bool FormattingDisabled;
+  FormatDisableState FDS;
 
   llvm::Regex MacroBlockBeginRegex;
   llvm::Regex MacroBlockEndRegex;
diff --git a/clang/lib/Format/IntegerLiteralSeparatorFixer.cpp b/clang/lib/Format/IntegerLiteralSeparatorFixer.cpp
index 87823ae32b1138..e93a8a84561002 100644
--- a/clang/lib/Format/IntegerLiteralSeparatorFixer.cpp
+++ b/clang/lib/Format/IntegerLiteralSeparatorFixer.cpp
@@ -93,9 +93,9 @@ IntegerLiteralSeparatorFixer::process(const Environment &Env,
     auto Location = Tok.getLocation();
     auto Text = StringRef(SourceMgr.getCharacterData(Location), Length);
     if (Tok.is(tok::comment)) {
-      if (isClangFormatOff(Text))
+      if (parseClangFormatDirective(Text) == ClangFormatDirective::Off)
         Skip = true;
-      else if (isClangFormatOn(Text))
+      else if (parseClangFormatDirective(Text) == ClangFormatDirective::On)
         Skip = false;
       continue;
     }
diff --git a/clang/lib/Format/SortJavaScriptImports.cpp b/clang/lib/Format/SortJavaScriptImports.cpp
index 1acce26ff2795e..f5038dd2080094 100644
--- a/clang/lib/Format/SortJavaScriptImports.cpp
+++ b/clang/lib/Format/SortJavaScriptImports.cpp
@@ -194,7 +194,9 @@ class JavaScriptImportSorter : public TokenAnalyzer {
     // Separate references from the main code body of the file.
     if (FirstNonImportLine && FirstNonImportLine->First->NewlinesBefore < 2 &&
         !(FirstNonImportLine->First->is(tok::comment) &&
-          isClangFormatOn(FirstNonImportLine->First->TokenText.trim()))) {
+          parseClangFormatDirective(
+              FirstNonImportLine->First->TokenText.trim()) ==
+              ClangFormatDirective::On)) {
       ReferencesText += "\n";
     }
 
@@ -375,9 +377,11 @@ class JavaScriptImportSorter : public TokenAnalyzer {
       // This is tracked in FormattingOff here and on JsModuleReference.
       while (Current && Current->is(tok::comment)) {
         StringRef CommentText = Current->TokenText.trim();
-        if (isClangFormatOff(CommentText)) {
+        ClangFormatDirective CFD = parseClangFormatDirective(CommentText);
+
+        if (CFD == ClangFormatDirective::Off) {
           FormattingOff = true;
-        } else if (isClangFormatOn(CommentText)) {
+        } else if (CFD == ClangFormatDirective::On) {
           FormattingOff = false;
           // Special case: consider a trailing "clang-format on" line to be part
           // of the module reference, so that it gets moved around together with
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index bc5239209f3aab..d73c7ed46df194 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -3559,7 +3559,9 @@ void TokenAnnotator::setCommentLineLevels(
     // If the comment is currently aligned with the line immediately following
     // it, that's probably intentional and we should keep it.
     if (NextNonCommentLine && NextNonCommentLine->First->NewlinesBefore < 2 &&
-        Line->isComment() && !isClangFormatOff(Line->First->TokenText) &&
+        Line->isComment() &&
+        parseClangFormatDirective(Line->First->TokenText) ==
+            ClangFormatDirective::None &&
         NextNonCommentLine->First->OriginalColumn ==
             Line->First->OriginalColumn) {
       const bool PPDirectiveOrImportStmt =
diff --git a/clang/unittests/Format/FormatTest.cpp b/clang/unittests/Format/FormatTest.cpp
index 250e51b5421664..567c57c821dd69 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -28273,6 +28273,59 @@ TEST_F(FormatTest, KeepFormFeed) {
                Style);
 }
 
+TEST_F(FormatTest, DisableLine) {
+  verifyFormat("int  a  =  1; // clang-format off-line\n"
+               "int b = 1;",
+               "int  a  =  1; // clang-format off-line\n"
+               "int  b  =  1;");
+
+  verifyFormat("int a = 1;\n"
+               "int  b  =  1; // clang-format off-line\n"
+               "int c = 1;",
+               "int  a  =  1;\n"
+               "int  b  =  1; // clang-format off-line\n"
+               "int  c  =  1;");
+
+  verifyFormat("int  a  =  1; /* clang-format off-line */\n"
+               "int b = 1;",
+               "int  a  =  1; /* clang-format off-line */\n"
+               "int  b  =  1;");
+
+  verifyFormat("int a = 1;\n"
+               "int  b  =  1; /* clang-format off-line */\n"
+               "int c = 1;",
+               "int  a  =  1;\n"
+               "int  b  =  1; /* clang-format off-line */\n"
+               "int  c  =  1;");
+}
+
+TEST_F(FormatTest, DisableNextLine) {
+  verifyFormat("// clang-format off-next-line\n"
+               "int  a  =  1;\n"
+               "int b = 1;",
+               "// clang-format off-next-line\n"
+               "int  a  =  1;\n"
+               "int  b  =  1;");
+
+  verifyFormat("// clang-format off-next-line\n"
+               "\n"
+               "\n"
+               "int  a  =  1;\n"
+               "int b = 1;",
+               "// clang-format off-next-line\n"
+               "\n"
+               "\n"
+               "int  a  =  1;\n"
+               "int  b  =  1;");
+
+  verifyFormat("/* clang-format off-next-line */\n"
+               "int  a  =  1;\n"
+               "int b = 1;",
+               "/* clang-format off-next-line */\n"
+               "int  a  =  1;\n"
+               "int  b  =  1;");
+}
+
 } // namespace
 } // namespace test
 } // namespace format
diff --git a/clang/unittests/Format/SortImportsTestJava.cpp b/clang/unittests/Format/SortImportsTestJava.cpp
index d577efa34f86e8..c77e1661300fb5 100644
--- a/clang/unittests/Format/SortImportsTestJava.cpp
+++ b/clang/unittests/Format/SortImportsTestJava.cpp
@@ -235,6 +235,15 @@ TEST_F(SortImportsTestJava, FormatTotallyOn) {
                  "import org.a;"));
 }
 
+TEST_F(SortImportsTestJava, DisableLine) {
+  EXPECT_EQ("// clang-format on\n"
+            "import org.a;\n"
+            "import org.b;",
+            sort("// clang-format on\n"
+                 "import org.b;\n"
+                 "import org.a;"));
+}
+
 TEST_F(SortImportsTestJava, FormatPariallyOnShouldNotReorder) {
   EXPECT_EQ("// clang-format off\n"
             "import org.b;\n"

@llvmbot
Copy link
Member

llvmbot commented Dec 4, 2024

@llvm/pr-subscribers-clang-format

Author: Oleksandr T. (a-tarasyuk)

Changes

Fixes #54334


Full diff: https://github.com/llvm/llvm-project/pull/118566.diff

11 Files Affected:

  • (modified) clang/docs/ReleaseNotes.rst (+1)
  • (modified) clang/include/clang/Format/Format.h (+9-2)
  • (modified) clang/lib/Format/DefinitionBlockSeparator.cpp (+2-1)
  • (modified) clang/lib/Format/Format.cpp (+41-18)
  • (modified) clang/lib/Format/FormatTokenLexer.cpp (+32-7)
  • (modified) clang/lib/Format/FormatTokenLexer.h (+7-1)
  • (modified) clang/lib/Format/IntegerLiteralSeparatorFixer.cpp (+2-2)
  • (modified) clang/lib/Format/SortJavaScriptImports.cpp (+7-3)
  • (modified) clang/lib/Format/TokenAnnotator.cpp (+3-1)
  • (modified) clang/unittests/Format/FormatTest.cpp (+53)
  • (modified) clang/unittests/Format/SortImportsTestJava.cpp (+9)
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 755418e9550cf4..1d7fddb2236740 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -997,6 +997,7 @@ clang-format
   ``Never``, and ``true`` to ``Always``.
 - Adds ``RemoveEmptyLinesInUnwrappedLines`` option.
 - Adds ``KeepFormFeed`` option and set it to ``true`` for ``GNU`` style.
+- Adds ``off-line`` and ``off-next-line`` options to the ``clang-format`` directive
 
 libclang
 --------
diff --git a/clang/include/clang/Format/Format.h b/clang/include/clang/Format/Format.h
index 6383934afa2c40..b25d190178247d 100644
--- a/clang/include/clang/Format/Format.h
+++ b/clang/include/clang/Format/Format.h
@@ -5620,8 +5620,15 @@ inline StringRef getLanguageName(FormatStyle::LanguageKind Language) {
   }
 }
 
-bool isClangFormatOn(StringRef Comment);
-bool isClangFormatOff(StringRef Comment);
+enum class ClangFormatDirective {
+  None,
+  Off,
+  On,
+  OffLine,
+  OffNextLine,
+};
+
+ClangFormatDirective parseClangFormatDirective(StringRef Comment);
 
 } // end namespace format
 } // end namespace clang
diff --git a/clang/lib/Format/DefinitionBlockSeparator.cpp b/clang/lib/Format/DefinitionBlockSeparator.cpp
index 319236d3bd618c..709ad5e776cc5a 100644
--- a/clang/lib/Format/DefinitionBlockSeparator.cpp
+++ b/clang/lib/Format/DefinitionBlockSeparator.cpp
@@ -144,7 +144,8 @@ void DefinitionBlockSeparator::separateBlocks(
         return false;
 
       if (const auto *Tok = OperateLine->First;
-          Tok->is(tok::comment) && !isClangFormatOn(Tok->TokenText)) {
+          Tok->is(tok::comment) && parseClangFormatDirective(Tok->TokenText) ==
+                                       ClangFormatDirective::None) {
         return true;
       }
 
diff --git a/clang/lib/Format/Format.cpp b/clang/lib/Format/Format.cpp
index dcaac4b0d42cc5..11802e8f5b3738 100644
--- a/clang/lib/Format/Format.cpp
+++ b/clang/lib/Format/Format.cpp
@@ -3266,10 +3266,11 @@ tooling::Replacements sortCppIncludes(const FormatStyle &Style, StringRef Code,
       FormattingOff = false;
 
     bool IsBlockComment = false;
+    ClangFormatDirective CFD = parseClangFormatDirective(Trimmed);
 
-    if (isClangFormatOff(Trimmed)) {
+    if (CFD == ClangFormatDirective::Off) {
       FormattingOff = true;
-    } else if (isClangFormatOn(Trimmed)) {
+    } else if (CFD == ClangFormatDirective::On) {
       FormattingOff = false;
     } else if (Trimmed.starts_with("/*")) {
       IsBlockComment = true;
@@ -3452,9 +3453,10 @@ tooling::Replacements sortJavaImports(const FormatStyle &Style, StringRef Code,
         Code.substr(Prev, (Pos != StringRef::npos ? Pos : Code.size()) - Prev);
 
     StringRef Trimmed = Line.trim();
-    if (isClangFormatOff(Trimmed))
+    ClangFormatDirective CFD = parseClangFormatDirective(Trimmed);
+    if (CFD == ClangFormatDirective::Off)
       FormattingOff = true;
-    else if (isClangFormatOn(Trimmed))
+    else if (CFD == ClangFormatDirective::On)
       FormattingOff = false;
 
     if (ImportRegex.match(Line, &Matches)) {
@@ -4190,24 +4192,45 @@ Expected<FormatStyle> getStyle(StringRef StyleName, StringRef FileName,
   return FallbackStyle;
 }
 
-static bool isClangFormatOnOff(StringRef Comment, bool On) {
-  if (Comment == (On ? "/* clang-format on */" : "/* clang-format off */"))
-    return true;
+static unsigned skipWhitespace(unsigned Pos, StringRef Str, unsigned Length) {
+  while (Pos < Length && isspace(Str[Pos]))
+    ++Pos;
+  return Pos;
+}
 
-  static const char ClangFormatOn[] = "// clang-format on";
-  static const char ClangFormatOff[] = "// clang-format off";
-  const unsigned Size = (On ? sizeof ClangFormatOn : sizeof ClangFormatOff) - 1;
+ClangFormatDirective parseClangFormatDirective(StringRef Comment) {
+  size_t Pos = std::min(Comment.find("/*"), Comment.find("//"));
+  unsigned Length = Comment.size();
+  if (Pos == StringRef::npos)
+    return ClangFormatDirective::None;
 
-  return Comment.starts_with(On ? ClangFormatOn : ClangFormatOff) &&
-         (Comment.size() == Size || Comment[Size] == ':');
-}
+  Pos = skipWhitespace(Pos + 2, Comment, Length);
+  StringRef ClangFormatDirectiveName = "clang-format";
 
-bool isClangFormatOn(StringRef Comment) {
-  return isClangFormatOnOff(Comment, /*On=*/true);
-}
+  if (Comment.substr(Pos, ClangFormatDirectiveName.size()) ==
+      ClangFormatDirectiveName) {
+    Pos =
+        skipWhitespace(Pos + ClangFormatDirectiveName.size(), Comment, Length);
+
+    unsigned EndDirectiveValuePos = Pos;
+    while (EndDirectiveValuePos < Length) {
+      char Char = Comment[EndDirectiveValuePos];
+      if (isspace(Char) || Char == '*' || Char == ':')
+        break;
+
+      ++EndDirectiveValuePos;
+    }
+
+    return llvm::StringSwitch<ClangFormatDirective>(
+               Comment.substr(Pos, EndDirectiveValuePos - Pos))
+        .Case("off", ClangFormatDirective::Off)
+        .Case("on", ClangFormatDirective::On)
+        .Case("off-line", ClangFormatDirective::OffLine)
+        .Case("off-next-line", ClangFormatDirective::OffNextLine)
+        .Default(ClangFormatDirective::None);
+  }
 
-bool isClangFormatOff(StringRef Comment) {
-  return isClangFormatOnOff(Comment, /*On=*/false);
+  return ClangFormatDirective::None;
 }
 
 } // namespace format
diff --git a/clang/lib/Format/FormatTokenLexer.cpp b/clang/lib/Format/FormatTokenLexer.cpp
index 7a264bddcdfe19..1bcb4acd7e8ad6 100644
--- a/clang/lib/Format/FormatTokenLexer.cpp
+++ b/clang/lib/Format/FormatTokenLexer.cpp
@@ -32,7 +32,8 @@ FormatTokenLexer::FormatTokenLexer(
       LangOpts(getFormattingLangOpts(Style)), SourceMgr(SourceMgr), ID(ID),
       Style(Style), IdentTable(IdentTable), Keywords(IdentTable),
       Encoding(Encoding), Allocator(Allocator), FirstInLineIndex(0),
-      FormattingDisabled(false), MacroBlockBeginRegex(Style.MacroBlockBegin),
+      FDS(FormatDisableState::None),
+      MacroBlockBeginRegex(Style.MacroBlockBegin),
       MacroBlockEndRegex(Style.MacroBlockEnd) {
   Lex.reset(new Lexer(ID, SourceMgr.getBufferOrFake(ID), SourceMgr, LangOpts));
   Lex->SetKeepWhitespaceMode(true);
@@ -82,7 +83,23 @@ ArrayRef<FormatToken *> FormatTokenLexer::lex() {
   assert(Tokens.empty());
   assert(FirstInLineIndex == 0);
   do {
-    Tokens.push_back(getNextToken());
+    FormatToken *NextToken = getNextToken();
+
+    if (FDS == FormatDisableState::None && NextToken->is(tok::comment) &&
+        parseClangFormatDirective(NextToken->TokenText) ==
+            ClangFormatDirective::OffLine) {
+      for (unsigned i = FirstInLineIndex; i < Tokens.size(); ++i)
+        Tokens[i]->Finalized = true;
+    }
+
+    if (Tokens.size() >= 1 && Tokens.back()->isNot(tok::comment) &&
+        FDS == FormatDisableState::SingleLine &&
+        (NextToken->NewlinesBefore > 0 || NextToken->IsMultiline)) {
+      FDS = FormatDisableState::None;
+    }
+
+    Tokens.push_back(NextToken);
+
     if (Style.isJavaScript()) {
       tryParseJSRegexLiteral();
       handleTemplateStrings();
@@ -1450,13 +1467,21 @@ void FormatTokenLexer::readRawToken(FormatToken &Tok) {
   if ((Style.isJavaScript() || Style.isProto()) && Tok.is(tok::char_constant))
     Tok.Tok.setKind(tok::string_literal);
 
-  if (Tok.is(tok::comment) && isClangFormatOn(Tok.TokenText))
-    FormattingDisabled = false;
+  if (Tok.is(tok::comment) &&
+      parseClangFormatDirective(Tok.TokenText) == ClangFormatDirective::On) {
+    FDS = FormatDisableState::None;
+  }
+
+  Tok.Finalized = FDS != FormatDisableState::None;
 
-  Tok.Finalized = FormattingDisabled;
+  if (Tok.is(tok::comment)) {
+    ClangFormatDirective FSD = parseClangFormatDirective(Tok.TokenText);
+    if (FSD == ClangFormatDirective::Off)
+      FDS = FormatDisableState::Range;
 
-  if (Tok.is(tok::comment) && isClangFormatOff(Tok.TokenText))
-    FormattingDisabled = true;
+    if (FSD == ClangFormatDirective::OffNextLine)
+      FDS = FormatDisableState::SingleLine;
+  }
 }
 
 void FormatTokenLexer::resetLexer(unsigned Offset) {
diff --git a/clang/lib/Format/FormatTokenLexer.h b/clang/lib/Format/FormatTokenLexer.h
index 71389d2ade2b73..3d3338357bab05 100644
--- a/clang/lib/Format/FormatTokenLexer.h
+++ b/clang/lib/Format/FormatTokenLexer.h
@@ -32,6 +32,12 @@ enum LexerState {
   TOKEN_STASHED,
 };
 
+enum class FormatDisableState {
+  None,
+  SingleLine,
+  Range,
+};
+
 class FormatTokenLexer {
 public:
   FormatTokenLexer(const SourceManager &SourceMgr, FileID ID, unsigned Column,
@@ -131,7 +137,7 @@ class FormatTokenLexer {
 
   llvm::SmallPtrSet<IdentifierInfo *, 8> TemplateNames, TypeNames;
 
-  bool FormattingDisabled;
+  FormatDisableState FDS;
 
   llvm::Regex MacroBlockBeginRegex;
   llvm::Regex MacroBlockEndRegex;
diff --git a/clang/lib/Format/IntegerLiteralSeparatorFixer.cpp b/clang/lib/Format/IntegerLiteralSeparatorFixer.cpp
index 87823ae32b1138..e93a8a84561002 100644
--- a/clang/lib/Format/IntegerLiteralSeparatorFixer.cpp
+++ b/clang/lib/Format/IntegerLiteralSeparatorFixer.cpp
@@ -93,9 +93,9 @@ IntegerLiteralSeparatorFixer::process(const Environment &Env,
     auto Location = Tok.getLocation();
     auto Text = StringRef(SourceMgr.getCharacterData(Location), Length);
     if (Tok.is(tok::comment)) {
-      if (isClangFormatOff(Text))
+      if (parseClangFormatDirective(Text) == ClangFormatDirective::Off)
         Skip = true;
-      else if (isClangFormatOn(Text))
+      else if (parseClangFormatDirective(Text) == ClangFormatDirective::On)
         Skip = false;
       continue;
     }
diff --git a/clang/lib/Format/SortJavaScriptImports.cpp b/clang/lib/Format/SortJavaScriptImports.cpp
index 1acce26ff2795e..f5038dd2080094 100644
--- a/clang/lib/Format/SortJavaScriptImports.cpp
+++ b/clang/lib/Format/SortJavaScriptImports.cpp
@@ -194,7 +194,9 @@ class JavaScriptImportSorter : public TokenAnalyzer {
     // Separate references from the main code body of the file.
     if (FirstNonImportLine && FirstNonImportLine->First->NewlinesBefore < 2 &&
         !(FirstNonImportLine->First->is(tok::comment) &&
-          isClangFormatOn(FirstNonImportLine->First->TokenText.trim()))) {
+          parseClangFormatDirective(
+              FirstNonImportLine->First->TokenText.trim()) ==
+              ClangFormatDirective::On)) {
       ReferencesText += "\n";
     }
 
@@ -375,9 +377,11 @@ class JavaScriptImportSorter : public TokenAnalyzer {
       // This is tracked in FormattingOff here and on JsModuleReference.
       while (Current && Current->is(tok::comment)) {
         StringRef CommentText = Current->TokenText.trim();
-        if (isClangFormatOff(CommentText)) {
+        ClangFormatDirective CFD = parseClangFormatDirective(CommentText);
+
+        if (CFD == ClangFormatDirective::Off) {
           FormattingOff = true;
-        } else if (isClangFormatOn(CommentText)) {
+        } else if (CFD == ClangFormatDirective::On) {
           FormattingOff = false;
           // Special case: consider a trailing "clang-format on" line to be part
           // of the module reference, so that it gets moved around together with
diff --git a/clang/lib/Format/TokenAnnotator.cpp b/clang/lib/Format/TokenAnnotator.cpp
index bc5239209f3aab..d73c7ed46df194 100644
--- a/clang/lib/Format/TokenAnnotator.cpp
+++ b/clang/lib/Format/TokenAnnotator.cpp
@@ -3559,7 +3559,9 @@ void TokenAnnotator::setCommentLineLevels(
     // If the comment is currently aligned with the line immediately following
     // it, that's probably intentional and we should keep it.
     if (NextNonCommentLine && NextNonCommentLine->First->NewlinesBefore < 2 &&
-        Line->isComment() && !isClangFormatOff(Line->First->TokenText) &&
+        Line->isComment() &&
+        parseClangFormatDirective(Line->First->TokenText) ==
+            ClangFormatDirective::None &&
         NextNonCommentLine->First->OriginalColumn ==
             Line->First->OriginalColumn) {
       const bool PPDirectiveOrImportStmt =
diff --git a/clang/unittests/Format/FormatTest.cpp b/clang/unittests/Format/FormatTest.cpp
index 250e51b5421664..567c57c821dd69 100644
--- a/clang/unittests/Format/FormatTest.cpp
+++ b/clang/unittests/Format/FormatTest.cpp
@@ -28273,6 +28273,59 @@ TEST_F(FormatTest, KeepFormFeed) {
                Style);
 }
 
+TEST_F(FormatTest, DisableLine) {
+  verifyFormat("int  a  =  1; // clang-format off-line\n"
+               "int b = 1;",
+               "int  a  =  1; // clang-format off-line\n"
+               "int  b  =  1;");
+
+  verifyFormat("int a = 1;\n"
+               "int  b  =  1; // clang-format off-line\n"
+               "int c = 1;",
+               "int  a  =  1;\n"
+               "int  b  =  1; // clang-format off-line\n"
+               "int  c  =  1;");
+
+  verifyFormat("int  a  =  1; /* clang-format off-line */\n"
+               "int b = 1;",
+               "int  a  =  1; /* clang-format off-line */\n"
+               "int  b  =  1;");
+
+  verifyFormat("int a = 1;\n"
+               "int  b  =  1; /* clang-format off-line */\n"
+               "int c = 1;",
+               "int  a  =  1;\n"
+               "int  b  =  1; /* clang-format off-line */\n"
+               "int  c  =  1;");
+}
+
+TEST_F(FormatTest, DisableNextLine) {
+  verifyFormat("// clang-format off-next-line\n"
+               "int  a  =  1;\n"
+               "int b = 1;",
+               "// clang-format off-next-line\n"
+               "int  a  =  1;\n"
+               "int  b  =  1;");
+
+  verifyFormat("// clang-format off-next-line\n"
+               "\n"
+               "\n"
+               "int  a  =  1;\n"
+               "int b = 1;",
+               "// clang-format off-next-line\n"
+               "\n"
+               "\n"
+               "int  a  =  1;\n"
+               "int  b  =  1;");
+
+  verifyFormat("/* clang-format off-next-line */\n"
+               "int  a  =  1;\n"
+               "int b = 1;",
+               "/* clang-format off-next-line */\n"
+               "int  a  =  1;\n"
+               "int  b  =  1;");
+}
+
 } // namespace
 } // namespace test
 } // namespace format
diff --git a/clang/unittests/Format/SortImportsTestJava.cpp b/clang/unittests/Format/SortImportsTestJava.cpp
index d577efa34f86e8..c77e1661300fb5 100644
--- a/clang/unittests/Format/SortImportsTestJava.cpp
+++ b/clang/unittests/Format/SortImportsTestJava.cpp
@@ -235,6 +235,15 @@ TEST_F(SortImportsTestJava, FormatTotallyOn) {
                  "import org.a;"));
 }
 
+TEST_F(SortImportsTestJava, DisableLine) {
+  EXPECT_EQ("// clang-format on\n"
+            "import org.a;\n"
+            "import org.b;",
+            sort("// clang-format on\n"
+                 "import org.b;\n"
+                 "import org.a;"));
+}
+
 TEST_F(SortImportsTestJava, FormatPariallyOnShouldNotReorder) {
   EXPECT_EQ("// clang-format off\n"
             "import org.b;\n"

@mydeveloperday
Copy link
Contributor

I struggle with changes that encourage people to not be fully clang-formatted, I would prefer to ask why we need this feature, can we have some examples of where this would be used?

@BrandonStaab
Copy link

I struggle with changes that encourage people to not be fully clang-formatted, I would prefer to ask why we need this feature, can we have some examples of where this would be used?

This makes it so only one line isn't formatted instead of the current solution which is 2 lines unformatted. Additionally, if you start writing code in between the unformatted region you inherit the unformatted scope. Being able to disable the formatter for only a single line means the formatter will be disabled for the shortest amount of code possible.

@mydeveloperday
Copy link
Contributor

I struggle with changes that encourage people to not be fully clang-formatted, I would prefer to ask why we need this feature, can we have some examples of where this would be used?

This makes it so only one line isn't formatted instead of the current solution which is 2 lines unformatted. Additionally, if you start writing code in between the unformatted region you inherit the unformatted scope. Being able to disable the formatter for only a single line means the formatter will be disabled for the shortest amount of code possible.

I understand what is can be used for, I'm asking why its needed? I don't understand why people are needing to unformat just one line, what is broken?

The implementation IMHO just complicates the code (I much prefer the isClangFormatOn() function than the parseXXXX())

I don't even deny it might be a nice to have, my concern is why do we continue to appease the people who don't want to use clang-format warts and all. I would prefer we put the effort into fixing the formatting issues which mean people are having to use this rather than marking hundreds of lines as // clang-format off.

Rather than complicating the code with a myriad of rules that effectively give half a dozen ways of doing the same thing. To me the current // clang-format off/on is enough, for everything else log an issue, or better still submit a pull request.

@HazardyKnusperkeks
Copy link
Contributor

I struggle with changes that encourage people to not be fully clang-formatted, I would prefer to ask why we need this feature, can we have some examples of where this would be used?

This makes it so only one line isn't formatted instead of the current solution which is 2 lines unformatted. Additionally, if you start writing code in between the unformatted region you inherit the unformatted scope. Being able to disable the formatter for only a single line means the formatter will be disabled for the shortest amount of code possible.

I understand what is can be used for, I'm asking why its needed? I don't understand why people are needing to unformat just one line, what is broken?

The implementation IMHO just complicates the code (I much prefer the isClangFormatOn() function than the parseXXXX())

I don't even deny it might be a nice to have, my concern is why do we continue to appease the people who don't want to use clang-format warts and all. I would prefer we put the effort into fixing the formatting issues which mean people are having to use this rather than marking hundreds of lines as // clang-format off.

Rather than complicating the code with a myriad of rules that effectively give half a dozen ways of doing the same thing. To me the current // clang-format off/on is enough, for everything else log an issue, or better still submit a pull request.

I think there are multiple reasons to disable formatting for a range, and be the range only one line. A nice multiline example I stumbled upon is:

    Tree<char> tree[]
    {
                                    {'D', tree + 1, tree + 2},
        //
        //            ┌───────────────┴────────────────┐
        //            │                                │
                    {'B', tree + 3, tree + 4},       {'F', tree + 5, tree + 6},
        //            │                                │
        //  ┌─────────┴─────────────┐      ┌───────────┴─────────────┐
        //  │                       │      │                         │
          {'A'},                  {'C'}, {'E'},                    {'G'}
    };

(Kindly borrowed from https://en.cppreference.com/w/cpp/coroutine/generator)
We will never be able to format something like this with clang-format.

I myself have a few places where automatic formatting would be possible, but kind of hard, they all have to do with some aligning.

And #54334 (comment) makes a good example on why only one line disabling would be desired.

Also this would bring us in par with clang-tidy which has NOLINT and NOLINTNEXTLINE.

@owenca
Copy link
Contributor

owenca commented Dec 24, 2024

A nice multiline example I stumbled upon is:

    Tree<char> tree[]
    {
                                    {'D', tree + 1, tree + 2},
        //
        //            ┌───────────────┴────────────────┐
        //            │                                │
                    {'B', tree + 3, tree + 4},       {'F', tree + 5, tree + 6},
        //            │                                │
        //  ┌─────────┴─────────────┐      ┌───────────┴─────────────┐
        //  │                       │      │                         │
          {'A'},                  {'C'}, {'E'},                    {'G'}
    };

Which line(s) do you want clang-format to skip here?

And #54334 (comment) makes a good example on why only one line disabling would be desired.

The current way to skip the line in that example is as follows:

int foo(Resources *resources, int i, int j, int k) {
  if (i < 0 && j < 0) {
    // clang-format off
    myproject::detail::LogErrorPrintf( resources->logger, "both i and j can not be negative at the same time.\ni = %d, j = %d\n", i, j);
    // clang-format on
    return -1;
  }

  if (i < 0) {
    j *= 10;
  }
  if (j < 0) {
    k += 5;
  }
  return i + j * k;
}

What's wrong with that other than it may be less convenient? Actually, I prefer the current way as it makes the skipped line stand out. If we were to add // clang-format off-next-line, would "next line" mean the next physical or logical/unwrapped line?

@a-tarasyuk
Copy link
Member Author

If we were to add // clang-format off-next-line, would "next line" mean the next physical or logical/unwrapped line?

I would expect it to apply only to the physical line, similar to how other formatters work. However, the main concern doesn’t seem to be about its behavior but rather about extending clang-format directive with new options at all.

To me the current // clang-format off/on is enough, for everything else log an issue, or better still submit a pull request.

In most cases where users find the formatting unacceptable, they tend to disable it, while only a narrow group take the initiative to address the root cause by submitting pull requests or suggesting new rules. For instance, a search for clang-format usage on GitHub, even just for C++, shows hundreds of uses across various scenarios. Analyzing all of them and covering them with new rules would take a significant amount of time,. anyway, we can assume that these options are not without value.

From my perspective, these options are just additional ways to control formatting and give users more flexibility. These options aren’t intended to eliminate clang-format usage, and based on user feedback on the issue, it seems there are cases in which these options are useful. For instance,

// clang-format off
CommandStyle(set,Set);
// clang-format on

vs

CommandStyle(set,Set); // clang-format off-line

We don't know the context of why it was disabled. However, I would use // clang-format off-line in a similar way to // NOLINT., it's up to personal preference.

If these options introduce significant complexity that could lead to regressions without improving the development experience, I believe the discussion should be wrapped up, and both the PR and issue should be closed. Otherwise, it would be helpful to continue the discussion in the issue to gather more insights into why users need these changes. Does that make sense?

Copy link

github-actions bot commented Dec 30, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@owenca
Copy link
Contributor

owenca commented Jan 2, 2025

If we were to add // clang-format off-next-line, would "next line" mean the next physical or logical/unwrapped line?

I would expect it to apply only to the physical line, similar to how other formatters work. However, the main concern doesn’t seem to be about its behavior but rather about extending clang-format directive with new options at all.

This is an essential detail missing from the description of the proposed option for us to accept a new option and review the patch.

From my perspective, these options are just additional ways to control formatting and give users more flexibility. These options aren’t intended to eliminate clang-format usage, and based on user feedback on the issue, it seems there are cases in which these options are useful. For instance,

// clang-format off
CommandStyle(set,Set);
// clang-format on

vs

CommandStyle(set,Set); // clang-format off-line

For simplicity and clarify, I might accept a lone comment above the (physical) line to be skipped by clang-format, but not a trailing comment.

If these options introduce significant complexity that could lead to regressions without improving the development experience, I believe the discussion should be wrapped up, and both the PR and issue should be closed. Otherwise, it would be helpful to continue the discussion in the issue to gather more insights into why users need these changes. Does that make sense?

I’m fine with continuing the discussion in the GitHub issue, but I’m still with @mydeveloperday on this one.

@a-tarasyuk
Copy link
Member Author

I’m fine with continuing the discussion in the GitHub issue

Ok, let's move the discussion to the issue, and I'll delete the PR. From there, we can summarize all the concerns to proceed with a decision on this proposal. I think it would be useful to add comments or update the status of the issue to avoid anyone spending time on development because before I started working on it proposal seemed to have no concerns and was ready for development.

@a-tarasyuk a-tarasyuk closed this Jan 2, 2025
@owenca owenca removed the clang Clang issues not falling into any other category label Jan 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

clang-format: Disabling Formatting on a one line
6 participants