You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add missing UTF-8 and raw string literal compiler diagnostics (#50393)
* Add new diagnostics
Add the new diagnostics that weren't already included in an appropriate diagnostic issue.
* Update affected files to focus on fixes
* copy edit
* fix lint issues
* lint, part 2
* Apply suggestions from code review
Co-authored-by: Copilot <[email protected]>
* Apply suggestions from code review
Co-authored-by: Copilot <[email protected]>
* One more edit pass
* fix warnings
* one more time on warnings
* one more warning pass
---------
Co-authored-by: Copilot <[email protected]>
Copy file name to clipboardExpand all lines: .github/prompts/error-consolidation.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -64,7 +64,7 @@ Understand these instructions, then suggest a list of themes and the included er
64
64
65
65
## Move from description to resolution
66
66
67
-
Rework the highlighted section so the focus is on how to correct each error. This article doesn't need to explain the associated language feature. Instead, in each section, provide links to language reference or language specification material that explains the rules violated when these diagnostics appear. Add explanatory context after each correction (in parentheses with the error code). Provided brief reasons why each correction is needed. Use detailed, sentence-style explanations rather than brief imperative statements. For each recommendation put the affectived error codes in parentheses, and in **bold** style. Remove extensive examples. Remove all H3 headings in this section. If any errors are no longer produced in the latest version of C#, make a note of that.
67
+
Rework the highlighted section so the focus is on how to correct each error. This article doesn't need to explain the associated language feature. Instead, in each section, provide links to language reference or language specification material that explains the rules violated when these diagnostics appear. Add explanatory context after each correction (in parentheses with the error code). Provided brief reasons why each correction is needed. Use detailed, sentence-style explanations rather than brief imperative statements. For each recommendation put the affected error codes in parentheses, and in **bold** style. Remove extensive examples. Remove all H3 headings in this section. If any errors are no longer produced in the latest version of C#, make a note of that.
title: Errors and warnings for string literal declarations
3
-
description: This article helps you diagnose and correct compiler errors and warnings when you declare string literals as constants or variables.
2
+
title: Resolve errors and warnings for string literal declarations
3
+
description: Learn how to diagnose and correct C# compiler errors and warnings when you declare string literals, including basic strings, raw strings, and UTF-8 strings.
4
4
f1_keywords:
5
5
- "CS1009"
6
6
- "CS1011"
7
7
- "CS1012"
8
8
- "CS1039"
9
+
- "CS8996"
9
10
- "CS8997"
10
11
- "CS8998"
11
12
- "CS8999"
@@ -20,13 +21,16 @@ f1_keywords:
20
21
- "CS9008"
21
22
- "CS9009"
22
23
- "CS1010"
24
+
- "CS9026"
25
+
- "CS9047"
23
26
- "CS9274"
24
27
- "CS9315"
25
28
helpviewer_keywords:
26
29
- "CS1009"
27
30
- "CS1011"
28
31
- "CS1012"
29
32
- "CS1039"
33
+
- "CS8996"
30
34
- "CS8997"
31
35
- "CS8998"
32
36
- "CS8999"
@@ -41,14 +45,16 @@ helpviewer_keywords:
41
45
- "CS9008"
42
46
- "CS9009"
43
47
- "CS1010"
48
+
- "CS9026"
49
+
- "CS9047"
44
50
- "CS9274"
45
51
- "CS9315"
46
-
ms.date: 10/09/2025
52
+
ms.date: 12/08/2025
47
53
ai-usage: ai-assisted
48
54
---
49
-
# Errors and warnings for string literal declarations
55
+
# Resolve errors and warnings for string literal declarations
50
56
51
-
There are several errors related to declaring string constants or string literals.
57
+
The C# compiler generates errors and warnings when you declare string literals with incorrect syntax or use them in unsupported contexts. These diagnostics help you identify issues with basic string literals, character literals, raw string literals, and UTF-8 string literals.
52
58
53
59
<!-- The text in this list generates issues for Acrolinx, because they don't use contractions.
54
60
That's by design. The text closely matches the text of the compiler error / warning for SEO purposes.
@@ -58,6 +64,7 @@ That's by design. The text closely matches the text of the compiler error / warn
58
64
-[**CS1011**](#incorrectly-formed-string-literals): *Empty character literal.*
59
65
-[**CS1012**](#incorrectly-formed-string-literals): *Too many characters in character literal.*
-[**CS8996**](#incorrectly-formed-raw-string-literals): *Raw string literals are not allowed in preprocessor directives.*
61
68
-[**CS8997**](#incorrectly-formed-raw-string-literals): *Unterminated raw string literal.*
62
69
-[**CS8998**](#incorrectly-formed-raw-string-literals): *Not enough starting quotes for this raw string content.*
63
70
-[**CS8999**](#incorrectly-formed-raw-string-literals): *Line does not start with the same whitespace as the closing line of the raw string literal.*
@@ -71,54 +78,31 @@ That's by design. The text closely matches the text of the compiler error / warn
71
78
-[**CS9007**](#incorrectly-formed-raw-string-literals): *Too many closing braces for interpolated raw string literal.*
72
79
-[**CS9008**](#incorrectly-formed-raw-string-literals): *Sequence of '@' characters is not allowed.*
73
80
-[**CS9009**](#incorrectly-formed-raw-string-literals): *String must start with quote character.*
81
+
-[**CS9026**](#utf-8-string-literals): *The input string cannot be converted into the equivalent UTF-8 byte representation.*
82
+
-[**CS9047**](#utf-8-string-literals): *Operator cannot be applied to operands that are not UTF-8 byte representations.*
74
83
-[**CS9274**](#literal-strings-in-data-sections): *Cannot emit this string literal into the data section because it has XXHash128 collision with another string literal.*
75
84
-[**CS9315**](#literal-strings-in-data-sections): *Combined length of user strings used by the program exceeds allowed limit. Adding a string literal requires restarting the application.*
76
85
77
-
The following sections provide examples of common issues and how to fix them.
78
-
79
86
## Incorrectly formed string literals
80
87
81
-
The following errors concern string and character literal syntax and common mistakes when declaring literal values.
82
-
83
88
-**CS1009** - *Unrecognized escape sequence.*
84
89
-**CS1010** - *Newline in constant.*
85
90
-**CS1011** - *Empty character literal.*
86
91
-**CS1012** - *Too many characters in character literal.*
87
92
-**CS1039** - *Unterminated string literal.*
88
93
89
-
Common causes and fixes:
90
-
91
-
- Invalid escape sequences: An unexpected character follows a backslash (`\\`). Use valid escapes (`\\n`, `\\t`, `\\uXXXX`, `\\xX`) or use verbatim (`@"..."`) or raw string literals for content that includes backslashes.
92
-
- Empty or multi-character char literals: Character literals must contain exactly one UTF-16 code unit. Use a single character like `'x'` or a string / `System.Text.Rune` for characters outside the BMP.
93
-
- Unterminated strings: Ensure every string or verbatim string has a matching closing quote. For verbatim strings, the final `"` must be present; for normal strings ensure escaped quotes are balanced.
94
-
- A string literal spans multiple lines of C# source.
// public char CharField = ''; // CS1011 - invalid: empty character literal
94
+
To correct these errors, apply the following techniques:
108
95
109
-
// CS1012 - too many characters in char literal
110
-
chara='xx'; // CS1012 - too many characters
96
+
- Use one of the standard escape sequences defined in the [C# language specification](~/_csharpstandard/standard/lexical-structure.md#642-unicode-character-escape-sequences), such as `\n` (newline), `\t` (tab), `\\` (backslash), or `\"` (double quote) (**CS1009**). The compiler doesn't recognize escape sequences that aren't part of the language specification, so using undefined escape sequences causes this error because the compiler can't determine what character you intended to represent.
97
+
- Add the closing quote character to complete your string literal (**CS1039**). String literals must have both an opening and closing delimiter, so an unterminated string causes the compiler to treat subsequent source code as part of the string content, which leads to parsing errors.
98
+
- Add exactly one character between the single quotes in your character literal (**CS1011**, **CS1012**). Character literals represent a single character value and must contain exactly one character or a valid escape sequence, so empty character literals or those containing multiple characters violate the language rules for the `char` type.
99
+
- Split string literals that span multiple source lines by ending each line with a closing quote and starting the next line with an opening quote, using the `+` operator to concatenate them (**CS1010**). Regular string literals can't contain actual newline characters because the closing quote must appear on the same line as the opening quote, but you can achieve multi-line strings through concatenation or by using [verbatim strings](../tokens/verbatim.md) or [raw string literals](../tokens/raw-string.md), which allow embedded newlines as part of the string content.
For more information on literal strings and escape sequences, see the articles on [verbatim strings](../tokens/verbatim.md) and [raw strings](../tokens/raw-string.md).
101
+
For more information, see [strings](../builtin-types/reference-types.md#string-literals).
117
102
118
103
## Incorrectly formed raw string literals
119
104
120
-
The following errors are related to raw string literal syntax and usage.
121
-
105
+
-**CS8996** - *Raw string literals are not allowed in preprocessor directives.*
122
106
-**CS8997** - *Unterminated raw string literal.*
123
107
-**CS8998** - *Not enough starting quotes for this raw string content.*
124
108
-**CS8999** - *Line does not start with the same whitespace as the closing line of the raw string literal.*
@@ -133,30 +117,40 @@ The following errors are related to raw string literal syntax and usage.
133
117
-**CS9008** - *Sequence of '@' characters is not allowed.*
134
118
-**CS9009** - *String must start with quote character.*
135
119
136
-
Check these common causes and fixes:
120
+
To correct these errors, apply the following techniques:
121
+
122
+
- Use regular string literals or [verbatim string literals](../tokens/verbatim.md) instead of raw string literals in preprocessor directives like `#if`, `#define`, or `#pragma` (**CS8996**). Preprocessor directives are evaluated during the preprocessing phase before lexical analysis occurs, so the compiler can't recognize raw string literal syntax in these contexts because raw strings are identified during the later lexical analysis phase.
123
+
- Add a closing delimiter that matches the opening delimiter to complete your raw string literal (**CS8997**, **CS9004**). The [raw string literal syntax](../tokens/raw-string.md) requires that the opening and closing delimiters contain the same number of consecutive double-quote characters (at least three), so a missing or mismatched closing delimiter prevents the compiler from determining where the string content ends.
124
+
- Place the opening and closing delimiters of multi-line raw string literals on their own lines, with no other content on those lines (**CS9000**). The [multi-line raw string format rules](../tokens/raw-string.md) require delimiters to occupy dedicated lines to establish clear boundaries for the string content and to enable the whitespace trimming behavior that removes common leading indentation from all content lines.
125
+
- Add at least one line of content between the opening and closing delimiters of your multi-line raw string literal (**CS9002**). The language specification requires multi-line raw strings to contain actual content because empty multi-line raw strings serve no purpose and likely indicate incomplete code, whereas single-line raw strings (with delimiters on the same line) can be empty and are the appropriate syntax for empty string values.
126
+
- Adjust the indentation of your raw string content lines to match or exceed the indentation of the closing delimiter line (**CS8999**, **CS9003**). The [whitespace handling rules](../tokens/raw-string.md) for raw string literals use the closing delimiter's leading whitespace as the baseline for trimming common indentation from all content lines, so content lines with less indentation than the closing delimiter violate this trimming algorithm and indicate incorrect formatting.
127
+
- Increase the number of double-quote characters in your raw string delimiter to exceed any consecutive run of quote characters in the content (**CS8998**). The delimiter must contain more consecutive quotes than any sequence within the string content so the compiler can unambiguously distinguish between quote characters that are part of the content and the delimiter sequence that marks the end of the string.
128
+
- For interpolated raw string literals, ensure the number of dollar signs (`$`) at the start matches the number of consecutive opening or closing braces you need as literal content (**CS9005**, **CS9006**, **CS9007**). The [interpolated raw string syntax](../tokens/raw-string.md#raw-string-literal-text----in-string-literals) uses the dollar sign count to determine the brace escape sequence length, so `$$"""` requires `{{` for interpolation holes and allows single `{` characters as content, while mismatched brace sequences indicate either incorrect interpolation syntax or content that needs a different dollar sign count.
129
+
- Remove the `@` prefix from your raw string literal and use only the quote character delimiter (**CS9008**, **CS9009**). Raw string literals are a distinct syntax introduced in C# 11 that doesn't use the `@` verbatim string prefix, and the language specification doesn't allow combining the `@` verbatim syntax with raw string delimiters because raw strings already support multi-line content and don't require escape sequences.
137
130
138
-
- Unterminated or mismatched delimiters: Ensure your raw string starts and ends with the same number of consecutive double quotes (`"`). For multi-line raw strings, the opening and closing delimiter lines must appear on their own lines.
139
-
- Indentation and whitespace mismatch: The indentation of the closing delimiter defines the trimming of common leading whitespace for content lines. Make sure content lines align with that indentation.
140
-
- Insufficient quote or `$` counts for content: If the content begins with runs of quote characters or brace characters, increase the length of the delimiter (more `"`) or the number of leading `$` characters for interpolated raw strings so content can't be confused with delimiters or interpolation.
141
-
- Illegal characters or sequences: Avoid multiple `@` characters for verbatim/raw combinations and ensure you use verbatim interpolated forms when combining interpolation with multi-line raw strings.
131
+
> [!NOTE]
132
+
> **CS9001** is no longer produced in current versions of C#. Multi-line raw string literals now support interpolation without requiring verbatim format.
142
133
143
-
The following code shows a few examples of incorrectly formed raw string literals.
134
+
For more information, see [raw string literals](../tokens/raw-string.md).
144
135
145
-
```csharp
146
-
// Unterminated raw string (CS8997)
147
-
vars="""This raw string never ends...
136
+
## UTF-8 string literals
148
137
149
-
// Delimiter must be on its own line (CS9000)
150
-
var t = """Firstline
151
-
Moretext
152
-
""";
153
-
```
138
+
-**CS9026** - *The input string cannot be converted into the equivalent UTF-8 byte representation.*
139
+
-**CS9047** - *Operator cannot be applied to operands that are not UTF-8 byte representations.*
154
140
155
-
For full syntax and more examples, see the [language reference on raw string literals](../tokens/raw-string.md).
141
+
To correct these errors, apply the following techniques:
142
+
143
+
- Remove characters or escape sequences that can't be encoded in UTF-8 from your `u8` string literal (**CS9026**). The [UTF-8 encoding specification](https://www.unicode.org/versions/Unicode15.0.0/ch03.pdf#G7404) supports the full Unicode character set but requires valid Unicode scalar values, so surrogate code points (values in the range U+D800 through U+DFFF) can't appear directly in UTF-8 strings because they're reserved for UTF-16 surrogate pair encoding rather than representing standalone characters, and attempting to encode them as UTF-8 would produce an invalid byte sequence.
144
+
- Ensure both operands of the addition operator are UTF-8 string literals (marked with the `u8` suffix) when concatenating UTF-8 strings (**CS9047**). The compiler provides special support for concatenating [UTF-8 string literals](../builtin-types/reference-types.md#utf-8-string-literals) at compile time, which produces `ReadOnlySpan<byte>` values representing the concatenated UTF-8 byte sequences, but mixing UTF-8 strings with regular `string` values or other types isn't supported because the type system can't determine whether to produce a byte span or a text string, and the underlying representations (UTF-8 bytes versus UTF-16 characters) are fundamentally incompatible.
145
+
146
+
For more information, see [UTF-8 string literals](../builtin-types/reference-types.md#utf-8-string-literals).
156
147
157
148
## Literal strings in data sections
158
149
159
150
-**CS9274**: *Cannot emit this string literal into the data section because it has XXHash128 collision with another string literal.*
160
151
-**CS9315**: *Combined length of user strings used by the program exceeds allowed limit. Adding a string literal requires restarting the application.*
161
152
162
-
**CS9274** indicate that your declaration can't be emitted in the data section. Disable this feature for your application. Debugging tools emit **CS9315** after you changed string data in the data section while debugging and your app must be restarted.
153
+
To fix these issues, try the following techniques:
154
+
155
+
- Disable the experimental data section string literals feature for your application when you encounter a hash collision (**CS9274**). This error indicates that two different string literals produced the same XXHash128 value, which prevents the optimization from working correctly, so you should remove the feature flag that enables this experimental behavior.
156
+
- Restart your application after modifying string literals during a debugging session when the data section feature is enabled (**CS9315**). The hot reload infrastructure can't update string literals stored in the data section because they're embedded in a special format that can't be modified at runtime, so continuing execution with the old string values would produce incorrect behavior.
0 commit comments