-
Notifications
You must be signed in to change notification settings - Fork 13.5k
Treat include_str!
like it produces a raw string.
#143077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There are two reasons to do this: - It just makes sense. The contents of an included file is just like the contents of a raw string literal, because string escapes like `\"` and `\x61` don't get special treatment. - We can avoid escaping it when putting it into `token::Lit::StrRaw`, unlike `token::Lit::Str`. On a tiny test program that included an 80 MiB file, this reduced compile time from 2.2s to 1.0s. The change is detectable from proc macros that use `to_string` on tokens, as the change to the `expand-expr.rs` test indicates. But this kind of change is allowable, and it seems very unlikely to cause problems in practice.
If we introduce |
Actually, that doesn't work. Because |
pub fn expr_str_raw(&self, span: Span, s: Symbol) -> P<ast::Expr> { | ||
let lit = token::Lit::new(token::StrRaw(0), s, None); | ||
self.expr(span, ast::ExprKind::Lit(lit)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can only assume 0
means "with 0 #
s". But that would produce an invalid token, right? I.e. if the file being included contains "
, this would break. Moreover for any number of #
s you can construct a file which would break it ("####...
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn desugar_doc_comments
has some logic that counts how many hashes need to be added to keep the literal in #[doc = r"my arbitrary string from a sugared doc comment"]
well formed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, forgot to mention, for doc comments the hash counter can overflow too.
Technically, token kind changes are user-observable. macro_rules! expect_nonraw {
("a") => {}
}
macro_rules! expect_raw {
(r"a") => {}
}
expect_nonraw!("a");
expect_nonraw!(r"a"); // ERROR no rules expected `r"a"`
expect_raw!(r"a");
expect_raw!("a"); // ERROR no rules expected `"a"`
fn main() {} |
Apparently my memory is failing me. |
|
Neither normal nor raw strings really suit perfectly for representing included strings and doc comments. I'd rather introduce a new literal kind "undelimited raw string" for all this stuff, if not the concerns about token matching and compatibility (#143077 (comment)). If we discern the undelimited literals from regular raw strings (with |
We can perhaps crater a change like this. |
This is clearly more complicated than I realised. The alternative suggestions might be worthwhile, but they don't have to happen in this PR. |
There are two reasons to do this:
It just makes sense. The contents of an included file is just like the contents of a raw string literal, because string escapes like
\"
and\x61
don't get special treatment.We can avoid escaping it when putting it into
token::Lit::StrRaw
, unliketoken::Lit::Str
. On a tiny test program that included an 80 MiB file, this reduced compile time from 2.2s to 1.0s.The change is detectable from proc macros that use
to_string
on tokens, as the change to theexpand-expr.rs
test indicates. But this kind of change is allowable, and it seems very unlikely to cause problems in practice.r? @petrochenkov