Skip to content

rustdoc: Fix doctest heuristic for main fn wrapping #140420

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
64 changes: 27 additions & 37 deletions src/librustdoc/doctest/make.rs
Original file line number Diff line number Diff line change
Expand Up @@ -301,8 +301,6 @@ fn parse_source(source: &str, crate_name: &Option<&str>) -> Result<ParseSourceIn

let filename = FileName::anon_source_code(&wrapped_source);

// Any errors in parsing should also appear when the doctest is compiled for real, so just
// send all the errors that librustc_ast emits directly into a `Sink` instead of stderr.
let sm = Arc::new(SourceMap::new(FilePathMapping::empty()));
let fallback_bundle = rustc_errors::fallback_fluent_bundle(
rustc_driver::DEFAULT_LOCALE_RESOURCES.to_vec(),
Expand All @@ -311,7 +309,8 @@ fn parse_source(source: &str, crate_name: &Option<&str>) -> Result<ParseSourceIn
info.supports_color =
HumanEmitter::new(stderr_destination(ColorConfig::Auto), fallback_bundle.clone())
.supports_color();

// Any errors in parsing should also appear when the doctest is compiled for real, so just
// send all the errors that the parser emits directly into a `Sink` instead of stderr.
let emitter = HumanEmitter::new(Box::new(io::sink()), fallback_bundle);

// FIXME(misdreavus): pass `-Z treat-err-as-bug` to the doctest parser
Expand Down Expand Up @@ -339,9 +338,6 @@ fn parse_source(source: &str, crate_name: &Option<&str>) -> Result<ParseSourceIn
*prev_span_hi = hi;
}

// Recurse through functions body. It is necessary because the doctest source code is
// wrapped in a function to limit the number of AST errors. If we don't recurse into
// functions, we would thing all top-level items (so basically nothing).
Comment on lines -342 to -344
Copy link
Member Author

@fmease fmease Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated since #138104 which has removed the recursion as suggested in #138104 (comment).

fn check_item(item: &ast::Item, info: &mut ParseSourceInfo, crate_name: &Option<&str>) -> bool {
let mut is_extern_crate = false;
if !info.has_global_allocator
Expand All @@ -351,8 +347,6 @@ fn parse_source(source: &str, crate_name: &Option<&str>) -> Result<ParseSourceIn
}
match item.kind {
ast::ItemKind::Fn(ref fn_item) if !info.has_main_fn => {
// We only push if it's the top item because otherwise, we would duplicate
// its content since the top-level item was already added.
Comment on lines -354 to -355
Copy link
Member Author

@fmease fmease Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Outdated since #138104 which has removed the recursion as suggested in #138104 (comment).

if fn_item.ident.name == sym::main {
info.has_main_fn = true;
}
Expand Down Expand Up @@ -412,44 +406,40 @@ fn parse_source(source: &str, crate_name: &Option<&str>) -> Result<ParseSourceIn
let mut is_extern_crate = false;
match stmt.kind {
StmtKind::Item(ref item) => {
is_extern_crate = check_item(&item, &mut info, crate_name);
}
StmtKind::Expr(ref expr) => {
if matches!(expr.kind, ast::ExprKind::Err(_)) {
reset_error_count(&psess);
return Err(());
}
has_non_items = true;
is_extern_crate = check_item(item, &mut info, crate_name);
}
// We assume that the macro calls will expand to item(s) even though they could
// expand to statements and expressions. And the simple fact that we're trying
// to retrieve a `main` function inside it is a terrible idea.
// expand to statements and expressions.
StmtKind::MacCall(ref mac_call) => {
if info.has_main_fn {
continue;
Copy link
Member Author

@fmease fmease Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the culprit that led to #140412: The continue skipped the entire span-to-string logic happening below the match which lead to the (compilefail) global_asm!(…); getting dropped right on the floor and thus not even appearing in the generated crate.

}
let mut iter = mac_call.mac.args.tokens.iter();

while let Some(token) = iter.next() {
if let TokenTree::Token(token, _) = token
&& let TokenKind::Ident(name, _) = token.kind
&& name == kw::Fn
&& let Some(TokenTree::Token(fn_token, _)) = iter.peek()
&& let TokenKind::Ident(fn_name, _) = fn_token.kind
&& fn_name == sym::main
&& let Some(TokenTree::Delimited(_, _, Delimiter::Parenthesis, _)) = {
iter.next();
iter.peek()
if !info.has_main_fn {
// For backward compatibility, we look for the token sequence `fn main(…)`
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-added the historical context that was dropped in #138104 (I rephrased it).

// in the macro input (!) to crudely detect main functions "masked by a
// wrapper macro". For the record, this is a horrible heuristic!
// See <https://github.com/rust-lang/rust/issues/56898>.
let mut iter = mac_call.mac.args.tokens.iter();
while let Some(token) = iter.next() {
if let TokenTree::Token(token, _) = token
&& let TokenKind::Ident(kw::Fn, _) = token.kind
&& let Some(TokenTree::Token(ident, _)) = iter.peek()
&& let TokenKind::Ident(sym::main, _) = ident.kind
&& let Some(TokenTree::Delimited(.., Delimiter::Parenthesis, _)) = {
iter.next();
iter.peek()
}
{
info.has_main_fn = true;
}
{
info.has_main_fn = true;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removing this break?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was tired and thought it would break to the outer for. Reverted.

}
}
}
_ => {
StmtKind::Expr(ref expr) => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved this one closer to the other non-items.

if matches!(expr.kind, ast::ExprKind::Err(_)) {
reset_error_count(&psess);
return Err(());
}
has_non_items = true;
}
StmtKind::Let(_) | StmtKind::Semi(_) | StmtKind::Empty => has_non_items = true,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replaced wildcard with exhaustive matching to prevent silently dropping non-items ever again (context: #140220 (comment)).

}

// Weirdly enough, the `Stmt` span doesn't include its attributes, so we need to
Expand Down
16 changes: 0 additions & 16 deletions tests/rustdoc-ui/doctest/macro-after-main.rs
Copy link
Member Author

@fmease fmease Apr 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doctest should definitely not pass! It never used to pass until the regression PR which added it. It's not a proper minimization of the diesel cargotest smoke test.

We're in an item context since there's a main fn and all three statements are considered items by us as expected. Therefore, the eprintln!(); which is only valid in expression contexts will and should lead to an error down the line.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem I identified was that the MacCall pattern was skipped if we already found a main function, making all other items "non function items", leading to the regression. My change was to instead always enter in this arm and inside it check if there is a main function only if we didn't find one yet.

The code does seem valid to me considering that you can generate items with macros inside a function.

Copy link
Member Author

@fmease fmease Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, that was a real regression, I'm not refuting that. As I wrote in #140420 (comment), the problem was the continue you added as a fix which skipped the entire span-to-string logic which leads to us now dropping certain things on the floor.

With my PR applied, the ParseSourceInfo.everything_else contains:

include!("./auxiliary/macro-after-main.rs");

fn main() {}
eprintln!();

Which is entirely reasonable (we consider all macro calls as item macro calls which is the only sensible thing to do with all the constraints we have).

However, on master (which continues) it contains:

include!("./auxiliary/macro-after-main.rs");

fn main() {}

Notice how we dropped the eprintln!();! That's why this doctest passes and tests the wrong things and also why ehuss's doctest in #140412 is no longer compile_fail: On master, it simply drops the malformed global_asm!(…);! That's what I'm fixing here.

This file was deleted.

6 changes: 0 additions & 6 deletions tests/rustdoc-ui/doctest/macro-after-main.stdout

This file was deleted.

Loading