-
Notifications
You must be signed in to change notification settings - Fork 13.7k
make rustdoc::invalid_html_tags more robust #145535
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make rustdoc::invalid_html_tags more robust #145535
Conversation
r? @notriddle rustbot has assigned @notriddle. Use |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
f2456a7
to
adf1e7d
Compare
This comment has been minimized.
This comment has been minimized.
Used pulldown-cmark is slightly old for reasons, this can be changed in more recent versions. |
// for some reason, pulldown-cmark splits html blocks into seperate events for each line. | ||
// we undo this, in order to handle multi-line tags. | ||
match (a, b) { | ||
((Event::Html(_), ra), (Event::Html(_), rb)) if ra.end == rb.start => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work if you wrap an html block in a markdown blockquote.
$ cargo run --release -- --events
Finished `release` profile [optimized] target(s) in 0.06s
Running `target/release/pulldown-cmark --events`
> <div
> class="foo">
0..22: Start(BlockQuote(None))
2..22: Start(HtmlBlock)
2..7: Html(Borrowed("<div\n"))
9..22: Html(Borrowed("class=\"foo\">\n"))
2..22: End(HtmlBlock)
0..22: End(BlockQuote(None))
EOF
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think should be done instead, then? Unconditionally coalesce adjacent Html events, using Cow::Owned
if needed? That could potentially also break the check that looks for >
within the range..
The other option would be fully refactoring the parser to move its state into a struct so that it can be passed multiple strings and doesn't need to have the strings concatenated. If we're doing that we can also refactor the tag parsing a bit so unfinished tags (those missing >
) are actually treated as a separate error case from unclosed tags.
The latter would be more robust, but also a fair bit more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unconditionally coalesce adjacent Html events, using Cow::Owned if needed?
I think that's the easiest way to handle it.
while let Some((event, range)) = p.next() {
match event {
Event::Start(Tag::CodeBlock(_)) => in_code_block = true,
Event::Html(text) | Event::InlineHtml(text) if !in_code_block => {
extract_tags(&mut tags, &text, range, dox, &mut is_in_comment, &report_diag)
}
// for some reason, pulldown-cmark splits html blocks into seperate events for each line.
// we undo this, in order to handle multi-line tags.
Event::Tag(Tag::HtmlBlock) if !in_code_block => {
let mut html_text = String::new();
while let Some((event_inner, _)) = p.next() {
match event_inner {
Event::Html(text) => {
html_text.push_str(&text[..]);
}
Event::End(TagEnd::HtmlBlock) => break,
_ => unreachable!("html is supposed to be a leaf block"),
}
}
extract_tags(&mut tags, &html_text, range, dox, &mut is_in_comment, &report_diag);
}
Event::End(TagEnd::CodeBlock) => in_code_block = false,
_ => {}
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That still doesn't solve the issue of >
from backquotes. I think refactoring the parser is probably the correct thing to do here.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
50d97d1
to
6e6de07
Compare
This comment has been minimized.
This comment has been minimized.
previously, this lint did not distinguish between `<img` and `<img>`, and since the latter should be accepted under html5, the former was also accepted. the parser now also handles multi-line tags and multi-line attributes.
6e6de07
to
d022089
Compare
|
Seems like it's already a big improvement over the existing so let's merge it. @notriddle: Based on your comments, you're not completely satisfied with the current fix so I suggest we do iterative improvements in follow-up(s). Thanks to both of you in any case! @bors r+ rollup |
…tags-svg-145529, r=GuillaumeGomez make rustdoc::invalid_html_tags more robust best reviewed a commit at a time. I kept finding more edge case so I ended up having to make quite significant changes to the parser in order to make it preserve state across events and handle multiline attributes correctly. fixes rust-lang#145529
Rollup of 12 pull requests Successful merges: - #143193 (Port `#[link]` to the new attribute parsing infrastructure ) - #144373 (remove deprecated Error::description in impls) - #144885 (Implement some more checks in `ptr_guaranteed_cmp`. ) - #145535 (make rustdoc::invalid_html_tags more robust) - #145766 (test(rustfmt): Verify frontmatter is preserved) - #145811 (Fix some minor issues in comments) - #145814 (Handle unwinding fatal errors in codegen workers) - #145815 (Wait for DPkg frontend lock when trying to remove packages) - #145821 (compiletest: if a compiler fails, show its output) - #145845 (Make `x test distcheck` self-contained) - #145847 (Don't show warnings from xcrun with -Zverbose-internals) - #145856 (Update books) r? `@ghost` `@rustbot` modify labels: rollup
…tags-svg-145529, r=GuillaumeGomez make rustdoc::invalid_html_tags more robust best reviewed a commit at a time. I kept finding more edge case so I ended up having to make quite significant changes to the parser in order to make it preserve state across events and handle multiline attributes correctly. fixes rust-lang#145529
Rollup of 13 pull requests Successful merges: - #143193 (Port `#[link]` to the new attribute parsing infrastructure ) - #143689 (Allow linking a prebuilt optimized compiler-rt builtins library) - #144885 (Implement some more checks in `ptr_guaranteed_cmp`. ) - #145535 (make rustdoc::invalid_html_tags more robust) - #145766 (test(rustfmt): Verify frontmatter is preserved) - #145811 (Fix some minor issues in comments) - #145814 (Handle unwinding fatal errors in codegen workers) - #145815 (Wait for DPkg frontend lock when trying to remove packages) - #145821 (compiletest: if a compiler fails, show its output) - #145845 (Make `x test distcheck` self-contained) - #145847 (Don't show warnings from xcrun with -Zverbose-internals) - #145856 (Update books) - #145858 (Update wasm-component-ld dependency) r? `@ghost` `@rustbot` modify labels: rollup
Rollup of 12 pull requests Successful merges: - #143689 (Allow linking a prebuilt optimized compiler-rt builtins library) - #144885 (Implement some more checks in `ptr_guaranteed_cmp`. ) - #145535 (make rustdoc::invalid_html_tags more robust) - #145766 (test(rustfmt): Verify frontmatter is preserved) - #145811 (Fix some minor issues in comments) - #145814 (Handle unwinding fatal errors in codegen workers) - #145815 (Wait for DPkg frontend lock when trying to remove packages) - #145821 (compiletest: if a compiler fails, show its output) - #145845 (Make `x test distcheck` self-contained) - #145847 (Don't show warnings from xcrun with -Zverbose-internals) - #145856 (Update books) - #145858 (Update wasm-component-ld dependency) r? `@ghost` `@rustbot` modify labels: rollup
Rollup of 12 pull requests Successful merges: - #143689 (Allow linking a prebuilt optimized compiler-rt builtins library) - #144885 (Implement some more checks in `ptr_guaranteed_cmp`. ) - #145535 (make rustdoc::invalid_html_tags more robust) - #145766 (test(rustfmt): Verify frontmatter is preserved) - #145811 (Fix some minor issues in comments) - #145814 (Handle unwinding fatal errors in codegen workers) - #145815 (Wait for DPkg frontend lock when trying to remove packages) - #145821 (compiletest: if a compiler fails, show its output) - #145845 (Make `x test distcheck` self-contained) - #145847 (Don't show warnings from xcrun with -Zverbose-internals) - #145856 (Update books) - #145858 (Update wasm-component-ld dependency) r? `@ghost` `@rustbot` modify labels: rollup
Rollup merge of #145535 - lolbinarycat:rustdoc-invalid_html_tags-svg-145529, r=GuillaumeGomez make rustdoc::invalid_html_tags more robust best reviewed a commit at a time. I kept finding more edge case so I ended up having to make quite significant changes to the parser in order to make it preserve state across events and handle multiline attributes correctly. fixes #145529
Rollup of 12 pull requests Successful merges: - rust-lang/rust#143689 (Allow linking a prebuilt optimized compiler-rt builtins library) - rust-lang/rust#144885 (Implement some more checks in `ptr_guaranteed_cmp`. ) - rust-lang/rust#145535 (make rustdoc::invalid_html_tags more robust) - rust-lang/rust#145766 (test(rustfmt): Verify frontmatter is preserved) - rust-lang/rust#145811 (Fix some minor issues in comments) - rust-lang/rust#145814 (Handle unwinding fatal errors in codegen workers) - rust-lang/rust#145815 (Wait for DPkg frontend lock when trying to remove packages) - rust-lang/rust#145821 (compiletest: if a compiler fails, show its output) - rust-lang/rust#145845 (Make `x test distcheck` self-contained) - rust-lang/rust#145847 (Don't show warnings from xcrun with -Zverbose-internals) - rust-lang/rust#145856 (Update books) - rust-lang/rust#145858 (Update wasm-component-ld dependency) r? `@ghost` `@rustbot` modify labels: rollup
best reviewed a commit at a time.
I kept finding more edge case so I ended up having to make quite significant changes to the parser in order to make it preserve state across events and handle multiline attributes correctly.
fixes #145529