Skip to content

Commenting on a diff of a file with spaces separating short tokens makes the PR page inaccessible #14812

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
3 of 6 tasks
abread opened this issue Feb 26, 2021 · 23 comments · Fixed by #14804
Closed
3 of 6 tasks

Comments

@abread
Copy link

abread commented Feb 26, 2021

Rendering the PR page errors with template: repo/issue/view_content/comments:470:20: executing "repo/issue/view_content/comments" at <CommentMustAsDiff (index $comms 0)>: error calling CommentMustAsDiff: runtime error: slice bounds out of range [2:1]

I was unable to get a proper stacktrace (or logs), but I manually traced the offending slice indexing to this line.

The git-diff output filename extractor (readFileName) cannot properly split the old name from the new name (and panics because instead of finding b/<filename> to remove b/ it finds the single letter bit of the old filename).

git-diff outputs unquoted paths because Gitea sets core.quotePath (git config) to false. Interestingly, readFileName has code handling quoted paths already. It did work too when I changed the offending patch in the comments table to have quoted paths.

Steps to reproduce:

  1. Open a PR creating a file whose name is something like sdafdsaf z 1 - the first document.txt (anything in the format<any amount of characters that are not a space> <1 non-space character>[ <anything>] works as far as I can tell)
  2. Comment on any line of this new file in the PR
  3. The conversation tab of this PR now gives a 500 error when accessed

Fix suggestion

Maybe there's a really good reason to keep it, but imho leaving all paths returned by git quoted (core.quotePath=true) is the best option for sane path handling everywhere.

It may already be ok to do so: setting it to true didn't break any tests on my machine at least.

  • Gitea version (or commit ref): 1.13.2
  • Git version: 2.26.2
  • Operating system: Gentoo
    • using official ebuild with USE="acct build-client filecaps pam -sqlite"
    • started by OpenRC
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite (also tested)
  • Can you reproduce the bug at https://try.gitea.io:
  • Log gist: no logs appear (not even with Trace everything)
@zeripath
Copy link
Contributor

I've said it before and I'll say it again I hate our diff infrastructure.

@zeripath
Copy link
Contributor

OK so this is not rendering the diff pages themselves but rather a problem with rendering a comment on the pull itself right?

@abread
Copy link
Author

abread commented Feb 26, 2021

I think so yes (at least that's what I gather from the error message)

@zeripath
Copy link
Contributor

yup - https://try.gitea.io/templatepanic/templatepanic/pulls/3/files this doesn't panic.

@zeripath
Copy link
Contributor

OK so this is not a failure with the code I spent so effing long on making work for rendering of diffs in the first place.

@zeripath
Copy link
Contributor

No doubt the problem will be similar - in the end the reality is that there is no way to reliably split up: diff a/filename b/filename2 without forcing it to be always quoted.

@zeripath
Copy link
Contributor

Just to highlight this issue here is a particularly catastrophic example:

diff --git a/b b/b b/b b/b
deleted file mode 100644
index 92e798b..0000000
--- a/b b/b	
+++ /dev/null
@@ -1 +0,0 @@
-b b/b
diff --git a/b b/b b/b b/b b/b b/b b/b b/b
new file mode 100644
index 0000000..1bc3698
--- /dev/null
+++ b/b b/b b/b b/b	
@@ -0,0 +1 @@
+b b/b is now b b/b b/b b/b
diff --git a/b b/b b/b b/b b/b b/b
similarity index 100%
rename from b
rename to b b/b b/b b/b b/b
diff --git a/regular_text_file.rb b/regular_text_file.rb
new file mode 100644
index 0000000..6aa8f4e
--- /dev/null
+++ b/regular_text_file.rb
@@ -0,0 +1,3 @@
+# frozen_string_literal: true
+
+puts 'Foo�bar!'

from the diff --git lines alone what are the new and old filenames...

@zeripath
Copy link
Contributor

zeripath commented Feb 27, 2021

oh yeah and this particular doozy:

diff --git a/b b/b b/b b/b b/b b/b
similarity index 100%
rename from b b/b b/b b/b b/b
rename to b
diff --git a/b b/b b/b b/b b/b b/b
similarity index 100%
rename from b b/b b/b b/b
rename to b b/b

@abread
Copy link
Author

abread commented Feb 27, 2021

Yesterday I bit the bullet and looked through all the code calling git (started by looking for everything using os/exec and went up from there). I found only 43 places that are not ready to handle quoted paths (1 looks like dead code).

I think I can get a PR fixing this (and potentially other issues with paths) that flips core.quotedPath to true done today

@zeripath
Copy link
Contributor

No don't worry I'm there already

@zeripath
Copy link
Contributor

Just thinking about fixing the comment as patch code to make it quote or not but I guess that needs to be done as a different PR.

@zeripath
Copy link
Contributor

Hmm. I think I'm gonna merge the fix into #14804

zeripath added a commit to zeripath/gitea that referenced this issue Feb 27, 2021
When making comment patches unfortunately the patch does not always quote the filename
This makes the diff --git header ambiguous again.

This PR finally adds handling for ambiguity in to parse patch

Fix go-gitea#14812

Signed-off-by: Andrew Thornton <[email protected]>
@abread
Copy link
Author

abread commented Feb 27, 2021

I think it's easier to just assume paths are quoted in ParsePatch and change the ~3 callers to call git -c core.quotedpath=true ... no?

@zeripath
Copy link
Contributor

ah you've made the mistake of assuming that these particular "diffs" came from git - they're from some code that hacks them out of git diffs.

The reality is that these diffs are stored in plaintext in your db so we still need to be able to parse them.

In any case core.quotedpath doesn't guarantee what you think it does. The answer for forcing every git from 1.7.2+ to quote its paths is to pass git diff --src-prefix=\\a/ --dst-prefix=\\b/ and then trim off the \\.

Honestly it's just crazy.

@abread
Copy link
Author

abread commented Feb 27, 2021 via email

6543 pushed a commit that referenced this issue Feb 27, 2021
* CutDiffAroundLine makes the incorrect assumption that `---` and `+++` always represent part of the header of a diff.

This PR adds a flag to its parsing to prevent this problem and adds a streaming parsing technique to CutDiffAroundLine using an io.pipe instead of just sending data to an unbounded buffer.

Fix #14711

Signed-off-by: Andrew Thornton <[email protected]>

* Handle unquoted comment patch files

When making comment patches unfortunately the patch does not always quote the filename
This makes the diff --git header ambiguous again.

This PR finally adds handling for ambiguity in to parse patch

Fix #14812

Signed-off-by: Andrew Thornton <[email protected]>

* Add in testing for no error

There is no way currently for CutDiffAroundLine in this test to cause an
error however, it should still be tested.

Signed-off-by: Andrew Thornton <[email protected]>
zeripath added a commit to zeripath/gitea that referenced this issue Feb 27, 2021
Backport go-gitea#14804

* CutDiffAroundLine makes the incorrect assumption that `---` and `+++` always represent part of the header of a diff.

This PR adds a flag to its parsing to prevent this problem and adds a streaming parsing technique to CutDiffAroundLine using an io.pipe instead of just sending data to an unbounded buffer.

Fix go-gitea#14711

Signed-off-by: Andrew Thornton <[email protected]>

* Handle unquoted comment patch files

When making comment patches unfortunately the patch does not always quote the filename
This makes the diff --git header ambiguous again.

This PR finally adds handling for ambiguity in to parse patch

Fix go-gitea#14812

Signed-off-by: Andrew Thornton <[email protected]>

* Add in testing for no error

There is no way currently for CutDiffAroundLine in this test to cause an
error however, it should still be tested.

Signed-off-by: Andrew Thornton <[email protected]>
lafriks pushed a commit that referenced this issue Feb 28, 2021
Backport #14804

* CutDiffAroundLine makes the incorrect assumption that `---` and `+++` always represent part of the header of a diff.

This PR adds a flag to its parsing to prevent this problem and adds a streaming parsing technique to CutDiffAroundLine using an io.pipe instead of just sending data to an unbounded buffer.

Fix #14711

* Handle unquoted comment patch files

When making comment patches unfortunately the patch does not always quote the filename
This makes the diff --git header ambiguous again.

This PR finally adds handling for ambiguity in to parse patch

Fix #14812

* Add in testing for no error

There is no way currently for CutDiffAroundLine in this test to cause an
error however, it should still be tested.

Signed-off-by: Andrew Thornton <[email protected]>
@skillcoder
Copy link

I'm updatated gitea 1.13.2 -> 1.13.4
But I still have see broken PR

template: repo/issue/view_content/comments:470:20: executing "repo/issue/view_content/comments" at <CommentMustAsDiff (index $comms 0)>: error calling CommentMustAsDiff: runtime error: invalid memory address or nil pointer dereference

It's normal ?

@miili
Copy link

miili commented Mar 18, 2021

Same here. Is there any way to fix this?
Can we fix the bad PR or throw it away?

@zeripath
Copy link
Contributor

Please open a new issue instead of commenting on old issues and replicate the issue on try.gitea.io, and give us some logs.

Please refer to this issue on the new issue so we can go from there.

@zeripath
Copy link
Contributor

@miili without even giving us an example of what is breaking how can you expect there to be a fix? The provided testcase referred to in the in this issue is fixed and frankly it's extremely unhelpful of you to write things like: "Can we fix the bad PR or throw it away."

@miili
Copy link

miili commented Mar 19, 2021

@miili without even giving us an example of what is breaking how can you expect there to be a fix? The provided testcase referred to in the in this issue is fixed and frankly it's extremely unhelpful of you to write things like: "Can we fix the bad PR or throw it away."

@zeripath, thank you for you respone. The exact issue and how to replicate it has been described in great detail in this issue. I don't feel the need to document or replicate it again.

template: repo/issue/view_content/comments:470:20: executing "repo/issue/view_content/comments" at <CommentMustAsDiff (index $comms 0)>: error calling CommentMustAsDiff: runtime error: invalid memory address or nil pointer dereference

The bad PR is refering to the broken Pull Request on the production system, which cannot be accessed anymore.
Here is a link, if that helps https://git.pyrocko.org/pyrocko/pyrocko/pulls/159#issuecomment-785

Can the broken PR be fixed?

@zeripath
Copy link
Contributor

@miili could you update to 1.13.6 please we should get more information then

@zeripath
Copy link
Contributor

Also looking at:

https://git.pyrocko.org/pyrocko/pyrocko/pulls/159/files

I can't see an obvious thing that would cause the problem but I wonder
if you'd be able to help me out to figure out what the problem is or
could be.

Part of the problem is that the panic is occurring inside the template
executer so we're lacking more information as to exactly where the
panic is occuring. It would be possible to add a recover handler into
the function to help get information into why the result is failing
but I can't seem to replicate this locally at all.

If you would be able to give me a copy of your issue, pull_request and
comment tables as relating to this request I might be able to figure
it out.

1.13.6 has changed the code to give us more information about the panic but it would still be helpful to figure things out

@zeripath
Copy link
Contributor

@miili @skillcoder Are either of you associated with pyrocko? If so the upgrade to 1.13.6 has revealed the source of the problem.

As I suspected it was not associated to this PR.

@go-gitea go-gitea locked and limited conversation to collaborators May 13, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants