You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This shows 1.txt for the first line, 2.txt for the second line, and again 1.txt for the third line. A colleague suspects this line is the source of renames being turned on (and I tend to agree), and because diff_opts.rename_score is not set, it defaults to 30000/50% instead of the documented 100%.
This causes problems (appearance of hanging) when running git blame on files in some massive repositories because Git needs to compare blob contents to do partial-file rename detection, which implies trying to download millions of files one-by-one to compare them.
We should consider changing the default blame behavior to only follow exact whole-file renames (ie where the blob sha doesn’t change).
More importantly, we need to add support for arguments like -M[<n>]/--find-renames[=<n>] like git log has.
Also, we probably want to add support for git config settings to control blame's rename behavior, similar to the existing diff.renames, merge.renames, status.renames settings.
In the least, we should update the git blame documentation to clearly state the current behavior (and not assume that all readers have the same idea what constitutes a whole-file rename: does it have to be exact, or is an inexact match acceptable, too?).
@Copilot: here are some guidelines how to implement a fix for this:
Look for the existing implementations of -M/--find-renames support in diff*.c, and see how much you can reuse (avoid copying code if it is easy).
Then look at Documentation/*diff* for the related documentation.
Now, find the spot in *blame.c where the options for git blame are parsed, and add the appropriate code to support the new options. After that, add the documentation to the proper spot in Documentation/git-blame.adoc.
Likely, this would make for a good time to commit the changes. Do pay attention to a thorough commit message, in particular focusing on addressing preemptively any questions as to what we're doing here and why (the intent, context, implementation and justification should be contained in the commit message, leaving out all unnecessary explanations that aren't readily obvious from looking at the diff).
Now it would probably be a good time to find a good spot for a regression test. Look at t/*blame*.sh to see the existing examples, test cases are all enclosed in test_expect_success calls. Find a spot that looks like it is the most appropriate to test whole-sale rename detection, and then add a new test case that verifies an inexact whole-file rename is only detected with specific -M values, by running git blame with two different values.
That would be another commit.
Now, it would be a good time to implement the support for the blame.renames config setting. To understand how to do that, look for the implementation (*.c) and documentation (Documentation/config/*) of above-mentioned .renames settings, and then imitate them.
Do augment the git blame -M test case by changing one existing git blame -M call to specify the config setting via git -c blame.renames=... blame ... instead, and then add another git blame invocation to the same test case that verifies that -M overrides blame.renames.
The text was updated successfully, but these errors were encountered:
Uh oh!
There was an error while loading. Please reload this page.
The documentation for git blame states:
It is easily shown that git blame follows partial-file renames as well:
This shows
1.txt
for the first line,2.txt
for the second line, and again1.txt
for the third line. A colleague suspects this line is the source of renames being turned on (and I tend to agree), and becausediff_opts.rename_score
is not set, it defaults to 30000/50% instead of the documented 100%.This causes problems (appearance of hanging) when running git blame on files in some massive repositories because Git needs to compare blob contents to do partial-file rename detection, which implies trying to download millions of files one-by-one to compare them.
We should consider changing the default
blame
behavior to only follow exact whole-file renames (ie where the blob sha doesn’t change).More importantly, we need to add support for arguments like
-M[<n>]
/--find-renames[=<n>]
likegit log
has.Also, we probably want to add support for
git config
settings to controlblame
's rename behavior, similar to the existingdiff.renames
,merge.renames
,status.renames
settings.In the least, we should update the
git blame
documentation to clearly state the current behavior (and not assume that all readers have the same idea what constitutes a whole-file rename: does it have to be exact, or is an inexact match acceptable, too?).@Copilot: here are some guidelines how to implement a fix for this:
Look for the existing implementations of
-M
/--find-renames
support indiff*.c
, and see how much you can reuse (avoid copying code if it is easy).Then look at
Documentation/*diff*
for the related documentation.Now, find the spot in
*blame.c
where the options forgit blame
are parsed, and add the appropriate code to support the new options. After that, add the documentation to the proper spot inDocumentation/git-blame.adoc
.Likely, this would make for a good time to commit the changes. Do pay attention to a thorough commit message, in particular focusing on addressing preemptively any questions as to what we're doing here and why (the intent, context, implementation and justification should be contained in the commit message, leaving out all unnecessary explanations that aren't readily obvious from looking at the diff).
Now it would probably be a good time to find a good spot for a regression test. Look at
t/*blame*.sh
to see the existing examples, test cases are all enclosed intest_expect_success
calls. Find a spot that looks like it is the most appropriate to test whole-sale rename detection, and then add a new test case that verifies an inexact whole-file rename is only detected with specific-M
values, by runninggit blame
with two different values.That would be another commit.
Now, it would be a good time to implement the support for the
blame.renames
config setting. To understand how to do that, look for the implementation (*.c
) and documentation (Documentation/config/*
) of above-mentioned.renames
settings, and then imitate them.Do augment the
git blame -M
test case by changing one existinggit blame -M
call to specify the config setting viagit -c blame.renames=... blame ...
instead, and then add anothergit blame
invocation to the same test case that verifies that-M
overridesblame.renames
.The text was updated successfully, but these errors were encountered: