Skip to content

Document two partial clone bugs, fix one #556

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions builtin/fetch.c
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,7 @@ static void find_non_local_tags(const struct ref *refs,
struct string_list_item *remote_ref_item;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Jonathan Tan wrote (reply to this):

> From: Derrick Stolee <[email protected]>
> 
> When using partial-clone, do_oid_object_info_extended() can trigger a
> fetch for missing objects. This can be extremely expensive when asking
> for a tag or commit, as we are completely removed from the context of
> the missing object and thus supply no "haves" in the request.
> 
> 6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a
> global variable that prevented these fetches in favor of a bitflag.
> However, some object existence checks were not updated to use this flag.
> 
> Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in
> addition to OBJECT_INFO_QUICK. The _QUICK option only prevents
> repreparing the pack-file structures. We need to be extremely careful
> about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist
> due to updated refs.
> 
> This resolves a broken test in t5616-partial-clone.sh.
> 
> Signed-off-by: Derrick Stolee <[email protected]>

Thanks for catching this.

I wonder if the commit message in this patch could be better worded -
the first paragraph seems to say that fetching missing commits and tags
are expensive, but that is not the problem here; the problem is that the
client lazily fetches refs advertised by the server, thinking that it is
lacking them due to partial clone, even when there is no expectation
that the client have them (so the commits and tags are not truly
missing).

So I would reword the first paragraph as:

  When using partial clone, find_non_local_tags() in builtin/fetch.c
  checks each remote tag to see if its object also exists locally. There
  is no expectation that the object exist locally, but this function
  nevertheless triggers a lazy fetch if the object does not exist. This
  can be extremely expensive when asking for a commit, as we are
  completely removed from the context of the non-existent object and
  thus supply no "haves" in the request.

All this rests on my thinking that "missing" has the connotation (or
actual meaning) that we expect the object to be there. If we think that
"missing" can also mean that the remote has it but the local doesn't,
then you can ignore what I just said :-)

Other than that, both patches look good to me.

const struct ref *ref;
struct refname_hash_entry *item = NULL;
const int quick_flags = OBJECT_INFO_QUICK | OBJECT_INFO_SKIP_FETCH_OBJECT;

refname_hash_init(&existing_refs);
refname_hash_init(&remote_refs);
Expand All @@ -353,10 +354,9 @@ static void find_non_local_tags(const struct ref *refs,
*/
if (ends_with(ref->name, "^{}")) {
if (item &&
!has_object_file_with_flags(&ref->old_oid,
OBJECT_INFO_QUICK) &&
!has_object_file_with_flags(&ref->old_oid, quick_flags) &&
!oidset_contains(&fetch_oids, &ref->old_oid) &&
!has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
!has_object_file_with_flags(&item->oid, quick_flags) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);
item = NULL;
Expand All @@ -370,7 +370,7 @@ static void find_non_local_tags(const struct ref *refs,
* fetch.
*/
if (item &&
!has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
!has_object_file_with_flags(&item->oid, quick_flags) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);

Expand All @@ -391,7 +391,7 @@ static void find_non_local_tags(const struct ref *refs,
* checked to see if it needs fetching.
*/
if (item &&
!has_object_file_with_flags(&item->oid, OBJECT_INFO_QUICK) &&
!has_object_file_with_flags(&item->oid, quick_flags) &&
!oidset_contains(&fetch_oids, &item->oid))
clear_item(item);

Expand Down
31 changes: 31 additions & 0 deletions t/t5616-partial-clone.sh
Original file line number Diff line number Diff line change
Expand Up @@ -374,6 +374,37 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
grep "want $(cat hash)" trace
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Feb 19, 2020 at 11:22 AM Derrick Stolee via GitGitGadget
<[email protected]> wrote:
> [...]
> The tests are ordered in this way because if I swap the test order the
> tag test will succeed instead of fail. I believe this is because somehow
> we need the srv.bare repo to not have any tags when we clone, but then
> have tags in our next fetch.

This ordering requirement might deserve an in-code comment in the test
script itself.

More below...

> Signed-off-by: Derrick Stolee <[email protected]>
> ---
> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
> +test_expect_failure 'verify fetch succeeds when asking for new tags' '
> +       git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
> +       for i in I J K
> +       do
> +               test_commit -C src $i &&
> +               git -C src branch $i
> +       done &&

If test_commit() or git-branch fail, those failures will go unnoticed.
You can fix this by bailing from the loop, like this:

    for i in I J K
    do
        test_commit -C src $i &&
        git -C src branch $i || return 1
    done &&

Same comment applies to the other new test.

> +       git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
> +       git -C tag-test fetch --tags origin
> +'
> +
> +test_expect_failure 'verify fetch downloads only one pack when updating refs' '
> +       git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
> +       ls pack-test/.git/objects/pack/*pack >pack-list &&
> +       test_line_count = 2 pack-list &&
> +       for i in A B C
> +       do
> +               test_commit -C src $i &&
> +               git -C src branch $i
> +       done &&
> +       git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
> +       git -C pack-test fetch origin &&
> +       ls pack-test/.git/objects/pack/*pack >pack-list &&
> +       test_line_count = 3 pack-list
> +'

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 2/19/2020 1:38 PM, Eric Sunshine wrote:
> On Wed, Feb 19, 2020 at 11:22 AM Derrick Stolee via GitGitGadget
> <[email protected]> wrote:
>> [...]
>> The tests are ordered in this way because if I swap the test order the
>> tag test will succeed instead of fail. I believe this is because somehow
>> we need the srv.bare repo to not have any tags when we clone, but then
>> have tags in our next fetch.
> 
> This ordering requirement might deserve an in-code comment in the test
> script itself.

Can do.

> More below...
> 
>> Signed-off-by: Derrick Stolee <[email protected]>
>> ---
>> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
>> @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
>> +test_expect_failure 'verify fetch succeeds when asking for new tags' '
>> +       git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
>> +       for i in I J K
>> +       do
>> +               test_commit -C src $i &&
>> +               git -C src branch $i
>> +       done &&
> 
> If test_commit() or git-branch fail, those failures will go unnoticed.
> You can fix this by bailing from the loop, like this:
> 
>     for i in I J K
>     do
>         test_commit -C src $i &&
>         git -C src branch $i || return 1
>     done &&
> 
> Same comment applies to the other new test.

Thanks!

-Stolee

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
> index fea56cda6d3..ed2ef45c37a 100755
> --- a/t/t5616-partial-clone.sh
> +++ b/t/t5616-partial-clone.sh
> @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
>  	grep "want $(cat hash)" trace
>  '
>  
> +test_expect_failure 'verify fetch succeeds when asking for new tags' '
> +	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
> +	for i in I J K
> +	do
> +		test_commit -C src $i &&
> +		git -C src branch $i
> +	done &&
> +	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
> +	git -C tag-test fetch --tags origin
> +'

Is this about an ultra-recent regresssion?  When applied directly on
top of v2.25.0, this one seems to pass already without any change.

> +test_expect_failure 'verify fetch downloads only one pack when updating refs' '
> +	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
> +	ls pack-test/.git/objects/pack/*pack >pack-list &&
> +	test_line_count = 2 pack-list &&
> +	for i in A B C
> +	do
> +		test_commit -C src $i &&
> +		git -C src branch $i
> +	done &&
> +	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
> +	git -C pack-test fetch origin &&
> +	ls pack-test/.git/objects/pack/*pack >pack-list &&
> +	test_line_count = 3 pack-list
> +'
> +
>  . "$TEST_DIRECTORY"/lib-httpd.sh
>  start_httpd

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <[email protected]> wrote:
> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
> > +test_expect_failure 'verify fetch succeeds when asking for new tags' '
> > +     git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
> > +     for i in I J K
> > +     do
> > +             test_commit -C src $i &&
> > +             git -C src branch $i
> > +     done &&
> > +     git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
> > +     git -C tag-test fetch --tags origin
> > +'
>
> Is this about an ultra-recent regresssion?  When applied directly on
> top of v2.25.0, this one seems to pass already without any change.

True, although both fail when applied atop "master".

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Junio C Hamano wrote (reply to this):

Eric Sunshine <[email protected]> writes:

> On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <[email protected]> wrote:
>> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
>> > +test_expect_failure 'verify fetch succeeds when asking for new tags' '
>> > +     git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
>> > +     for i in I J K
>> > +     do
>> > +             test_commit -C src $i &&
>> > +             git -C src branch $i
>> > +     done &&
>> > +     git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
>> > +     git -C tag-test fetch --tags origin
>> > +'
>>
>> Is this about an ultra-recent regresssion?  When applied directly on
>> top of v2.25.0, this one seems to pass already without any change.
>
> True, although both fail when applied atop "master".

I flipped the first one (i.e. test #24) to expect success and run
bisect between 3f7553ac ("Merge branch 'jt/t5616-robustify'",
2020-02-12) and the tip of 'master'.

Interesting that bisecting it points at 684ceae3 ("fetch: default to
protocol version 2", 2019-12-23).

diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 9a9178fd28..099406c2f1 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -384,6 +384,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace
 '
 
+test_expect_success 'verify fetch succeeds when asking for new tags' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
+	for i in I J K
+	do
+		test_commit -C src $i &&
+		git -C src branch $i
+	done &&
+	git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
+	git -C tag-test fetch --tags origin
+'
+
+test_expect_failure 'verify fetch downloads only one pack when updating refs' '
+	git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 2 pack-list &&
+	for i in A B C
+	do
+		test_commit -C src $i &&
+		git -C src branch $i
+	done &&
+	git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
+	git -C pack-test fetch origin &&
+	ls pack-test/.git/objects/pack/*pack >pack-list &&
+	test_line_count = 3 pack-list
+'
+
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd
 
-- 
2.25.1-440-g39558b81cc

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 2/19/2020 4:17 PM, Junio C Hamano wrote:
> Eric Sunshine <[email protected]> writes:
> 
>> On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <[email protected]> wrote:
>>> "Derrick Stolee via GitGitGadget" <[email protected]> writes:
>>>> +test_expect_failure 'verify fetch succeeds when asking for new tags' '
>>>> +     git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
>>>> +     for i in I J K
>>>> +     do
>>>> +             test_commit -C src $i &&
>>>> +             git -C src branch $i
>>>> +     done &&
>>>> +     git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
>>>> +     git -C tag-test fetch --tags origin
>>>> +'
>>>
>>> Is this about an ultra-recent regresssion?  When applied directly on
>>> top of v2.25.0, this one seems to pass already without any change.
>>
>> True, although both fail when applied atop "master".
> 
> I flipped the first one (i.e. test #24) to expect success and run
> bisect between 3f7553ac ("Merge branch 'jt/t5616-robustify'",
> 2020-02-12) and the tip of 'master'.
> 
> Interesting that bisecting it points at 684ceae3 ("fetch: default to
> protocol version 2", 2019-12-23).

Thanks for tracking this down. I had originally been working on
top of master, but then rebased onto v2.25.0 to test this on our
VFS for Git/Scalar fork [1]. I have since noticed that the test
passes in that case.

Thanks,
-Stolee

[1] https://github.com/microsoft/git/pull/247

'

# The following two tests must be in this order, or else
# the first will not fail. It is important that the srv.bare
# repository did not have tags during clone, but has tags
# in the fetch.

test_expect_failure 'verify fetch succeeds when asking for new tags' '
git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test &&
for i in I J K
do
test_commit -C src $i &&
git -C src branch $i || return 1
done &&
git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* &&
git -C tag-test -c protocol.version=2 fetch --tags origin
'

test_expect_success 'verify fetch downloads only one pack when updating refs' '
git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test &&
ls pack-test/.git/objects/pack/*pack >pack-list &&
test_line_count = 2 pack-list &&
for i in A B C
do
test_commit -C src $i &&
git -C src branch $i || return 1
done &&
git -C srv.bare fetch origin +refs/heads/*:refs/heads/* &&
git -C pack-test fetch origin &&
ls pack-test/.git/objects/pack/*pack >pack-list &&
test_line_count = 3 pack-list
'

. "$TEST_DIRECTORY"/lib-httpd.sh
start_httpd

Expand Down