Document two partial clone bugs, fix one #556

derrickstolee · 2020-02-19T15:49:04Z

While playing with partial clone, I discovered a few bugs and document them with tests in patch 1. One seems to be a server-side bug that happens in a somewhat rare situation, but not terribly unlikely. The other is a client-side bug that leads to quadratic amounts of data transfer; I fix this bug in patch 2.

UPDATES in V2:

Added "|| return 1" inside the for loops.
Added an in-test comment about the test ordering.
Required protocol.version=2 in the tags test due to the bisect Junio performed.
Updated the commit message via Jonathan Tan's suggestion.

You can ignore the stack traces I sent earlier, as those seem to be from states I cannot get into without being destructive to my .git directory.

Thanks,
-Stolee

Cc: [email protected], [email protected], [email protected], [email protected], [email protected]

derrickstolee · 2020-02-19T16:21:01Z

/submit

gitgitgadget · 2020-02-19T16:22:04Z

Submitted as [email protected]

gitgitgadget · 2020-02-19T18:14:13Z

builtin/fetch.c

@@ -335,6 +335,7 @@ static void find_non_local_tags(const struct ref *refs,
 	struct string_list_item *remote_ref_item;


On the Git mailing list, Jonathan Tan wrote (reply to this):

> From: Derrick Stolee <[email protected]> > > When using partial-clone, do_oid_object_info_extended() can trigger a > fetch for missing objects. This can be extremely expensive when asking > for a tag or commit, as we are completely removed from the context of > the missing object and thus supply no "haves" in the request. > > 6462d5eb9a (fetch: remove fetch_if_missing=0, 2019-11-05) removed a > global variable that prevented these fetches in favor of a bitflag. > However, some object existence checks were not updated to use this flag. > > Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in > addition to OBJECT_INFO_QUICK. The _QUICK option only prevents > repreparing the pack-file structures. We need to be extremely careful > about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist > due to updated refs. > > This resolves a broken test in t5616-partial-clone.sh. > > Signed-off-by: Derrick Stolee <[email protected]> Thanks for catching this. I wonder if the commit message in this patch could be better worded - the first paragraph seems to say that fetching missing commits and tags are expensive, but that is not the problem here; the problem is that the client lazily fetches refs advertised by the server, thinking that it is lacking them due to partial clone, even when there is no expectation that the client have them (so the commits and tags are not truly missing). So I would reword the first paragraph as: When using partial clone, find_non_local_tags() in builtin/fetch.c checks each remote tag to see if its object also exists locally. There is no expectation that the object exist locally, but this function nevertheless triggers a lazy fetch if the object does not exist. This can be extremely expensive when asking for a commit, as we are completely removed from the context of the non-existent object and thus supply no "haves" in the request. All this rests on my thinking that "missing" has the connotation (or actual meaning) that we expect the object to be there. If we think that "missing" can also mean that the remote has it but the local doesn't, then you can ignore what I just said :-) Other than that, both patches look good to me.

gitgitgadget · 2020-02-19T18:41:09Z

t/t5616-partial-clone.sh

@@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace


On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Feb 19, 2020 at 11:22 AM Derrick Stolee via GitGitGadget <[email protected]> wrote: > [...] > The tests are ordered in this way because if I swap the test order the > tag test will succeed instead of fail. I believe this is because somehow > we need the srv.bare repo to not have any tags when we clone, but then > have tags in our next fetch. This ordering requirement might deserve an in-code comment in the test script itself. More below... > Signed-off-by: Derrick Stolee <[email protected]> > --- > diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh > @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' ' > +test_expect_failure 'verify fetch succeeds when asking for new tags' ' > + git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test && > + for i in I J K > + do > + test_commit -C src $i && > + git -C src branch $i > + done && If test_commit() or git-branch fail, those failures will go unnoticed. You can fix this by bailing from the loop, like this: for i in I J K do test_commit -C src $i && git -C src branch $i || return 1 done && Same comment applies to the other new test. > + git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* && > + git -C tag-test fetch --tags origin > +' > + > +test_expect_failure 'verify fetch downloads only one pack when updating refs' ' > + git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test && > + ls pack-test/.git/objects/pack/*pack >pack-list && > + test_line_count = 2 pack-list && > + for i in A B C > + do > + test_commit -C src $i && > + git -C src branch $i > + done && > + git -C srv.bare fetch origin +refs/heads/*:refs/heads/* && > + git -C pack-test fetch origin && > + ls pack-test/.git/objects/pack/*pack >pack-list && > + test_line_count = 3 pack-list > +'

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 2/19/2020 1:38 PM, Eric Sunshine wrote: > On Wed, Feb 19, 2020 at 11:22 AM Derrick Stolee via GitGitGadget > <[email protected]> wrote: >> [...] >> The tests are ordered in this way because if I swap the test order the >> tag test will succeed instead of fail. I believe this is because somehow >> we need the srv.bare repo to not have any tags when we clone, but then >> have tags in our next fetch. > > This ordering requirement might deserve an in-code comment in the test > script itself. Can do. > More below... > >> Signed-off-by: Derrick Stolee <[email protected]> >> --- >> diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh >> @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' ' >> +test_expect_failure 'verify fetch succeeds when asking for new tags' ' >> + git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test && >> + for i in I J K >> + do >> + test_commit -C src $i && >> + git -C src branch $i >> + done && > > If test_commit() or git-branch fail, those failures will go unnoticed. > You can fix this by bailing from the loop, like this: > > for i in I J K > do > test_commit -C src $i && > git -C src branch $i || return 1 > done && > > Same comment applies to the other new test. Thanks! -Stolee

gitgitgadget · 2020-02-19T20:53:24Z

t/t5616-partial-clone.sh

@@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
 	grep "want $(cat hash)" trace


On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes: > diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh > index fea56cda6d3..ed2ef45c37a 100755 > --- a/t/t5616-partial-clone.sh > +++ b/t/t5616-partial-clone.sh > @@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' ' > grep "want $(cat hash)" trace > ' > > +test_expect_failure 'verify fetch succeeds when asking for new tags' ' > + git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test && > + for i in I J K > + do > + test_commit -C src $i && > + git -C src branch $i > + done && > + git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* && > + git -C tag-test fetch --tags origin > +' Is this about an ultra-recent regresssion? When applied directly on top of v2.25.0, this one seems to pass already without any change. > +test_expect_failure 'verify fetch downloads only one pack when updating refs' ' > + git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test && > + ls pack-test/.git/objects/pack/*pack >pack-list && > + test_line_count = 2 pack-list && > + for i in A B C > + do > + test_commit -C src $i && > + git -C src branch $i > + done && > + git -C srv.bare fetch origin +refs/heads/*:refs/heads/* && > + git -C pack-test fetch origin && > + ls pack-test/.git/objects/pack/*pack >pack-list && > + test_line_count = 3 pack-list > +' > + > . "$TEST_DIRECTORY"/lib-httpd.sh > start_httpd

On the Git mailing list, Eric Sunshine wrote (reply to this):

On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <[email protected]> wrote: > "Derrick Stolee via GitGitGadget" <[email protected]> writes: > > +test_expect_failure 'verify fetch succeeds when asking for new tags' ' > > + git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test && > > + for i in I J K > > + do > > + test_commit -C src $i && > > + git -C src branch $i > > + done && > > + git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* && > > + git -C tag-test fetch --tags origin > > +' > > Is this about an ultra-recent regresssion? When applied directly on > top of v2.25.0, this one seems to pass already without any change. True, although both fail when applied atop "master".

On the Git mailing list, Junio C Hamano wrote (reply to this):

Eric Sunshine <[email protected]> writes: > On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <[email protected]> wrote: >> "Derrick Stolee via GitGitGadget" <[email protected]> writes: >> > +test_expect_failure 'verify fetch succeeds when asking for new tags' ' >> > + git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test && >> > + for i in I J K >> > + do >> > + test_commit -C src $i && >> > + git -C src branch $i >> > + done && >> > + git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* && >> > + git -C tag-test fetch --tags origin >> > +' >> >> Is this about an ultra-recent regresssion? When applied directly on >> top of v2.25.0, this one seems to pass already without any change. > > True, although both fail when applied atop "master". I flipped the first one (i.e. test #24) to expect success and run bisect between 3f7553ac ("Merge branch 'jt/t5616-robustify'", 2020-02-12) and the tip of 'master'. Interesting that bisecting it points at 684ceae3 ("fetch: default to protocol version 2", 2019-12-23). diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh index 9a9178fd28..099406c2f1 100755 --- a/t/t5616-partial-clone.sh +++ b/t/t5616-partial-clone.sh @@ -384,6 +384,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' ' grep "want $(cat hash)" trace ' +test_expect_success 'verify fetch succeeds when asking for new tags' ' + git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test && + for i in I J K + do + test_commit -C src $i && + git -C src branch $i + done && + git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* && + git -C tag-test fetch --tags origin +' + +test_expect_failure 'verify fetch downloads only one pack when updating refs' ' + git clone --filter=blob:none "file://$(pwd)/srv.bare" pack-test && + ls pack-test/.git/objects/pack/*pack >pack-list && + test_line_count = 2 pack-list && + for i in A B C + do + test_commit -C src $i && + git -C src branch $i + done && + git -C srv.bare fetch origin +refs/heads/*:refs/heads/* && + git -C pack-test fetch origin && + ls pack-test/.git/objects/pack/*pack >pack-list && + test_line_count = 3 pack-list +' + . "$TEST_DIRECTORY"/lib-httpd.sh start_httpd -- 2.25.1-440-g39558b81cc

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 2/19/2020 4:17 PM, Junio C Hamano wrote: > Eric Sunshine <[email protected]> writes: > >> On Wed, Feb 19, 2020 at 3:52 PM Junio C Hamano <[email protected]> wrote: >>> "Derrick Stolee via GitGitGadget" <[email protected]> writes: >>>> +test_expect_failure 'verify fetch succeeds when asking for new tags' ' >>>> + git clone --filter=blob:none "file://$(pwd)/srv.bare" tag-test && >>>> + for i in I J K >>>> + do >>>> + test_commit -C src $i && >>>> + git -C src branch $i >>>> + done && >>>> + git -C srv.bare fetch --tags origin +refs/heads/*:refs/heads/* && >>>> + git -C tag-test fetch --tags origin >>>> +' >>> >>> Is this about an ultra-recent regresssion? When applied directly on >>> top of v2.25.0, this one seems to pass already without any change. >> >> True, although both fail when applied atop "master". > > I flipped the first one (i.e. test #24) to expect success and run > bisect between 3f7553ac ("Merge branch 'jt/t5616-robustify'", > 2020-02-12) and the tip of 'master'. > > Interesting that bisecting it points at 684ceae3 ("fetch: default to > protocol version 2", 2019-12-23). Thanks for tracking this down. I had originally been working on top of master, but then rebased onto v2.25.0 to test this on our VFS for Git/Scalar fork [1]. I have since noticed that the test passes in that case. Thanks, -Stolee [1] https://github.com/microsoft/git/pull/247

gitgitgadget · 2020-02-19T21:14:26Z

On the Git mailing list, Derrick Stolee wrote (reply to this):

On 2/19/2020 11:21 AM, Derrick Stolee via GitGitGadget wrote:
> While playing with partial clone, I discovered a few bugs and document them
> with tests in patch 1. One seems to be a server-side bug that happens in a
> somewhat rare situation, but not terribly unlikely. The other is a
> client-side bug that leads to quadratic amounts of data transfer; I fix this
> bug in patch 2.

While I was able to demonstrate these bugs, after looking at my real-world
example with these fixes I found _yet another_ set of issues.

My "real world" test is the following: run "git init" and then populate
the config file with these contents:

[core]
        repositoryformatversion = 1
        filemode = true
        bare = false
        logallrefupdates = true
[remote "origin"]
        url = <url-that-supports-partial-clone>
        fetch = +refs/heads/*:refs/remotes/origin/*
        promisor = true
        partialclonefilter = blob:none
[branch "master"]
        remote = origin
        merge = refs/heads/master

Then run "git fetch origin".

First, we check if the repository contains a .gitmodule file, which
triggers a download of HEAD:.gitmodules. First HEAD, then its blob.
(This may be a bit of a red herring: I may have forgotten to delete
the contents of my refs/heads/master when setting up the test.)

#0  promisor_remote_get_direct (repo=repo@entry=0x555555a71680 <the_repo>, oids=oids@entry=0x7fffffffda70, oid_nr=oid_nr@entry=1) at promisor-remote.c:237
#1  0x000055555572bbfa in oid_object_info_extended (r=r@entry=0x555555a71680 <the_repo>, oid=oid@entry=0x7fffffffda70, oi=oi@entry=0x7fffffffd8e0, flags=flags@entry=0) at sha1-file.c:1483
#2  0x000055555572bd7c in read_object (r=r@entry=0x555555a71680 <the_repo>, oid=oid@entry=0x7fffffffda70, type=type@entry=0x7fffffffda64, size=size@entry=0x7fffffffda68) at sha1-file.c:1537
#3  0x000055555572be15 in read_object_file_extended (r=r@entry=0x555555a71680 <the_repo>, oid=oid@entry=0x7fffffffda70, type=type@entry=0x7fffffffda64, size=size@entry=0x7fffffffda68, 
    lookup_replace=lookup_replace@entry=1) at sha1-file.c:1579
#4  0x000055555572c04c in repo_read_object_file (size=0x7fffffffda68, type=0x7fffffffda64, oid=0x7fffffffda70, r=0x555555a71680 <the_repo>) at object-store.h:192
#5  read_object_with_reference (r=r@entry=0x555555a71680 <the_repo>, oid=oid@entry=0x7fffffffdc20, required_type_name=<optimized out>, size=size@entry=0x7fffffffdae8, 
    actual_oid_return=actual_oid_return@entry=0x7fffffffdaf0) at sha1-file.c:1619
#6  0x0000555555755a53 in get_tree_entry (r=r@entry=0x555555a71680 <the_repo>, tree_oid=tree_oid@entry=0x7fffffffdc20, name=name@entry=0x5555557d5ffe ".gitmodules", oid=0x7fffffffdd40, 
    mode=mode@entry=0x7fffffffdcb0) at tree-walk.c:573
#7  0x000055555572f628 in get_oid_with_context_1 (repo=repo@entry=0x555555a71680 <the_repo>, name=name@entry=0x5555557d5ff9 "HEAD:.gitmodules", flags=flags@entry=0, prefix=prefix@entry=0x0, 
    oid=oid@entry=0x7fffffffdd40, oc=oc@entry=0x7fffffffdcb0) at sha1-name.c:1899
#8  0x000055555572fea3 in get_oid_with_context (oc=0x7fffffffdcb0, oid=0x7fffffffdd40, flags=0, str=0x5555557d5ff9 "HEAD:.gitmodules", repo=0x555555a71680 <the_repo>) at sha1-name.c:1946
#9  repo_get_oid (r=r@entry=0x555555a71680 <the_repo>, name=name@entry=0x5555557d5ff9 "HEAD:.gitmodules", oid=oid@entry=0x7fffffffdd40) at sha1-name.c:1602
#10 0x000055555573d962 in config_from_gitmodules (fn=fn@entry=0x55555573da30 <gitmodules_fetch_config>, repo=0x555555a71680 <the_repo>, data=data@entry=0x7fffffffdda0) at submodule-config.c:648
#11 0x000055555573ebed in config_from_gitmodules (data=0x7fffffffdda0, repo=<optimized out>, fn=0x55555573da30 <gitmodules_fetch_config>) at submodule-config.c:637
#12 fetch_config_from_gitmodules (max_children=max_children@entry=0x555555a32894 <submodule_fetch_jobs_config>, recurse_submodules=recurse_submodules@entry=0x555555a3288c <recurse_submodules>)
    at submodule-config.c:799
#13 0x00005555555a6920 in cmd_fetch (argc=2, argv=0x7fffffffe2e8, prefix=0x0) at builtin/fetch.c:1762
#14 0x0000555555570b9d in run_builtin (argv=<optimized out>, argc=<optimized out>, p=<optimized out>) at git.c:444
#15 handle_builtin (argc=<optimized out>, argv=<optimized out>) at git.c:674
#16 0x0000555555571d05 in run_argv (argv=0x7fffffffe030, argcp=0x7fffffffe03c) at git.c:741
#17 cmd_main (argc=<optimized out>, argv=<optimized out>) at git.c:872
#18 0x00005555555707c8 in main (argc=6, argv=0x7fffffffe2c8) at common-main.c:52

THEN, we start walking the refs to see if we have the objects
locally:

#0  promisor_remote_get_direct (repo=repo@entry=0x555555a71680 <the_repo>, oids=oids@entry=0x555555af7ed0, oid_nr=oid_nr@entry=1) at promisor-remote.c:237
#1  0x000055555572bbfa in oid_object_info_extended (r=0x555555a71680 <the_repo>, oid=<optimized out>, oi=0x555555a71940 <blank_oi>, oi@entry=0x0, flags=flags@entry=4) at sha1-file.c:1483
#2  0x000055555572c420 in repo_has_object_file_with_flags (flags=<optimized out>, oid=<optimized out>, r=<optimized out>) at sha1-file.c:1935
#3  repo_has_object_file (r=<optimized out>, oid=<optimized out>) at sha1-file.c:1942
#4  0x00005555556e682f in ref_resolves_to_object (refname=0x555555af7f40 "refs/remotes/origin/next", oid=<optimized out>, flags=<optimized out>) at refs.c:261
#5  0x00005555556eb059 in files_ref_iterator_advance (ref_iterator=0x555555b27ed0) at refs/files-backend.c:754
#6  0x00005555556f1082 in ref_iterator_advance (ref_iterator=0x555555b27ed0) at refs/iterator.c:13
#7  do_for_each_repo_ref_iterator (r=0x555555a71680 <the_repo>, iter=0x555555b27ed0, fn=fn@entry=0x5555556e5930 <do_for_each_ref_helper>, cb_data=cb_data@entry=0x7fffffffdc80) at refs/iterator.c:417
#8  0x00005555556e7b79 in do_for_each_ref (refs=<optimized out>, prefix=prefix@entry=0x5555557cb095 "", fn=fn@entry=0x5555555a3cb0 <add_one_refname>, trim=trim@entry=0, flags=flags@entry=0, 
    cb_data=cb_data@entry=0x55555572bbfa <oid_object_info_extended+778>) at refs.c:1566
#9  0x00005555556e8918 in refs_for_each_ref (cb_data=0x55555572bbfa <oid_object_info_extended+778>, fn=0x5555555a3cb0 <add_one_refname>, refs=<optimized out>) at refs.c:1572
#10 for_each_ref (fn=fn@entry=0x5555555a3cb0 <add_one_refname>, cb_data=cb_data@entry=0x7fffffffdd40) at refs.c:1577
#11 0x00005555555a46d6 in find_non_local_tags (refs=0x970e93150326b500, refs@entry=0x555555b2ae40, head=head@entry=0x7fffffffde20, tail=tail@entry=0x7fffffffde28) at builtin/fetch.c:344
#12 0x00005555555a7a87 in get_ref_map (rs=0x7fffffffde90, rs=0x7fffffffde90, autotags=<synthetic pointer>, tags=<optimized out>, remote_refs=0x555555b2ae40, remote=<optimized out>) at builtin/fetch.c:523
#13 do_fetch (rs=0x7fffffffde90, transport=<optimized out>) at builtin/fetch.c:1367
#14 fetch_one (prune_tags_ok=<optimized out>, argv=<optimized out>, argc=<optimized out>, remote=<optimized out>) at builtin/fetch.c:1738
#15 cmd_fetch (argc=<optimized out>, argv=<optimized out>, prefix=<optimized out>) at builtin/fetch.c:1827
#16 0x0000555555570b9d in run_builtin (argv=<optimized out>, argc=<optimized out>, p=<optimized out>) at git.c:444
#17 handle_builtin (argc=<optimized out>, argv=<optimized out>) at git.c:674
#18 0x0000555555571d05 in run_argv (argv=0x7fffffffe050, argcp=0x7fffffffe05c) at git.c:741
#19 cmd_main (argc=<optimized out>, argv=<optimized out>) at git.c:872
#20 0x00005555555707c8 in main (argc=5, argv=0x7fffffffe2e8) at common-main.c:52

This is running through has_object_file(), but the worst part is that switching
it to has_object_file_with_flags(oid, OBJECT_INFO_SKIP_FETCH_OBJECT) will cause
warnings when we do not have the object. Something else must be done here.

Since this is more complicated to fix, I'm going to set this part aside
for now. I may come back with a test case that demonstrates the problem.

Thanks,
-Stolee

gitgitgadget · 2020-02-19T22:54:19Z

This patch series was integrated into pu via git@c58cd7e.

gitgitgadget · 2020-02-19T23:04:11Z

This branch is now known as ds/partial-clone-fixes.

gitgitgadget · 2020-02-21T01:59:56Z

This patch series was integrated into pu via git@65035e9.

gitgitgadget · 2020-02-21T03:51:02Z

This patch series was integrated into pu via git@f80c967.

While testing partial clone, I noticed some odd behavior. I was testing a way of running 'git init', followed by manually configuring the remote for partial clone, and then running 'git fetch'. Astonishingly, I saw the 'git fetch' process start asking the server for multiple rounds of pack-file downloads! When tweaking the situation a little more, I discovered that I could cause the remote to hang up with an error. Add two tests that demonstrate these two issues. In the first test, we find that when fetching with blob filters from a repository that previously did not have any tags, the 'git fetch --tags origin' command fails because the server sends "multiple filter-specs cannot be combined". This only happens when using protocol v2. In the second test, we see that a 'git fetch origin' request with several ref updates results in multiple pack-file downloads. This must be due to Git trying to fault-in the objects pointed by the refs. What makes this matter particularly nasty is that this goes through the do_oid_object_info_extended() method, so there are no "haves" in the negotiation. This leads the remote to send every reachable commit and tree from each new ref, providing a quadratic amount of data transfer! This test is fixed if we revert 6462d5e (fetch: remove fetch_if_missing=0, 2019-11-05), but that revert causes other test failures. The real fix will need more care. The tests are ordered in this way because if I swap the test order the tag test will succeed instead of fail. I believe this is because somehow we need the srv.bare repo to not have any tags when we clone, but then have tags in our next fetch. Signed-off-by: Derrick Stolee <[email protected]>

When using partial clone, find_non_local_tags() in builtin/fetch.c checks each remote tag to see if its object also exists locally. There is no expectation that the object exist locally, but this function nevertheless triggers a lazy fetch if the object does not exist. This can be extremely expensive when asking for a commit, as we are completely removed from the context of the non-existent object and thus supply no "haves" in the request. 6462d5e (fetch: remove fetch_if_missing=0, 2019-11-05) removed a global variable that prevented these fetches in favor of a bitflag. However, some object existence checks were not updated to use this flag. Update find_non_local_tags() to use OBJECT_INFO_SKIP_FETCH_OBJECT in addition to OBJECT_INFO_QUICK. The _QUICK option only prevents repreparing the pack-file structures. We need to be extremely careful about supplying _SKIP_FETCH_OBJECT when we expect an object to not exist due to updated refs. This resolves a broken test in t5616-partial-clone.sh. Signed-off-by: Derrick Stolee <[email protected]>

derrickstolee · 2020-02-21T21:46:41Z

/submit

gitgitgadget · 2020-02-21T21:47:40Z

Submitted as [email protected]

gitgitgadget · 2020-02-22T17:28:44Z

On the Git mailing list, Junio C Hamano wrote (reply to this):

"Derrick Stolee via GitGitGadget" <[email protected]> writes:

> While playing with partial clone, I discovered a few bugs and document them
> with tests in patch 1. One seems to be a server-side bug that happens in a
> somewhat rare situation, but not terribly unlikely. The other is a
> client-side bug that leads to quadratic amounts of data transfer; I fix this
> bug in patch 2.
>
> UPDATES in V2:
>
>  * Added "|| return 1" inside the for loops.
>    
>    
>  * Added an in-test comment about the test ordering.
>    
>    
>  * Required protocol.version=2 in the tags test due to the bisect Junio
>    performed.
>    
>    
>  * Updated the commit message via Jonathan Tan's suggestion.
>    

Now this can safely be queued directly on v2.25.0, I'll
rebase it (earlyer I queued it after the merge to make protocol v2
the default).

Thanks.

gitgitgadget · 2020-02-22T18:39:11Z

This patch series was integrated into pu via git@0577552.

gitgitgadget · 2020-02-25T20:10:08Z

This patch series was integrated into pu via git@ddebbbb.

gitgitgadget · 2020-02-25T20:10:09Z

This patch series was integrated into next via git@a26434b.

gitgitgadget · 2020-02-26T23:13:21Z

This patch series was integrated into pu via git@91b48d7.

gitgitgadget · 2020-02-28T00:01:43Z

This patch series was integrated into pu via git@992e35d.

gitgitgadget · 2020-03-03T02:39:44Z

This patch series was integrated into pu via git@444cff6.

gitgitgadget · 2020-03-03T02:39:45Z

This patch series was integrated into next via git@444cff6.

gitgitgadget · 2020-03-03T02:39:46Z

This patch series was integrated into master via git@444cff6.

gitgitgadget · 2020-03-03T02:39:49Z

Closed via 444cff6.

derrickstolee changed the title ~~Reveal two partial clone bugs, fix one~~ Document two partial clone bugs, fix one Feb 19, 2020

derrickstolee mentioned this pull request Feb 19, 2020

[WIP] Partial clone fix microsoft/git#247

Closed

gitgitgadget bot reviewed Feb 19, 2020

View reviewed changes

gitgitgadget bot added the pu label Feb 19, 2020

derrickstolee added 2 commits February 21, 2020 21:25

derrickstolee force-pushed the partial-clone-fix branch from 937a882 to 7c4c9f0 Compare February 21, 2020 21:25

weekly-digest bot mentioned this pull request Feb 23, 2020

Weekly Digest (16 February, 2020 - 23 February, 2020) #564

Closed

gitgitgadget bot added the next label Feb 25, 2020

weekly-digest bot mentioned this pull request Mar 1, 2020

Weekly Digest (23 February, 2020 - 1 March, 2020) #569

Closed

gitgitgadget bot added the master label Mar 3, 2020

gitgitgadget bot closed this Mar 3, 2020

		@@ -335,6 +335,7 @@ static void find_non_local_tags(const struct ref *refs,
		struct string_list_item *remote_ref_item;

		@@ -374,6 +374,32 @@ test_expect_success 'fetch lazy-fetches only to resolve deltas, protocol v2' '
		grep "want $(cat hash)" trace

Document two partial clone bugs, fix one #556

Document two partial clone bugs, fix one #556

Uh oh!

Conversation

derrickstolee commented Feb 19, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

derrickstolee commented Feb 19, 2020

Uh oh!

gitgitgadget bot commented Feb 19, 2020

Uh oh!

gitgitgadget bot Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot Feb 19, 2020

Choose a reason for hiding this comment

Uh oh!

gitgitgadget bot commented Feb 19, 2020

Uh oh!

gitgitgadget bot commented Feb 19, 2020

Uh oh!

gitgitgadget bot commented Feb 19, 2020

Uh oh!

gitgitgadget bot commented Feb 21, 2020

Uh oh!

gitgitgadget bot commented Feb 21, 2020

Uh oh!

derrickstolee commented Feb 21, 2020

Uh oh!

gitgitgadget bot commented Feb 21, 2020

Uh oh!

gitgitgadget bot commented Feb 22, 2020

Uh oh!

gitgitgadget bot commented Feb 22, 2020

Uh oh!

gitgitgadget bot commented Feb 25, 2020

Uh oh!

gitgitgadget bot commented Feb 25, 2020

Uh oh!

gitgitgadget bot commented Feb 26, 2020

Uh oh!

gitgitgadget bot commented Feb 28, 2020

Uh oh!

gitgitgadget bot commented Mar 3, 2020

Uh oh!

gitgitgadget bot commented Mar 3, 2020

Uh oh!

gitgitgadget bot commented Mar 3, 2020

Uh oh!

gitgitgadget bot commented Mar 3, 2020

Uh oh!

Uh oh!

derrickstolee commented Feb 19, 2020 •

edited

Loading