Skip to content

Commit 7b0f229

Browse files
avargitster
authored andcommitted
commit-graph write: add progress output
Before this change the "commit-graph write" command didn't report any progress. On my machine this command takes more than 10 seconds to write the graph for linux.git, and around 1m30s on the 2015-04-03-1M-git.git[1] test repository (a test case for a large monorepository). Furthermore, since the gc.writeCommitGraph setting was added in d5d5d7b ("gc: automatically write commit-graph files", 2018-06-27), there was no indication at all from a "git gc" run that anything was different. This why one of the progress bars being added here uses start_progress() instead of start_delayed_progress(), so that it's guaranteed to be seen. E.g. on my tiny 867 commit dotfiles.git repository: $ git -c gc.writeCommitGraph=true gc Enumerating objects: 2821, done. [...] Computing commit graph generation numbers: 100% (867/867), done. On larger repositories, such as linux.git the delayed progress bar(s) will kick in, and we'll show what's going on instead of, as was previously happening, printing nothing while we write the graph: $ git -c gc.writeCommitGraph=true gc [...] Annotating commits in commit graph: 1565573, done. Computing commit graph generation numbers: 100% (782484/782484), done. Note that here we don't show "Finding commits for commit graph", this is because under "git gc" we seed the search with the commit references in the repository, and that set is too small to show any progress, but would e.g. on a smaller repo such as git.git with --stdin-commits: $ git rev-list --all | git -c gc.writeCommitGraph=true write --stdin-commits Finding commits for commit graph: 100% (162576/162576), done. Computing commit graph generation numbers: 100% (162576/162576), done. With --stdin-packs we don't show any estimation of how much is left to do. This is because we might be processing more than one pack. We could be less lazy here and show progress, either by detecting that we're only processing one pack, or by first looping over the packs to discover how many commits they have. I don't see the point in doing that work. So instead we get (on 2015-04-03-1M-git.git): $ echo pack-<HASH>.idx | git -c gc.writeCommitGraph=true --exec-path=$PWD commit-graph write --stdin-packs Finding commits for commit graph: 13064614, done. Annotating commits in commit graph: 3001341, done. Computing commit graph generation numbers: 100% (1000447/1000447), done. No GC mode uses --stdin-packs. It's what they use at Microsoft to manually compute the generation numbers for their collection of large packs which are never coalesced. The reason we need a "report_progress" variable passed down from "git gc" is so that we don't report this output when we're running in the process "git gc --auto" detaches from the terminal. Since we write the commit graph from the "git gc" process itself (as opposed to what we do with say the "git repack" phase), we'd end up writing the output to .git/gc.log and reporting it to the user next time as part of the "The last gc run reported the following[...]" error, see 329e6e8 ("gc: save log from daemonized gc --auto and print it next time", 2015-09-19). So we must keep track of whether or not we're running in that demonized mode, and if so print no progress. See [2] and subsequent replies for a discussion of an approach not taken in compute_generation_numbers(). I.e. we're saying "Computing commit graph generation numbers", even though on an established history we're mostly skipping over all the work we did in the past. This is similar to the white lie we tell in the "Writing objects" phase (not all are objects being written). Always showing progress is considered more important than accuracy. I.e. on a repository like 2015-04-03-1M-git.git we'd hang for 6 seconds with no output on the second "git gc" if no changes were made to any objects in the interim if we'd take the approach in [2]. 1. https://github.com/avar/2015-04-03-1M-git 2. <[email protected]> (https://public-inbox.org/git/[email protected]/) Signed-off-by: Ævar Arnfjörð Bjarmason <[email protected]> Signed-off-by: Junio C Hamano <[email protected]>
1 parent 1d4361b commit 7b0f229

File tree

4 files changed

+60
-13
lines changed

4 files changed

+60
-13
lines changed

builtin/commit-graph.c

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -151,7 +151,7 @@ static int graph_write(int argc, const char **argv)
151151
opts.obj_dir = get_object_directory();
152152

153153
if (opts.reachable) {
154-
write_commit_graph_reachable(opts.obj_dir, opts.append);
154+
write_commit_graph_reachable(opts.obj_dir, opts.append, 1);
155155
return 0;
156156
}
157157

@@ -171,7 +171,8 @@ static int graph_write(int argc, const char **argv)
171171
write_commit_graph(opts.obj_dir,
172172
pack_indexes,
173173
commit_hex,
174-
opts.append);
174+
opts.append,
175+
1);
175176

176177
string_list_clear(&lines, 0);
177178
return 0;

builtin/gc.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -646,7 +646,8 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
646646
clean_pack_garbage();
647647

648648
if (gc_write_commit_graph)
649-
write_commit_graph_reachable(get_object_directory(), 0);
649+
write_commit_graph_reachable(get_object_directory(), 0,
650+
!daemonized);
650651

651652
if (auto_gc && too_many_loose_objects())
652653
warning(_("There are too many unreachable loose objects; "

commit-graph.c

Lines changed: 52 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,7 @@
1313
#include "commit-graph.h"
1414
#include "object-store.h"
1515
#include "alloc.h"
16+
#include "progress.h"
1617

1718
#define GRAPH_SIGNATURE 0x43475048 /* "CGPH" */
1819
#define GRAPH_CHUNKID_OIDFANOUT 0x4f494446 /* "OIDF" */
@@ -548,6 +549,8 @@ struct packed_oid_list {
548549
struct object_id *list;
549550
int nr;
550551
int alloc;
552+
struct progress *progress;
553+
int progress_done;
551554
};
552555

553556
static int add_packed_commits(const struct object_id *oid,
@@ -560,6 +563,9 @@ static int add_packed_commits(const struct object_id *oid,
560563
off_t offset = nth_packed_object_offset(pack, pos);
561564
struct object_info oi = OBJECT_INFO_INIT;
562565

566+
if (list->progress)
567+
display_progress(list->progress, ++list->progress_done);
568+
563569
oi.typep = &type;
564570
if (packed_object_info(the_repository, pack, offset, &oi) < 0)
565571
die(_("unable to get type of object %s"), oid_to_hex(oid));
@@ -587,12 +593,18 @@ static void add_missing_parents(struct packed_oid_list *oids, struct commit *com
587593
}
588594
}
589595

590-
static void close_reachable(struct packed_oid_list *oids)
596+
static void close_reachable(struct packed_oid_list *oids, int report_progress)
591597
{
592598
int i;
593599
struct commit *commit;
600+
struct progress *progress = NULL;
601+
int j = 0;
594602

603+
if (report_progress)
604+
progress = start_delayed_progress(
605+
_("Annotating commits in commit graph"), 0);
595606
for (i = 0; i < oids->nr; i++) {
607+
display_progress(progress, ++j);
596608
commit = lookup_commit(the_repository, &oids->list[i]);
597609
if (commit)
598610
commit->object.flags |= UNINTERESTING;
@@ -604,26 +616,36 @@ static void close_reachable(struct packed_oid_list *oids)
604616
* closure.
605617
*/
606618
for (i = 0; i < oids->nr; i++) {
619+
display_progress(progress, ++j);
607620
commit = lookup_commit(the_repository, &oids->list[i]);
608621

609622
if (commit && !parse_commit(commit))
610623
add_missing_parents(oids, commit);
611624
}
612625

613626
for (i = 0; i < oids->nr; i++) {
627+
display_progress(progress, ++j);
614628
commit = lookup_commit(the_repository, &oids->list[i]);
615629

616630
if (commit)
617631
commit->object.flags &= ~UNINTERESTING;
618632
}
633+
stop_progress(&progress);
619634
}
620635

621-
static void compute_generation_numbers(struct packed_commit_list* commits)
636+
static void compute_generation_numbers(struct packed_commit_list* commits,
637+
int report_progress)
622638
{
623639
int i;
624640
struct commit_list *list = NULL;
641+
struct progress *progress = NULL;
625642

643+
if (report_progress)
644+
progress = start_progress(
645+
_("Computing commit graph generation numbers"),
646+
commits->nr);
626647
for (i = 0; i < commits->nr; i++) {
648+
display_progress(progress, i + 1);
627649
if (commits->list[i]->generation != GENERATION_NUMBER_INFINITY &&
628650
commits->list[i]->generation != GENERATION_NUMBER_ZERO)
629651
continue;
@@ -655,6 +677,7 @@ static void compute_generation_numbers(struct packed_commit_list* commits)
655677
}
656678
}
657679
}
680+
stop_progress(&progress);
658681
}
659682

660683
static int add_ref_to_list(const char *refname,
@@ -667,19 +690,20 @@ static int add_ref_to_list(const char *refname,
667690
return 0;
668691
}
669692

670-
void write_commit_graph_reachable(const char *obj_dir, int append)
693+
void write_commit_graph_reachable(const char *obj_dir, int append,
694+
int report_progress)
671695
{
672696
struct string_list list;
673697

674698
string_list_init(&list, 1);
675699
for_each_ref(add_ref_to_list, &list);
676-
write_commit_graph(obj_dir, NULL, &list, append);
700+
write_commit_graph(obj_dir, NULL, &list, append, report_progress);
677701
}
678702

679703
void write_commit_graph(const char *obj_dir,
680704
struct string_list *pack_indexes,
681705
struct string_list *commit_hex,
682-
int append)
706+
int append, int report_progress)
683707
{
684708
struct packed_oid_list oids;
685709
struct packed_commit_list commits;
@@ -692,9 +716,12 @@ void write_commit_graph(const char *obj_dir,
692716
int num_chunks;
693717
int num_extra_edges;
694718
struct commit_list *parent;
719+
struct progress *progress = NULL;
695720

696721
oids.nr = 0;
697722
oids.alloc = approximate_object_count() / 4;
723+
oids.progress = NULL;
724+
oids.progress_done = 0;
698725

699726
if (append) {
700727
prepare_commit_graph_one(the_repository, obj_dir);
@@ -721,6 +748,11 @@ void write_commit_graph(const char *obj_dir,
721748
int dirlen;
722749
strbuf_addf(&packname, "%s/pack/", obj_dir);
723750
dirlen = packname.len;
751+
if (report_progress) {
752+
oids.progress = start_delayed_progress(
753+
_("Finding commits for commit graph"), 0);
754+
oids.progress_done = 0;
755+
}
724756
for (i = 0; i < pack_indexes->nr; i++) {
725757
struct packed_git *p;
726758
strbuf_setlen(&packname, dirlen);
@@ -733,15 +765,21 @@ void write_commit_graph(const char *obj_dir,
733765
for_each_object_in_pack(p, add_packed_commits, &oids, 0);
734766
close_pack(p);
735767
}
768+
stop_progress(&oids.progress);
736769
strbuf_release(&packname);
737770
}
738771

739772
if (commit_hex) {
773+
if (report_progress)
774+
progress = start_delayed_progress(
775+
_("Finding commits for commit graph"),
776+
commit_hex->nr);
740777
for (i = 0; i < commit_hex->nr; i++) {
741778
const char *end;
742779
struct object_id oid;
743780
struct commit *result;
744781

782+
display_progress(progress, i + 1);
745783
if (commit_hex->items[i].string &&
746784
parse_oid_hex(commit_hex->items[i].string, &oid, &end))
747785
continue;
@@ -754,12 +792,18 @@ void write_commit_graph(const char *obj_dir,
754792
oids.nr++;
755793
}
756794
}
795+
stop_progress(&progress);
757796
}
758797

759-
if (!pack_indexes && !commit_hex)
798+
if (!pack_indexes && !commit_hex) {
799+
if (report_progress)
800+
oids.progress = start_delayed_progress(
801+
_("Finding commits for commit graph"), 0);
760802
for_each_packed_object(add_packed_commits, &oids, 0);
803+
stop_progress(&oids.progress);
804+
}
761805

762-
close_reachable(&oids);
806+
close_reachable(&oids, report_progress);
763807

764808
QSORT(oids.list, oids.nr, commit_compare);
765809

@@ -799,7 +843,7 @@ void write_commit_graph(const char *obj_dir,
799843
if (commits.nr >= GRAPH_PARENT_MISSING)
800844
die(_("too many commits to write graph"));
801845

802-
compute_generation_numbers(&commits);
846+
compute_generation_numbers(&commits, report_progress);
803847

804848
graph_name = get_commit_graph_filename(obj_dir);
805849
if (safe_create_leading_directories(graph_name))

commit-graph.h

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -52,11 +52,12 @@ struct commit_graph {
5252

5353
struct commit_graph *load_commit_graph_one(const char *graph_file);
5454

55-
void write_commit_graph_reachable(const char *obj_dir, int append);
55+
void write_commit_graph_reachable(const char *obj_dir, int append,
56+
int report_progress);
5657
void write_commit_graph(const char *obj_dir,
5758
struct string_list *pack_indexes,
5859
struct string_list *commit_hex,
59-
int append);
60+
int append, int report_progress);
6061

6162
int verify_commit_graph(struct repository *r, struct commit_graph *g);
6263

0 commit comments

Comments
 (0)