Skip to content

Commit 42661c5

Browse files
committed
Add note to corpus to assure we don't forget another worst-case repo (#851)
1 parent d9d9bc0 commit 42661c5

File tree

1 file changed

+10
-4
lines changed

1 file changed

+10
-4
lines changed

etc/corpus/README.md

+10-4
Original file line numberDiff line numberDiff line change
@@ -49,19 +49,25 @@ your respective `repo_metadata.jsonl` and the computed amount of repos to includ
4949

5050
#### Add one large (100GB+) repository and one with a lot of commits repository by hand
5151

52-
Invoke `git clone --bare https://github.com/NagatoDEV/PlayStation-Home-Master-Archive <corpus>/github.com/NagatoDEV/PlayStation-Home-Master-Archive` (after replacing `<curpus>` with your base path)
52+
Invoke `git clone --bare https://github.com/NagatoDEV/PlayStation-Home-Master-Archive <corpus>/github.com/NagatoDEV/PlayStation-Home-Master-Archive.git` (after replacing `<curpus>` with your base path)
5353
to obtain one sample of a huge repository with a lot of assets and other binary data whose tree spans more than 440k files.
5454

5555
That way, we also get to see what happens when we have to handle huge binary files in massive trees.
5656

5757
Another massive tree and a more than 1.3m commits comes in with this invocation:
5858

59-
`git clone --bare https://github.com/archlinux/svntogit-community <corpus>/github.com/archlinux/svntogit-community`.
59+
`git clone --bare https://github.com/archlinux/svntogit-community <corpus>/github.com/archlinux/svntogit-community.git`.
60+
61+
This repo has 100MB+ files with a lot of append-only changes to it, giving it a very imbalanced delta-tree that triggers worst-case behaviour that needed
62+
special mitigations:
63+
64+
`git clone --bare https://github.com/fz139/vigruzki <corpus>/github.com/fz139/vigruzki.git`.
65+
66+
All repos should be topped off with…
6067

61-
Both repos should be topped off with
6268
```shell
6369
cd <corpus>
64-
for d in github.com/archlinux/svntogit-community github.com/NagatoDEV/PlayStation-Home-Master-Archive; do
70+
for d in github.com/archlinux/svntogit-community.git github.com/NagatoDEV/PlayStation-Home-Master-Archive.git github.com/fz139/vigruzki.git; do
6571
git -C $d read-tree @
6672
git -C $d commit-graph write --no-progress --reachable
6773
done

0 commit comments

Comments
 (0)